• azurecoder

Open Source solution for input output tables

So input output tables are an economic cludge format which groups together a bunch of coefficients per country per year (these are specifically country tables) so that you can work out relative inputs and exports of a country.


Last weekend I helped run a hackathon on sustainable finance. We had some great entries and some very smart approaches to bundling transition risk into credit risk.


We got sponsored data by a group called Eora which gave us a set of input-output tables. For my first pass I tried to use pymrio to read them but it I didn't find it that easy so I thought I'd right my own parser which our team might be able to use much easier.


I'm a bit of a hacky Python programmer but all the data scientists in my team would it need it in Python so I thought I'd take advantage of some of the Python 3 features and be more OO in my approach This is what it ended up looking like.


if __name__ == "__main__":    
    filepath_1 = sys.argv[1]
    filepath_2 = sys.argv[2]
    table1 = CountryTable(CountryTableSegment.PrimaryInputs, filepath_1)
    table2 = CountryTable(CountryTableSegment.PrimaryInputs, filepath_2)
    df = table2.append(table1)
    print(df)
   

And returning Pandas DataFrames. Going to make a few updates so that it reads directories and also reads each of the other parts of the country table as well.


You can learning a little about the solution here and check out the code. Trying to finish it within the next few weeks and add to pypi.


https://github.com/elastacloud/input-output-tables


Interested and economist please feel free to get in touch.

19 views0 comments

Recent Posts

See All

Just one more column, what could go wrong?

Sometimes, when you go scanning through the documentation for Spark, you come across notes about certain functions. These tend to offer little hints and tips about using the function, or to give warni

Why leave bad data to chance?

Something that we often see as Spark jobs are moved into production is that handling of bad data is either ignored, or a lot of effort goes into validating the data and, specifically for this post, th