This means that the format of files to be written by a converter have changed since the writing of this document. But globally the information inside files are the same, but organized differently upon files and dirs.
This being said, this is still a good introduction to understand the basics of DBnomics vocabulary.
Write a new converter
The aim of this page is to describe a conversion process from source_data to json_data free, starting from a dummy dataset TSV file.
Categories won't be covered here.
Let's consider the following data, in a TSV file:
Country ccode Flow fcode year totalFrance FR Import I 2010 83791France FR Import I 2011 83332France FR Import I 2012 82001Belguim BE Import I 2010 33290Belguim BE Import I 2011 36002Belguim BE Import I 2012 39332Italy IT Import I 2009 ...Italy IT Import I 2010 ...Italy IT Import I 2011 77266Italy IT Import I 2012 89022France FR Export E 2010 23982France FR Export E 2011 23777France FR Export E 2012 24000Belguim BE Export E 2010 ...Belguim BE Export E 2011 13277Belguim BE Export E 2012 14002Italy IT Export E 2009 ...Italy IT Export E 2010 ...Italy IT Export E 2011 59288Italy IT Export E 2012 61300
Note: some values are unknown for Italy; ie the provider do not know the values. In this dataset, those unknown values are represented by this string: "..."
This dataset contains only 2 dimensions:
Fixing values for those dimensions make possible to extract a series. For example, the series corresponding to Country = 'Belgium' and Flow = 'Export' is:
We didn't use the dimensions_codes given by provider for dimension "Country" (aka dimension with dimension_label="Country"): we used "geo" in DBnomics
We didn't use the dimensions_values_codes given by provider for dimension_values_labels "France", "Italy" and "Belgium": we used "fra", "ita" and "bel" (not "FR", "IT" and "BE" as given in source file)
We used the dimensions_values_codes given by provider for "flow" dimension ("I" and "E"). We could have choosen something else.