SECO
Challenges
Excel format
Excel (.xls) files with labels internationalized in four languages (german, french, english and italian) using a combo on first sheet. Default language is german. Selecting another language automagically translate messages used in all other sheets.
Excel language selection macro doesn't work with LibreOffice Calc. None of the labels appear :-(
Solution found is to generate one CSV per sheet using ssconvert
(command-line tool in gnumeric package). All CSV files contain german labels. One additional file contain all the labels in the different languages. Python script first loads internalised messages then can replace german labels to english ones in data CSV files.
Time series codes
Data Excel files contain several time series with a (annual or quarterly) period list. Times series names are built by concatenating hierarchical headers.
Times series codes are given but:
- sometimes codes are absent: a code must be built from hierarchical headers (attributing one code by label)
- sometimes codes are duplicated in a sheet: a code must be built also from hierarchical headers
Concept Level and Growth rate
In each sheet, each times series takes 2 columns:
- 1st one contain an int value for
level
info - 2nd one contain a percentage (float) value for
growth rate
but in sheets called gc_q and gc_y, growth rate value comes first. Second column remains empty!
Last column
In gc_* sheets, last right column is empty, displayed in Excel but not exported to CSV. We have to add a ghost time series to balance between Level and Growth rate time series