I want to parse a CSV file from ONS and convert it into JSON and TSV observations in JSON repository
In order to store the data from ONS
Criteria of acceptance
The dataset code MUST be the name of the CSV file without its extension
The dataset name MUST be title of the website page containing the filename
The number of produced series MUST be equal to the number of columns, except column A
The series names MUST be equal to values of row 1, except column A
The series codes MUST be equal to values of row 2, except column A
The dimension "unit" CAN be the union of values of rows 3 and 4, except column A
The release date (see #104 (closed)) SHOULD (if "released_at" property is voted for) be values of row 5, except column A
The row 6 MUST be stored as-is in a "next_release_at" property of series.json (see #104 (closed))
The series notes MUST be equal to values of row 7, except column A
Starting with row 8, periods and data MUST be stored, until a blank cell in column A or the period in column A indicates a different frequency
If column A defines many frequencies, they MUST be stored as different series of the same dataset
If column A defines many frequencies, they MUST be concatenated to series name and code in order to distinguish them
A hard-coded dimension "frequency" MUST be created, using the frequencies found in the CSV, using the usual values labels (M = "Monthly", Q = "Quarterly", etc.)
Visual inspection of data and metadata
Hints
assert that cell A1 contains string 'Title'
assert that cell A2 contains string 'CDID'
assert that cell A3 contains string 'PreUnit'
assert that cell A4 contains string 'Unit
assert that cell A5 contains string 'Release Date'
assert that cell A6 contains string 'Next release'
assert that cell A7 contains string 'Important Notes'
use cell A8 to infer frequency and store it in metadata as dimension