dbnomics-data-model issueshttps://git.nomics.world/dbnomics/dbnomics-data-model/-/issues2022-01-10T10:40:22Zhttps://git.nomics.world/dbnomics/dbnomics-data-model/-/issues/6Exception in production in complete_series2022-01-10T10:40:22ZChristophe Benzchristophe.benz@nomics.worldException in production in complete_series## Description
- URL triggering error: https://db.nomics.world/Eurostat/demo_r_mwk3_ts
- Sentry issue: https://sentry.io/organizations/jailbreak_paris/issues/2081069222/?project=5548941&query=&statsPeriod=14d## Description
- URL triggering error: https://db.nomics.world/Eurostat/demo_r_mwk3_ts
- Sentry issue: https://sentry.io/organizations/jailbreak_paris/issues/2081069222/?project=5548941&query=&statsPeriod=14dhttps://git.nomics.world/dbnomics/dbnomics-data-model/-/issues/5Encoding issue - when reading series.jsonl2021-12-14T18:37:35ZGyorgy GyomaiEncoding issue - when reading series.jsonlI am working on a new toolbox based version of the Aqicn-fetcher. All seems to be fine but when I attempt to run a validation on the converted output, the program runs into an error traceable back to this line:
https://git.nomics.world/d...I am working on a new toolbox based version of the Aqicn-fetcher. All seems to be fine but when I attempt to run a validation on the converted output, the program runs into an error traceable back to this line:
https://git.nomics.world/dbnomics/dbnomics-data-model/-/blob/master/dbnomics_data_model/storages/filesystem.py#L196
The file is open with a wrongly guessed codec/decoder and hence some lines become unreadable and validation aborts. The error goes as follows.
```
File "E:\venv\aqicn\lib\site-packages\dbnomics_data_model\storages\filesystem.py", line 202, in iter_series_json_from_jsonl
for line in fp:
File "\\main.oecd.org\em_apps\python\current\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 4777: character maps to <undefined>
- Dataset "aqicn/aqicn" at location aqicn/dataset.json
Error code: storage-error
Message: Could not load "dataset.json"
Encountered errors codes:
- storage-error: 1
```
We used UTF8 for writing the jsonl file.
What should I do, enforce cp1252 when writing (although not sure all the characters would be covered), or is there a way to specify the encoding in the validator?