Commit 07e3f8d6 authored by Christophe Benz's avatar Christophe Benz
Browse files


parent 98258826
Pipeline #28110 passed with stage
in 20 minutes and 41 seconds
# Changelog
### 0.12.6 -> 0.12.7
Non-breaking changes:
- Set dimensions_codes_order if not defined in `dataset.json`.
### 0.12.5 -> 0.12.6
- configure continuous integration to validate DBnomics data
......@@ -41,6 +41,7 @@ Each storage directory is versioned using Git in order to track revisions.
Data MUST NOT be stored if it adds no value or if it can be computed from any other data.
As a consequence:
- series names MUST NOT be generated when not provided by source data;
DBnomics can generate a name from the dimensions values codes
......@@ -52,6 +53,7 @@ Any commit in the storage directory of a provider MUST reflect a change from the
Data conversions MUST be stable: running a conversion script on the same source-data MUST NOT change converted data.
As a consequence:
- when series codes are generated from a dimensions `dict`, always use the same order;
- properties of JSON objects MUST be sorted alphabetically;
......@@ -66,6 +68,7 @@ See [its JSON schema](./dbnomics_data_model/schemas/v0.8/provider.json).
This JSON file contains a tree of categories whose leaves are datasets and nodes are categories.
This file is optional:
- if categories are provided by source data, it SHOULD exist;
- if it's missing, DBnomics will generate the tree as a list of datasets ordered lexicographically;
- it MUST NOT be written if it is identical to the generated list mentioned above (due to the general constraint about minimal data)
......@@ -120,7 +123,7 @@ Note: The `✓` symbol means that a constraint is validated by the [validation s
- `YYYY-MM` for months (MUST be padded for `MM`)
- `YYYY-MM-DD` for days (MUST be padded for `MM` and `DD`)
- `YYYY-Q[1-4]` for year quarters
- example: `2018-Q1` represents jan to mar 2018, and `2018-Q4` represents oct to dec 2018
- example: `2018-Q1` represents jan to mar 2018, and `2018-Q4` represents oct to dec 2018
- `YYYY-S[1-2]` for year semesters (aka bi-annual, semi-annual)
- example: `2018-S1` represents jan to jun 2018, and `2018-S2` represents jul to dec 2018
- `YYYY-B[1-6]` for pairs of months (aka bi-monthly)
......@@ -142,6 +145,7 @@ Note: The `✓` symbol means that a constraint is validated by the [validation s
### Meta-data
Time series meta-data can be stored either:
- in `{dataset_code}/dataset.json` under the `series` property as a JSON array of objects
- in `{dataset_code}/series.jsonl`, a [JSON-lines]( file, each line being a (non-indented) JSON object
......@@ -154,6 +158,7 @@ Constraints additional to the schema:
- ✓ The `code` properties of the series list MUST be unique
- [this dataset](./tests/fixtures/provider1-json-data/dataset1) stores time series meta-data in `dataset.json` under the `series` property
- [this dataset](./tests/fixtures/provider2-json-data/dataset1) stores time series meta-data in `series.jsonl`
......@@ -169,11 +174,11 @@ It is possible to encode this order in `dataset.json` like this:
"dimensions_values_labels": {
"country": [
[ "ALL", "All countries" ],
[ "AF", "Afghanistan" ],
[ "FR", "France" ],
[ "DE", "Germany" ],
[ "OTHER", "Other countries" ]
["ALL", "All countries"],
["AF", "Afghanistan"],
["FR", "France"],
["DE", "Germany"],
["OTHER", "Other countries"]
......@@ -184,6 +189,7 @@ Another case is when the dimensions values talk about units, and we want to orde
### Observations
Time-series observations can be stored either:
- in `{dataset_code}/{series_code}.tsv` [TSV]( files
- in `{dataset_code}/series.jsonl`, a [JSON-lines]( file, each line being a (non-indented) JSON object, under the `observations` property of each object.
......@@ -192,12 +198,14 @@ When a dataset contains a huge number of time series, the number of TSV files fi
Whatever format you choose, the JSON objects are validated against [this JSON schema](./dbnomics_data_model/schemas/v0.8/series.json).
- [this dataset](./tests/fixtures/provider2-json-data/dataset1) stores observations in TSV files
- [this dataset](./tests/fixtures/provider2-json-data/dataset2) stores observations in `series.jsonl`
## Adding documentation to data (description and notes fields)
Datasets and series can be documented using `description` and `notes` fields.
- `description` presents what is the meaning of the data
- `notes` presents some remarks about the data. Example: "Before March 2002, exposures were netted across the banking and trading books. This has necessitated a break in the series."
......@@ -242,3 +250,13 @@ dbnomics-validate tests/fixtures/provider2-json-data
## Changelog
See [](./ It contains an upgrade guide explaining how to modify the source code of your fetcher, if the data model changes in unexpected ways.
## Publish a new version
For package maintainers:
rm -rf build dist
python sdist bdist_wheel
twine upload dist/*
......@@ -40,7 +40,7 @@ doc_lines = __doc__.split('\n')
author='DBnomics Team',
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment