Create a data storage library
Part of #554
Related to #818
Description
Goals:
- abstract source code of fetchers and services (Web API, indexation script, Python client) from storage specificities, letting them manipulate a domain-level data model instead of a storage-level one
- concentrate the source code handling data storage at one place
- improve the documentation of the storage model, with all its variants
-
git+tsv
,git+jsonl
,filesystem+tsv
,filesystem+jsonl
-
Features:
- serialize and deserialize data model instances to many backend storages (e.g.
git+tsv
,git+jsonl
) - provider capabilities like accessing past revisions
Use cases:
- convert script of fetchers write data model instances to storage
- indexation script and web API read data model instances from storage
- simplify towards one storage model
- get rid of bare repositories by converting to
git+jsonl
the repositories that can't be checked-out because of the too many tsv files
Tasks
-
create a new Python package: dbnomics-storage -
move and adapt parts of dbnomics_data_model.storage.*
to new package -
adapt clients of dbnomics_data_model.storage
to new package -
remove dbnomics_data_model.storage
Edited by Christophe Benz