Consume data locally
EPIC: #519
- As a data consumer
- I want to consume data locally
- in order to save bandwidth (and save server resources).
Acceptance criteria
-
a client MUST be able to build a data-frame from a local dataset (downloaded without using the web API)
Description
Problems:
- Using web API to download a big amount of series uses too much server resources
- We must find a way to let the clients replicate data locally and work on it without relying on DBnomics infrastructure
Goals:
- download a single dataset locally
Ideas:
- load DataFrame from a Git bare repo
# `./ameco-json-data.git` has been cloned from DBnomics GitLab instance in bare mode ameco = DBnomics(provider_dir="./ameco-json-data.git") ameco.to_df(dataset="ZUTN") ameco.to_df(dataset="ZUTN", series="AUS.1.0.0.0.ZUTN") # access revisions ameco.to_df(dataset="ZUTN", series="AUS.1.0.0.0.ZUTN", revisions=True) ameco.to_df(dataset="ZUTN", series="AUS.1.0.0.0.ZUTN", revision="{SHA1}")
- instrument
git clone
step also# `./data` is a black box directory client = DBnomics(data_dir="./data") ameco = client.fetch(provider="AMECO") ameco.to_df(dataset="ZUTN", series="AUS.1.0.0.0.ZUTN")
Questions:
- if the server is restarted while a client is downloading a dataset from the API, using pagination, will the client miss some data pages, or will the request fail?
Tasks
-
...
Edited by Christophe Benz