Consume data locally

EPIC: #519

As a data consumer
I want to consume data locally
in order to save bandwidth (and save server resources).

Acceptance criteria

a client MUST be able to build a data-frame from a local dataset (downloaded without using the web API)

Description

Problems:

Using web API to download a big amount of series uses too much server resources
- We must find a way to let the clients replicate data locally and work on it without relying on DBnomics infrastructure

Goals:

download a single dataset locally

Ideas:

load DataFrame from a Git bare repo

# `./ameco-json-data.git` has been cloned from DBnomics GitLab instance in bare mode
ameco = DBnomics(provider_dir="./ameco-json-data.git")
ameco.to_df(dataset="ZUTN")
ameco.to_df(dataset="ZUTN", series="AUS.1.0.0.0.ZUTN")
# access revisions
ameco.to_df(dataset="ZUTN", series="AUS.1.0.0.0.ZUTN", revisions=True)
ameco.to_df(dataset="ZUTN", series="AUS.1.0.0.0.ZUTN", revision="{SHA1}")

instrument git clone step also

# `./data` is a black box directory
client = DBnomics(data_dir="./data")
ameco = client.fetch(provider="AMECO")
ameco.to_df(dataset="ZUTN", series="AUS.1.0.0.0.ZUTN")

Questions:

if the server is restarted while a client is downloading a dataset from the API, using pagination, will the client miss some data pages, or will the request fail?

Tasks

Edited Oct 28, 2019 by Christophe Benz