|
|
# Outdated !
|
|
|
|
|
|
** Warning: this is an old documentation. Some parts may still be valid, some not.
|
|
|
The only reference for data format is the [`sample-json-data-tree` directory in `dbnomics-data-model` repo](https://git.nomics.world/dbnomics/dbnomics-data-model)
|
|
|
**
|
|
|
|
|
|
### Requirements
|
|
|
|
... | ... | @@ -33,7 +38,7 @@ For pedagogical purpose we will create the tree for dbnomics project by creating |
|
|
- dbnomics-source-data
|
|
|
- dbnomics-json-data
|
|
|
- dbnomics-fetcher
|
|
|
and by cloning:
|
|
|
and by cloning:
|
|
|
- dbnomics-data-model
|
|
|
|
|
|
```bash
|
... | ... | @@ -42,7 +47,7 @@ and by cloning: |
|
|
(nomics_env) me@mylaptop:~$ mkdir dbnomics-fetchers
|
|
|
```
|
|
|
|
|
|
At the end of the procedure a DBNOMICS fetcher should be ordered in your computer like this:
|
|
|
At the end of the procedure a DBNOMICS fetcher should be ordered in your computer like this:
|
|
|
|
|
|
```bash
|
|
|
(nomics_env) me@mylaptop:~$ tree . -L 2
|
... | ... | @@ -76,9 +81,9 @@ using json-schema |
|
|
* In ssh mode: you have to previously add your ssh_key to your profile on git.nomics.world.
|
|
|
|
|
|
```bash
|
|
|
|
|
|
|
|
|
(nomics_env) me@mylaptop:~$ git clone git@git.nomics.world:dbnomics/dbnomics-data-model.git
|
|
|
|
|
|
|
|
|
```
|
|
|
You will have to add your fingerprint to the server
|
|
|
|
... | ... | @@ -87,12 +92,12 @@ using json-schema |
|
|
```bash
|
|
|
|
|
|
(nomics_env) me@mylaptop:~$ git clone https://git.nomics.world/dbnomics/dbnomics-data_model
|
|
|
|
|
|
|
|
|
```
|
|
|
* Check if clone is ok
|
|
|
|
|
|
```bash
|
|
|
|
|
|
|
|
|
(nomics_env) me@mylaptop:~$ ls dbnomics-data_model/
|
|
|
(nomics_env) me@mylaptop:~$ dbnomics_data_model setup.cfg setup.py
|
|
|
```
|
... | ... | @@ -106,7 +111,7 @@ using json-schema |
|
|
```
|
|
|
Add it the packags requirements with the current version
|
|
|
|
|
|
> When pulling dbnomics-converters think about reinstalling the current version
|
|
|
> When pulling dbnomics-converters think about reinstalling the current version
|
|
|
with :
|
|
|
```pip -e dbnomics-data_model/```
|
|
|
|
... | ... | @@ -145,7 +150,7 @@ with the targeted datasets. |
|
|
|
|
|
#### JSON Data
|
|
|
|
|
|
JSON data is a git repository where results of the conversion process from datasets (source-data repository) to **db-nomics datasets** ( json-data) are stored
|
|
|
JSON data is a git repository where results of the conversion process from datasets (source-data repository) to **db-nomics datasets** ( json-data) are stored
|
|
|
|
|
|
* Create an empty repository for **json-data**:
|
|
|
|
... | ... | @@ -190,7 +195,7 @@ Inside https://git.nomics.world/dbnomics-fetchers click on `New Project` |
|
|
> Let the visibility to public
|
|
|
|
|
|
|
|
|
* clone it inside your dbnomics-fetchers folder:
|
|
|
* clone it inside your dbnomics-fetchers folder:
|
|
|
|
|
|
|
|
|
```bash
|
... | ... | @@ -212,7 +217,7 @@ Correspond to script to_source_data.py in your fetcher that populate the source |
|
|
`<provider_slug>_to_source_data.py` is a script that:
|
|
|
|
|
|
* given a provider
|
|
|
* populate the **source-data** repository
|
|
|
* populate the **source-data** repository
|
|
|
* with the raw data of the provider (specified datasets mentionned in the Analysis)
|
|
|
* by using the most appropriate method
|
|
|
|
... | ... | @@ -229,7 +234,7 @@ Some useful tips: |
|
|
|
|
|
* Define the targeted datasets and make assertion check to detect if there is change in the access to the datasets
|
|
|
|
|
|
* Specify the **data-source repository** for your provider into your `<provider_slug>_to_source_data.py`, this script will be executed from CLI by gitlab-CI so it should take at least one argument : the destination for the datasets i.e the specific path of source-data repository corresponding to your provider
|
|
|
* Specify the **data-source repository** for your provider into your `<provider_slug>_to_source_data.py`, this script will be executed from CLI by gitlab-CI so it should take at least one argument : the destination for the datasets i.e the specific path of source-data repository corresponding to your provider
|
|
|
|
|
|
### JSON DATA
|
|
|
|
... | ... | @@ -239,7 +244,7 @@ Correspond to your script to_dbnomics.py in your fetcher that will populate JSON |
|
|
`<provider_slug>_to_dbnomics.py` is a script that:
|
|
|
|
|
|
* given a data_source
|
|
|
* populate the **json-data** repository
|
|
|
* populate the **json-data** repository
|
|
|
* with the selected and converted data as mentionned in the Analysis
|
|
|
* by using the most appropriate method and dbnomics-convertors builtins functions to help and validate
|
|
|
|
... | ... | |