Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • D documentation
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • dbnomics-fetchers
  • documentation
  • Wiki
  • test fetcher on pre prod

Last edited by Bruno Duyé Aug 17, 2020
Page history

test fetcher on pre prod

Test a fetcher on the pre-production server

This documentation applies to pre-production server. More info on servers page.

See also: https://git.nomics.world/dbnomics/fetchers-envs

Pyenv is used to have access to the latest stable Python version when the OS does not.

Super quick guide

In a nutshell:

  • clone, download and convert
    • export PROVIDER_SLUG=xxxx
    • cd ~cepremap/fetchers-envs
    • ./create-fetcher-env.sh $PROVIDER_SLUG
      • or for specific Python version: ./create-fetcher-env.sh --pyenv x.y.z $PROVIDER_SLUG
    • cd $PROVIDER_SLUG/$PROVIDER_SLUG-fetcher
    • pyenv activate ${PROVIDER_SLUG}-fetcher
    • ../../download.py
    • ../../convert.py
  • Validation & Solr import
    • pyenv activate dbnomics
    • dbnomics-validate ~/json-data/${PROVIDER_SLUG}-json-data/
    • dbnomics-solr index-provider ~/json-data/${PROVIDER_SLUG}-json-data

Detailed guide

Create a new fetcher environment

As a requirement, repositories for the fetcher source code, source-data and json-data must have been created before. If necessary, use the script create-repositories-for-provider.py.

create-fetcher-env.sh script can be found in this repo

We take ecb as an example.

ssh cepremap@eros.nomics.world

cd ~/fetchers-envs
./create-fetcher-env.sh ecb  # Replace ecb by the real provider slug.

Mount remote data via sshfs

./mount-fetchers-envs-eros.sh

# unmount
./umount-fetchers-envs-eros.sh

Running download or convert

We take ecb as an example.

ssh cepremap@eros.nomics.world

PROVIDER_SLUG=ecb

cd ~/fetchers-envs/${PROVIDER_SLUG}
source ${PROVIDER_SLUG}-venv/bin/activate
# or using pyenv
pyenv activate ${PROVIDER_SLUG}-fetcher

cd ${PROVIDER_SLUG}-fetcher

"Manual" method

rm -rf ../${PROVIDER_SLUG}-source-data/*; python download.py ../${PROVIDER_SLUG}-source-data/
# or
rm -rf ../${PROVIDER_SLUG}-json-data/*; python convert.py ../${PROVIDER_SLUG}-source-data/ ../${PROVIDER_SLUG}-json-data/

"Assisted" method - using bduye scripts

You can use also use bduye scripts that automatize:

  • cleaning of source/json dir (depending if the script is download/convert)
  • call script with correct directories, and custom arguments you passed to it

Example:

~/fetchers-envs/eurostat$ ../../convert.py --datasets teicp290 --full

is equivalent to:

~/fetchers-envs/eurostat$ ./convert.py ../eurostat-source-data ../eurostat-json-data --datasets teicp290 --full

Using pyenv to choose specific Python version

pyenv allows to choose a specific Python version.

Create a virtualenv using pyenv:

pyenv versions # list installed versions
pyenv virtualenv ${PYTHON_VERSION} ${PROVIDER_SLUG}-fetcher
pyenv activate ${PROVIDER_SLUG}-fetcher

Delete a virtualenv using pyenv:

pyenv virtualenv-delete ${PROVIDER-SLUG}-fetcher

Install a specific Python version:

pyenv install --list # list available versions to install
CONFIGURE_OPTS=--enable-shared pyenv install $PYTHON_VERSION

Validating converted data

We take ecb provider as an example.

ssh cepremap@eros.nomics.world

PROVIDER_SLUG=ecb

pyenv activate dbnomics

# optional: update packages
pip install -U dbnomics-data-model dbnomics-solr

dbnomics-validate ~/json-data/${PROVIDER_SLUG}-json-data/
# No error should be displayed. Use --log=debug to see more details.

Importing converted data into Apache Solr

We take ecb provider as an example.

ssh cepremap@eros.nomics.world

PROVIDER_SLUG=ecb

pyenv activate dbnomics

# optional: update packages
pip install -U dbnomics-data-model dbnomics-solr

dbnomics-solr index-provider ~/json-data/${PROVIDER_SLUG}-json-data

# If the indexation raises an error like:
#    "msg":"ERROR: [doc=ELSTAT/DKT15/DKT15-2-1_0020_F_A] unknown field 'dimensions_values_labels'",
# then you must reset the Solr core (see section below).

Now you may verify that ecb is visible in the API and the UI:

  • http://pre.db.nomics.world/providers
  • http://api.pre.db.nomics.world/v22/providers
  • http://api.pre.db.nomics.world/v22/providers/ECB
  • http://pre.db.nomics.world/ECB/MIR (check that dimensions search and charts are OK)

If an internal error is returned by the Web API, follow the "Errors handling" section below.

Update the Web API and UI

You may first want to have an up-to-date Web API and UI. Follow the documentations of the respective projects.

Errors handling

If the Web API returns an internal error, you can check the server logs.

As root:

tail -f /var/log/uwsgi/app/dbnomics-api-uwsgi-v21.log

See also: troubleshooting

Test Solr queries locally

You can forward the port used by Solr to run queries with HTTP requests.

From your machine:

ssh -N -L 8983:localhost:8983 cepremap@eros.nomics.world

Then open URLs like:

  • http://localhost:8983/solr/dbnomics/query?q=*
  • http://localhost:8983/solr/dbnomics/query?q=provider_code:ECB

Reset the Solr core

Only to be done if the converted data follow a data model too recent compared to the Solr core schema. For example when a new field is added to the Solr schema by the import script.

Warning: this deletes absolutely everything in Solr about DBnomics! Be sure to check that you run this on the pre-production server (eros).

As solr user:

./bin/solr delete -c dbnomics
./bin/solr create -c dbnomics
./bin/solr config -c dbnomics -p 8983 -property update.autoCreateFields -value false
rm /var/solr/data/dbnomics/conf/managed-schema

# Get files from https://git.nomics.world/dbnomics/dbnomics-solr/-/tree/master/solr_core_config
cp /path/to/solr_core_config/* /var/solr/data/dbnomics/conf/

As root:

systemctl restart solr.service
Clone repository
  • Git and Gitlab workflow
  • acceptance criteria
    • fetchers
  • ci jobs and runners
  • code optimization
  • dev tools
  • e mails
  • failure handling procedures
  • Home
  • librairies
  • maintaining fetchers
  • monitoring
  • presentation
  • production configuration
  • publishing python packages
  • repositories organization
View All Pages