Port fetchers using pipeline v1 to v5

Tasks

For all the fetchers concerned (cf section below):

try it in pre-production
review with team
if OK, deploy to production
remove the fetcher from pre-production to lighten it (put deploy: false in the fetcher item in fetchers.yml of pre-production instance)

Start with simple fetchers (non-incremental) to quickly port the majority of fetchers to pipeline v5, then do the more complex ones.

Technical tasks:

on pre-prod, do not add schedules during dbnomics-fetcher-ops configure job
replace deploy: false in fetchers.yml by removing the item from the list
remove webhooks from fetcher projects that trigger indexation and validation jobs, in dbnomics-fetcher-ops (there is a TODO) (example)

Fetchers concerned

All fetchers declaring pipeline: v1 in fetchers.yml

Extend the following list to reflect progression:

done

BCEAO
BI
BOJ
EIA
FHFA
NAR
ELSTAT
ND_GAIN
NBS
ONS
pole-emploi
cbo
dares
fao
fh
ilo
indec

Incremental fetchers

Some fetchers read data from Git to implement incremental mode. The pipelines v1 and v2 gave access to Git cloned repo to the fetcher, but v5 gives an empty dir. So these fetchers must be ported.

Note: to facilitate the detection of those fetchers, it is advised to do find -name requirements.txt -exec grep dulwich {} +. However this grep is not enough: it is required to check manually each fetcher.

Checks to be done:

whether download is incremental: in this case, replace the old date comparaison strategy by reading FROM_DATETIME
whether convert is incremental: in this case, simply remove it and convert all the downloaded datasets

Fetchers:

Destatis: incremental download using get_last_commit_date, cf branch 821-read-datetime-from-env
Eurostat: incremental download
IMF: incremental download
INSEE: incremental download using LAST_DOWNLOAD_STARTED_AT env var, managed by .gitlab-ci.yml
- -> waiting for current developments by @MichelJuillard before passing to v5
UNCTAD: incremental download reading existing file in get_old_source_json_dict

Edited Jan 10, 2022 by Christophe Benz