management issueshttps://git.nomics.world/dbnomics-fetchers/management/-/issues2022-11-10T16:40:49Zhttps://git.nomics.world/dbnomics-fetchers/management/-/issues/1123Add missing series for a Macroeconomic presentation2022-11-10T16:40:49ZChristophe Benzchristophe.benz@nomics.worldAdd missing series for a Macroeconomic presentation## Context
Some DBnomics users are building a Macroeconomic presentation, and some series from misc providers are missing on DBnomics. This issue tracks their integration.
## Series
- [x] Slide 4 : [OECD weekly growth tracker](https:/...## Context
Some DBnomics users are building a Macroeconomic presentation, and some series from misc providers are missing on DBnomics. This issue tracks their integration.
## Series
- [x] Slide 4 : [OECD weekly growth tracker](https://www.oecd.org/economy/weekly-tracker-of-gdp-growth/) (cf https://git.nomics.world/dbnomics-fetchers/oecd-fetcher/-/issues/1)
- [x] Slide 7: [consumer confidence index from University of Michigan](https://data.sca.isr.umich.edu)
- cf https://db.nomics.world/SCSMICH/MICS/ICS
- [ ] Priority [Cours du brent sur Nasdaq](https://www.nasdaq.com/fr/market-activity/commodities/bz:nmx/historical)
- Web page : https://www.nasdaq.com/fr/market-activity/commodities/bz:nmx/historical
- Raw data :
- Create a new provider with code NASDAQ and with name "Nasdaq"
- Create a new dataset with code BRENT and with name "Brent"
- [ ] Priority [Add Consumer price index to StatCan] (https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=1810000401)
- Web Data : https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=1810000401
- Create new dataset with Name "Consumer price index", with code "CPI"
- One dimension per item category
- [ ] Priority [Add Unemployment data at National Level to BLS provider (Bureau of labor statistics)]
- Web page and raw data : https://www.bls.gov/web/empsit/cpseea10.htm
- [ ] Priority [Add Notified vacancies dataset to Destatis provider]
- Web page and raw data : https://www.destatis.de/EN/Themes/Economy/Short-Term-Indicators/Labour-Market/karb830.html#249828
- [ ] Priority [Cours du WTI sur Nasdaq](https://www.nasdaq.com/market-activity/commodities/cl:nmx/historical)
- Web page: https://www.nasdaq.com/market-activity/commodities/cl:nmx/historical
- Raw data :
- Create a new dataset with code WTI and with name "WTI" to the existing provider Nasdaq
- [ ] Priority [Cours du Natural Gas sur Nasdaq](https://www.nasdaq.com/fr/market-activity/commodities/ng:nmx/historical)
- Web page: https://www.nasdaq.com/fr/market-activity/commodities/ng:nmx/historical
- Raw data:
- Create a new dataset NATURAL_GAS with name "Natural Gas" to the existing provider Nasdaq
- [ ] Priority [Cours du Copper sur Nasdaq](https://www.nasdaq.com/market-activity/commodities/hg:cmx/historical)
- [ ] Priority [Cours du CBOT Wheat sur Nasdaq](https://www.nasdaq.com/fr/market-activity/commodities/zw)
- Web Page: https://www.nasdaq.com/fr/market-activity/commodities/zw
- Raw data:
- Create a new dataset with code WHEAT and with name "Wheat" to the existing provider Nasdaq
- [x] Priority and already done on previous call (To be added) Extraire la base de donnée de [ICE](https://www.theice.com/), en particulier le cours du Dutch TTF
- Web page : https://www.theice.com/products/27996665/Dutch-TTF-Gas-Futures/data?marketId=5419234
- Raw data : https://www.theice.com/marketdata/DelayedMarkets.shtml?getHistoricalChartDataAsJson=&marketId=5508663&historicalSpan=3
- Create a new provider with code ICE and with name "Intercontinental Exchange"
- Create a new dataset with code DUTCH_TTF_GAS_FUTURES and with name "Dutch TTF Gas Futures"
- [x] Slide 11 : [Extraire les bases de données du site suivant](https://economy-finance.eceuropa.eu/economic-forecast-and-surveys/business-and-consumer-surveys/download-business-and-consumer-survey-data/time-series_en)
- Ok : cf : https://db.nomics.world/EC/CONSTRUCTION?dimensions=%7B%22country%22%3A%5B%22EU%22%5D%7D
- [ ] Priority [shipping container index de la base de données suivante](https://fbx.freightos.com)
- [ ] Priority (but difficult to get, an excel file is available for download on each publication) [Informations extraites de la Press room du site suivant, afin d’avoir le"Global reliability schedule"](https://www.sea-intelligence.com/press-room/155-schedule-reliability-improves-to-40-in-june-2022)
- [ ] Priority [Atlanta wage tracker](https://www.atlantafed.org/chcs/wage-growth-tracker)
- Web page:
- Raw data: /-/media/documents/datafiles/chcs/wage-growth-tracker/wage-growth-data.xlsx
- Create a new provider with code ATLANTA_FED and with name "Federal Reserve Bank of Atlanta"
- Create a new dataset with code WAGE_TRACKER and with name "Wage growth tracker"
- [ ] Priority [Fed nominal yield curve](https://www.federalreserve.gov/data/nominal-yield-curve.htm)
- Web page: https://www.federalreserve.gov/data/nominal-yield-curve.htm
- Raw data : /data/yield-curve-tables/feds200628.csv
- Create a new dataset with code "NOMINAL_YLD_CURVE" and with name "Nominal Yield Curve" to the existing Fed provider
- [ ] Priority [Fed TIPS and inflation compensation](https://www.federalreserve.gov/data/tips-yield-curve-and-inflation-compensation.htm)
- Web page: https://www.federalreserve.gov/data/tips-yield-curve-and-inflation-compensation.htm
- Raw data : data/yield-curve-tables/feds200805.csv
- Create a new dataset with code "TIPS_YLD_CURVE" and with name "TIPS and Inflation compensation" to the existing Fed provider
- [ ] Priority : [Database Banco di Espana] (https://www.bde.es/bde/en/areas/estadis/)
- Add complete database from the website : https://www.bde.es/bde/en/areas/estadis/
- [ ] Priority (the objective is to get the graph with High Yield US Corporate Spreads and High Yield Euro Corporate Spreads (cf. slide 36) [Base de données ICE Bofa](https://indices.theice.com)
- Web page: https://indices.theice.com
- Raw data:
- Create a new provider with code ICE_Bofa and with name "ICE Bofa"
-
- [ ] Priority [Indices du S&P 500](https://www.nasdaq.com/fr/market-activity/index/spx/historical)
- Web page: https://www.nasdaq.com/fr/market-activity/index/spx/historical
- Raw data:
- Create a new dataset with code "S&P500_INDICES" with name "S&P 500 Index" to the existing provider Nasdaq
- [ ] Slide 38 : [Indices du CAC 40](https://live.euronext.com/fr/product/index/FR0003500008-XPAR)
- [ ] Priority [indice de Shiller CAPE](http://www.econ.yale.edu/~shiller/data.htm)
- Web page: http://www.econ.yale.edu/~shiller/data.htm
- Raw data: data/ie_data.xls"
- Create a new provider with code "SHILLER" and with name "Robert Shiller"
- Create a new dataset with code CAPE_RATIO and with name "CAPE Ratio
- [ ] Priority [VIX index](https://www.cboe.com/tradable_products/vix/vix_historical_data/)
- Create a new provider CBOE
- Create a new dataset with name "VIX index" and with code "VIX_INDEX"
- [ ] Priority : Shiller CAPE ratios from Barclays
- web page: https://indices.barclays/IM/21/en/indices/static/historic-cape.app
- Create a new provider with name "Barclays" and with code "BARCLAYS"
- Have one dimension per countryhttps://git.nomics.world/dbnomics-fetchers/management/-/issues/1051SCB Download - some datasets are not downloaded: TypeError: list indices must...2021-09-06T07:00:52ZMichel JuillardSCB Download - some datasets are not downloaded: TypeError: list indices must be integers or slices, not strdownload.py fails with
```
$ time python download.py "$WORKSPACE_SOURCE_DATA_DIR"
download.py:86: DeprecationWarning: Using 'method_whitelist' with Retry is deprecated and will be removed in v2.0. Use 'allowed_methods' instead
requests...download.py fails with
```
$ time python download.py "$WORKSPACE_SOURCE_DATA_DIR"
download.py:86: DeprecationWarning: Using 'method_whitelist' with Retry is deprecated and will be removed in v2.0. Use 'allowed_methods' instead
requests_retry = Retry(total=12, backoff_factor=2, status_forcelist=[
INFO: * category AA
INFO: * category AM
INFO: * category BE
INFO: * category BO
INFO: * category EN
INFO: * category FM
INFO: * category HA
INFO: Download of dataset 'OImpExpSITC4Ar' took: 47 minutes 14 seconds
WARNING: This dataset is ignored to avoid too long download times
INFO: Download of dataset 'ImpExpSPIN2007M' took: 30 minutes 49 seconds
WARNING: This dataset is ignored to avoid too long download times
WARNING: This dataset is ignored to avoid too long download times
WARNING: This dataset is ignored to avoid too long download times
WARNING: This dataset is ignored to avoid too long download times
INFO: * category HE
ERROR: Unexpected exception occurred while downloading "HushallT31" dataset
Traceback (most recent call last):
File "download.py", line 193, in download_dataset
response = download_dataset_data(url, dataset_code, metadata)
File "download.py", line 168, in download_dataset_data
for dimension in metadata["variables"]
TypeError: list indices must be integers or slices, not str
ERROR: => Dataset HushallT31 aborted
INFO: Download of dataset 'SamForvInk1c' took: 46 minutes 59 seconds
INFO: Download of dataset 'SamForvInk2' took: 25 minutes 54 seconds
INFO: Download of dataset 'InkAvTjanst' took: 2 hours 30 minutes 7 seconds
INFO: Download of dataset 'BeskForvInk' took: 22 minutes 48 seconds
INFO: Download of dataset 'Skatter' took: 6 hours 29 minutes 8 seconds
INFO: Download of dataset 'Skattereduktioner' took: 43 minutes 55 seconds
INFO: * category JO
INFO: * category LE
INFO: * category ME
INFO: * category MI
ERROR: Unexpected exception occurred while downloading "MI0305T003" dataset
Traceback (most recent call last):
File "download.py", line 193, in download_dataset
response = download_dataset_data(url, dataset_code, metadata)
File "download.py", line 168, in download_dataset_data
for dimension in metadata["variables"]
TypeError: list indices must be integers or slices, not str
ERROR: => Dataset MI0305T003 aborted
INFO: * category NR
INFO: * category NV
INFO: * category OE
ERROR: Unexpected exception occurred while downloading "OffTillgSektor" dataset
Traceback (most recent call last):
File "download.py", line 193, in download_dataset
response = download_dataset_data(url, dataset_code, metadata)
File "download.py", line 168, in download_dataset_data
for dimension in metadata["variables"]
TypeError: list indices must be integers or slices, not str
ERROR: => Dataset OffTillgSektor aborted
INFO: * category PR
INFO: * category TK
INFO: * category UF
INFO: Download of dataset 'YHStudT1dN' took: 55 minutes
INFO: Download of dataset 'UF0542T2B' took: 20 minutes 15 seconds
INFO: Download of dataset 'UF05031cHgsk' took: 2 hours 51 minutes 7 seconds
INFO: Download of dataset 'UF05032c' took: 1 hour 56 minutes 43 seconds
INFO: Download of dataset 'UF05035a' took: 21 minutes 6 seconds
INFO: Download of dataset 'UF05035c' took: 15 hours 25 minutes 33 seconds
ERROR: Job failed: execution took longer than 48h0m0s seconds
```https://git.nomics.world/dbnomics-fetchers/management/-/issues/1027refactor UNCTAD fetcher2022-03-01T16:49:00ZMichel Juillardrefactor UNCTAD fetcher- Bulk download is available
- CSV files
- Updated files are announced on RSS feed- Bulk download is available
- CSV files
- Updated files are announced on RSS feed2022-04-05Michel JuillardMichel Juillardhttps://git.nomics.world/dbnomics-fetchers/management/-/issues/926OECD: import datasets from iLibrary2021-03-08T09:36:40ZChristophe Benzchristophe.benz@nomics.worldOECD: import datasets from iLibrary## Description
Data acquisition via OECD web API is difficult due to rate and response size limiting.
DBnomics has an access to OECD iLibrary that provides datasets as static files.
We need to establish a correspondance table between ...## Description
Data acquisition via OECD web API is difficult due to rate and response size limiting.
DBnomics has an access to OECD iLibrary that provides datasets as static files.
We need to establish a correspondance table between DBnomics datasets and iLibrary URLs, and evaluate if those files follow a common data schema that would allow importing them with the same source code.
## Correspondance
Copy-pasted from an e-mail of Arnaud (not exhaustive):
<table class="MsoNormalTable" border="0" cellspacing="0" cellpadding="0" width="0" style="width:1459.05pt;border-collapse:collapse">
<tbody>
<tr style="height:12.75pt">
<td width="157" style="width:117.45pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal" style="text-indent:10.0pt"><b><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1">Dbnomics<o:p></o:p></span></u></b></p>
</td>
<td width="505" nowrap="" valign="bottom" style="width:379.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal"><b><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1">Ilibrary<o:p></o:p></span></u></b></p>
</td>
<td width="1283" nowrap="" valign="bottom" style="width:962.6pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal"><b><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1">CSV (password needed)<o:p></o:p></span></u></b></p>
</td>
</tr>
<tr style="height:12.75pt">
<td width="157" style="width:117.45pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal" style="text-indent:10.0pt"><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1"><a href="https://db.nomics.world/OECD/KEI"><span style="color:#0563C1">Key Short-Term Economic Indicators [OECD/KEI]</span></a><o:p></o:p></span></u></p>
</td>
<td width="505" nowrap="" valign="bottom" style="width:379.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal"><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1"><a href="https://www.oecd-ilibrary.org/economics/data/main-economic-indicators_mei-data-en#archive"><span style="color:#0563C1">OECD iLibrary | Main Economic
Indicators (oecd-ilibrary.org)</span></a><o:p></o:p></span></u></p>
</td>
<td width="1283" nowrap="" valign="bottom" style="width:962.6pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal"><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1"><a href="https://www.oecd-ilibrary.org/key-short-term-indicators-edition-2021-01_e54963ba-en.zip?itemId=%2Fcontent%2Fdata%2Fe54963ba-en&containerItemId=%2Fcontent%2Fcollection%2Fmei-data-en"><span style="color:#0563C1">https://www.oecd-ilibrary.org/key-short-term-indicators-edition-2021-01_e54963ba-en.zip?itemId=%2Fcontent%2Fdata%2Fe54963ba-en&containerItemId=%2Fcontent%2Fcollection%2Fmei-data-en</span></a><o:p></o:p></span></u></p>
</td>
</tr>
<tr style="height:12.75pt">
<td width="157" style="width:117.45pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal" style="text-indent:10.0pt"><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1"><a href="https://db.nomics.world/OECD/MEI"><span style="color:#0563C1">Main Economic Indicators Publication [OECD/MEI]</span></a><o:p></o:p></span></u></p>
</td>
<td width="505" nowrap="" valign="bottom" style="width:379.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal"><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1"><a href="https://www.oecd-ilibrary.org/economics/data/main-economic-indicators_mei-data-en"><span style="color:#0563C1">OECD iLibrary | Main Economic Indicators
(oecd-ilibrary.org)</span></a><o:p></o:p></span></u></p>
</td>
<td width="1283" nowrap="" valign="bottom" style="width:962.6pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal"><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1"><a href="https://www.oecd-ilibrary.org/main-economic-indicators-complete-database-edition-2021-01_f50826d4-en.zip?itemId=%2Fcontent%2Fdata%2Ff50826d4-en&containerItemId=%2Fcontent%2Fcollection%2Fmei-data-en"><span style="color:#0563C1">https://www.oecd-ilibrary.org/main-economic-indicators-complete-database-edition-2021-01_f50826d4-en.zip?itemId=%2Fcontent%2Fdata%2Ff50826d4-en&containerItemId=%2Fcontent%2Fcollection%2Fmei-data-en</span></a><o:p></o:p></span></u></p>
</td>
</tr>
<tr style="height:12.75pt">
<td width="157" style="width:117.45pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal" style="text-indent:10.0pt"><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1"><a href="https://db.nomics.world/OECD/MIG"><span style="color:#0563C1">International Migration Database [OECD/MIG]</span></a><o:p></o:p></span></u></p>
</td>
<td width="505" nowrap="" valign="bottom" style="width:379.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal"><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1"><a href="https://www.oecd-ilibrary.org/social-issues-migration-health/data/oecd-international-migration-statistics_mig-data-en"><span style="color:#0563C1">OECD
iLibrary | OECD International Migration Statistics (oecd-ilibrary.org)</span></a><o:p></o:p></span></u></p>
</td>
<td width="1283" nowrap="" valign="bottom" style="width:962.6pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal"><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1"><a href="https://www.oecd-ilibrary.org/international-migration-database-edition-2019_9bb6f3f6-en.zip?itemId=%2Fcontent%2Fdata%2F9bb6f3f6-en&containerItemId=%2Fcontent%2Fcollection%2Fmig-data-en"><span style="color:#0563C1">https://www.oecd-ilibrary.org/international-migration-database-edition-2019_9bb6f3f6-en.zip?itemId=%2Fcontent%2Fdata%2F9bb6f3f6-en&containerItemId=%2Fcontent%2Fcollection%2Fmig-data-en</span></a><o:p></o:p></span></u></p>
</td>
</tr>
<tr style="height:12.75pt">
<td width="157" style="width:117.45pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal" style="text-indent:10.0pt"><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1"><a href="https://db.nomics.world/OECD/QNA"><span style="color:#0563C1">Quarterly National Accounts [OECD/QNA]</span></a><o:p></o:p></span></u></p>
</td>
<td width="505" nowrap="" valign="bottom" style="width:379.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal"><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1"><a href="https://www.oecd-ilibrary.org/economics/data/oecd-national-accounts-statistics_na-data-en"><span style="color:#0563C1">OECD iLibrary | OECD National
Accounts Statistics (oecd-ilibrary.org)</span></a><o:p></o:p></span></u></p>
</td>
<td width="1283" nowrap="" valign="bottom" style="width:962.6pt;padding:0cm 5.4pt 0cm 5.4pt;height:12.75pt">
<p class="MsoNormal"><u><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0563C1"><a href="https://www.oecd-ilibrary.org/quarterly-national-accounts-edition-2020-2_1c70c4a5-en.zip?itemId=%2Fcontent%2Fdata%2F1c70c4a5-en&containerItemId=%2Fcontent%2Fcollection%2Fna-data-en"><span style="color:#0563C1">https://www.oecd-ilibrary.org/quarterly-national-accounts-edition-2020-2_1c70c4a5-en.zip?itemId=%2Fcontent%2Fdata%2F1c70c4a5-en&containerItemId=%2Fcontent%2Fcollection%2Fna-data-en</span></a><o:p></o:p></span></u></p>
</td>
</tr>
</tbody>
</table>https://git.nomics.world/dbnomics-fetchers/management/-/issues/888NBB convert fails: https://git.nomics.world/dbnomics-fetchers/nbb-fetcher/-/j...2021-07-25T10:24:50ZEnzo ButhiotNBB convert fails: https://git.nomics.world/dbnomics-fetchers/nbb-fetcher/-/jobs/403038Michel JuillardMichel Juillardhttps://git.nomics.world/dbnomics-fetchers/management/-/issues/874BEA - reduce download time using parallel download2021-01-12T18:36:08ZBruno DuyéBEA - reduce download time using parallel downloadDue to quite hard [API request rate limitations](https://git.nomics.world/dbnomics-fetchers/management/-/issues/723), download of BEA is quite long (± 10h !).
But this time is mostly spent waiting for the API to be kind and let us downlo...Due to quite hard [API request rate limitations](https://git.nomics.world/dbnomics-fetchers/management/-/issues/723), download of BEA is quite long (± 10h !).
But this time is mostly spent waiting for the API to be kind and let us download something again.
I got the (nasty ?) idea of using multiple API keys and download data in parallel.https://git.nomics.world/dbnomics-fetchers/management/-/issues/867BEA - convert fails: FileNotFoundError: [Errno 2] No such file or directory: ...2021-01-07T16:31:30ZBruno DuyéBEA - convert fails: FileNotFoundError: [Errno 2] No such file or directory: 'bea-source-data/NIPA/t20506-A.json'Since 5 Jan 2021, BEA convert fails
https://git.nomics.world/dbnomics-fetchers/bea-fetcher/-/jobs/329008
```
* Appendix: 'NIPA'
Exception while convert 'T20506' dataset
Traceback (most recent call last):
File "convert.py", line 463, ...Since 5 Jan 2021, BEA convert fails
https://git.nomics.world/dbnomics-fetchers/bea-fetcher/-/jobs/329008
```
* Appendix: 'NIPA'
Exception while convert 'T20506' dataset
Traceback (most recent call last):
File "convert.py", line 463, in <module>
sys.exit(main())
File "convert.py", line 115, in main
dataset_code, dataset_name = treat_dataset(dataset_dict, appendix_path, appendix_dict)
File "convert.py", line 242, in treat_dataset
with open(dataset_source_filepath) as dataset_file:
FileNotFoundError: [Errno 2] No such file or directory: 'bea-source-data/NIPA/t20506-A.json'
```https://git.nomics.world/dbnomics-fetchers/management/-/issues/856FAO convert fails2020-12-22T08:30:23ZMichel JuillardFAO convert failsFor FAO fetcher ``convert.py`` has failed for several day. The last time that the data have been updated is November 11, 2020.
See https://git.nomics.world/dbnomics-fetchers/fao-fetcher/-/jobs/314197For FAO fetcher ``convert.py`` has failed for several day. The last time that the data have been updated is November 11, 2020.
See https://git.nomics.world/dbnomics-fetchers/fao-fetcher/-/jobs/314197https://git.nomics.world/dbnomics-fetchers/management/-/issues/853CSO - get new category tree2020-12-10T17:58:02ZBruno DuyéCSO - get new category treeFollowing #631
CSO website changed a lot, and the category tree can't be parsed using the same technique as before. Until now using the category tree was the only way to get the full datasets list. Temporary solution used in #631 was t...Following #631
CSO website changed a lot, and the category tree can't be parsed using the same technique as before. Until now using the category tree was the only way to get the full datasets list. Temporary solution used in #631 was to fix category tree using the last downloaded dataset (before the website changed).
Consequences:
- we probably do not provide all datasets
- if new datasets appears, we won't include them in DBnomics
- the category tree present in DBnomics is not up to date
We have to find a way to get the new category tree (I've done some promising tests).https://git.nomics.world/dbnomics-fetchers/management/-/issues/848OECD: timeout raises exception instead of continuing2022-01-10T10:37:35ZChristophe Benzchristophe.benz@nomics.worldOECD: timeout raises exception instead of continuingCf https://git.nomics.world/dbnomics-fetchers/oecd-fetcher/-/jobs/315633#L204Cf https://git.nomics.world/dbnomics-fetchers/oecd-fetcher/-/jobs/315633#L204https://git.nomics.world/dbnomics-fetchers/management/-/issues/817OpenTable convert fails: KeyError badden_wurttemberg2022-01-10T10:37:47ZEmmanuel RaviartOpenTable convert fails: KeyError badden_wurttembergKeyError: 'baden_wurttemberg' when executing convert.py
```
2020-11-10 18:48:43 + python convert.py /workspace/fetcher-source-data /workspace/fetcher-json-data
2020-11-10 18:48:48 Traceback (most recent call last):
2020-11-10 18:48:48 Fi...KeyError: 'baden_wurttemberg' when executing convert.py
```
2020-11-10 18:48:43 + python convert.py /workspace/fetcher-source-data /workspace/fetcher-json-data
2020-11-10 18:48:48 Traceback (most recent call last):
2020-11-10 18:48:48 File "convert.py", line 279, in <module>
2020-11-10 18:48:48 sys.exit(main())
2020-11-10 18:48:48 File "convert.py", line 102, in main
2020-11-10 18:48:48 write_series_jsonl(dataset_dir / "series.jsonl", df, dimension_list)
2020-11-10 18:48:48 File "convert.py", line 193, in write_series_jsonl
2020-11-10 18:48:48 name = " - ".join([dim_v_l["area"][area],
2020-11-10 18:48:48 KeyError: 'baden_wurttemberg'
```https://git.nomics.world/dbnomics-fetchers/management/-/issues/815SAFE: convert failed2020-11-27T10:36:57ZPierre DittgenSAFE: convert failedhttps://git.nomics.world/dbnomics-fetchers/safe-fetcher/-/jobs/297691https://git.nomics.world/dbnomics-fetchers/safe-fetcher/-/jobs/297691https://git.nomics.world/dbnomics-fetchers/management/-/issues/812POLE-EMPLOI: download failed2020-11-27T10:36:42ZPierre DittgenPOLE-EMPLOI: download failedhttps://git.nomics.world/dbnomics-fetchers/pole-emploi-fetcher/-/jobs/300095https://git.nomics.world/dbnomics-fetchers/pole-emploi-fetcher/-/jobs/300095https://git.nomics.world/dbnomics-fetchers/management/-/issues/811Bank of Indonesia (BI) convert fails with KeyError: -12020-11-03T17:07:52ZEmmanuel RaviartBank of Indonesia (BI) convert fails with KeyError: -1Conversion fails on all tables (TABEL1_1 to TABEL9_9) with `KeyError: -1` when executing `convert_excel_to_dataset`.
And when fetcher is executed on an empty json-data directory, every dataset directory is empty.
See [attached log for m...Conversion fails on all tables (TABEL1_1 to TABEL9_9) with `KeyError: -1` when executing `convert_excel_to_dataset`.
And when fetcher is executed on an empty json-data directory, every dataset directory is empty.
See [attached log for more details](/uploads/46ee43571370b279adb93c2f7b729f57/default-bi-run-mrxgt-fetch-1gi-rnr9h-pod-9j6hb-1604420811245058885.log)https://git.nomics.world/dbnomics-fetchers/management/-/issues/791AMECO - download fails - subprocess.CalledProcessError: Command '['git', 'che...2021-01-12T15:48:50ZBruno DuyéAMECO - download fails - subprocess.CalledProcessError: Command '['git', 'checkout', '-B', 'master', '--quiet']' returned non-zero exit status 128.https://git.nomics.world/dbnomics-fetchers/ameco-fetcher/-/jobs/293005
```
fatal: Unable to create '/fetcher-data/ameco/ameco-source-data/.git/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an...https://git.nomics.world/dbnomics-fetchers/ameco-fetcher/-/jobs/293005
```
fatal: Unable to create '/fetcher-data/ameco/ameco-source-data/.git/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
```https://git.nomics.world/dbnomics-fetchers/management/-/issues/778ECB - ERROR: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRe...2020-10-13T10:47:39ZBruno DuyéECB - ERROR: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))Sometimes (not on every download), some datasets (frequently `GFS`, and `E11`) does not download, the error is `ERROR: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))`
Example:
https://git.nomics.world/...Sometimes (not on every download), some datasets (frequently `GFS`, and `E11`) does not download, the error is `ERROR: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))`
Example:
https://git.nomics.world/dbnomics-fetchers/ecb-fetcher/-/jobs/282058
```
ERROR: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
url: 'https://sdw-wsrest.ecb.europa.eu/service/data/GFS/'
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/urllib3/response.py", line 697, in _update_chunk_length
self.chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/urllib3/response.py", line 437, in _error_catcher
yield
File "/usr/local/lib/python3.7/dist-packages/urllib3/response.py", line 764, in read_chunked
self._update_chunk_length()
File "/usr/local/lib/python3.7/dist-packages/urllib3/response.py", line 701, in _update_chunk_length
raise httplib.IncompleteRead(line)
http.client.IncompleteRead: IncompleteRead(0 bytes read)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/requests/models.py", line 751, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/usr/local/lib/python3.7/dist-packages/urllib3/response.py", line 572, in stream
for line in self.read_chunked(amt, decode_content=decode_content):
File "/usr/local/lib/python3.7/dist-packages/urllib3/response.py", line 793, in read_chunked
self._original_response.close()
File "/usr/lib/python3.7/contextlib.py", line 130, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.7/dist-packages/urllib3/response.py", line 455, in _error_catcher
raise ProtocolError("Connection broken: %r" % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "download.py", line 186, in download_file
response = requests_session.get(url)
File "/usr/local/lib/python3.7/dist-packages/requests/sessions.py", line 543, in get
return self.request('GET', url, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.7/dist-packages/requests/sessions.py", line 683, in send
r.content
File "/usr/local/lib/python3.7/dist-packages/requests/models.py", line 829, in content
self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
File "/usr/local/lib/python3.7/dist-packages/requests/models.py", line 754, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
ERROR: https://sdw-wsrest.ecb.europa.eu/service/data/GFS/
```https://git.nomics.world/dbnomics-fetchers/management/-/issues/777ECB - Take last updated date information into account for download2020-10-13T10:36:20ZBruno DuyéECB - Take last updated date information into account for downloadAs @MichelJuillard pointed out [here](https://git.nomics.world/dbnomics-fetchers/management/-/issues/692#note_19530), there's a way of determine last updates for each datasets.
Downloader should use this information for selectively dow...As @MichelJuillard pointed out [here](https://git.nomics.world/dbnomics-fetchers/management/-/issues/692#note_19530), there's a way of determine last updates for each datasets.
Downloader should use this information for selectively download datasets (not all datasets each time)https://git.nomics.world/dbnomics-fetchers/management/-/issues/776ECB - ICO and IDCS are never downloaded (404 errors)2020-10-26T18:39:53ZBruno DuyéECB - ICO and IDCS are never downloaded (404 errors)Looking at the 12 last [download jobs](https://git.nomics.world/dbnomics-fetchers/ecb-fetcher/-/jobs), ICO and IDCS datasets are never downloaded:
```
INFO: * ICO
WARNING: download url 'https://sdw-wsrest.ecb.europa.eu/service/dataflow/...Looking at the 12 last [download jobs](https://git.nomics.world/dbnomics-fetchers/ecb-fetcher/-/jobs), ICO and IDCS datasets are never downloaded:
```
INFO: * ICO
WARNING: download url 'https://sdw-wsrest.ecb.europa.eu/service/dataflow/ECB/ICO?references=all' - status_code: 404 - reason: Not Found
WARNING: download url 'https://sdw-wsrest.ecb.europa.eu/service/data/ICO/' - status_code: 404 - reason: Not Found
WARNING: !!! Dataset ICO ignored
INFO: * IDCS
WARNING: download url 'https://sdw-wsrest.ecb.europa.eu/service/dataflow/ECB/IDCS?references=all' - status_code: 404 - reason: Not Found
WARNING: !!! Dataset IDCS ignored
```https://git.nomics.world/dbnomics-fetchers/management/-/issues/774OECD : download failed : Connection reset by peer2020-10-26T18:39:55ZThomas BrandOECD : download failed : Connection reset by peerhttps://git.nomics.world/dbnomics-fetchers/oecd-fetcher/-/jobs/286579https://git.nomics.world/dbnomics-fetchers/oecd-fetcher/-/jobs/286579https://git.nomics.world/dbnomics-fetchers/management/-/issues/772BEA - FixedAssets - KeyError: 'freq_list_url'2021-01-07T21:55:14ZBruno DuyéBEA - FixedAssets - KeyError: 'freq_list_url'```
Traceback (most recent call last):
File "./download.py", line 348, in <module>
sys.exit(main())
File "./download.py", line 105, in main
dimensions_values = get_from_api(appendix_dict['api']['freq_list_url'].format(
KeyErr...```
Traceback (most recent call last):
File "./download.py", line 348, in <module>
sys.exit(main())
File "./download.py", line 105, in main
dimensions_values = get_from_api(appendix_dict['api']['freq_list_url'].format(
KeyError: 'freq_list_url'
```
This error prevents FixedAssets appendix from being downloaded