WIP: #501: add series names
1 unresolved thread
1 unresolved thread
Merge request reports
Activity
Filter activity
mentioned in issue management#501 (closed)
@cbenz I wanted to test this on preprod to close #501, but when indexing I have this error:
(dbnomics) cepremap@eros:~/fetchers-envs$ PROVIDER_SLUG='buba'; ~/dbnomics-importer/import_storage_dir.py ~/fetchers-envs/${PROVIDER_SLUG}/${PROVIDER_SLUG}-json-data INFO:__main__:2019-09-25 16:56:36,804:Received args: Namespace(bare_repo_fallback=False, datasets=None, exclude_datasets=None, full=False, log='INFO', print_json_lines=False, solr_core='dbnomics', solr_hostname='localhost', solr_port=898 3, solr_post=PosixPath('/opt/solr/bin/post'), start_from=None, storage_dir=PosixPath('/home/cepremap/fetchers-envs/buba/buba-json-data')) INFO:__main__:2019-09-25 16:56:36,804:Using indexed_at '2019-09-25T14:56:36.804848Z' for all documents INFO:__main__:2019-09-25 16:56:36,805:Provider code: 'BUBA' INFO:__main__:2019-09-25 16:56:36,811:provider.created_at is unknown. Running command 'git log --reverse --format="format:%at" | head -n1' ERROR:__main__:2019-09-25 16:56:36,814:Could not find provider document in Solr. Indexing all datasets. INFO:__main__:2019-09-25 16:56:36,815:Mode: full INFO:__main__:2019-09-25 16:56:36,815:Running command '/opt/solr/bin/post -c dbnomics -type application/json -url http://localhost:8983/solr/dbnomics/update/json/docs -' INFO:__main__:2019-09-25 16:56:36,817:Processing 45 datasets... INFO:__main__:2019-09-25 16:56:36,817:Indexing dataset 'BBAI3' (1/45) INFO:__main__:2019-09-25 16:56:36,829:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBAI3 | head -n1' java -classpath /opt/solr/dist/solr-core-7.5.0.jar -Dauto=yes -Dtype=application/json -Durl=http://localhost:8983/solr/dbnomics/update/json/docs -Dc=dbnomics -Ddata=stdin org.apache.solr.util.SimplePostTool SimplePostTool version 5.0.0 POSTing stdin to http://localhost:8983/solr/dbnomics/update/json/docs... INFO:__main__:2019-09-25 16:56:37,055:Indexing dataset 'BBAPV' (2/45) INFO:__main__:2019-09-25 16:56:37,089:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBAPV | head -n1' INFO:__main__:2019-09-25 16:56:37,095:Indexing dataset 'BBASV' (3/45) INFO:__main__:2019-09-25 16:56:37,163:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBASV | head -n1' INFO:__main__:2019-09-25 16:56:37,720:Indexing dataset 'BBBP1' (4/45) INFO:__main__:2019-09-25 16:56:37,729:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBBP1 | head -n1' INFO:__main__:2019-09-25 16:56:37,750:Indexing dataset 'BBBP2' (5/45) INFO:__main__:2019-09-25 16:56:37,761:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBBP2 | head -n1' INFO:__main__:2019-09-25 16:56:37,766:Indexing dataset 'BBBPS' (6/45) INFO:__main__:2019-09-25 16:56:37,776:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBBPS | head -n1' INFO:__main__:2019-09-25 16:56:37,781:Indexing dataset 'BBBU2' (7/45) INFO:__main__:2019-09-25 16:56:37,793:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBBU2 | head -n1' INFO:__main__:2019-09-25 16:56:37,827:Indexing dataset 'BBBZ1' (8/45) INFO:__main__:2019-09-25 16:56:37,836:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBBZ1 | head -n1' INFO:__main__:2019-09-25 16:56:37,844:Indexing dataset 'BBDA1' (9/45) INFO:__main__:2019-09-25 16:56:37,853:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBDA1 | head -n1' INFO:__main__:2019-09-25 16:56:37,866:Indexing dataset 'BBDB2' (10/45) INFO:__main__:2019-09-25 16:56:37,878:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBDB2 | head -n1' INFO:__main__:2019-09-25 16:56:37,905:Indexing dataset 'BBDE1' (11/45) INFO:__main__:2019-09-25 16:56:37,928:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBDE1 | head -n1' INFO:__main__:2019-09-25 16:56:37,961:Indexing dataset 'BBDG1' (12/45) INFO:__main__:2019-09-25 16:56:37,970:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBDG1 | head -n1' INFO:__main__:2019-09-25 16:56:37,975:Indexing dataset 'BBDL1' (13/45) INFO:__main__:2019-09-25 16:56:37,983:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBDL1 | head -n1' INFO:__main__:2019-09-25 16:56:37,991:Indexing dataset 'BBDP1' (14/45) INFO:__main__:2019-09-25 16:56:38,001:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBDP1 | head -n1' INFO:__main__:2019-09-25 16:56:38,009:Indexing dataset 'BBDR1' (15/45) INFO:__main__:2019-09-25 16:56:38,018:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBDR1 | head -n1' INFO:__main__:2019-09-25 16:56:38,028:Indexing dataset 'BBDY1' (16/45) INFO:__main__:2019-09-25 16:56:38,038:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBDY1 | head -n1' INFO:__main__:2019-09-25 16:56:38,044:Indexing dataset 'BBDZ1' (17/45) INFO:__main__:2019-09-25 16:56:38,054:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBDZ1 | head -n1' INFO:__main__:2019-09-25 16:56:38,061:Indexing dataset 'BBEE1' (18/45) INFO:__main__:2019-09-25 16:56:38,070:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBEE1 | head -n1' INFO:__main__:2019-09-25 16:56:38,075:Indexing dataset 'BBEX3' (19/45) INFO:__main__:2019-09-25 16:56:38,109:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBEX3 | head -n1' INFO:__main__:2019-09-25 16:56:38,237:Indexing dataset 'BBFB1' (20/45) INFO:__main__:2019-09-25 16:56:38,278:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBFB1 | head -n1' INFO:__main__:2019-09-25 16:56:38,545:Indexing dataset 'BBFDV' (21/45) INFO:__main__:2019-09-25 16:56:38,616:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBFDV | head -n1' INFO:__main__:2019-09-25 16:56:39,645:Indexing dataset 'BBFI1' (22/45) INFO:__main__:2019-09-25 16:56:39,673:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBFI1 | head -n1' INFO:__main__:2019-09-25 16:56:39,892:Indexing dataset 'BBFI3' (23/45) INFO:__main__:2019-09-25 16:56:39,901:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBFI3 | head -n1' INFO:__main__:2019-09-25 16:56:39,911:Indexing dataset 'BBFN1' (24/45) INFO:__main__:2019-09-25 16:56:39,934:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBFN1 | head -n1' INFO:__main__:2019-09-25 16:56:40,203:Indexing dataset 'BBK01' (25/45) INFO:__main__:2019-09-25 16:56:40,375:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBK01 | head -n1' INFO:__main__:2019-09-25 16:56:42,497:Indexing dataset 'BBMF1' (26/45) INFO:__main__:2019-09-25 16:56:42,506:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBMF1 | head -n1' INFO:__main__:2019-09-25 16:56:42,539:Indexing dataset 'BBMME' (27/45) INFO:__main__:2019-09-25 16:56:42,549:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBMME | head -n1' INFO:__main__:2019-09-25 16:56:42,561:Indexing dataset 'BBMMS' (28/45) INFO:__main__:2019-09-25 16:56:42,569:dataset.created_at is unknown. Running command 'git log --reverse --format="format:%at" -- BBMMS | head -n1' Traceback (most recent call last): File "/home/cepremap/dbnomics-importer/import_storage_dir.py", line 525, in <module> sys.exit(main()) File "/home/cepremap/dbnomics-importer/import_storage_dir.py", line 483, in main indexed_at, desired_datasets_codes_actions) File "/home/cepremap/dbnomics-importer/import_storage_dir.py", line 253, in process_datasets dataset_solr = build_dataset_solr(solr, provider_json, dataset_json, indexed_at, repo) File "/home/cepremap/dbnomics-importer/import_storage_dir.py", line 135, in build_dataset_solr commit_datetime = datetime.utcfromtimestamp(int(commit_timestamp_str)) ValueError: invalid literal for int() with base 10: '' (dbnomics) cepremap@eros:~/fetchers-envs$ COMMITting Solr index changes to http://localhost:8983/solr/dbnomics/update/json/docs... Time spent: 0:00:08.430
I suppose that ontly 28/45 datasets have been imported. Do you have an idea on how to solve this ?
added 3 commits
-
8b6d3ba9...29529b8d - 2 commits from branch
master
- 346367c2 - #501: add series names
-
8b6d3ba9...29529b8d - 2 commits from branch
Closing this MR as issue management#501 (closed) have been fixed by !2 (merged)