README.md 6.27 KB
Newer Older
1
# Data Catalogue
Emmanuel Raviart's avatar
Emmanuel Raviart committed
2
3
4
5
6
7

_Fetch, validate, convert & serve CESSDA-compliants DDI repositories._

## Installation

```bash
8
9
git clone https://git.nomics.world/progedo/data-catalogue.git
cd data-catalogue/
Emmanuel Raviart's avatar
Emmanuel Raviart committed
10
npm install
11
ln -s example.env .env
Emmanuel Raviart's avatar
Emmanuel Raviart committed
12
13
```

14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
### Database Creation

#### Using _Debian GNU/Linux_

As `root` user:

```bash
apt install postgresql
su - postgres
psql
```

#### Using _MacOS_

```bash
brew install postgresql
psql postgres
```

#### For everybody

```sql
36
37
38
CREATE USER data_catalogue WITH PASSWORD 'data_catalogue';
CREATE DATABASE data_catalogue WITH OWNER data_catalogue;
\connect data_catalogue
39
40
41
42
43
44
45
46
47
48
49
CREATE EXTENSION IF NOT EXISTS pg_trgm;
\q
logout # For Debian only
```

As normal user, create database tables:

```bash
npm run configure
```

Emmanuel Raviart's avatar
Emmanuel Raviart committed
50
51
## Usage

52
53
54
### Fetching DDI Files

#### Fetching DDI Files from OAI-PMH Servers
Emmanuel Raviart's avatar
Emmanuel Raviart committed
55

Emmanuel Raviart's avatar
Emmanuel Raviart committed
56
```bash
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# ADISP (OAI-PMH): Contains every french DDIs
npx babel-node --extensions ".ts" -- src/scripts/retrieve_oai-pmh_ddis.ts --url http://www.progedo-adisp.fr/oai/oai2.php ../public_data/adisp-oai-pmh-ddi/
```

#### Fetching DDI Files from Dataverse Servers

```bash
# data.sciencespo
npx babel-node --extensions ".ts" -- src/scripts/retrieve_dataverse_ddis.ts --url https://data.sciencespo.fr/ --verbose ../public_data/sciencespo-dataverse-ddi/
```

#### Fetching DDI Files from Nesstar Servers

```bash
# ADISP (public Nesstar) : Contains some French & English DDIs
npx babel-node --extensions ".ts" src/scripts/retrieve_nesstar_ddis.ts --url http://nesstar.progedo-adisp.fr/ ../public_data/adisp-nesstar-ddi/
73
# CDSP Sciences Po (obsolete & closed)
74
# npx babel-node --extensions ".ts" src/scripts/retrieve_nesstar_ddis.ts --url http://nesstar.sciences-po.fr/ ../public_data/cdsp-nesstar-ddi/
75
# INED
76
npx babel-node --extensions ".ts" src/scripts/retrieve_nesstar_ddis.ts --url http://nesstar.ined.fr/ ../public_data/ined-nesstar-ddi/
77
# INED - Generations and Gender Survey
78
npx babel-node --extensions ".ts" src/scripts/retrieve_nesstar_ddis.ts --url http://ggpsurvey.ined.fr/ ../public_data/ined-gpgsurvey-nesstar-ddi/
79
# UK Data Service
80
npx babel-node --extensions ".ts" src/scripts/retrieve_nesstar_ddis.ts --url http://nesstar.ukdataservice.ac.uk/ ../public_data/ukdataservice-nesstar-ddi/
81
# Norwegian Centre for Research Data
82
npx babel-node --extensions ".ts" src/scripts/retrieve_nesstar_ddis.ts --url http://nsddata.nsd.uib.no ../public_data/nsddata-nesstar-ddi/
83
84
```

85
### Repairing DDI Files
86
87

```bash
88
# ADISP
Emmanuel Raviart's avatar
Emmanuel Raviart committed
89
90
npx babel-node --extensions ".ts" src/scripts/repair_adisp_oai-pmh_ddis.ts --source=../public_data/adisp-oai-pmh-ddi/  ../public_data/adisp-oai-pmh-ddi-repaired/
npx babel-node --extensions ".ts" src/scripts/repair_adisp_nesstar_ddis.ts --source=../public_data/adisp-nesstar-ddi/  ../public_data/adisp-nesstar-ddi-repaired/
91
npx babel-node --extensions ".ts" src/scripts/repair_ined_nesstar_ddis.ts --source=../public_data/ined-nesstar-ddi/  ../public_data/ined-nesstar-ddi-repaired/
92
93
```

94
95
96
### Indexing DDI files

#### Indexing Progedo DDI Files
97
98

```bash
99
npx babel-node --extensions ".ts" -- src/scripts/index_codebooks.ts --language=fr --path=adisp ../public_data/adisp-oai-pmh-ddi-repaired/
Emmanuel Raviart's avatar
Emmanuel Raviart committed
100
npx babel-node --extensions ".ts" -- src/scripts/index_codebooks.ts --language=fr --path=cdsp ../public_data/sciencespo-dataverse-ddi/
101
npx babel-node --extensions ".ts" -- src/scripts/index_codebooks.ts --language=fr --path=ined ../public_data/ined-nesstar-ddi-repaired/
Emmanuel Raviart's avatar
Emmanuel Raviart committed
102
```
Emmanuel Raviart's avatar
Emmanuel Raviart committed
103

104
#### Indexing Other (non Progedo-related) DDI Files
105
106

```bash
107
108
109
110
111
112
113
npx babel-node --extensions ".ts" -- src/scripts/index_codebooks.ts --language=fr --path=adisp-nesstar ../public_data/adisp-nesstar-ddi-repaired/
npx babel-node --extensions ".ts" -- src/scripts/index_codebooks.ts --language=fr --path=cdsp ../public_data/sciencespo-dataverse-ddi/
# Obsolete
# npx babel-node --extensions ".ts" -- src/scripts/index_codebooks.ts --language=fr --path=cdsp-obsolete ../public_data/cdsp-nesstar-ddi/
npx babel-node --extensions ".ts" -- src/scripts/index_codebooks.ts --language=fr --path=ined/gpgsurvey ../public_data/ined-gpgsurvey-nesstar-ddi/
npx babel-node --extensions ".ts" -- src/scripts/index_codebooks.ts --language=en --path=ukdataservice ../public_data/ukdataservice-nesstar-ddi/
npx babel-node --extensions ".ts" -- src/scripts/index_codebooks.ts --language=no --path=nsddata ../public_data/nsddata-nesstar-ddi/
Emmanuel Raviart's avatar
Emmanuel Raviart committed
114

115
116
```

117
### Extracting Words from CodeBooks for Autocompletion
118
119
120
121
122

```bash
npx babel-node --extensions ".ts" -- src/scripts/index_words.ts
```

123
124
## Development

125
126
127
128
129
130
131
### Extracting TypeScript Raw Types from DDI Files

#### Extracting TypeScript Raw Types from Progedo DDI Files

```bash
npx babel-node --extensions ".ts" --max-old-space-size=10240 -- src/scripts/raw_types_from_ddi_files.ts ../public_data/adisp-oai-pmh-ddi-repaired/ ../public_data/sciencespo-dataverse-ddi/ ../public_data/ined-nesstar-ddi/

Emmanuel Raviart's avatar
Emmanuel Raviart committed
132
133
134
135
npx babel-node --extensions ".ts" --max-old-space-size=10240 -- src/scripts/raw_types_from_ddi_files.ts ../public_data/adisp-oai-pmh-ddi-repaired/ ../public_data/sciencespo-dataverse-ddi/ ../public_data/ined-nesstar-ddi/ --version=1.2.2
npx babel-node --extensions ".ts" --max-old-space-size=10240 -- src/scripts/raw_types_from_ddi_files.ts ../public_data/adisp-oai-pmh-ddi-repaired/ ../public_data/sciencespo-dataverse-ddi/ ../public_data/ined-nesstar-ddi/ --version=1.3
npx babel-node --extensions ".ts" --max-old-space-size=10240 -- src/scripts/raw_types_from_ddi_files.ts ../public_data/adisp-oai-pmh-ddi-repaired/ ../public_data/sciencespo-dataverse-ddi/ ../public_data/ined-nesstar-ddi/ --version=2.5

136
137
138
# Prettify generated TypeScript file:
npm run prettier
```
Emmanuel Raviart's avatar
Emmanuel Raviart committed
139

140
#### Extracting TypeScript Raw Types for Other Tests
141

Emmanuel Raviart's avatar
Emmanuel Raviart committed
142
```bash
143
144
145
146
npx babel-node --extensions ".ts" --max-old-space-size=8192 -- src/scripts/raw_types_from_ddi_files.ts ../public_data/adisp-manual-ddi/ --target=src/raw_types/codebooks_adisp_manual.ts
npx babel-node --extensions ".ts" --max-old-space-size=8192 -- src/scripts/raw_types_from_ddi_files.ts ../public_data/adisp-nesstar-ddi/ --target=src/raw_types/codebooks_adisp_nesstar.ts
npx babel-node --extensions ".ts" -- src/scripts/raw_types_from_ddi_files.ts ../public_data/sciencespo-dataverse-ddi/ --target=src/raw_types/codebooks_sciencespo_dataverse.ts
npx babel-node --extensions ".ts" -- src/scripts/raw_types_from_ddi_files.ts ../public_data/ined-nesstar-ddi/ --target=src/raw_types/codebooks_ined_nesstar.ts
Emmanuel Raviart's avatar
Emmanuel Raviart committed
147
148
149
150

# Prettify generated TypeScript files:
npm run prettier
```