rdbnomics.Rmd 37 KB
Newer Older
1
2
---
title: "DBnomics R client"
3
author: "Sébastien Galais^[Banque de France, [https://github.com/s915](https://github.com/s915)], Thomas Brand^[CEPREMAP]"
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
output:
  html_document:
    highlight: default
    theme: simplex
    smart: false
    toc: true
    toc_float: true
    number_sections: true
  rmarkdown::html_vignette:
    highlight: default
    theme: simplex
    smart: false
    toc: true
    toc_float: true
    number_sections: true
vignette: >
  %\VignetteIndexEntry{DBnomics R client}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

<style type="text/css">
h1.title {
  text-align: center;
  font-weight: bold;
}
h4.author { /* Header 4 - and the author and data headers use this too  */
  text-align: center;
}
h4.date { /* Header 4 - and the author and data headers use this too  */
  text-align: center;
}
</style>

38
# DBnomics: the world's economic database
39

40
Explore all the economic data from different providers (national and international statistical institutes, central banks, etc.), for free, following the link [db.nomics.world](https://db.nomics.world)  
41
(*N.B.: in the examples, data have already been retrieved on april 6<sup>th</sup> 2020*).
42

43
[![](dbnomics001.png)](https://db.nomics.world)
44

45
# Fetch time series by `ids`
46

47
First, let's assume that we know which series we want to download. A series identifier (`ids`) is defined by three values, formatted like this: `provider_code`/`dataset_code`/`series_code`.
48
49
50
51
52
53
54
55
56
57
58
59

## Fetch one series from dataset 'Unemployment rate' (ZUTN) of AMECO provider

```{r, echo = FALSE}
library <- function(...) {
  suppressWarnings(
    suppressPackageStartupMessages(base::library(..., quietly = TRUE))
  )
}
```

```{r}
Sébastien Galais's avatar
Sébastien Galais committed
60
library(data.table)
61
62
63
64
65
library(rdbnomics)
```

```{r, echo = FALSE}
reorder_cols <- function(x) {
66
67
  data.table::setDT(x)

68
69
  cols <- c(
    "provider_code", "dataset_code", "dataset_name", "series_code",
70
71
    "series_name", "original_period", "period", "original_value", "value",
    "@frequency"
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
  )

  if ("unit" %in% colnames(x)) {
    cols <- c(cols, "unit", "Unit")
  }

  if ("geo" %in% colnames(x)) {
    cols <- c(cols, "geo", "Country")
  }

  if ("freq" %in% colnames(x)) {
    cols <- c(cols, "freq", "Frequency")
  }

  cols_add <- setdiff(colnames(x), cols)
  cols <- c(cols, cols_add)

  cols <- cols[cols %in% colnames(x)]
  
  cols <- match(cols, colnames(x))

93
  x[, .SD, .SDcols = cols]
94
95
96
97
}

knitr::opts_chunk$set(dev.args = list(bg = "transparent"))

98
display_table <- function(DT) {
99
100
101
102
103
104
105
106
107
108
109
  DT <- head(DT)
  DT <- as.data.table(
    lapply(DT, function(x) {
      if (is.character(x)) {
        ifelse(
          nchar(x) > 16,
          paste0(substr(x, 1, 16), "..."),
          x
        )
      } else {
        x
110
      }
111
112
113
    })
  )
  DT[]
114
}
115
116
117
```

```{r, eval = FALSE}
118
119
df <- rdb(ids = "AMECO/ZUTN/EA19.1.0.0.0.ZUTN")
df <- df[!is.na(value)]
120
121
122
```
```{r, eval = TRUE, echo = FALSE}
df <- rdbnomics:::rdbnomics_df001
123
data.table::setDT(df)
124
125
```

126
In such `data.table`, you will always find at least ten columns:
127
128
129
130
131
132

- `provider_code`
- `dataset_code`
- `dataset_name`
- `series_code`
- `series_name`
133
134
- `original_period` (character string)
- `period` (date of the first day of `original_period`)
135
- `original_value` (character string)
136
137
138
139
140
141
142
143
144
145
- `value`
- `@frequency` (harmonized frequency generated by DBnomics)

The other columns depend on the provider and on the dataset. They always come in pairs (for the code and the name). In the data.frame `df`, you have:

- `unit` (code) and `Unit` (name)
- `geo` (code) and `Country` (name)
- `freq` (code) and `Frequency` (name)

```{r, echo = FALSE}
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
df <- reorder_cols(df)
display_table(df)
```

```{r, echo = FALSE, fig.align = 'center'}
# ggplot(df, aes(x = period, y = value, color = series_name)) +
#   geom_line(size = 1.2) +
#   geom_point(size = 2) +
#   dbnomics()
i <- 1
x1 <- df[series_name == sort(unique(series_name))[i]]$period
y1 <- df[series_name == sort(unique(series_name))[i]]$value
cols <- c("red", "blue", "darkgreen")
PCH <- 18

plot(
  x1, y1, col = cols[1], type = "l", xlab = "", ylab = "",
  xlim = c(min(df$period), max(df$period)),
  ylim = c(min(df$value) - 0.5, max(df$value)),
  panel.first = grid(lty = 1)
)
points(x1, y1, col = cols[1], pch = PCH)
legend(
  "bottomleft", inset = 0.005,
  legend = sort(unique(df$series_name)),
  col = cols,
  lty = 1, pch = PCH,  box.lty = 0, cex = 0.7
)
mtext(
  text = "DBnomics <https://db.nomics.world>",
  side = 3, col = "grey", font = 3, adj = 1
)
178
179
```

180
In the event that you only use the argument `ids`, you can drop it and run:
181
```{r, eval = FALSE}
182
df <- rdb("AMECO/ZUTN/EA19.1.0.0.0.ZUTN")
183
184
```

185
186
187
## Fetch two series from dataset 'Unemployment rate' (ZUTN) of AMECO provider

```{r, eval = FALSE}
188
189
df <- rdb(ids = c("AMECO/ZUTN/EA19.1.0.0.0.ZUTN", "AMECO/ZUTN/DNK.1.0.0.0.ZUTN"))
df <- df[!is.na(value)]
190
191
192
```
```{r, eval = TRUE, echo = FALSE}
df <- rdbnomics:::rdbnomics_df002
193
data.table::setDT(df)
194
195
196
```

```{r, echo = FALSE}
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
df <- df[order(series_code, period)]
df <- reorder_cols(df)
display_table(df)
```

```{r, echo = FALSE, fig.align = 'center'}
# ggplot(df, aes(x = period, y = value, color = series_name)) +
#   geom_line(size = 1.2) +
#   geom_point(size = 2) +
#   dbnomics()
i <- 1
x1 <- df[series_name == sort(unique(series_name))[i]]$period
y1 <- df[series_name == sort(unique(series_name))[i]]$value
i <- 2
x2 <- df[series_name == sort(unique(series_name))[i]]$period
y2 <- df[series_name == sort(unique(series_name))[i]]$value
cols <- c("red", "blue", "darkgreen")
PCH <- 18

plot(
  x1, y1, col = cols[1], type = "l", xlab = "", ylab = "",
  xlim = c(min(df$period), max(df$period)),
  ylim = c(min(df$value) - 1.7, max(df$value)),
  panel.first = grid(lty = 1)
)
points(x1, y1, col = cols[1], pch = PCH)
lines(x2, y2, col = cols[2], type = "l")
points(x2, y2, col = cols[2], pch = PCH)
legend(
  "bottomleft", inset = 0.005,
  legend = sort(unique(df$series_name)),
  col = cols,
  lty = 1, pch = PCH,  box.lty = 0, cex = 0.7
)
mtext(
  text = "DBnomics <https://db.nomics.world>",
  side = 3, col = "grey", font = 3, adj = 1
)
235
236
237
238
239
```

## Fetch two series from different datasets of different providers

```{r, eval = FALSE}
240
df <- rdb(ids = c("AMECO/ZUTN/EA19.1.0.0.0.ZUTN", "Eurostat/une_rt_q/Q.SA.Y15-24.PC_ACT.T.EA19"))
241
df <- df[!is.na(value)]
242
243
244
```
```{r, eval = TRUE, echo = FALSE}
df <- rdbnomics:::rdbnomics_df003
245
data.table::setDT(df)
246
247
248
```

```{r, echo = FALSE}
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
df <- df[order(series_code, period)]
df <- reorder_cols(df)
display_table(df)
```

```{r, echo = FALSE, fig.align = 'center'}
# ggplot(df, aes(x = period, y = value, color = series_name)) +
#   geom_line(size = 1.2) +
#   geom_point(size = 2) +
#   dbnomics(legend.text = element_text(size = 7))
i <- 1
x1 <- df[series_name == sort(unique(series_name))[i]]$period
y1 <- df[series_name == sort(unique(series_name))[i]]$value
i <- 2
x2 <- df[series_name == sort(unique(series_name))[i]]$period
y2 <- df[series_name == sort(unique(series_name))[i]]$value
cols <- c("red", "blue", "darkgreen")
PCH <- 18
legend_text <- sort(unique(df$series_name))
legend_text[2] <- sapply(
  legend_text[2],
  function(y) {
    paste0(
      paste0(
        strsplit(y, "active ")[[1]], collapse = "active\n"
      ),
      "\n"
    )
  }
)
279

280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
plot(
  x1, y1, col = cols[1], type = "l", xlab = "", ylab = "",
  xlim = c(min(df$period), max(df$period)),
  ylim = c(min(df$value) - 1.5, max(df$value)),
  panel.first = grid(lty = 1)
)
points(x1, y1, col = cols[1], pch = PCH)
lines(x2, y2, col = cols[2], type = "l")
points(x2, y2, col = cols[2], pch = PCH)
legend(
  "bottomleft", inset = 0.005,
  legend = legend_text,
  col = cols,
  lty = 1, pch = PCH,  box.lty = 0, cex = 0.7
)
mtext(
  text = "DBnomics <https://db.nomics.world>",
  side = 3, col = "grey", font = 3, adj = 1
)
299
300
301
```

# Fetch time series by `mask`
302
The code mask notation is a very concise way to select one or many time series at once.
303

304
## Fetch one series from dataset 'Balance of Payments' (BOP) of IMF
305
```{r, eval = FALSE}
306
307
df <- rdb("IMF", "BOP", mask = "A.FR.BCA_BP6_EUR")
df <- df[!is.na(value)]
308
309
310
```
```{r, eval = TRUE, echo = FALSE}
df <- rdbnomics:::rdbnomics_df004
311
data.table::setDT(df)
312
313
314
```

```{r, echo = FALSE}
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
df <- reorder_cols(df)
display_table(df)
```

```{r, echo = FALSE, fig.align = 'center'}
# ggplot(df, aes(x = period, y = value, color = series_name)) +
#   geom_step(size = 1.2) +
#   geom_point(size = 2) +
#   dbnomics()
i <- 1
x1 <- df[series_name == sort(unique(series_name))[i]]$period
y1 <- df[series_name == sort(unique(series_name))[i]]$value
cols <- c("red", "blue", "darkgreen")
PCH <- 18
legend_text <- sort(unique(df$series_name))

plot(
  x1, y1, col = cols[1], type = "s", xlab = "", ylab = "",
  xlim = c(min(df$period), max(df$period)),
  ylim = c(min(df$value), max(df$value)),
  panel.first = grid(lty = 1)
)
points(x1, y1, col = cols[1], pch = PCH)
legend(
  "bottomleft", inset = 0.005,
  legend = legend_text,
  col = cols,
  lty = 1, pch = PCH,  box.lty = 0, cex = 0.7
)
mtext(
  text = "DBnomics <https://db.nomics.world>",
  side = 3, col = "grey", font = 3, adj = 1
)
348
349
```

350
In the event that you only use the arguments `provider_code`, `dataset_code` and `mask`, you can drop the name `mask` and run:
351
```{r, eval = FALSE}
352
df <- rdb("IMF", "BOP", "A.FR.BCA_BP6_EUR")
353
354
```

355
## Fetch two series from dataset 'Balance of Payments' (BOP) of IMF
356
357
358

You just have to add a `+` between two different values of a dimension.
```{r, eval = FALSE}
359
360
df <- rdb("IMF", "BOP", mask = "A.FR+ES.BCA_BP6_EUR")
df <- df[!is.na(value)]
361
362
363
```
```{r, eval = TRUE, echo = FALSE}
df <- rdbnomics:::rdbnomics_df005
364
data.table::setDT(df)
365
366
367
```

```{r, echo = FALSE}
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
df <- df[order(series_code, period)]
df <- reorder_cols(df)
display_table(df)
```

```{r, echo = FALSE, fig.align = 'center'}
# ggplot(df, aes(x = period, y = value, color = series_name)) +
#   geom_step(size = 1.2) +
#   geom_point(size = 2) +
#   dbnomics()
i <- 1
x1 <- df[series_name == sort(unique(series_name))[i]]$period
y1 <- df[series_name == sort(unique(series_name))[i]]$value
i <- 2
x2 <- df[series_name == sort(unique(series_name))[i]]$period
y2 <- df[series_name == sort(unique(series_name))[i]]$value
cols <- c("red", "blue", "darkgreen")
PCH <- 18
legend_text <- sort(unique(df$series_name))

plot(
  x1, y1, col = cols[1], type = "s", xlab = "", ylab = "",
  xlim = c(min(df$period), max(df$period)),
  ylim = c(min(df$value) - 2*10^4, max(df$value)),
  panel.first = grid(lty = 1)
)
points(x1, y1, col = cols[1], pch = PCH)
lines(x2, y2, col = cols[2], type = "s")
points(x2, y2, col = cols[2], pch = PCH)
legend(
  "bottomleft", inset = 0.005,
  legend = legend_text,
  col = cols,
  lty = 1, pch = PCH,  box.lty = 0, cex = 0.7
)
mtext(
  text = "DBnomics <https://db.nomics.world>",
  side = 3, col = "grey", font = 3, adj = 1
)
407
408
```

409
## Fetch all series along one dimension from dataset 'Balance of Payments' (BOP) of IMF
410
411

```{r, eval = FALSE}
412
413
414
415
df <- rdb("IMF", "BOP", mask = "A..BCA_BP6_EUR")
df <- df[!is.na(value)]
df <- df[order(-period, REF_AREA)]
df <- head(df, 100)
416
417
418
```
```{r, eval = TRUE, echo = FALSE}
df <- rdbnomics:::rdbnomics_df006
419
data.table::setDT(df)
420
421
422
```

```{r, echo = FALSE}
423
424
df <- reorder_cols(df)
display_table(df)
425
426
```

427
## Fetch series along multiple dimensions from dataset 'Balance of Payments' (BOP) of IMF
428
429

```{r, eval = FALSE}
430
431
432
df <- rdb("IMF", "BOP", mask = "A.FR.BCA_BP6_EUR+IA_BP6_EUR")
df <- df[!is.na(value)]
df <- df[order(period), head(.SD, 50), by = INDICATOR]
433
434
```
```{r, eval = TRUE, echo = FALSE}
435
436
df <- rdbnomics:::rdbnomics_df007
data.table::setDT(df)
437
438
439
```

```{r, echo = FALSE}
440
441
df <- reorder_cols(df)
display_table(df)
442
443
```

444
# Fetch time series by `dimensions`
445
Searching by `dimensions` is a less concise way to select time series than using the code `mask`, but it works with all the different providers. You have a "*Description of series code*" at the bottom of each dataset page on the [DBnomics website](https://db.nomics.world).
446
447
448
449

## Fetch one value of one dimension from dataset 'Unemployment rate' (ZUTN) of AMECO provider

```{r, eval = FALSE}
450
451
df <- rdb("AMECO", "ZUTN", dimensions = list(geo = "ea19"))
df <- df[!is.na(value))]
452
# or
453
454
# df <- rdb("AMECO", "ZUTN", dimensions = '{"geo": ["ea19"]}')
# df <- df[!is.na(value))]
455
456
457
```
```{r, eval = TRUE, echo = FALSE}
df <- rdbnomics:::rdbnomics_df008
458
data.table::setDT(df)
459
460
461
```

```{r, echo = FALSE}
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
df <- reorder_cols(df)
display_table(df)
```

```{r, echo = FALSE, fig.align = 'center'}
# ggplot(df, aes(x = period, y = value, color = series_name)) +
#   geom_line(size = 1.2) +
#   geom_point(size = 2) +
#   dbnomics()
i <- 1
x1 <- df[series_name == sort(unique(series_name))[i]]$period
y1 <- df[series_name == sort(unique(series_name))[i]]$value
cols <- c("red", "blue", "darkgreen")
PCH <- 18
legend_text <- sort(unique(df$series_name))

plot(
  x1, y1, col = cols[1], type = "l", xlab = "", ylab = "",
  xlim = c(min(df$period), max(df$period)),
  ylim = c(min(df$value) - 0.2, max(df$value)),
  panel.first = grid(lty = 1)
)
points(x1, y1, col = cols[1], pch = PCH)
legend(
  "bottomleft", inset = 0.005,
  legend = legend_text,
  col = cols,
  lty = 1, pch = PCH,  box.lty = 0, cex = 0.7
)
mtext(
  text = "DBnomics <https://db.nomics.world>",
  side = 3, col = "grey", font = 3, adj = 1
)
495
496
497
498
499
```

## Fetch two values of one dimension from dataset 'Unemployment rate' (ZUTN) of AMECO provider

```{r, eval = FALSE}
500
501
df <- rdb("AMECO", "ZUTN", dimensions = list(geo = c("ea19", "dnk")))
df <- df[!is.na(value))]
502
# or
503
504
# df <- rdb("AMECO", "ZUTN", dimensions = '{"geo": ["ea19", "dnk"]}')
# df <- df[!is.na(value))]
505
506
507
```
```{r, eval = TRUE, echo = FALSE}
df <- rdbnomics:::rdbnomics_df009
508
data.table::setDT(df)
509
510
511
```

```{r, echo = FALSE}
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
df <- df[order(series_code, period)]
df <- reorder_cols(df)
display_table(df)
```

```{r, echo = FALSE, fig.align = 'center'}
# ggplot(df, aes(x = period, y = value, color = series_name)) +
#   geom_line(size = 1.2) +
#   geom_point(size = 2) +
#   dbnomics()
i <- 1
x1 <- df[series_name == sort(unique(series_name))[i]]$period
y1 <- df[series_name == sort(unique(series_name))[i]]$value
i <- 2
x2 <- df[series_name == sort(unique(series_name))[i]]$period
y2 <- df[series_name == sort(unique(series_name))[i]]$value
cols <- c("red", "blue", "darkgreen")
PCH <- 18
legend_text <- sort(unique(df$series_name))

plot(
  x1, y1, col = cols[1], type = "l", xlab = "", ylab = "",
  xlim = c(min(df$period), max(df$period)),
  ylim = c(min(df$value) - 1.2, max(df$value)),
  panel.first = grid(lty = 1)
)
points(x1, y1, col = cols[1], pch = PCH)
lines(x2, y2, col = cols[2], type = "l")
points(x2, y2, col = cols[2], pch = PCH)
legend(
  "bottomleft", inset = 0.005,
  legend = legend_text,
  col = cols,
  lty = 1, pch = PCH,  box.lty = 0, cex = 0.7
)
mtext(
  text = "DBnomics <https://db.nomics.world>",
  side = 3, col = "grey", font = 3, adj = 1
)
551
552
553
554
555
```

## Fetch several values of several dimensions from dataset 'Doing business' (DB) of World Bank

```{r, eval = FALSE}
556
557
df <- rdb("WB", "DB", dimensions = list(country = c("DZ", "PE"), indicator = c("ENF.CONT.COEN.COST.ZS", "IC.REG.COST.PC.FE.ZS")))
df <- df[!is.na(value))]
558
# or
559
560
# df <- rdb("WB", "DB", dimensions = '{"country": ["DZ", "PE"], "indicator": ["ENF.CONT.COEN.COST.ZS", "IC.REG.COST.PC.FE.ZS"]}')
# df <- df[!is.na(value))]
561
562
563
```
```{r, eval = TRUE, echo = FALSE}
df <- rdbnomics:::rdbnomics_df010
564
data.table::setDT(df)
565
566
567
```

```{r, echo = FALSE}
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
df <- df[order(series_name, period)]
df <- reorder_cols(df)
display_table(df)
```

```{r, echo = FALSE, fig.align = 'center'}
# ggplot(df, aes(x = period, y = value, color = series_name)) +
#   geom_line(size = 1.2) +
#   geom_point(size = 2) +
#   dbnomics()
i <- 1
x1 <- df[series_name == sort(unique(series_name))[i]]$period
y1 <- df[series_name == sort(unique(series_name))[i]]$value
i <- 2
x2 <- df[series_name == sort(unique(series_name))[i]]$period
y2 <- df[series_name == sort(unique(series_name))[i]]$value
i <- 3
x3 <- df[series_name == sort(unique(series_name))[i]]$period
y3 <- df[series_name == sort(unique(series_name))[i]]$value
i <- 4
x4 <- df[series_name == sort(unique(series_name))[i]]$period
y4 <- df[series_name == sort(unique(series_name))[i]]$value
cols <- c("red", "blue", "darkgreen", "purple")
PCH <- 18
legend_text <- sort(unique(df$series_name))

plot(
  x1, y1, col = cols[1], type = "l", xlab = "", ylab = "",
  xlim = c(min(df$period), max(df$period)),
  ylim = c(min(df$value) - 7, max(df$value)),
  panel.first = grid(lty = 1)
)
points(x1, y1, col = cols[1], pch = PCH)
lines(x2, y2, col = cols[2], type = "l")
points(x2, y2, col = cols[2], pch = PCH)
lines(x3, y3, col = cols[3], type = "l")
points(x3, y3, col = cols[3], pch = PCH)
lines(x4, y4, col = cols[4], type = "l")
points(x4, y4, col = cols[4], pch = PCH)
legend(
  "bottomleft", inset = 0.005,
  legend = legend_text,
  col = cols,
  lty = 1, pch = PCH,  box.lty = 0, cex = 0.7
)
mtext(
  text = "DBnomics <https://db.nomics.world>",
  side = 3, col = "grey", font = 3, adj = 1
)
617
618
```

619
620
621
# Fetch time series with a `query`
The query is a Google-like search that will filter/select time series from a provider's dataset.

622
## Fetch one series from dataset 'WEO by countries (2019-10 release)' (WEO:2019-10) of IMF
623
```{r, eval = FALSE}
624
df <- rdb("IMF", "WEO:2019-10", query = "France current account balance percent")
625
df <- df[!is.na(value))]
626
627
628
```
```{r, eval = TRUE, echo = FALSE}
df <- rdbnomics:::rdbnomics_df014
629
data.table::setDT(df)
630
631
632
```

```{r, echo = FALSE}
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
df <- reorder_cols(df)
display_table(df)
```

```{r, echo = FALSE, fig.align = 'center'}
# ggplot(df, aes(x = period, y = value, color = series_name)) +
#   geom_line(size = 1.2) +
#   geom_point(size = 2) +
#   dbnomics()
i <- 1
x1 <- df[series_name == sort(unique(series_name))[i]]$period
y1 <- df[series_name == sort(unique(series_name))[i]]$value
cols <- c("red", "blue", "darkgreen", "purple")
PCH <- 18
legend_text <- sort(unique(df$series_name))

plot(
  x1, y1, col = cols[1], type = "l", xlab = "", ylab = "",
  xlim = c(min(df$period), max(df$period)),
  ylim = c(min(df$value) - 0.5, max(df$value)),
  panel.first = grid(lty = 1)
)
points(x1, y1, col = cols[1], pch = PCH)
legend(
  "bottomleft", inset = 0.005,
  legend = legend_text,
  col = cols,
  lty = 1, pch = PCH,  box.lty = 0, cex = 0.7
)
mtext(
  text = "DBnomics <https://db.nomics.world>",
  side = 3, col = "grey", font = 3, adj = 1
)
666
667
```

668
## Fetch series from dataset 'WEO by countries (2019-10 release)' (WEO:2019-10) of IMF
669
```{r, eval = FALSE}
670
df <- rdb("IMF", "WEO:2019-10", query = "current account balance percent")
671
df <- df[!is.na(value))]
672
673
674
```
```{r, eval = TRUE, echo = FALSE}
df <- rdbnomics:::rdbnomics_df015
675
data.table::setDT(df)
676
677
678
```

```{r, echo = FALSE}
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
df <- reorder_cols(df)
display_table(df)
```

```{r, echo = FALSE, fig.align = 'center'}
# ggplot(df, aes(x = period, y = value, color = `WEO Country`)) +
#   geom_line(size = 1.2) +
#   geom_point(size = 2) +
#   ggtitle("Current account balance (% GDP)") +
#   dbnomics(legend.direction = "horizontal")
i <- 1
x1 <- df[series_name == sort(unique(series_name))[i]]$period
y1 <- df[series_name == sort(unique(series_name))[i]]$value
i <- 2
x2 <- df[series_name == sort(unique(series_name))[i]]$period
y2 <- df[series_name == sort(unique(series_name))[i]]$value
i <- 3
x3 <- df[series_name == sort(unique(series_name))[i]]$period
y3 <- df[series_name == sort(unique(series_name))[i]]$value
i <- 4
x4 <- df[series_name == sort(unique(series_name))[i]]$period
y4 <- df[series_name == sort(unique(series_name))[i]]$value
cols <- c("red", "blue", "darkgreen", "purple")
PCH <- 18
legend_text <- sort(unique(df$series_name))

plot(
  x1, y1, col = cols[1], type = "l", xlab = "", ylab = "",
  xlim = c(min(df$period), max(df$period)),
  ylim = c(min(df$value), max(df$value)),
  panel.first = grid(lty = 1)
)
points(x1, y1, col = cols[1], pch = PCH)
lines(x2, y2, col = cols[2], type = "l")
points(x2, y2, col = cols[2], pch = PCH)
lines(x3, y3, col = cols[3], type = "l")
points(x3, y3, col = cols[3], pch = PCH)
lines(x4, y4, col = cols[4], type = "l")
points(x4, y4, col = cols[4], pch = PCH)
legend(
  "bottomleft", inset = 0.005,
  legend = legend_text,
  col = cols,
  lty = 1, pch = PCH,  box.lty = 0, cex = 0.7
)
mtext(
  text = "DBnomics <https://db.nomics.world>",
  side = 3, col = "grey", font = 3, adj = 1
)
728
729
```

730
731
# Fetch time series found on the web site

732
When you don't know the codes of the dimensions, provider, dataset or series, you can:
733

734
- go to the page of a dataset on [DBnomics website](https://db.nomics.world), for example [Doing Business](https://db.nomics.world/WB/DB),  
735
736

- select some dimensions by using the input widgets of the left column,
737
![](dbnomics002.png)  
738
739

- click on "*Copy API link*" in the menu of the "*Download*" button,
740
![](dbnomics003.png)  
741

742
- use the `rdb(api_link = ...)` function such as below.  
743
744

```{r, eval = FALSE}
745
746
df <- rdb(api_link = "https://api.db.nomics.world/v22/series/WB/DB?dimensions=%7B%22country%22%3A%5B%22FR%22%2C%22IT%22%2C%22ES%22%5D%7D&q=IC.REG.PROC.FE.NO&observations=1&format=json&align_periods=1&offset=0&facets=0")
df <- df[!is.na(value))]
747
748
749
```
```{r, eval = TRUE, echo = FALSE}
df <- rdbnomics:::rdbnomics_df011
750
data.table::setDT(df)
751
752
753
```

```{r, echo = FALSE}
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
df <- df[order(period, series_name)]
df <- reorder_cols(df)
display_table(df)
```

```{r, echo = FALSE, fig.align = 'center'}
# ggplot(df, aes(x = period, y = value, color = series_name)) +
#   geom_step(size = 1.2) +
#   geom_point(size = 2) +
#   dbnomics()
i <- 1
x1 <- df[series_name == sort(unique(series_name))[i]]$period
y1 <- df[series_name == sort(unique(series_name))[i]]$value
i <- 2
x2 <- df[series_name == sort(unique(series_name))[i]]$period
y2 <- df[series_name == sort(unique(series_name))[i]]$value
i <- 3
x3 <- df[series_name == sort(unique(series_name))[i]]$period
y3 <- df[series_name == sort(unique(series_name))[i]]$value
cols <- c("red", "blue", "darkgreen")
PCH <- 18

plot(
  x1, y1, col = cols[1], type = "s", xlab = "", ylab = "",
  xlim = c(min(df$period), max(df$period)),
  ylim = c(min(df$value) - 1.2, max(df$value)),
  panel.first = grid(lty = 1)
)
points(x1, y1, col = cols[1], pch = PCH)
lines(x2, y2, col = cols[2], type = "s")
points(x2, y2, col = cols[2], pch = PCH)
lines(x3, y3, col = cols[3], type = "s")
points(x3, y3, col = cols[3], pch = PCH)
legend(
  "bottomleft", inset = 0.005,
  legend = sort(unique(df$series_name)),
  col = cols,
  lty = 1, pch = PCH,  box.lty = 0, cex = 0.7
)
mtext(
  text = "DBnomics <https://db.nomics.world>",
  side = 3, col = "grey", font = 3, adj = 1
)
797
798
```

799
800
801
802
803
In the event that you only use the argument `api_link`, you can drop the name and run:
```{r, eval = FALSE}
df <- rdb("https://api.db.nomics.world/v22/series/WB/DB?dimensions=%7B%22country%22%3A%5B%22FR%22%2C%22IT%22%2C%22ES%22%5D%7D&q=IC.REG.PROC.FE.NO&observations=1&format=json&align_periods=1&offset=0&facets=0")
```

804
805
# Fetch time series from the cart

806
On the cart page of the [DBnomics website](https://db.nomics.world), click on "*Copy API link*" and copy-paste it as an argument of the `rdb(api_link = ...)` function. Please note that when you update your cart, you have to copy this link again, because the link itself contains the ids of the series in the cart.
807
<center>
808
![](dbnomics005.png)
809
810
811
</center>
  
```{r, eval = FALSE}
812
813
df <- rdb(api_link = "https://api.db.nomics.world/v22/series?observations=1&series_ids=BOE/6008/RPMTDDC,BOE/6231/RPMTBVE")
df <- df[!is.na(value))]
814
815
```
```{r, eval = TRUE, echo = FALSE}
816
817
df <- rdbnomics:::rdbnomics_df012
data.table::setDT(df)
818
819
820
```

```{r, echo = FALSE}
821
822
823
df[
    ,
    series_name := sapply(
824
825
826
827
      series_name,
      function(y) {
        paste0(
          paste0(
828
            strsplit(y, "institutions' ")[[1]], collapse = "institutions'\n"
829
830
831
832
833
          ),
          "\n"
        )
      }
    )
834
  ]
835
836
837
```

```{r, echo = FALSE}
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
df <- df[order(period, series_name)]
df <- reorder_cols(df)
display_table(df)
```

```{r, echo = FALSE, fig.align = 'center'}
# ggplot(df, aes(x = period, y = value, color = series_name)) +
#   geom_line(size = 1.2) +
#   geom_point(size = 2) +
#   dbnomics()
i <- 1
x1 <- df[series_name == sort(unique(series_name))[i]]$period
y1 <- df[series_name == sort(unique(series_name))[i]]$value
i <- 2
x2 <- df[series_name == sort(unique(series_name))[i]]$period
y2 <- df[series_name == sort(unique(series_name))[i]]$value
cols <- c("red", "blue", "darkgreen")
PCH <- 18

plot(
  x1, y1, col = cols[1], type = "l", xlab = "", ylab = "",
  xlim = c(min(df$period), max(df$period)),
  ylim = c(min(df$value) - 4*10^3, max(df$value)),
  panel.first = grid(lty = 1)
)
points(x1, y1, col = cols[1], pch = PCH)
lines(x2, y2, col = cols[2], type = "l")
points(x2, y2, col = cols[2], pch = PCH)
legend(
  "bottomleft", inset = 0.005,
  legend = sort(unique(df$series_name)),
  col = cols,
  lty = 1, pch = PCH,  box.lty = 0, cex = 0.7
)
mtext(
  text = "DBnomics <https://db.nomics.world>",
  side = 3, col = "grey", font = 3, adj = 1
)
876
```
877

Sébastien Galais's avatar
Sébastien Galais committed
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
# Fetch the available datasets of a provider

When fetching series from [DBnomics](https://db.nomics.world), you need
to give a provider and a dataset before specifying correct dimensions. With
the function `rdb_datasets`, you can download the list of the available datasets
for a provider.  
For example, to fetch the **IMF** datasets, you have to use:
```{r, eval = FALSE}
rdb_datasets(provider_code = "IMF")
```

The result is a named list (its name is **IMF**) with one element which is a
`data.table`:
```{r, eval = TRUE, echo = FALSE}
str(rdbnomics:::rdbnomics_df016)
```

With the same function, if you want to fetch the available datasets for multiple
providers, you can give a vector of providers and get a named list.
```{r, eval = FALSE}
rdb_datasets(provider_code = c("IMF", "BDF"))
```
```{r, eval = TRUE, echo = FALSE}
str(rdbnomics:::rdbnomics_df018)
```
```{r, eval = TRUE, echo = FALSE}
904
905
906
907
DT <- rdbnomics:::rdbnomics_df018
DT <- sapply(DT, function(y) { paste0(": ", nrow(y)) })
DT <- paste0("Number of datasets for ", names(DT), " ", unname(DT))
cat(DT, sep = "\n")
Sébastien Galais's avatar
Sébastien Galais committed
908
909
910
911
912
913
914
915
```

In the event that you only request the datasets for one provider, if you define
`simplify = TRUE`, then the result will be a `data.table` not a named list.
```{r, eval = FALSE}
rdb_datasets(provider_code = "IMF", simplify = TRUE)
```
```{r, eval = TRUE, echo = FALSE}
916
917
918
DT <- rdbnomics:::rdbnomics_df017
data.table::setDT(DT)
display_table(DT)
Sébastien Galais's avatar
Sébastien Galais committed
919
920
921
922
923
924
925
926
927
928
929
```

The extent of datasets gathered by [DBnomics](https://db.nomics.world) can be
appreciate by using the function with the argument `provider_code` set to
`NULL`:
```{r, eval = FALSE}
options(rdbnomics.progress_bar_datasets = TRUE)
rdb_datasets()
options(rdbnomics.progress_bar_datasets = FALSE)
```
```{r, eval = TRUE, echo = FALSE}
930
931
932
933
DT <- rdbnomics:::rdbnomics_df019
DT <- data.table(Provider = names(DT), `Number of datasets` = sapply(DT, nrow))
DT <- DT[order(Provider)]
display_table(DT)
Sébastien Galais's avatar
Sébastien Galais committed
934
935
936
937
938
939
940
941
942
```

# Fetch the possible dimensions of available datasets of a provider

When fetching series from [DBnomics](https://db.nomics.world), it can be 
interesting and especially useful to specify dimensions for a particular
dataset to download only the series you want to analyse. With
the function `rdb_dimensions`, you can download these dimensions and their
meanings.  
943
For example, for the dataset **WEO:2019-10** of the **IMF**, you may use:
Sébastien Galais's avatar
Sébastien Galais committed
944
```{r, eval = FALSE}
945
rdb_dimensions(provider_code = "IMF", dataset_code = "WEO:2019-10")
Sébastien Galais's avatar
Sébastien Galais committed
946
947
```

948
The result is a nested named list (its names are **IMF**, **WEO:2019-10** and the
Sébastien Galais's avatar
Sébastien Galais committed
949
950
dimensions names) with a `data.table` at the end of each branch:
```{r, eval = TRUE, echo = FALSE}
951
DT <- rdbnomics:::rdbnomics_df020
952
953
DT <- DT$IMF$`WEO:2019-10`
DT <- paste0("Number of dimensions for IMF/WEO:2019-10 : ", length(DT))
954
cat(DT, sep = "\n")
Sébastien Galais's avatar
Sébastien Galais committed
955
956
957
```

```{r, eval = TRUE, echo = FALSE}
958
DT <- rdbnomics:::rdbnomics_df020
959
DT <- DT$IMF$`WEO:2019-10`[[1]]
960
display_table(DT)
Sébastien Galais's avatar
Sébastien Galais committed
961
962
963
```

```{r, eval = TRUE, echo = FALSE}
964
DT <- rdbnomics:::rdbnomics_df020
965
966
967
968
969
970
971
DT <- DT$IMF$`WEO:2019-10`[[2]]
display_table(DT)
```

```{r, eval = TRUE, echo = FALSE}
DT <- rdbnomics:::rdbnomics_df020
DT <- DT$IMF$`WEO:2019-10`[[3]]
972
display_table(DT)
Sébastien Galais's avatar
Sébastien Galais committed
973
974
975
976
977
978
```

In the event that you only request the dimensions for one dataset for one
provider, if you define `simplify = TRUE`, then the result will be a named list
`data.table` not a nested named list.
```{r, eval = FALSE}
979
rdb_dimensions(provider_code = "IMF", dataset_code = "WEO:2019-10", simplify = TRUE)
Sébastien Galais's avatar
Sébastien Galais committed
980
981
982
983
984
985
986
987
988
```
```{r, eval = TRUE, echo = FALSE}
str(rdbnomics:::rdbnomics_df021)
```

You can measure the vast extent of datasets gathered by
[DBnomics](https://db.nomics.world) by downloading all the possible
dimensions. To do this, you have to set the arguments
`provider_code` and `dataset_code` to `NULL`.  
Sébastien Galais's avatar
Sébastien Galais committed
989
<b><font color='red'>&#9888;</font></b> It's relatively long to run and heavy to show so we display
Sébastien Galais's avatar
Sébastien Galais committed
990
991
992
993
994
995
996
the first 100.
```{r, eval = FALSE}
options(rdbnomics.progress_bar_datasets = TRUE)
rdb_dimensions()
options(rdbnomics.progress_bar_datasets = FALSE)
```
```{r, eval = TRUE, echo = FALSE}
997
998
999
1000
DT <- rdbnomics:::rdbnomics_df022
DT <- DT[order(Provider, Dataset)]
DT <- head(DT, 100)
display_table(DT)
For faster browsing, not all history is shown. View entire blame