Commit fb5bb9c1 authored by Thomas Brand's avatar Thomas Brand

Update rdbnomics tutorial with new features about proxy configuration.

parent 5c13a98e
Pipeline #35386 failed with stage
in 60 minutes
Title: Financial factors in the US and Euro Area business cycles: a general equilibrium approach
Date: 2016-07-01
Date: 2016-07-03
Category: Policy
Tags: model, DSGE, bayesian estimation, R
Slug: divergence-website
......
---
Title: DBnomics R client, a tutorial
date: 2018-11-30
Title: Access the free economic database DBnomics with R
date: 2019-03-04
Category: Data
Tags: DBnomics, database, R
Slug: rdbnomics-tutorial
Authors: Thomas Brand, Sébastien Galais
Summary: Access DBnomics data series from R.
Summary: Access the whole economic database <a href="https://db.nomics.world/" target="_blank">DBnomics</a> for free from R.
Download: https://git.nomics.world/macro/macro.nomics.world/tree/master/content/rdbnomics-tutorial
output: html_document
---
......@@ -13,7 +13,7 @@ output: html_document
# DBnomics : the world's economic database
You can explore all the economic data from different providers by following the link <a href="https://db.nomics.world/" target="_blank">db.nomics.world</a>.
Explore all the economic data from different providers (national and international statistical institutes, central banks, etc.), for free, following the link <a href="https://db.nomics.world/" target="_blank">db.nomics.world</a>.
![]({filename}/images/dbnomics001.png){width=750px}
......@@ -277,8 +277,11 @@ ggplot(df, aes(x = period, y = value, color = series_code)) +
## Fetch two values of one dimension from dataset 'Unemployment rate' (ZUTN) of AMECO provider
```{r, eval = TRUE}
df <- rdb('AMECO', 'ZUTN', dimensions = '{"geo": ["ea19", "dnk"]}') %>%
df <- rdb('AMECO', 'ZUTN', dimensions = list(geo = c("ea19", "dnk"))) %>%
filter(!is.na(value))
# or
# df <- rdb('AMECO', 'ZUTN', dimensions = '{"geo": ["ea19", "dnk"]}') %>%
# filter(!is.na(value))
```
```{r, echo = FALSE}
......@@ -297,8 +300,11 @@ ggplot(df, aes(x = period, y = value, color = series_code)) +
## Fetch several values of several dimensions from dataset 'Doing business' (DB) of World Bank
```{r, eval = TRUE}
df <- rdb('WB', 'DB', dimensions = '{"country": ["DZ", "PE"],"indicator": ["ENF.CONT.COEN.COST.ZS","IC.REG.COST.PC.FE.ZS"]}') %>%
df <- rdb('WB', 'DB', dimensions = list(country = c("DZ", "PE"), indicator = c("ENF.CONT.COEN.COST.ZS", "IC.REG.COST.PC.FE.ZS"))) %>%
filter(!is.na(value))
# or
# df <- rdb('WB', 'DB', dimensions = '{"country": ["DZ", "PE"],"indicator": ["ENF.CONT.COEN.COST.ZS","IC.REG.COST.PC.FE.ZS"]}') %>%
# filter(!is.na(value))
```
```{r, echo = FALSE}
......@@ -329,7 +335,7 @@ When you don't know the codes of the dimensions, provider, dataset or series, yo
- use the `rdb_by_api_link` function such as below.
```{r, eval = TRUE}
df <- rdb_by_api_link("https://api.db.nomics.world/v21/series?dimensions=%7B%22country%22%3A%5B%22FR%22%2C%22IT%22%2C%22ES%22%5D%2C%22indicator%22%3A%5B%22IC.REG.PROC.FE.NO%22%5D%7D&provider_code=WB&dataset_code=DB&format=json") %>%
df <- rdb_by_api_link("https://api.db.nomics.world/v22/series/WB/DB?dimensions=%7B%22country%22%3A%5B%22FR%22%2C%22IT%22%2C%22ES%22%5D%7D&q=IC.REG.PROC.FE.NO&observations=1&format=json&align_periods=1&offset=0&facets=0") %>%
filter(!is.na(value))
```
......@@ -354,7 +360,7 @@ On the cart page of the <a href="https://db.nomics.world/" target="_blank">DBnom
</center>
```{r, eval = TRUE}
df <- rdb_by_api_link("https://api.db.nomics.world/v21/series?series_ids=BOE%2F8745%2FLPMB23A%2CBOE%2F8745%2FLPMB26A&format=json") %>%
df <- rdb_by_api_link("https://api.db.nomics.world/v22/series?series_ids=BOE%2F8745%2FLPMB23A%2CBOE%2F8745%2FLPMB26A&observations=1&format=json&align_periods=1") %>%
filter(!is.na(value))
```
......@@ -388,3 +394,83 @@ ggplot(df, aes(x = period, y = value, color = series_name)) +
scale_y_continuous(labels = function(x) { format(x, big.mark = " ") }) +
dbnomics()
```
# Proxy configuration or connection error `Could not resolve host`
When using the functions `rdb` or `rdb_...`, you may come across the following error :
```{r, eval = FALSE}
Error in open.connection(con, "rb") :
Could not resolve host: api.db.nomics.world
```
To get round this situation, you have two options :
1. configure **curl** to use a specific and authorized proxy.
2. use the default R internet connection i.e. the Internet Explorer proxy defined in *internet2.dll*.
## Configure **curl** to use a specific and authorized proxy
In **rdbnomics**, by default the function `curl_fetch_memory` (of the package **curl**) is used to fetch the data. If a specific proxy must be used, it is possible to define it permanently with the package option `rdbnomics.curl_config` or on the fly through the argument `curl_config`. In that way the object is passed to the argument `handle` of the `curl_fetch_memory` function.
To see the available parameters, run `names(curl_options())` in *R* or visit the website <a href="https://curl.haxx.se/libcurl/c/curl_easy_setopt.html" target="_blank">https://curl.haxx.se/libcurl/c/curl_easy_setopt.html</a>. Once they are chosen, you define the curl object as follows :
```{r, eval = FALSE}
h <- curl::new_handle(
proxy = "<proxy>",
proxyport = <port>,
proxyusername = "<username>",
proxypassword = "<password>"
)
```
### Set the connection up for a session
The curl connection can be set up for a session by modifying the following package option :
```{r, eval = FALSE}
options(rdbnomics.curl_config = h)
```
When fetching the data, the command `curl_fetch_memory(url = <...>, handle = h)` is executed. In the event that you want to add others arguments, use :
```{r, eval = FALSE}
options(rdbnomics.curl_config = list(handle = h, arg = <...>))
```
After configuration, just use the standard functions of **rdbnomics** e.g. :
```{r, eval = FALSE}
df1 <- rdb(ids = 'AMECO/ZUTN/EA19.1.0.0.0.ZUTN')
```
This option of the package can be disabled with :
```{r, eval = FALSE}
options(rdbnomics.curl = NULL)
```
### Use the connection only for a function call
If a complete configuration is not needed but just an "on the fly" execution, then use the argument `curl_config` of the functions `rdb` and `rdb_...` :
```{r, eval = FALSE}
df1 <- rdb(ids = 'AMECO/ZUTN/EA19.1.0.0.0.ZUTN', curl_config = h)
```
## Use the default R internet connection
To retrieve the data with the default R internet connection, **rdbnomics** will use the base function `readLines`.
### Set the connection up for a session
To activate this feature for a session, you need to enable an option of the package :
```{r, eval = FALSE}
options(rdbnomics.use_readLines = TRUE)
```
And then use the standard function as follows :
```{r, eval = FALSE}
df1 <- rdb(ids = 'AMECO/ZUTN/EA19.1.0.0.0.ZUTN')
```
This configuration can be disabled with :
```{r, eval = FALSE}
options(rdbnomics.use_readLines = FALSE)
```
### Use the connection only for a function call
If you just want to do it once, you may use the argument `use_readLines` of the functions `rdb` and `rdb_...` :
```{r, eval = FALSE}
df1 <- rdb(ids = 'AMECO/ZUTN/EA19.1.0.0.0.ZUTN', use_readLines = TRUE)
```
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment