Disks of dolos server is full because of growing json-data
The disk of the production server dolos
of 1Tb is full at 85%, with 137G free. Eurostat JSON data takes 257G.
Also, the disk of the development server ioke
of 1Tb is full at 100%. I deleted ecb-json-data
and deleted the webhook of ecb-json-data, which triggered data validation.
Impacts
As mentioned in #550, we are blocked to deploy its newly downloaded data. Given OECD source data takes 335G, I estimate that JSON data will take between 200G and 300G. That is bigger than free space.
Also, data validation on ioke
is limited because any pull to a json-data repo is blocked due to the full disk.
Solution
A solution is to have a shared storage server, and having the consumers mount it via NFS. The consumers are the API, the Solr importer job, and the data validation job.
The storage server would have a 10Tb disk size, which is enough to last many years probably. Indeed, OECD is among the biggest providers with Eurostat, and currently the whole DBnomics json-data weight less than 2Tb.
Tasks
-
explore NFS storage of online.net, preferably a managed block storage, or else create a server with 10Tb of disk (similar to eros
) namedoizys
with a NFS server -
explore how to mount a NFS storage from a CI job -
rsync all json-data repositories on the new server -
optional: adapt Prometheus and Grafana monitoring to new server
Questions
- now that there is a single shared storage, and that the import and data validation jobs are not sequential, and both do a clone-or-update of the json-data repository, the situation where both do a "git pull" at the same time can happen. Is it supported by Git?