Datasets within this collection
Filter Results
32 results
- Knockin’ on H(e)aven’s door. Financial crises and offshore wealthReproduction Package for Knockin’ on H(e)aven’s door. Financial crises and offshore wealth by Silvia Marchesi and Giovanna Marcolongo. For further information see the README file
- Hodrick-Prescott filter with jumps (Maranzano & Pelagatti, 2025)We provide data and code to replicate the results presented in "A Hodrick-Prescott Filter with automatically selected breaks" (Maranzano & Pelagatti, 2025). The subfolders allow replicating the following: 1. Simulation experiments discussed in Section 3 "Simulations"; 2. Application results discussed in Section 4 "Assessing structural breaks in the Italian labour market"; 3. Simulation experiments discussed in Section 5 "A comparison with other business cycle extraction methods". For each subfolder a readme file is provided. It contains information about the reproduction steps.
- ISTADFuels - Italian SpatioTemporal Augmented Dataset on FuelsWe present a dataset for fuel sales analysis at the Italian provincial (NUTS3) level from January 2015 to October 2023 (release V3, January 2024). Fuel sales data are collected at monthly frequency, and are organized by fuel type, usage, and point of sale (highway, municipal road, extra-network road). Fuels data are augmented by a set of socio-economic and geographical variables, which help explain the impact of economic phenomena and topography on fuel sales. The data is collected from the Monthly oil Bullettin of Italian Ministry of Environment and Energy Security (MITE), ISTAT (Istituto Nazionale di Statistica), Bank of Italy and Eurostat, and has been collected through both automated web scraping and manual downloads, then cleaned and reshaped to be suitable for analysis. The produced dataset may be useful for spatiotemporal fuel sales forecasting, air quality analysis, urban mobility, econometric research, as well as machine learning applications. To further assist the user in finding valuable insight, an R Shiny app (freely available at the webpage https://ale-ch.shinyapps.io/it-fuel-dashboard/) was developed for data exploration. App code and the data have been made fully available on the following Github repository (https://github.com/ale-ch/it-fuel-dashboard). The app consists of interactive plots that allow the user to visualize every variable in the dataset at different time ranges and locations, allowing full flexibility in data exploration.
- Data for "Labeled loans and human capital investments"Codes and database originated for the manuscript published in the DOI article: 10.1016/j.jdeveco.2023.103053
- BayesANT: Bayesian Nonparametric Taxonomic classifier for DNA barcoding sequencesBayesANT is a package for the taxonomic classification of DNA sequences. It trains a taxonomic classifier on a dataset of DNA barcodes and returns probabilistic predictions for query DNA sequences. BayesANT explicitly accounts for potential taxonomic novelty of the query sequences by relying on Bayesian nonparametric species sampling priors to model the taxonomic tree.
- Data and Files for Zito, Rigon and Dunson (2022): "Inferring taxonomic affiliation from DNA barcoding aiding in discovery of new taxa"This folder contains the data and the R code to reproduce the figures and tables in the paper Zito, Rigon and Dunson (2022) - "Inferring Taxonomic placement from DNA barcoding aiding in discovery of new taxa", accepted as open access publication in Methods in Ecology and Evolution. The file "main_FinBOL.R" reproduces the tables in the main document and in the Supporting information available online for the analysis of the FinBOL data, while "main_Simulation_Section4_SI.R" reproduces the simulation in Section 4 of the Supporting information. All data are saved in the folder "data". For replicability purposes, we added version 2.13 of the RDP classifier to the repository, in the folder "RDP/java". This has been downloaded from https://sourceforge.net/projects/rdp-classifier/. For questions, contact the author at alessandro.zito@duke.edu
- ropensci/osmdata: CRAN version 0.1.10Major changes: Changed httr dependency for httr2 (#272) Removed two authors of code formerly including for stubbing results; which is now done via httptest2 package. Minor changes: Moved jsonlite from Imports to Suggests (now only used in tests).
- BayesANT: Bayesian Nonparametric Taxonomic classifier for DNA barcoding sequencesBayesANT is a package for the taxonomic classification of DNA sequences. It trains a taxonomic classifier on a dataset of DNA barcodes and returns probabilistic predictions for query DNA sequences. BayesANT explicitly accounts for potential taxonomic novelty of the query sequences by relying on Bayesian nonparametric species sampling priors to model the taxonomic tree. The package works with both aligned and not aligned sequences.
- ropensci/stplanr: stplanr 1.0.1Fix for breaking change in dodgr (#494)
- AgrImOnIA: Open Access dataset correlating livestock and air quality in the Lombardy region, ItalyThe AgrImOnIA dataset is a comprehensive dataset relating air quality and livestock (expressed as the density of bovines and swine bred) along with weather and other variables. The AgrImOnIA Dataset represents the first step of the AgrImOnIA project. The purpose of this data set is to give the opportunity to assess the impact of agriculture on air quality in Lombardy through statistical techniques capable of highlighting the relationship between the livestock sector and air pollutants concentrations. This dataset is a collection of estimated daily values for a range of measurements of different dimensions as: air quality, meteorology, emissions, livestock animals and land use. Data are related to Lombardy and the surrounding area for 2016-2021, inclusive. The surrounding area is obtained by applying a 0.3° buffer on Lombardy borders. The data uses several aggregation and interpolation methods to estimate the measurement for all days. For more details see the paper: A. Fassò, J. Rodeschini, A. Fusta Moro, Q. Shaboviq, P. Maranzano, M. Cameletti, F. Finazzi, N. Golini, R. Ignaccolo, and P. Otto (2022) Agrimonia: a dataset on livestock, meteorology and air quality in the Lombardy region, Italy. Arxiv preprint, arxiv:2210.10604. (click here). The files in the folder are: Agrimonia_Dataset.csv(.Rdata,.mat) which is built by joining the daily time series related to the AQ, WE, EM, LI and LA variables. In order to simplify access to variables in the Agrimonia dataset, the variable name starts with the dimension of the variable, i.e., the name of the variables related to the AQ dimension start with 'AQ_'. This file is archived also in the and format for MATLAB and R software, respectively. Metadata_Agrimonia.csv which provides further information for the sources used, variables imported, transformations applied, and about the Agrimonia variables. Metadata_AQ_imputation_uncertainty.csv which contains the daily uncertainty estimate of the imputed observation for the AQ to mitigate missing data in the hourly time series. Metadata_LA_CORINE_labels.csv which contains the label and the description associated with the CLC class. Metadata_monitoring_network_registry.csv which contains all details about the AQ monitoring station used to build the dataset. Information about pollutant stations includes: station type, municipality code, environment type, altitude, pollutants sampled and other information. Each row represents a single sensor. Metadata_LA_SIARL_labels.csv which contains the label and the description associated with the SIARL class. The dataset can be reproduced using the code available at the GitHub page: https://github.com/AgrImOnIA-project/AgrImOnIA_Data
1

