77 datasets found

f
Average properties of predicted APRs in different datasets.
figshare.com
plos.figshare.com
xls
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrick M. Buck; Sandeep Kumar; Satish K. Singh (2023). Average properties of predicted APRs in different datasets. [Dataset]. http://doi.org/10.1371/journal.pcbi.1003291.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1003291.t003
Dataset updated
Jun 5, 2023
Dataset provided by
PLOS Computational Biology
Authors
Patrick M. Buck; Sandeep Kumar; Satish K. Singh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
*Average length of sequences that contain at least one TANGO/WALTZ predicted APRs.†Average proportion of APR residues in the sequences contained in different datasets was computed as described in methods.Sequences that do not contain APRs were excluded from both the average length and average proportion calculations. Standard deviations (σ) are reported for all averages. Standard error (SE) of the mean can be computed as σ/√(number of sequences; see Table 2) with 95% confidence intervals (average ± SE*1.96).
Global datasets to evaluate a multi-sensor approach for observation of...
zenodo.org
data.niaid.nih.gov
zip
Updated Jul 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dinuke Munasinghe; Dinuke Munasinghe; Renato PM Frasson; Renato PM Frasson; Cédric H. David; Cédric H. David; Matthew Bonnema; Matthew Bonnema; Guy Schumann; Guy Schumann; G. Robert Brakenridge; G. Robert Brakenridge (2023). Global datasets to evaluate a multi-sensor approach for observation of floods [Dataset]. http://doi.org/10.5281/zenodo.8164503
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8164503
Dataset updated
Jul 20, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Dinuke Munasinghe; Dinuke Munasinghe; Renato PM Frasson; Renato PM Frasson; Cédric H. David; Cédric H. David; Matthew Bonnema; Matthew Bonnema; Guy Schumann; Guy Schumann; G. Robert Brakenridge; G. Robert Brakenridge
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
1. Overview

This repository contains datasets used to evaluate potential improvements to flood detectability afforded by combining data collected by Landsat, Sentinel-2, and Sentinel-1 for the first time globally. The datasets were produced as part of the manuscript "A multi-sensor approach for increased measurements of floods and their societal impacts from space" which is currently in review.

2. Dataset Descriptions

There are two datasets included here.

(a) A global grid of revisit periods of Landsat, Sentinel-1, Sentinel-2 Satellites and their combination [GlobalMedianRevisits.zip]

A global dataset of revisit periods of individual satellites and their combination based on a 0.5-degree resolution grid.
Revisit periods are defined as the time between two consecutive observations of a particular point on the surface, for the satellite missions Landsat, Sentinel-2 and Sentinel-1. The grid was created using ArcMap 10.8.1 and intersections of the grid were used to create points. For each individual point, average revisit times (i.e., to account for irregular revisits, downlink issues) were calculated for each individual satellite and the composite of the three satellites. Averaged revisit times for each of these points were calculated based on the number of image tiles that intersected a particular grid point with more than a 30-minute time difference between each other acquired between 01 Jan 2016 and 31 Dec 2020.
The following equation is used to calculate revisit periods:

Average revisit time for a grid point = (Number of days between 01 Jan 2016 and 31 Dec 2020 (1827)) / (Total Number of Images captured)

Only revisits occurring between 82.5 N and 55 S of land grid points are considered; Antarctica is omitted from analysis. For satellite missions that consist of two spacecraft orbiting simultaneously (Sentinel-1 A/B, and Sentinel-2 A/B), images acquired by both satellites were used in average revisit period calculation for a given grid point. Sum totals of image tiles of all three missions are used to calculate composite point-based revisit times.

(b) Average revisit periods of satellites for flood records in the DFO database [FloodInfo.zip]

Average Revisit Times of Landsat, Sentinel-1, Sentinel-2 and their ensemble are calculated for 5130 flood records in the Dartmouth Flood Observatory's (DFO) flood record database. These were appended to the already existing attributes of the database.
d
Matlab script Stress2Grid - Dataset - B2FIND
b2find.dkrz.de
Updated Jun 29, 2007
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2007). Matlab script Stress2Grid - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/214253a7-91ea-5351-8f46-4c162cb8ac2e
Explore at:
Dataset updated
Jun 29, 2007
Description
The distribution of data records for the maximum horizontal stress orientation SHmax in the Earth’s crust is sparse and very unequally. In order to analyse the stress pattern and its wavelength or to predict the mean SHmax orientation on a regular grid, statistical interpolation as conducted e.g. by Coblentz and Richardson (1995), Müller et al. (2003), Heidbach and Höhne (2008), Heidbach et al. (2010) or Reiter et al. (2014) is necessary. Based on their work we wrote the Matlab® script Stress2Grid that provides several features to analyse the mean SHmax pattern. The script facilitates and speeds up this analysis and extends the functionality compared to aforementioned publications. The script is complemented by a number of example and input files as described in the WSM Technical Report (Ziegler and Heidbach, 2017, http://doi.org/10.2312/wsm.2017.002). The script provides two different concepts to calculate the mean SHmax orientation on a regular grid. The first is using a fixed search radius around the grid point and computes the mean SHmax orientation if sufficient data records are within the search radius. The larger the search radius the larger is the filtered wavelength of the stress pattern. The second approach is using variable search radii and determines the search radius for which the variance of the mean SHmax orientation is below a given threshold. This approach delivers mean SHmax orientations with a user-defined degree of reliability. It resolves local stress perturbations and is not available in areas with conflicting information that result in a large variance. Furthermore, the script can also estimate the deviation between plate motion direction and the mean SHmax orientation.
N
Income Distribution by Quintile: Mean Household Income in Two Rivers...
neilsberg.com
csv, json
Updated Jan 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Income Distribution by Quintile: Mean Household Income in Two Rivers Township, Minnesota [Dataset]. https://www.neilsberg.com/research/datasets/950bba80-7479-11ee-949f-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Minnesota, Two Rivers Township
Variables measured
Income Level, Mean Household Income
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across income quintiles (mentioned above) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the mean household income for each of the five quintiles in Two Rivers Township, Minnesota, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.

Key observations

Income disparities: The mean income of the lowest quintile (20% of households with the lowest income) is 17,983, while the mean income for the highest quintile (20% of households with the highest income) is 230,047. This indicates that the top earners earn 13 times compared to the lowest earners.

*Top 5%: * The mean household income for the wealthiest population (top 5%) is 379,359, which is 164.90% higher compared to the highest quintile, and 2109.54% higher compared to the lowest quintile.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Income Levels:

Lowest Quintile

Second Quintile

Third Quintile

Fourth Quintile

Highest Quintile

Top 5 Percent

Variables / Data Columns

Income Level: This column showcases the income levels (As mentioned above).

Mean Household Income: Mean household income, in 2022 inflation-adjusted dollars for the specific income level.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Two Rivers township median household income. You can refer the same here
d
Example Groundwater-Level Datasets and Benchmarking Results for the...
catalog.data.gov
data.usgs.gov
Updated Oct 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Example Groundwater-Level Datasets and Benchmarking Results for the Automated Regional Correlation Analysis for Hydrologic Record Imputation (ARCHI) Software Package [Dataset]. https://catalog.data.gov/dataset/example-groundwater-level-datasets-and-benchmarking-results-for-the-automated-regional-cor
Explore at:
Dataset updated
Oct 13, 2024
Dataset provided by
U.S. Geological Survey
Description
This data release provides two example groundwater-level datasets used to benchmark the Automated Regional Correlation Analysis for Hydrologic Record Imputation (ARCHI) software package (Levy and others, 2024). The first dataset contains groundwater-level records and site metadata for wells located on Long Island, New York (NY) and some surrounding mainland sites in New York and Connecticut. The second dataset contains groundwater-level records and site metadata for wells located in the southeastern San Joaquin Valley of the Central Valley, California (CA). For ease of exposition these are referred to as NY and CA datasets, respectively. Both datasets are formatted with column headers that can be read by the ARCHI software package within the R computing environment. These datasets were used to benchmark the imputation accuracy of three ARCHI model settings (OLS, ridge, and MOVE.1) against the widely used imputation program missForest (Stekhoven and Bühlmann, 2012). The ARCHI program was used to process the NY and CA datasets on monthly and annual timesteps, respectively, filter out sites with insufficient data for imputation, and create 200 test datasets from each of the example datasets with 5 percent of observations removed at random (herein, referred to as "holdouts"). Imputation accuracy for test datasets was assessed using normalized root mean square error (NRMSE), which is the root mean square error divided by the standard deviation of the observed holdout values. ARCHI produces prediction intervals (PIs) using a non-parametric bootstrapping routine, which were assessed by computing a coverage rate (CR) defined as the proportion of holdout observations falling within the estimated PI. The multiple regression models included with the ARCHI package (OLS and ridge) were further tested on all test datasets at eleven different levels of the p_per_n input parameter, which limits the maximum ratio of regression model predictors (p) per observations (n) as a decimal fraction greater than zero and less than or equal to one. This data release contains ten tables formatted as tab-delimited text files. The “CA_data.txt” and “NY_data.txt” tables contain 243,094 and 89,997 depth-to-groundwater measurement values (value, in feet below land surface) indexed by site identifier (site_no) and measurement date (date) for CA and NY datasets, respectively. The “CA_sites.txt” and “NY_sites.txt” tables contain site metadata for the 4,380 and 476 unique sites included in the CA and NY datasets, respectively. The “CA_NRMSE.txt” and “NY_NRMSE.txt” tables contain NRMSE values computed by imputing 200 test datasets with 5 percent random holdouts to assess imputation accuracy for three different ARCHI model settings and missForest using CA and NY datasets, respectively. The “CA_CR.txt” and “NY_CR.txt” tables contain CR values used to evaluate non-parametric PIs generated by bootstrapping regressions with three different ARCHI model settings using the CA and NY test datasets, respectively. The “CA_p_per_n.txt” and “NY_p_per_n.txt” tables contain mean NRMSE values computed for 200 test datasets with 5 percent random holdouts at 11 different levels of p_per_n for OLS and ridge models compared to training error for the same models on the entire CA and NY datasets, respectively. References Cited Levy, Z.F., Stagnitta, T.J., and Glas, R.L., 2024, ARCHI: Automated Regional Correlation Analysis for Hydrologic Record Imputation, v1.0.0: U.S. Geological Survey software release, https://doi.org/10.5066/P1VVHWKE. Stekhoven, D.J., and Bühlmann, P., 2012, MissForest—non-parametric missing value imputation for mixed-type data: Bioinformatics 28(1), 112-118. https://doi.org/10.1093/bioinformatics/btr597.
Seasonal Average Wind Speed - Projections (5km)
climatedataportal.metoffice.gov.uk
Updated Dec 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Met Office (2023). Seasonal Average Wind Speed - Projections (5km) [Dataset]. https://climatedataportal.metoffice.gov.uk/datasets/6b2bee0ed29749caaaf9c49f5ddd3a7f
Explore at:
Dataset updated
Dec 4, 2023
Dataset authored and provided by
Met Officehttp://www.metoffice.gov.uk/
Area covered

Description
What does the data show?

The dataset is derived from projections of seasonal mean wind speeds from UKCP18 which are averaged to produce values for the 1981-2000 baseline and two warming levels: 2.0°C and 4.0°C above the pre-industrial (1850-1900) period. All wind speeds have units of metres per second (m / s). These data enable users to compare future seasonal mean wind speeds to those of the baseline period.

What is a warming level and why are they used?

The wind speeds were calculated from the UKCP18 local climate projections which used a high emissions scenario (RCP 8.5) where greenhouse gas emissions continue to grow. Instead of considering future climate change during specific time periods (e.g., decades) for this scenario, the dataset is calculated at two levels of global warming relative to the pre-industrial (1850-1900) period. The world has already warmed by around 1.1°C (between 1850–1900 and 2011–2020), so this dataset allows for the exploration of greater levels of warming.

The global warming levels available in this dataset are 2°C and 4°C in line with recommendations in the third UK Climate Risk Assessment. The data at each warming level were calculated using 20 year periods over which the average warming was equal to 2°C and 4°C. The exact time period will be different for different model ensemble members. To calculate the seasonal mean wind speeds, an average is taken across the 20 year period. Therefore, the seasonal wind speeds represent those for a given level of warming.

We cannot provide a precise likelihood for particular emission scenarios being followed in the real world in the future. However, we do note that RCP8.5 corresponds to emissions considerably above those expected under current international policy agreements. The results are also expressed for several global warming levels because we do not yet know which level will be reached in the real climate; the warming level reached will depend on future greenhouse emission choices and the sensitivity of the climate system, which is uncertain. Estimates based on the assumption of current international agreements on greenhouse gas emissions suggest a median warming level in the region of 2.4-2.8°C, but it could either be higher or lower than this level.

What are the naming conventions and how do I explore the data?

The columns (fields) correspond to each global warming level and two baselines. They are named 'windspeed' (Wind Speed), the season, warming level or baseline, and ‘upper’ ‘median’ or ‘lower’ as per the description below. For example, ‘windspeed winter 2.0 median’ is the median winter wind speed for the 2°C projection. Decimal points are included in field aliases but not field names; e.g., ‘windspeed winter 2.0 median’ is ‘ws_winter_20_median’.

To understand how to explore the data, see this page: https://storymaps.arcgis.com/stories/457e7a2bc73e40b089fac0e47c63a578

What do the ‘median’, ‘upper’, and ‘lower’ values mean?

Climate models are numerical representations of the climate system. To capture uncertainty in projections for the future, an ensemble, or group, of climate models are run. Each ensemble member has slightly different starting conditions or model set-ups. Considering all of the model outcomes gives users a range of plausible conditions which could occur in the future.

For this dataset, the model projections consist of 12 separate ensemble members. To select which ensemble members to use, seasonal mean wind speeds were calculated for each ensemble member and then ranked in order from lowest to highest for each location.

The ‘lower’ fields are the second lowest ranked ensemble member. The ‘upper’ fields are the second highest ranked ensemble member. The ‘median’ field is the central value of the ensemble.

This gives a median value, and a spread of the ensemble members indicating the range of possible outcomes in the projections. This spread of outputs can be used to infer the uncertainty in the projections. The larger the difference between the lower and upper fields, the greater the uncertainty.

‘Lower’, ‘median’ and ‘upper’ are also given for the baseline periods as these values also come from the model that was used to produce the projections. This allows a fair comparison between the model projections and recent past.

Data source

The seasonal mean wind speeds were calculated from daily values of wind speeds generated from the UKCP Local climate projections; they are one of the standard UKCP18 products. These projections were created with a 2.2km convection-permitting climate model. To aid comparison with other models and UK-based datasets, the UKCP Local model data were aggregated to a 5km grid on the British National grid; the 5km data were processed to generate the seasonal mean wind speeds.

Useful links

Further information on the UK Climate Projections (UKCP). Further information on understanding climate data within the Met Office Climate Data Portal.
l
Data from: Median Household Income
geohub.lacity.org
data.lacounty.gov
+1more
Updated Dec 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
County of Los Angeles (2023). Median Household Income [Dataset]. https://geohub.lacity.org/datasets/lacounty::median-household-income-2
Explore at:
Dataset updated
Dec 21, 2023
Dataset authored and provided by
County of Los Angeles
Area covered

Description
Includes median household income in the past twelve months (in 2022 inflation-adjusted dollars). Geography-specific median household income are calculated as the population-weighted averages of the median household incomes within their respective 2020 census tracts. Median household income is defined as the amount that divides the household income distribution of a population into two equal groups; half of the population has a household income above that amount, whereas the other half has a household income below that amount. Household income is an important driver of life expectancy and other health outcomes, as individuals with higher household incomes, on average, experience better health and live longer than individuals with lower household incomes. This is largely due to increased access to opportunities, resources, and healthier living conditions that higher income individuals experience compared to lower income individuals.For more information about the Community Health Profiles Data Initiative, please see the initiative homepage.
Z
Data from: An Uncertainty-Aware Approach to Optimal Configuration of Stream...
data.niaid.nih.gov
zenodo.org
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Casale, Giuliano (2020). An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing Systems [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_56238
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Jamshidi, Pooyan
Casale, Giuliano
License
https://opensource.org/licenses/BSD-3-Clausehttps://opensource.org/licenses/BSD-3-Clause
Description
The datasets in this release support the results presented in the paper

P. Jamshidi, G. Casale, "An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing Systems", accepted for presentation at MASCOTS 2016.

An open access to the paper is available at https://arxiv.org/abs/1606.06543

Also open source code is available at https://github.com/dice-project/DICE-Configuration-BO4CO

The archive contains 10 comma separated datasets representing performance measurements (throughput and latency) for 3 different stream benchmark applications. These have been experimentally collected on 5 different cloud cluster over the course of 3 months (24/7). Each row in the datasets represents a different configuration setting for the application and the last two columns represent the average performance of the application measured over the course of 10 minutes under that specific configuration setting. The datasets contains a full factorial and exhaustive measurements for all possible settings limited to a predetermined interval for each variable. Each dataset is named in the following format: "benchmark_application-dimensions-cluster_name". For example, "wc-6d-c1" refers to WordCount benchmark application with 6 dimensions (i.e., we varied 6 configuration parameters) and the application was deployed on c1 cluster (OpenNebula, see Appendix). This resulted in a dataset of size 2880, i.e., it has taken 2880*10m=480h=20days for collecting the data!

For more information about the data refer to the appendix of the paper: https://arxiv.org/abs/1606.06543.

When referring to the dataset or code please cite the paper above.
d
2010 County and City-Level Water-Use Data and Associated Explanatory...
catalog.data.gov
data.usgs.gov
+4more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). 2010 County and City-Level Water-Use Data and Associated Explanatory Variables [Dataset]. https://catalog.data.gov/dataset/2010-county-and-city-level-water-use-data-and-associated-explanatory-variables
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
U.S. Geological Survey
Description
This data release contains the input-data files and R scripts associated with the analysis presented in [citation of manuscript]. The spatial extent of the data is the contiguous U.S. The input-data files include one comma separated value (csv) file of county-level data, and one csv file of city-level data. The county-level csv (“county_data.csv”) contains data for 3,109 counties. This data includes two measures of water use, descriptive information about each county, three grouping variables (climate region, urban class, and economic dependency), and contains 18 explanatory variables: proportion of population growth from 2000-2010, fraction of withdrawals from surface water, average daily water yield, mean annual maximum temperature from 1970-2010, 2005-2010 maximum temperature departure from the 40-year maximum, mean annual precipitation from 1970-2010, 2005-2010 mean precipitation departure from the 40-year mean, Gini income disparity index, percent of county population with at least some college education, Cook Partisan Voting Index, housing density, median household income, average number of people per household, median age of structures, percent of renters, percent of single family homes, percent apartments, and a numeric version of urban class. The city-level csv (city_data.csv) contains data for 83 cities. This data includes descriptive information for each city, water-use measures, one grouping variable (climate region), and 6 explanatory variables: type of water bill (increasing block rate, decreasing block rate, or uniform), average price of water bill, number of requirement-oriented water conservation policies, number of rebate-oriented water conservation policies, aridity index, and regional price parity. The R scripts construct fixed-effects and Bayesian Hierarchical regression models. The primary difference between these models relates to how they handle possible clustering in the observations that define unique water-use settings. Fixed-effects models address possible clustering in one of two ways. In a "fully pooled" fixed-effects model, any clustering by group is ignored, and a single, fixed estimate of the coefficient for each covariate is developed using all of the observations. Conversely, in an unpooled fixed-effects model, separate coefficient estimates are developed only using the observations in each group. A hierarchical model provides a compromise between these two extremes. Hierarchical models extend single-level regression to data with a nested structure, whereby the model parameters vary at different levels in the model, including a lower level that describes the actual data and an upper level that influences the values taken by parameters in the lower level. The county-level models were compared using the Watanabe-Akaike information criterion (WAIC) which is derived from the log pointwise predictive density of the models and can be shown to approximate out-of-sample predictive performance. All script files are intended to be used with R statistical software (R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org) and Stan probabilistic modeling software (Stan Development Team. 2017. RStan: the R interface to Stan. R package version 2.16.2. http://mc-stan.org).
e
Global - Annual Average Methane Concentration - Dataset - ENERGYDATA.INFO
energydata.info
Updated Oct 22, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). Global - Annual Average Methane Concentration - Dataset - ENERGYDATA.INFO [Dataset]. https://energydata.info/dataset/global-annual-average-methane-concentration-2016
Explore at:
Dataset updated
Oct 22, 2019
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This raster maps average atmospheric methane concentrations in 2016. The data used to create it is from the NASA Aqua satellite (specifically the AIRS instrument) that records monthly average atmospheric methane concentrations. AIRS collects methane data at different pressure levels. The raster depicts data at the 400 hPa level because that is where the instrument is most sensitive to methane concentration. The monthly data was consolidated using the NASA tool, Giovanni https://giovanni.gsfc.nasa.gov/giovanni/, to create a raster with annual average methane concentrations. Giovanni output two rasters: one for daytime averaged data and one for nighttime averaged data. ArcGIS was then used to combine the two rasters to create a single annual raster. A conversion factor of 1.0e+9 was multiplied to convert the final raster from mole fractions to parts per billion. The results are attached. The Carnegie Endowment for International Peace would like to eventually incorporate a methane raster as a new layer in the Oil Climate Index (OCI) web tool http://oci.carnegieendowment.org/. Carnegie is currently updating the OCI, adding greenhouse gas comparisons of global gas fields and visualizing their methane emissions. Carnegie is planning to work with our OCI partners at Stanford to further analyze the methane concentration raster to separate out signal from noise and to identify potential methane concentration hot spots associated with oil and gas operations. This raster is useful when studying short term climate risks, especially when it comes to Arctic oil and gas resources.
ERA5 hourly data on single levels from 1940 to present
cds.climate.copernicus.eu
arcticdata.io
grib
Updated Mar 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ECMWF (2025). ERA5 hourly data on single levels from 1940 to present [Dataset]. http://doi.org/10.24381/cds.adbb2d47
Explore at:
gribAvailable download formats
Unique identifier
https://doi.org/10.24381/cds.adbb2d47
Dataset updated
Mar 26, 2025
Dataset provided by
European Centre for Medium-Range Weather Forecastshttp://ecmwf.int/
Authors
ECMWF
License
https://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/licence-to-use-copernicus-products/licence-to-use-copernicus-products_b4b9451f54cffa16ecef5c912c9cebd6979925a956e3fa677976e0cf198c2c18.pdfhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/licence-to-use-copernicus-products/licence-to-use-copernicus-products_b4b9451f54cffa16ecef5c912c9cebd6979925a956e3fa677976e0cf198c2c18.pdf
Time period covered
Jan 1, 1959 - Mar 20, 2025
Description
ERA5 is the fifth generation ECMWF reanalysis for the global climate and weather for the past 8 decades. Data is available from 1940 onwards. ERA5 replaces the ERA-Interim reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. This principle, called data assimilation, is based on the method used by numerical weather prediction centres, where every so many hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called analysis, from which an updated, improved forecast is issued. Reanalysis works in the same way, but at reduced resolution to allow for the provision of a dataset spanning back several decades. Reanalysis does not have the constraint of issuing timely forecasts, so there is more time to collect observations, and when going further back in time, to allow for the ingestion of improved versions of the original observations, which all benefit the quality of the reanalysis product. ERA5 provides hourly estimates for a large number of atmospheric, ocean-wave and land-surface quantities. An uncertainty estimate is sampled by an underlying 10-member ensemble at three-hourly intervals. Ensemble mean and spread have been pre-computed for convenience. Such uncertainty estimates are closely related to the information content of the available observing system which has evolved considerably over time. They also indicate flow-dependent sensitive areas. To facilitate many climate applications, monthly-mean averages have been pre-calculated too, though monthly means are not available for the ensemble mean and spread. ERA5 is updated daily with a latency of about 5 days. In case that serious flaws are detected in this early release (called ERA5T), this data could be different from the final release 2 to 3 months later. In case that this occurs users are notified. The data set presented here is a regridded subset of the full ERA5 data set on native resolution. It is online on spinning disk, which should ensure fast and easy access. It should satisfy the requirements for most common applications. An overview of all ERA5 datasets can be found in this article. Information on access to ERA5 data on native resolution is provided in these guidelines. Data has been regridded to a regular lat-lon grid of 0.25 degrees for the reanalysis and 0.5 degrees for the uncertainty estimate (0.5 and 1 degree respectively for ocean waves). There are four main sub sets: hourly and monthly products, both on pressure levels (upper air fields) and single levels (atmospheric, ocean-wave and land surface quantities). The present entry is "ERA5 hourly data on single levels from 1940 to present".
d
Data for Calculating Efficient Outdoor Water Uses
catalog.data.gov
data.cnra.ca.gov
+2more
Updated May 14, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Water Resources (2024). Data for Calculating Efficient Outdoor Water Uses [Dataset]. https://catalog.data.gov/dataset/data-for-calculating-efficient-outdoor-water-uses-147dd
Explore at:
Dataset updated
May 14, 2024
Dataset provided by
California Department of Water Resources
Description
December 6, 2023 (Final DWR Data) The 2018 Legislation required DWR to provide or otherwise identify data regarding the unique local conditions to support the calculation of an urban water use objective (CWC 10609. (b)(2) (C)). The urban water use objective (UWUO) is an estimate of aggregate efficient water use for the previous year based on adopted water use efficiency standards and local service area characteristics for that year. UWUO is calculated as the sum of efficient indoor residential water use, efficient outdoor residential water use, efficient outdoor irrigation of landscape areas with dedicated irrigation meter for Commercial, Industrial, and Institutional (CII) water use, efficient water losses, and an estimated water use in accordance with variances, as appropriate. Details of urban water use objective calculations can be obtained from DWR’s Recommendations for Guidelines and Methodologies document (Recommendations for Guidelines and Methodologies for Calculating Urban Water Use Objective - https://water.ca.gov/-/media/DWR-Website/Web-Pages/Programs/Water-Use-And-Efficiency/2018-Water-Conservation-Legislation/Performance-Measures/UWUO_GM_WUES-DWR-2021-01B_COMPLETE.pdf). The datasets provided in the links below enable urban retail water suppliers calculate efficient outdoor water uses (both residential and CII), agricultural variances, variances for significant uses of water for dust control for horse corals, and temporary provisions for water use for existing pools (as stated in Water Boards’ draft regulation). DWR will provide technical assistance for estimating the remaining UWUO components, as needed. Data for calculating outdoor water uses include: • Reference evapotranspiration (ETo) – ETo is evaporation plant and soil surface plus transpiration through the leaves of standardized grass surfaces over which weather stations stand. Standardization of the surfaces is required because evapotranspiration (ET) depends on combinations of several factors, making it impractical to take measurements under all sets of conditions. Plant factors, known as crop coefficients (Kc) or landscape coefficients (KL), are used to convert ETo to actual water use by specific crop/plant. The ETo data that DWR provides to urban retail water suppliers for urban water use objective calculation purposes is derived from the California Irrigation Management Information System (CIMIS) program (https://cimis.water.ca.gov/). CIMIS is a network of over 150 automated weather stations throughout the state that measure weather data that are used to estimate ETo. CIMIS also provides daily maps of ETo at 2-km grid using the Spatial CIMIS modeling approach that couples satellite data with point measurements. The ETo data provided below for each urban retail water supplier is an area weighted average value from the Spatial CIMIS ETo. • Effective precipitation (Peff) - Peff is the portion of total precipitation which becomes available for plant growth. Peff is affected by soil type, slope, land cover type, and intensity and duration of rainfall. DWR is using a soil water balance model, known as Cal-SIMETAW, to estimate daily Peff at 4-km grid and an area weighted average value is calculated at the service area level. Cal-SIMETAW is a model that was developed by UC Davis and DWR and it is widely used to quantify agricultural, and to some extent urban, water uses for the publication of DWR’s Water Plan Update. Peff from Cal-SIMETAW is capped at 25% of total precipitation to account for potential uncertainties in its estimation. Daily Peff at each grid point is aggregated to produce weighted average annual or seasonal Peff at the service area level. The total precipitation that Cal-SIMETAW uses to estimate Peff comes from the Parameter-elevation Relationships on Independent Slopes Model (PRISM), which is a climate mapping model developed by the PRISM Climate Group at Oregon State University. • Residential Landscape Area Measurement (LAM) – The 2018 Legislation required DWR to provide each urban retail water supplier with data regarding the area of residential irrigable lands in a manner that can reasonably be applied to the standards (CWC 10609.6.(b)). DWR delivered the LAM data to all retail water suppliers, and a tabular summary of selected data types will be provided here. The data summary that is provided in this file contains irrigable-irrigated (II), irrigable-not-irrigated (INI), and not irrigable (NI) irrigation status classes, as well as horse corral areas (HCL_area), agricultural areas (Ag_area), and pool areas (Pool_area) for all retail suppliers.
Z
PhysioIntent: Multimodal dataset for human intention prediction research
data.niaid.nih.gov
zenodo.org
Updated Jul 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ana Filipa Ferreira (2024). PhysioIntent: Multimodal dataset for human intention prediction research [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7054397
Explore at:
Dataset updated
Jul 16, 2024
Dataset provided by
Ana Filipa Ferreira
Vitor Minhoto
João Paulo Cunha
Duarte Dias
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PhysioIntent database was acquired with the aim to research human movement intention through biosignals (electromyogram (EMG), electroencephalogram (EEG) and electrocardiogram (ECG)) and inertial data (9-axis). It was acquired in INESC TEC making use of several integrated wearable devices to monitor human activity during a defined protocol.

PhysioIntent database was acquired during the master thesis at INESC TEC. The master thesis will be published in October. The dataset was built to research human movement intention through biosignals (electromyogram (EMG), electroencephalogram (EEG) and electrocardiogram (ECG)) using the Cyton board from openBCI [1]. Inertial data (9-axis) was also recorded with a proprietary device from INESC TEC named iHandU [2]. A camera, logitech C270 HD, was also used to record the participant’s session video, thus better supporting the post-processing of the recorded data and the agreement between the protocol and the participant activity. All data was then synchronized with the aid of a photoresistor, correlating the visual stimuli presented to the user with the signals acquired.

The acquisitions are divided into two phases, where the 2nd phase was performed to improve some setbacks encountered in the 1st phase, such as data loss and synchronization issues. The 1st phase study included 6 healthy volunteers (range of age = 22 to 25; average age = 22.3±0.9; 2 males and 4 females; all right-handed). In the 2nd phase, the study included 3 healthy volunteers (range of age = 20 to 26; average age = 22.6±2.5; 2 males and 1 female; all right-handed).

The protocol consists in the execution and imagination of some upper limb movements, which will be repeated several times throughout the protocol. There are a total of three different movements during the session: hand-grasping, wrist supination and pick and place. Each sequence of movements, imagination and execution, as well as the resting periods is called a trial. A run is a sequence of trials that end on a 60s break.

This dataset has two different phases of acquisition. Phase 1 has a total of four runs with fifteen trials each, while phase 2 has five runs with eighteen trials each. During Phase 1, on every run, each movement is imagined and executed 5 times corresponding to a total of 20 repetitions per movement during each session. On phase two, on every run, each movement was executed and imagined 6 times, resulting in 30 repetitions per movement on each session.

In phase 1, 4 different muscles, bicep brachii, tricep brachii, flexor carpi radialis, and extensor digitorum, were measured. For the EEG, the measured channels were: FP1, FP2, FCZ, C3, CZ, C4, CP3, CP4, P3, and P4. During phase 2, only one muscle, extensor digitorum, was measured. For the EEG, the channels measured were: FP1, FP2, FC3, FCz, FC4, C1, C3, Cz, C2, C4, CP3, CP4, P3, and P4.

Before the experiments, the participants were informed about the experimental protocols, paradigms, and purpose. After ensuring they understood the information, the participants signed a written consent approved by the DPO from INESC TEC.

All files are grouped by subject. You can find all the detailed descriptions of how the files are organized on the README file. Also, there is an extra folder called "PhysioIntent supporting material" where you can find some extra material including a script with functions to help you read the data, a description of the experimental protocol and the setup create for each phase. For each subject the data is organized according to the data model ("Subject_data_storage_model") where it is shown that each type of data is present in a different folder. Regarding biosignals (openBCI/ folder), there is the raw and processed data. There is an additional README file for some subjects that contains some particular details of the acquisition.

[1] Cyton + Daisy Biosensing Boards (16-Channels). (2022). Retrieved 23 August 2022, from https://shop.openbci.com/products

[2] Oliveira, Ana, Duarte Dias, Elodie Múrias Lopes, Maria do Carmo Vilas-Boas, and João Paulo Silva Cunha. "SnapKi—An Inertial Easy-to-Adapt Wearable Textile Device for Movement Quantification of Neurological Patients." Sensors 20, no. 14 (2020): 3875.
A
‘Avocado Prices’ analyzed by Analyst-2
analyst-2.ai
Updated Aug 4, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2020). ‘Avocado Prices’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-avocado-prices-3f64/94480f27/?iid=007-438&v=presentation
Explore at:
Dataset updated
Aug 4, 2020
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Avocado Prices’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/neuromusic/avocado-prices on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Context

It is a well known fact that Millenials LOVE Avocado Toast. It's also a well known fact that all Millenials live in their parents basements.

Clearly, they aren't buying home because they are buying too much Avocado Toast!

But maybe there's hope... if a Millenial could find a city with cheap avocados, they could live out the Millenial American Dream.

Content

This data was downloaded from the Hass Avocado Board website in May of 2018 & compiled into a single CSV. Here's how the Hass Avocado Board describes the data on their website:

The table below represents weekly 2018 retail scan data for National retail volume (units) and price. Retail scan data comes directly from retailers’ cash registers based on actual retail sales of Hass avocados. Starting in 2013, the table below reflects an expanded, multi-outlet retail data set. Multi-outlet reporting includes an aggregation of the following channels: grocery, mass, club, drug, dollar and military. The Average Price (of avocados) in the table reflects a per unit (per avocado) cost, even when multiple units (avocados) are sold in bags. The Product Lookup codes (PLU’s) in the table are only for Hass avocados. Other varieties of avocados (e.g. greenskins) are not included in this table.

Some relevant columns in the dataset:

Date - The date of the observation

AveragePrice - the average price of a single avocado

type - conventional or organic

year - the year

Region - the city or region of the observation

Total Volume - Total number of avocados sold

4046 - Total number of avocados with PLU 4046 sold

4225 - Total number of avocados with PLU 4225 sold

4770 - Total number of avocados with PLU 4770 sold

Acknowledgements

Many thanks to the Hass Avocado Board for sharing this data!!

http://www.hassavocadoboard.com/retail/volume-and-price-data

Inspiration

In which cities can millenials have their avocado toast AND buy a home?

Was the Avocadopocalypse of 2017 real?

--- Original source retains full ownership of the source dataset ---
f
Comparison of the methods in this paper on three different datasets.
plos.figshare.com
xls
Updated Feb 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ge Wang; Zikai Sun; Weiyang HU; MengHuan Cai (2025). Comparison of the methods in this paper on three different datasets. [Dataset]. http://doi.org/10.1371/journal.pone.0310992.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0310992.t005
Dataset updated
Feb 26, 2025
Dataset provided by
PLOS ONE
Authors
Ge Wang; Zikai Sun; Weiyang HU; MengHuan Cai
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparison of the methods in this paper on three different datasets.
Average Maximum Afternoon Temperature (F)
s.cnmilf.com
data.seattle.gov
+1more
Updated Feb 28, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Seattle ArcGIS Online (2025). Average Maximum Afternoon Temperature (F) [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/average-maximum-afternoon-temperature-f
Explore at:
Dataset updated
Feb 28, 2025
Dataset provided by
Description
This data layer references data from a high-resolution tree canopy change-detection layer for Seattle, Washington. Tree canopy change was mapped by using remotely sensed data from two time periods (2016 and 2021). Tree canopy was assigned to three classes: 1) no change, 2) gain, and 3) loss. No change represents tree canopy that remained the same from one time period to the next. Gain represents tree canopy that increased or was newly added, from one time period to the next. Loss represents the tree canopy that was removed from one time period to the next. Mapping was carried out using an approach that integrated automated feature extraction with manual edits. Care was taken to ensure that changes to the tree canopy were due to actual change in the land cover as opposed to differences in the remotely sensed data stemming from lighting conditions or image parallax. Direct comparison was possible because land-cover maps from both time periods were created using object-based image analysis (OBIA) and included similar source datasets (LiDAR-derived surface models, multispectral imagery, and thematic GIS inputs). OBIA systems work by grouping pixels into meaningful objects based on their spectral and spatial properties, while taking into account boundaries imposed by existing vector datasets. Within the OBIA environment a rule-based expert system was designed to effectively mimic the process of manual image analysis by incorporating the elements of image interpretation (color/tone, texture, pattern, _location, size, and shape) into the classification process. A series of morphological procedures were employed to ensure that the end product is both accurate and cartographically pleasing. No accuracy assessment was conducted, but the dataset was subjected to manual review and correction.University of Vermont Spatial Analysis LaboratoryThis dataset consists of hexagons 50-acres in area, or several city blocks. The dataset covers the following tree canopy categories:Existing tree canopy percentPossible tree canopy - vegetation percentRelative percent changeAbsolute percent changeAverage maximum afternoon temperature (F)Tree canopy percentage & average afternoon temperature (F)For more information, please see the 2021 Tree Canopy Assessment.
g
Mid-year population estimates
find.eks.integration.govuk.digital
data.europa.eu
csv, json1.0, json2.0 +1
Updated Sep 5, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenDataNI (2024). Mid-year population estimates [Dataset]. https://find.eks.integration.govuk.digital/dataset/f061e414-eab5-4620-881c-cac63f84f09d/mid-year-population-estimates
Explore at:
json2.0, csv, json1.0, xlsxAvailable download formats
Dataset updated
Sep 5, 2024
Dataset authored and provided by
OpenDataNI
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Description of Data Population estimates for the 850 Super Data Zones in Northern Ireland were published on 25th July 2024.

Time Period Estimates are provided for mid-2021 and mid-2022.

Notes: 1. Estimated populations are given as of 30th June for the year noted, rounded to the nearest person. 2. Rounding for estimates at this geographic level is independent. As such, figures may not add to higher geography totals.

Methodology The population estimates for small geographical areas are created from an average of two statistical methods: the ratio change and cohort-component methods. The ratio change method applies the change in secondary (typically administrative) data sources to Census estimates. The 2022 small geographical area estimates use a single statistical dataset which has been created by amalgamating a series of different administrative data sources. This statistical dataset is a de-duplicated admin based estimate for the usually resident population of NI. The cohort-component method updates the Census estimates by ‘ageing on’ populations and applying information on births, deaths and migration. An average of both methods is taken and constrained to the published population figures. Further information is available at: NISRA 2022 Mid-year Population Estimates webpage

Geographic Referencing Population Estimates are based on a large number of secondary datasets. Where the full address was available, the Pointer Address database was used to allocate a unique property reference number (UPRN) and geo-spatial co-ordinates to each home address. These can then be used to map the address to particular geographies. Where it was not possible to assign a unique property reference number to an address using the Pointer database, or where the secondary dataset contained only postcode information, the Central Postcode Directory was used to map home address postcodes to higher geographies. A small proportion of records with unknown geography were apportioned based on the spatial characteristics of known records.

Further Information The next estimates of the population for Northern Ireland will be released later in 2024.

Contact: NISRA Customer Services 02890 255156 census@nisra.gov.uk Responsible Statistician: Jonathan Harvey
C
Covid-19 reproductiegetal
ckan.mobidatalab.eu
dexes.eu
+4more
Updated Jul 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OverheidNl (2023). Covid-19 reproductiegetal [Dataset]. https://ckan.mobidatalab.eu/dataset/12704-covid-19-reproductiegetal
Explore at:
http://publications.europa.eu/resource/authority/file-type/zipAvailable download formats
Dataset updated
Jul 13, 2023
Dataset provided by
OverheidNl
License
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
Description
For English, see below The reproduction number R gives the average number of people infected by one person with COVID-19. To estimate this reproduction number, we use the number of reported COVID-19 hospital admissions per day in the Netherlands. This number of hospital admissions is tracked by the NICE Foundation (National Intensive Care Evaluation). Because a COVID-19 admission is passed on with some delay in the reporting system, we correct the number of admissions for this delay [1]. The first day of illness is known for a large proportion of the reported cases. This information is used to estimate the first day of illness for hospital admissions. By displaying the number of COVID-19 admissions per date of the first day of illness, it is immediately possible to see whether the number of infections is increasing, peaking or decreasing. For the calculation of the reproduction number, it is also necessary to know the length of time between the first day of illness of a COVID-19 case and the first day of illness of his or her infector. This duration is an average of 4 days for SARS-CoV-2 variants in 2020 and 2021, and an average of 3.5 days for more recent variants, calculated on the basis of COVID-19 reports to the GGD. With this information, the value of the reproduction number is calculated as described in Wallinga & Lipsitch 2007 [2]. Until June 12, 2020, the reproduction number was calculated on the basis of COVID-19 hospital admissions, and until March 15, 2023, the reproduction number was calculated on the basis of COVID-19 reports to the GGDs. [1] van de Kassteele J, Eilers PHC, Wallinga J. Nowcasting the Number of New Symptomatic Cases During Infectious Disease Outbreaks Using Constrained P-spline Smoothing. Epidemiology. 2019;30(5):737-745. doi:10.1097/EDE.0000000000001050. [2] Wallinga J, Lipsitch M. How generation intervals shape the relationship between growth rates and reproductive numbers. Proc Biol Sci. 2007;274(1609):599-604. doi:10.1098/rspb.2006.3754. Description of the variables: Version: Version number of the dataset. When the content of the dataset is structurally changed (so not the daily update or a correction at record level), the version number will be adjusted (+1) and also the corresponding metadata in RIVMdata (https://data.rivm.nl) . Version 2 update (February 8, 2022): - In the calculation of the reproduction number, the date of the positive test result is now used instead of the GGD notification date. Version 3 update (February 17, 2022): - The calculation of the reproduction number now takes into account different generation times for different variants. For the variants up to and including Delta, the average generation time is 4 days, from Omikron it is 3.5 days. The reproduction number published here is a weighted average of the reproduction numbers per variant. Version 4 update (September 1, 2022): - From September 1, 2022, this dataset is split into two parts. The first part contains the dates from the start of the pandemic to October 3, 2021 (week 39) and contains "tm" in the file name. This data will no longer be updated. The second part contains the data from October 4, 2021 (week 40) and is updated every Tuesday and Friday. - Until August 31, the published reproduction number was calculated with the data of the day before publication. From September 1, the published reproduction number is calculated with the data of the day of publication. Version 5 update (March 31, 2023): - From March 15, 2023, the reproduction number is calculated based on COVID-19 hospital admissions according to the NICE hospital registration. From June 13, 2020 to March 14, 2023, the reproduction number was calculated on the basis of COVID-19 reports to the GGD. However, the number of reports is strongly determined by the test policy, and is less suitable as a basis for calculating the reproduction number due to the adjusted test policy as of March 10, 2023 and the closure of the GGD test lanes as of March 17, 2023. Until 12 June 2020, the reproduction number was also calculated on the basis of hospital admissions, but then as reported to the GGD. Date: Date for which the reproduction number was estimated Rt_low: Lower bound 95% confidence interval Rt_avg: Estimated reproduction number Rt_up: Upper bound 95% confidence interval population: patient population with value “hosp” for hospitalized patients or “testpos” for test positive patients For recent R estimates, the reliability is not great, because the reliability depends on the time between infection and becoming ill and the time between becoming ill and reporting. Therefore, the variable Rt_avg is absent in the last two weeks. -------------------------------------------------- --------------------------------------------- Covid-19 reproduction number The reproduction number R gives the average number of people infected by one person with COVID-19. To estimate this reproduction number, we use the number of reported COVID-19 hospital admissions per day in the Netherlands. This number of hospital admissions is tracked by the NICE Foundation (National Intensive Care Evaluation). Because a COVID-19 admission is reported with some delay in the reporting system, we correct the number of admissions for this delay [1]. The first day of illness is known for a large proportion of the reported cases. This information is used to estimate the first day of illness for hospital admissions. By displaying the number of COVID-19 admissions per date of the first day of illness, it is immediately possible to see whether the number of infections is increasing, peaking or decreasing. To calculate the reproduction number, it is also necessary to know the length of time between the first day of illness of a COVID-19 case and the first day of illness of his or her infector. This duration is an average of 4 days for SARS-CoV-2 variants in 2020 and 2021, and an average of 3.5 days for more recent variants, calculated on the basis of COVID-19 reports to the PHS. With this information, the value of the reproduction number is calculated as described in Wallinga & Lipsitch 2007 [2]. Until June 12, 2020, the reproduction number was calculated on the basis of COVID-19 hospital admissions, and until March 15, 2023, the reproduction number was calculated on the basis of COVID-19 reports to the GGDs. [1] van de Kassteele J, Eilers PHC, Wallinga J. Nowcasting the Number of New Symptomatic Cases During Infectious Disease Outbreaks Using Constrained P-spline Smoothing. Epidemiology. 2019;30(5):737-745. doi:10.1097/EDE.0000000000001050. [2] Wallinga J, Lipsitch M. How generation intervals shape the relationship between growth rates and reproductive numbers. Proc Biol Sci. 2007;274(1609):599-604. doi:10.1098/rspb.2006.3754. Description of the variables: Version: Version number of the dataset. When the content of the dataset is structurally changed (so not the daily update or a correction at record level), the version number will be adjusted (+1) and also the corresponding metadata in RIVMdata (https://data.rivm.nl). Version 2 update (February 8, 2022): - In the calculation of the reproduction number, the date of the positive test result is now used instead of the PHS notification date. Version 3 update (February 17, 2022): - The calculation of the reproduction number now takes into account different generation times for different variants. For the variants up to and including Delta, the average generation time is 4 days, from Omikron it is 3.5 days. The reproduction number published here is a weighted average of the reproduction numbers per variant. Version 4 update (September 1, 2022): - As of September 1, 2022, this dataset is split into two parts. The first part contains the dates from the start of the pandemic till October 3, 2021 (week 39) and contains "tm" in the file name. This data will no longer be updated. The second part contains the data from October 4, 2021 (week 40) and is updated every Tuesday and Friday. - Until August 31, the published reproduction number was calculated with the data of the day before publication. From September 1, the published reproduction number is calculated with the data of the day of publication. Version 5 update (March 31, 2023): - As of March 15, 2023, the reproduction number is calculated based on COVID-19 hospital admissions according to the NICE hospital registry. From June 13, 2020 to March 14, 2023, the reproduction number was calculated on the basis of COVID-19 reports to the PHS. However, the number of reports is strongly determined by the test policy, and is less suitable as a basis for calculating the reproduction number due to the adjusted test policy as of March 10, 2023 and the closure of the PHS test lanes as of March 17, 2023. Until 12 June 2020, the reproduction number was also calculated on the basis of hospital admissions, but then as reported to the PHS. Date: Date for which the reproduction number was estimated Rt_low: Lower limit 95% confidence interval Rt_avg: Estimated reproduction number Rt_up: Upper bound 95% confidence interval population: patient population with value “hosp” for hospitalized patients or “testpos” for test positive patients For recent R estimates, the reliability is not great, because the reliability depends on the time between infection and becoming ill and the time between becoming ill and reporting. Therefore, the variable Rt_avg is absent in the last two weeks.
d
Charts of climate statistics and MODIS data for all Bioregional Assessment...
data.gov.au
researchdata.edu.au
+2more
Updated Nov 20, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2019). Charts of climate statistics and MODIS data for all Bioregional Assessment subregions [Dataset]. https://data.gov.au/data/dataset/groups/8a1c5f43-b150-4357-aa25-5f301b1a02e1
Explore at:
Dataset updated
Nov 20, 2019
Dataset provided by
Bioregional Assessment Program
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Abstract

This dataset was derived by the Bioregional Assessment Programme from 'Mean climate variables for all subregions' and 'fPAR derived from MODIS for BA subregions'. You can find a link to the parent datasets in the Lineage Field in this metadata statement. The History Field in this metadata statement describes how this dataset was derived.

These are charts of climate statistics and MODIS data for each BA subregion. There are six 600dpi PNG files per subregion, with the naming convention BA-[regioncode]-[subregioncode]-[chartname].png. The charts, according to their filename, are: rain (time-series of rainfall; Figure 1), P-PET (average monthly precipitation and potential evapotranspiration; Figure 2), 5line (assorted monthly statistics; Figure 3), trend (monthly long-term trends; Figure 4) and fPAR (fraction of photosynthetically available radiation - an indication of biomass; Figure 5).

This version was created on 18 November 2014, using data that accounted for a modified boundary for the Gippsland Basin bioregion and the combination of two subregions to form the Sydney Basin bioregion.

Purpose

These charts were generated to be included in the Contextual Report (geography) for each subregion.

Dataset History

These charts were generated using MatPlotLib 1.3.0 in Python 2.7.5 (Anaconda distribution v1.7.0 32-bit).

The script for generating these plots is BA-ClimateCharts.py, and is packaged with the dataset. This script is a data collection and chart drawing script, it does not do any analysis. The data are charted as they appear in the parent datasets (see Lineage). A word document (BA-ClimateGraphs-ReadMe) is also included. This document includes examples of, and approved captions for, each chart.

Dataset Citation

Bioregional Assessment Programme (2014) Charts of climate statistics and MODIS data for all Bioregional Assessment subregions. Bioregional Assessment Derived Dataset. Viewed 14 June 2018, http://data.bioregionalassessments.gov.au/dataset/8a1c5f43-b150-4357-aa25-5f301b1a02e1.

Dataset Ancestors

Derived From Mean climate variables for all subregions

Derived From BILO Gridded Climate Data: Daily Climate Data for each year from 1900 to 2012

Derived From fPar derived from MODIS for BA subregions
u
A Suite of Perturbed Parameter Ensembles using CESM2.2 CAM6 under a Wide...
rda.ucar.edu
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
A Suite of Perturbed Parameter Ensembles using CESM2.2 CAM6 under a Wide Range of Temperatures [Dataset]. https://rda.ucar.edu/lookfordata/datasets/?nb=y&b=topic&v=Atmosphere
Explore at:
Description
This dataset originates from a new CESM2 CAM6 perturbed parameter ensemble (PPE) designed to explore climate and hydroclimate dynamics under a wide range of sea surface temperature (SST) conditions. The ... SST varies from 4 degrees Celsius colder to 16 degrees Celsius warmer than preindustrial levels, encompassing a broad spectrum of mean temperatures spanning the past 65 million years. This dataset offers valuable insights into climate and hydroclimate responses, as well as weather and climate extremes under diverse conditions. The dataset includes results from nine PPE simulations with different SST scenarios: preindustrial (PREI), 4K cooler (M04K), and 4K, 8K, 12K, and 16K warmer (P04K to P16K). For SSTs exceeding 8K warming, sea ice was removed to improve numerical stability. Each PPE set consists of 250 ensemble members, with 45 parameters related to microphysics, convection, turbulence, and aerosols perturbed using Latin Hypercube Sampling. An additional simulation with default parameter settings brings the total to 251 simulations, each running for five years using CAM6.3 (https://github.com/ESCOMP/CAM/tree/cam6_3_026; with additional paleo modifications). Post-processing converted the data into compressed NetCDF-4 format. All 251 runs were concatenated using ncecat to minimize the number of files. For example, the following file contains monthly surface temperature data from the preindustrial PPE: f.c6.F1850.f19_f19.paleo_ppe.sst_prei.ens251/atm/proc/tseries/month_1/f.c6.F1850.f19_f19.paleo_ppe.sst_prei.ens251.cam.h0.TS.000101-000512.nc Parameter values are provided in the PPE Parameter File.

Facebook

Twitter

Click to copy link

Link copied

Cite

Patrick M. Buck; Sandeep Kumar; Satish K. Singh (2023). Average properties of predicted APRs in different datasets. [Dataset]. http://doi.org/10.1371/journal.pcbi.1003291.t003

Average properties of predicted APRs in different datasets.

Explore at:

xlsAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pcbi.1003291.t003

Dataset updated

Jun 5, 2023

Dataset provided by

PLOS Computational Biology

Authors

Patrick M. Buck; Sandeep Kumar; Satish K. Singh

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

*Average length of sequences that contain at least one TANGO/WALTZ predicted APRs.†Average proportion of APR residues in the sequences contained in different datasets was computed as described in methods.Sequences that do not contain APRs were excluded from both the average length and average proportion calculations. Standard deviations (σ) are reported for all averages. Standard error (SE) of the mean can be computed as σ/√(number of sequences; see Table 2) with 95% confidence intervals (average ± SE*1.96).

Clear search

Close search

Google apps

Main menu

Average properties of predicted APRs in different datasets.

Global datasets to evaluate a multi-sensor approach for observation of...

Matlab script Stress2Grid - Dataset - B2FIND

Income Distribution by Quintile: Mean Household Income in Two Rivers...

About this dataset

Content

Inspiration

Recommended for further research

Example Groundwater-Level Datasets and Benchmarking Results for the...

Seasonal Average Wind Speed - Projections (5km)

Data from: Median Household Income

Data from: An Uncertainty-Aware Approach to Optimal Configuration of Stream...

2010 County and City-Level Water-Use Data and Associated Explanatory...

Global - Annual Average Methane Concentration - Dataset - ENERGYDATA.INFO

ERA5 hourly data on single levels from 1940 to present

Data for Calculating Efficient Outdoor Water Uses

PhysioIntent: Multimodal dataset for human intention prediction research

‘Avocado Prices’ analyzed by Analyst-2

Context

Content

Acknowledgements

Inspiration

Comparison of the methods in this paper on three different datasets.

Average Maximum Afternoon Temperature (F)

Mid-year population estimates

Covid-19 reproductiegetal

Charts of climate statistics and MODIS data for all Bioregional Assessment...

Abstract

Purpose

Dataset History

Dataset Citation

Dataset Ancestors

A Suite of Perturbed Parameter Ensembles using CESM2.2 CAM6 under a Wide...

Average properties of predicted APRs in different datasets.