100+ datasets found

m
Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First...
data.mendeley.com
Updated Jul 20, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hasmot Ali (2020). Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First Death [Dataset]. http://doi.org/10.17632/vw427wzzkk.4
Explore at:
Unique identifier
https://doi.org/10.17632/vw427wzzkk.4
Dataset updated
Jul 20, 2020
Authors
Hasmot Ali
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Contain informative data related to COVID-19 pandemic. Specially, figure out about the First Case and First Death information for every single country. First Case information consist of Date of First Case(s), Number of confirm Case(s) at First Day, Age of the patient(s) of First Case, Last Visited Country and the First Death information consist of Date of First Death and Age of the Patient who died first for every Country mentioning corresponding Continent. The datasets also contain the Binary Matrix of spread chain among different country and region.
Cover smart, do your part, slow the spread. My Stay-at-Home Lab Shows How...
catalog.data.gov
datasets.ai
+1more
Updated Jul 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2022). Cover smart, do your part, slow the spread. My Stay-at-Home Lab Shows How Face Coverings Can Slow the Spread of Disease [Dataset]. https://catalog.data.gov/dataset/cover-smart-do-your-part-slow-the-spread-my-stay-at-home-lab-shows-how-face-coverings-can--e4a9e
Explore at:
Dataset updated
Jul 29, 2022
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
This dataset illustrates the fluid dynamics of human coughing and breathing by using schlieren imaging. This dataset was used to help inform the general public about the importance of face coverings during the COVID-19 global pandemic.
Data from: Modeling the Spread of a Livestock Disease With Semi-Supervised...
catalog.data.gov
Updated Mar 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2024). Data from: Modeling the Spread of a Livestock Disease With Semi-Supervised Spatiotemporal Deep Neural Networks [Dataset]. https://catalog.data.gov/dataset/data-from-modeling-the-spread-of-a-livestock-disease-with-semi-supervised-spatiotemporal-d-bdd33
Explore at:
Dataset updated
Mar 30, 2024
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Description
This dataset contains the spatiotemporal data used to train the spatiotemporal deep neural networks described in "Modeling the Spread of a Livestock Disease With Semi-Supervised Spatiotemporal Deep Neural Networks". The dataset consists of two sets of NumPy arrays. The first set: X_grid.npy and Y_grid.npy were used to train the convolutional LSTM, while the second set: X_graph.npy, Y_graph.npy, and edge_index.npy were used to train the graph convolutional LSTM. The data consists of spatiotemporally varying environmental and anthropogenic variables along with case reports of vesicular stomatitis. Resources in this dataset:Resource Title: NumPy Arrays of Spatiotemporal Features and VS Cases. File Name: vs_data.zipResource Description: This is a ZIP archive containing five NumPy arrays of spatiotemporal features and geotagged VS cases.Resource Software Recommended: NumPy,url: https://numpy.org/
q
MATLAB code and output files for integral, mean and covariance of the...
researchdatafinder.qut.edu.au
Updated Jul 25, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dr Matthew Adams (2022). MATLAB code and output files for integral, mean and covariance of the simplex-truncated multivariate normal distribution [Dataset]. https://researchdatafinder.qut.edu.au/display/n20044
Explore at:
Dataset updated
Jul 25, 2022
Dataset provided by
Queensland University of Technology (QUT)
Authors
Dr Matthew Adams
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Compositional data, which is data consisting of fractions or probabilities, is common in many fields including ecology, economics, physical science and political science. If these data would otherwise be normally distributed, their spread can be conveniently represented by a multivariate normal distribution truncated to the non-negative space under a unit simplex. Here this distribution is called the simplex-truncated multivariate normal distribution. For calculations on truncated distributions, it is often useful to obtain rapid estimates of their integral, mean and covariance; these quantities characterising the truncated distribution will generally possess different values to the corresponding non-truncated distribution.

In the paper Adams, Matthew (2022) Integral, mean and covariance of the simplex-truncated multivariate normal distribution. PLoS One, 17(7), Article number: e0272014. https://eprints.qut.edu.au/233964/, three different approaches that can estimate the integral, mean and covariance of any simplex-truncated multivariate normal distribution are described and compared. These three approaches are (1) naive rejection sampling, (2) a method described by Gessner et al. that unifies subset simulation and the Holmes-Diaconis-Ross algorithm with an analytical version of elliptical slice sampling, and (3) a semi-analytical method that expresses the integral, mean and covariance in terms of integrals of hyperrectangularly-truncated multivariate normal distributions, the latter of which are readily computed in modern mathematical and statistical packages. Strong agreement is demonstrated between all three approaches, but the most computationally efficient approach depends strongly both on implementation details and the dimension of the simplex-truncated multivariate normal distribution.

This dataset consists of all code and results for the associated article.
Spread of the ICC - Earmarked - Dataset - Banco Central do Brasil Open Data...
opendata.bcb.gov.br
Updated Jan 15, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
opendata.bcb.gov.br (2018). Spread of the ICC - Earmarked - Dataset - Banco Central do Brasil Open Data Portal [Dataset]. https://opendata.bcb.gov.br/dataset/27449-spread-of-the-icc---earmarked
Explore at:
Dataset updated
Jan 15, 2018
Dataset provided by
Central Bank of Brazilhttp://www.bc.gov.br/
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
Concept: Difference between average cost of outstanding loans (ICC) and its average funding cost. Comprises both earmarked and nonearmarked operations. Source: Central Bank of Brazil – Statistics Department 27449-spread-of-the-icc---earmarked 27449-spread-of-the-icc---earmarked
n
ECMWF ERA5: 10 ensemble member surface level analysis parameter data
data-search.nerc.ac.uk
catalogue.ceda.ac.uk
Updated Dec 8, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). ECMWF ERA5: 10 ensemble member surface level analysis parameter data [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/search?keyword=ensemble%20runs
Explore at:
Dataset updated
Dec 8, 2023
Description
This dataset contains ERA5 surface level analysis parameter data from 10 ensemble runs. ERA5 is the 5th generation reanalysis project from the European Centre for Medium-Range Weather Forecasts (ECWMF) - see linked documentation for further details. The ensemble members were used to derive means and spread data (see linked datasets). Ensemble means and spreads were calculated from the ERA5t 10 member ensemble, run at a reduced resolution compared with the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation, for which these data have been produced to provide an uncertainty estimate. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record. Note, ensemble standard deviation is often referred to as ensemble spread and is calculated as the standard deviation of the 10-members in the ensemble (i.e., including the control). It is not the sample standard deviation, and thus were calculated by dividing by 10 rather than 9 (N-1). See linked datasets for ensemble member and ensemble mean data. The ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects. An initial release of ERA5 data (ERA5t) is made roughly 5 days behind the present date. These will be subsequently reviewed ahead of being released by ECMWF as quality assured data within 3 months. CEDA holds a 6 month rolling copy of the latest ERA5t data. See related datasets linked to from this record. However, for the period 2000-2006 the initial ERA5 release was found to suffer from stratospheric temperature biases and so new runs to address this issue were performed resulting in the ERA5.1 release (see linked datasets). Note, though, that Simmons et al. 2020 (technical memo 859) report that "ERA5.1 is very close to ERA5 in the lower and middle troposphere." but users of data from this period should read the technical memo 859 for further details.
India Chocolate Spread Export Data, List of Chocolate Spread Exporters in...
seair.co.in
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim, India Chocolate Spread Export Data, List of Chocolate Spread Exporters in India [Dataset]. https://www.seair.co.in
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset provided by
Seair Exim Solutions
Authors
Seair Exim
Area covered
India
Description
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
o
Spread Oak Lane Cross Street Data in Waterbury, CT
ownerly.com
Updated May 31, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ownerly (2022). Spread Oak Lane Cross Street Data in Waterbury, CT [Dataset]. https://www.ownerly.com/ct/waterbury/spread-oak-ln-home-details
Explore at:
Dataset updated
May 31, 2022
Dataset authored and provided by
Ownerly
Area covered
Connecticut, Waterbury
Description
This dataset provides information about the number of properties, residents, and average property values for Spread Oak Lane cross streets in Waterbury, CT.
Wildfire Risk to Communities Housing Unit Density (Image Service)
catalog.data.gov
s.cnmilf.com
+5more
Updated Nov 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Forest Service (2024). Wildfire Risk to Communities Housing Unit Density (Image Service) [Dataset]. https://catalog.data.gov/dataset/wildfire-risk-to-communities-housing-unit-density-image-service-fac22
Explore at:
Dataset updated
Nov 2, 2024
Dataset provided by
U.S. Department of Agriculture Forest Servicehttp://fs.fed.us/
Description
The data included in this publication depict components of wildfire risk specifically for populated areas in the United States. These datasets represent where people live in the United States and the in situ risk from wildfire, i.e., the risk at the location where the adverse effects take place.National wildfire hazard datasets of annual burn probability and fire intensity, generated by the USDA Forest Service, Rocky Mountain Research Station and Pyrologix LLC, form the foundation of the Wildfire Risk to Communities data. Vegetation and wildland fuels data from LANDFIRE 2020 (version 2.2.0) were used as input to two different but related geospatial fire simulation systems. Annual burn probability was produced with the USFS geospatial fire simulator (FSim) at a relatively coarse cell size of 270 meters (m). To bring the burn probability raster data down to a finer resolution more useful for assessing hazard and risk to communities, we upsampled them to the native 30 m resolution of the LANDFIRE fuel and vegetation data. In this upsampling process, we also spread values of modeled burn probability into developed areas represented in LANDFIRE fuels data as non-burnable. Burn probability rasters represent landscape conditions as of the end of 2020. Fire intensity characteristics were modeled at 30 m resolution using a process that performs a comprehensive set of FlamMap runs spanning the full range of weather-related characteristics that occur during a fire season and then integrates those runs into a variety of results based on the likelihood of those weather types occurring. Before the fire intensity modeling, the LANDFIRE 2020 data were updated to reflect fuels disturbances occurring in 2021 and 2022. As such, the fire intensity datasets represent landscape conditions as of the end of 2022. The data products in this publication that represent where people live, reflect 2021 estimates of housing unit and population counts from the U.S. Census Bureau, combined with building footprint data from Onegeo and USA Structures, both reflecting 2022 conditions.The specific raster datasets included in this publication include:Building Count: Building Count is a 30-m raster representing the count of buildings in the building footprint dataset located within each 30-m pixel.Building Density: Building Density is a 30-m raster representing the density of buildings in the building footprint dataset (buildings per square kilometer [km²]).Building Coverage: Building Coverage is a 30-m raster depicting the percentage of habitable land area covered by building footprints.Population Count (PopCount): PopCount is a 30-m raster with pixel values representing residential population count (persons) in each pixel.Population Density (PopDen): PopDen is a 30-m raster of residential population density (people/km²).Housing Unit Count (HUCount): HUCount is a 30-m raster representing the number of housing units in each pixel.Housing Unit Density (HUDen): HUDen is a 30-m raster of housing-unit density (housing units/km²).Housing Unit Exposure (HUExposure): HUExposure is a 30-m raster that represents the expected number of housing units within a pixel potentially exposed to wildfire in a year. This is a long-term annual average and not intended to represent the actual number of housing units exposed in any specific year.Housing Unit Impact (HUImpact): HUImpact is a 30-m raster that represents the relative potential impact of fire to housing units at any pixel, if a fire were to occur. It is an index that incorporates the general consequences of fire on a home as a function of fire intensity and uses flame length probabilities from wildfire modeling to capture likely intensity of fire.Housing Unit Risk (HURisk): HURisk is a 30-m raster that integrates all four primary elements of wildfire risk - likelihood, intensity, susceptibility, and exposure - on pixels where housing unit density is greater than zero.Additional methodology documentation is provided with the data publication download. Metadata and Downloads.Note: Pixel values in this image service have been altered from the original raster dataset due to data requirements in web services. The service is intended primarily for data visualization. Relative values and spatial patterns have been largely preserved in the service, but users are encouraged to download the source data for quantitative analysis.
Spread of the ICC - Non-revolving operations - Households - Dataset - Banco...
opendata.bcb.gov.br
Updated Jun 20, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
bcb.gov.br (2018). Spread of the ICC - Non-revolving operations - Households - Dataset - Banco Central do Brasil Open Data Portal [Dataset]. https://opendata.bcb.gov.br/dataset/27697-spread-of-the-icc---non-revolving-operations---households
Explore at:
Dataset updated
Jun 20, 2018
Dataset provided by
Central Bank of Brazilhttp://www.bc.gov.br/
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
Concept: -- to be defined -- Source: Central Bank of Brazil - Statistics Department 27697-spread-of-the-icc---non-revolving-operations---households 27697-spread-of-the-icc---non-revolving-operations---households
Global Data Set on Spread of COVID-19 and Ambient Temperature
zenodo.org
explore.openaire.eu
csv
Updated Aug 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tahira Jamil; Tahira Jamil; Carlos M. Duarte; Carlos M. Duarte (2020). Global Data Set on Spread of COVID-19 and Ambient Temperature [Dataset]. http://doi.org/10.5281/zenodo.3981482
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3981482
Dataset updated
Aug 13, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Tahira Jamil; Tahira Jamil; Carlos M. Duarte; Carlos M. Duarte
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Novel Coronavirus (COVID-19) daily data of confirmed cases for affected countries and provinces of China reported between 31st December 2019 and 31st May 2020. The data was collected from the European Centre for Disease Prevention and Control (ECDC), and John Hopkin CSSA.

The monthly mean temperature of February to May 2020 of capital cities for the various nations.
n
ECMWF ERA5: ensemble spreads of surface level analysis parameter data
data-search.nerc.ac.uk
catalogue.ceda.ac.uk
Updated Sep 16, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). ECMWF ERA5: ensemble spreads of surface level analysis parameter data [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/search?keyword=ERA5
Explore at:
Dataset updated
Sep 16, 2021
Description
This dataset contains ensemble spreads for the ERA5 surface level analysis parameter data ensemble means (see linked dataset). ERA5 is the 5th generation reanalysis project from the European Centre for Medium-Range Weather Forecasts (ECWMF) - see linked documentation for further details. The ensemble means and spreads are calculated from the ERA5 10 member ensemble, run at a reduced resolution compared with the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation, for which these data have been produced to provide an uncertainty estimate. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record. Note, ensemble standard deviation is often referred to as ensemble spread and is calculated as the standard deviation of the 10-members in the ensemble (i.e., including the control). It is not the sample standard deviation, and thus were calculated by dividing by 10 rather than 9 (N-1). See linked datasets for ensemble member and ensemble mean data. The ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects. An initial release of ERA5 data (ERA5t) is made roughly 5 days behind the present date. These will be subsequently reviewed ahead of being released by ECMWF as quality assured data within 3 months. CEDA holds a 6 month rolling copy of the latest ERA5t data. See related datasets linked to from this record. However, for the period 2000-2006 the initial ERA5 release was found to suffer from stratospheric temperature biases and so new runs to address this issue were performed resulting in the ERA5.1 release (see linked datasets). Note, though, that Simmons et al. 2020 (technical memo 859) report that "ERA5.1 is very close to ERA5 in the lower and middle troposphere." but users of data from this period should read the technical memo 859 for further details.
n
ECMWF ERA5t: ensemble spreads of surface level analysis parameter data
data-search.nerc.ac.uk
catalogue.ceda.ac.uk
Updated Jul 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). ECMWF ERA5t: ensemble spreads of surface level analysis parameter data [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/search?format=Data%20are%20netCDF%20formatted%20with%20internal%20compression.
Explore at:
Dataset updated
Jul 28, 2021
Description
This dataset contains ensemble spreads for the ERA5 initial release (ERA5t) surface level analysis parameter data ensemble means (see linked dataset). ERA5t is the European Centre for Medium-Range Weather Forecasts (ECWMF) ERA5 reanalysis project initial release available upto 5 days behind the present data. CEDA will maintain a 6 month rolling archive of these data with overlap to the verified ERA5 data - see linked datasets on this record. The ensemble means and spreads are calculated from the ERA5t 10 member ensemble, run at a reduced resolution compared with the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation, for which these data have been produced to provide an uncertainty estimate. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record. Note, ensemble standard deviation is often referred to as ensemble spread and is calculated as the standard deviation of the 10-members in the ensemble (i.e., including the control). It is not the sample standard deviation, and thus were calculated by dividing by 10 rather than 9 (N-1). See linked datasets for ensemble member and ensemble mean data. The ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects. An initial release of ERA5 data (ERA5t) is made roughly 5 days behind the present date. These will be subsequently reviewed and, if required, amended before the full ERA5 release. CEDA holds a 6 month rolling copy of the latest ERA5t data. See related datasets linked to from this record.
n
ECMWF ERA5t: 10 ensemble member surface level analysis parameter data
data-search.nerc.ac.uk
catalogue.ceda.ac.uk
Updated Dec 8, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). ECMWF ERA5t: 10 ensemble member surface level analysis parameter data [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/search?keyword=ensemble%20runs
Explore at:
Dataset updated
Dec 8, 2023
Description
This dataset contains ERA5 initial release (ERA5t) surface level analysis parameter data from 10 member ensemble runs. ERA5t is the European Centre for Medium-Range Weather Forecasts (ECWMF) ERA5 reanalysis project initial release available upto 5 days behind the present data. CEDA will maintain a 6 month rolling archive of these data with overlap to the verified ERA5 data - see linked datasets on this record. Ensemble means and spreads were calculated from the ERA5t 10 member ensemble, run at a reduced resolution compared with the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation, for which these data have been produced to provide an uncertainty estimate. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record. See linked datasets for ensemble member and spread data. Note, ensemble standard deviation is often referred to as ensemble spread and is calculated as the standard deviation of the 10-members in the ensemble (i.e., including the control). It is not the sample standard deviation, and thus were calculated by dividing by 10 rather than 9 (N-1). See linked datasets for ensemble mean and ensemble spread data. The ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects. An initial release of ERA5 data (ERA5t) is made roughly 5 days behind the present date. These will be subsequently reviewed and, if required, amended before the full ERA5 release. CEDA holds a 6 month rolling copy of the latest ERA5t data. See related datasets linked to from this record.
S
Data from: Accuracy of climate-based forecasts of pathogen spread
data.subak.org
data.niaid.nih.gov
+2more
csv
Updated Feb 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Georgia (2023). Data from: Accuracy of climate-based forecasts of pathogen spread [Dataset]. https://data.subak.org/dataset/data-from-accuracy-of-climate-based-forecasts-of-pathogen-spread
Explore at:
csvAvailable download formats
Dataset updated
Feb 16, 2023
Dataset provided by
University of Georgia
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Species distribution models (SDMs) are a tool for predicting the eventual geographical range of an emerging pathogen. Most SDMs, however, rely on an assumption of equilibrium with the environment, which an emerging pathogen, by definition, has not reached. To determine if some SDM approaches work better than others for modelling the spread of emerging, non-equilibrium pathogens, we studied time-sensitive predictive performance of SDMs for Batrachochytrium dendrobatidis, a devastating infectious fungus of amphibians, using multiple methods trained on time-incremented subsets of the available data. We split our data into timeline-based training and testing sets, and evaluated models on each set using standard performance criteria, including AUC, kappa, false negative rate and the Boyce index. Of eight models examined, we found that boosted regression trees and random forests performed best, closely followed by MaxEnt. As expected, predictive performance generally improved with the length of time series used for model training. These results provide information on how quickly the potential extent of an emerging disease may be determined, and identify which modelling frameworks are likely to provide useful information during the early phases of pathogen expansion.
d
Dataplex: Reddit Data | Consumer Behavior Data | 2.1M+ subreddits: trends,...
datarade.ai
.json, .csv
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataplex, Dataplex: Reddit Data | Consumer Behavior Data | 2.1M+ subreddits: trends, audience insights + more | Ideal for Interest-Based Segmentation [Dataset]. https://datarade.ai/data-products/dataplex-reddit-data-consumer-behavior-data-2-1m-subred-dataplex
Explore at:
.json, .csvAvailable download formats
Dataset authored and provided by
Dataplex
Area covered
Tunisia, Saint Barthélemy, Cuba, Cocos (Keeling) Islands, Netherlands, Lithuania, Burkina Faso, Togo, Belize, Croatia
Description
The Reddit Subreddit Dataset by Dataplex offers a comprehensive and detailed view of Reddit’s vast ecosystem, now enhanced with appended AI-generated columns that provide additional insights and categorization. This dataset includes data from over 2.1 million subreddits, making it an invaluable resource for a wide range of analytical applications, from social media analysis to market research.

Dataset Overview:

This dataset includes detailed information on subreddit activities, user interactions, post frequency, comment data, and more. The inclusion of AI-generated columns adds an extra layer of analysis, offering sentiment analysis, topic categorization, and predictive insights that help users better understand the dynamics of each subreddit.

2.1 Million Subreddits with Enhanced AI Insights: The dataset covers over 2.1 million subreddits and now includes AI-enhanced columns that provide: - Sentiment Analysis: AI-driven sentiment scores for posts and comments, allowing users to gauge community mood and reactions. - Topic Categorization: Automated categorization of subreddit content into relevant topics, making it easier to filter and analyze specific types of discussions. - Predictive Insights: AI models that predict trends, content virality, and user engagement, helping users anticipate future developments within subreddits.

Sourced Directly from Reddit:

All data in this dataset is sourced directly from Reddit, ensuring accuracy and authenticity. The dataset is updated regularly, reflecting the latest trends and user interactions on the platform. This ensures that users have access to the most current and relevant data for their analyses.

Key Features:

Subreddit Metrics: Detailed data on subreddit activity, including the number of posts, comments, votes, and user participation.

User Engagement: Insights into how users interact with content, including comment threads, upvotes/downvotes, and participation rates.

Trending Topics: Track emerging trends and viral content across the platform, helping you stay ahead of the curve in understanding social media dynamics.

AI-Enhanced Analysis: Utilize AI-generated columns for sentiment analysis, topic categorization, and predictive insights, providing a deeper understanding of the data.

Use Cases:

Social Media Analysis: Researchers and analysts can use this dataset to study online behavior, track the spread of information, and understand how content resonates with different audiences.

Market Research: Marketers can leverage the dataset to identify target audiences, understand consumer preferences, and tailor campaigns to specific communities.

Content Strategy: Content creators and strategists can use insights from the dataset to craft content that aligns with trending topics and user interests, maximizing engagement.

Academic Research: Academics can explore the dynamics of online communities, studying everything from the spread of misinformation to the formation of online subcultures.

Data Quality and Reliability:

The Reddit Subreddit Dataset emphasizes data quality and reliability. Each record is carefully compiled from Reddit’s vast database, ensuring that the information is both accurate and up-to-date. The AI-generated columns further enhance the dataset's value, providing automated insights that help users quickly identify key trends and sentiments.

Integration and Usability:

The dataset is provided in a format that is compatible with most data analysis tools and platforms, making it easy to integrate into existing workflows. Users can quickly import, analyze, and utilize the data for various applications, from market research to academic studies.

User-Friendly Structure and Metadata:

The data is organized for easy navigation and analysis, with metadata files included to help users identify relevant subreddits and data points. The AI-enhanced columns are clearly labeled and structured, allowing users to efficiently incorporate these insights into their analyses.

Ideal For:

Data Analysts: Conduct in-depth analyses of subreddit trends, user engagement, and content virality. The dataset’s extensive coverage and AI-enhanced insights make it an invaluable tool for data-driven research.

Marketers: Use the dataset to better understand your target audience, tailor campaigns to specific interests, and track the effectiveness of marketing efforts across Reddit.

Researchers: Explore consumer behavior data of online communities, analyze the spread of ideas and information, and study the impact of digital media on public discourse, all while leveraging AI-generated insights.

This dataset is an essential resource for anyone looking to understand the intricacies of Reddit's vast ecosystem, offering the data and AI-enhanced insights needed to drive informed decisions and strategies across various fields. Whether you’re tracking emerging trends, analyzing user behavior, or conducting acade...
Gender breakdown by Grade within Grade Stream - Dataset - data.gov.ie
data.gov.ie
Updated Dec 31, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.gov.ie (2022). Gender breakdown by Grade within Grade Stream - Dataset - data.gov.ie [Dataset]. https://data.gov.ie/dataset/gender-breakdown-by-grade-within-grade-stream
Explore at:
Dataset updated
Dec 31, 2022
Dataset provided by
data.gov.ie
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
The Department actively seeks to expand the range of datasets it shares through the Government’s Open Data portal. In this regard, it is the first government department to publish gender-based statistics. This data is drawn from the Department’s HR database system and provides on a headcount basis, a female-male breakdown at grade and grade equivalent level. These statistics will be published annually, using the position at the end of December. For baseline and comparison purposes, and to assist the reader in forming a picture of how the Department’s gender balance has evolved over the last 20+ years, we have also provided these reports as at December 2000, 2010, 2020, 2021, and 2022. The Department is quite unique in terms of the broad range of grades of its staff. At the end of 2022 there were 75 grades spread across three distinct grade streams; General Service, Professional & Technical, Industrial. We have uploaded a table which provides a breakdown of these grades within their respective streams and which, in the case of the Professional & Technical grades, also shows their General Service grade equivalent. By way of illustration of the diversity of the grades in the Department, our headcount at the end of December 2022 was 1,604 of which 905 were General Service staff serving across 18 different grades and representing 56.42% of our workforce. The Professional & Technical headcount was 533. These staff were spread across 44 grades and accounted for 33.23% of our staffing complement at that time. We had 166 Industrial staff at the end of December. These staff represented 10.35% of our workforce and were spread across 13 different grades.
Data from: CASSINI S INMS TELEMETRY PACKET DATA V1.0
catalog.data.gov
gimi9.com
+1more
Updated Dec 7, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Aeronautics and Space Administration (2023). CASSINI S INMS TELEMETRY PACKET DATA V1.0 [Dataset]. https://catalog.data.gov/dataset/cassini-s-inms-telemetry-packet-data-v1-0-8af4b
Explore at:
Dataset updated
Dec 7, 2023
Dataset provided by
NASAhttp://nasa.gov/
Description
The Cassini Ion and Neutral Mass Spectrometer (INMS) Packet data set contains all telemetry packets as received from the instrument. One standard product data type is defined for each INMS telemetry Packet. In each standard data product. one record is produced for each packet. Each item in the packet is converted from data numbers to dimensional values. The data set contains all science packets for the entire Cassini mission. The data set includes telemetry data from the instrument checkout periods, SOI and the entire Saturn tour.Each standard data product is organized as a spread sheet with one row for each packet. Each column in the spread sheet contains the contents of one item in the telemetry packet, converted from data number to dimensional quantities where appropriate.
A Twitter Dataset of 100+ million tweets related to COVID-19
zenodo.org
application/gzip, csv +1
Updated Apr 17, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juan M. Banda; Juan M. Banda; Ramya Tekumalla; Ramya Tekumalla; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding; Gerardo Chowell; Gerardo Chowell; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding (2023). A Twitter Dataset of 100+ million tweets related to COVID-19 [Dataset]. http://doi.org/10.5281/zenodo.3735274
Explore at:
application/gzip, tsv, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3735274
Dataset updated
Apr 17, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Juan M. Banda; Juan M. Banda; Ramya Tekumalla; Ramya Tekumalla; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding; Gerardo Chowell; Gerardo Chowell; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding
Description
Due to the relevance of the COVID-19 global pandemic, we are releasing our dataset of tweets acquired from the Twitter Stream related to COVID-19 chatter. The first 9 weeks of data (from January 1st, 2020 to March 11th, 2020) contain very low tweet counts as we filtered other data we were collecting for other research purposes, however, one can see the dramatic increase as the awareness for the virus spread. Dedicated data gathering started from March 11th to March 30th which yielded over 4 million tweets a day. We have added additional data provided by our new collaborators from January 27th to February 27th, to provide extra longitudinal coverage.

The data collected from the stream captures all languages, but the higher prevalence are: English, Spanish, and French. We release all tweets and retweets on the full_dataset.tsv file (101,400,452 unique tweets), and a cleaned version with no retweets on the full_dataset-clean.tsv file (20,244,746 unique tweets). There are several practical reasons for us to leave the retweets, tracing important tweets and their dissemination is one of them. For NLP tasks we provide the top 1000 frequent terms in frequent_terms.csv, the top 1000 bigrams in frequent_bigrams.csv, and the top 1000 trigrams in frequent_trigrams.csv. Some general statistics per day are included for both datasets in the statistics-full_dataset.tsv and statistics-full_dataset-clean.tsv files.

More details can be found (and will be updated faster at: https://github.com/thepanacealab/covid19_twitter)

As always, the tweets distributed here are only tweet identifiers (with date and time added) due to the terms and conditions of Twitter to re-distribute Twitter data. The need to be hydrated to be used.
Spread of the ICC - Individuals - Dataset - Banco Central do Brasil Open...
opendata.bcb.gov.br
Updated Jan 15, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
bcb.gov.br (2018). Spread of the ICC - Individuals - Dataset - Banco Central do Brasil Open Data Portal [Dataset]. https://opendata.bcb.gov.br/dataset/27445-spread-of-the-icc---individuals
Explore at:
Dataset updated
Jan 15, 2018
Dataset provided by
Central Bank of Brazilhttp://www.bc.gov.br/
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
Concept: Difference between average cost of outstanding loans (ICC) and its average funding cost. Comprises both earmarked and nonearmarked operations. Source: Central Bank of Brazil – Statistics Department 27445-spread-of-the-icc---individuals 27445-spread-of-the-icc---individuals

Facebook

Twitter

Click to copy link

Link copied

Cite

Hasmot Ali (2020). Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First Death [Dataset]. http://doi.org/10.17632/vw427wzzkk.4

Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First Death

Explore at:

Unique identifier

https://doi.org/10.17632/vw427wzzkk.4

Dataset updated

Jul 20, 2020

Authors

Hasmot Ali

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Contain informative data related to COVID-19 pandemic. Specially, figure out about the First Case and First Death information for every single country. First Case information consist of Date of First Case(s), Number of confirm Case(s) at First Day, Age of the patient(s) of First Case, Last Visited Country and the First Death information consist of Date of First Death and Age of the Patient who died first for every Country mentioning corresponding Continent. The datasets also contain the Binary Matrix of spread chain among different country and region.

Clear search

Close search

Google apps

Main menu

Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First...

Cover smart, do your part, slow the spread. My Stay-at-Home Lab Shows How...

Data from: Modeling the Spread of a Livestock Disease With Semi-Supervised...

MATLAB code and output files for integral, mean and covariance of the...

Spread of the ICC - Earmarked - Dataset - Banco Central do Brasil Open Data...

ECMWF ERA5: 10 ensemble member surface level analysis parameter data

India Chocolate Spread Export Data, List of Chocolate Spread Exporters in...

Spread Oak Lane Cross Street Data in Waterbury, CT

Wildfire Risk to Communities Housing Unit Density (Image Service)

Spread of the ICC - Non-revolving operations - Households - Dataset - Banco...

Global Data Set on Spread of COVID-19 and Ambient Temperature

ECMWF ERA5: ensemble spreads of surface level analysis parameter data

ECMWF ERA5t: ensemble spreads of surface level analysis parameter data

ECMWF ERA5t: 10 ensemble member surface level analysis parameter data

Data from: Accuracy of climate-based forecasts of pathogen spread

Dataplex: Reddit Data | Consumer Behavior Data | 2.1M+ subreddits: trends,...

Gender breakdown by Grade within Grade Stream - Dataset - data.gov.ie

Data from: CASSINI S INMS TELEMETRY PACKET DATA V1.0

A Twitter Dataset of 100+ million tweets related to COVID-19

Spread of the ICC - Individuals - Dataset - Banco Central do Brasil Open...

Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First DeathSee More Versions

Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First Death