Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Aggregate mean and standard deviation of out-of-sample r2 and NRMSE for contemporaneous prediction, indexed by methodology and indicator.
Facebook
TwitterDefinition Proprietary Limited Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mean and standard deviations of country level r2 and NRMSE for contemporaneous prediction, indexed by methodology and indicator.
Facebook
TwitterContext
In the field of e-commerce, the datasets are typically considered as proprietary, meaning they are owned and controlled by individual organizations and are not often made publicly available due to privacy and business considerations. In spite of this, The UCI Machine Learning Repository, known for its extensive collection of datasets beneficial for machine learning and data mining research, has curated and made accessible a unique dataset. This dataset comprises actual transactional data spanning from the year 2010 to 2011. For those interested, the dataset is maintained and readily available on the UCI Machine Learning Repository's site under the title "Online Retail".
Content
The dataset is a transnational one, capturing every transaction made from December 1, 2010, through December 9, 2011, by a UK-based non-store online retail company. As an online retail entity, the company doesn't have a physical store presence, and its operations and sales are conducted purely online. The company's primary product offering includes unique gifts for all occasions. While the company serves a diverse range of customers, a significant number of its clientele includes wholesalers.
Acknowledgements
In collaboration with the UCI Machine Learning Repository, the dataset was provided and made available by Dr. Daqing Chen. Dr. Chen is the Director of the Public Analytics group at London South Bank University, UK. Any correspondence regarding this dataset can be sent to Dr. Chen at 'chend' at 'lsbu.ac.uk'. We are grateful to him for providing such an invaluable resource for researchers and data science enthusiasts.
The image used has been sourced from Canva
Inspiration
The rich and extensive data within this dataset opens the door for a multitude of potential analyses. It lends itself well to various methods and techniques in data science, including but not limited to time series analysis, clustering, and classification. By exploring this dataset, one could derive key insights into customer behavior, transaction trends, and product performance, providing ample opportunities for deep and insightful explorations.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This archive includes data from a randomized controlled trial of a produce prescription program. Codes is provided to replicate Figure 1 from this paper.NB: The remaining Figures use proprietary data that would require requesting access to the data from Geisinger. Please contact Joseph Doyle, jjdoyle@mit.edu, for more information.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Measuring the quality of Question Answering (QA) systems is a crucial task to validate the results of novel approaches. However, there are already indicators of a reproducibility crisis as many published systems have used outdated datasets or use subsets of QA benchmarks, making it hard to compare results. We identified the following core problems: there is no standard data format, instead, proprietary data representations are used by the different partly inconsistent datasets; additionally, the characteristics of datasets are typically not reflected by the dataset maintainers nor by the system publishers. To overcome these problems, we established an ontology---Question Answering Dataset Ontology (QADO)---for representing the QA datasets in RDF. The following datasets were mapped into the ontology: the QALD series, LC-QuAD series, RuBQ series, ComplexWebQuestions, and Mintaka. Hence, the integrated data in QADO covers widely used datasets and multilinguality. Additionally, we did intensive analyses of the datasets to identify their characteristics to make it easier for researchers to identify specific research questions and to select well-defined subsets. The provided resource will enable the research community to improve the quality of their research and support the reproducibility of experiments.
Here, the mapping results of the QADO process, the SPARQL queries for data analytics, and the archived analytics results file are provided.
Up-to-date statistics can be created automatically by the script provided at the corresponding QADO GitHub RDFizer repository.
Facebook
TwitterSAMSN7L3ZMTG is the Nimbus-7 Stratospheric and Mesospheric Sounder (SAMS) Level 3 Zonal Means Composition Data Product. The Earth's surface is divided into 2.5-deg latitudinal zones that extend from 50 deg South to 67.5 deg North. Retrieved mixing ratios of nitrous oxide (N2O) and methane (CH4) are averaged over day and night, along with errors, at 31 pressure levels between 50 and 0.125 mbar. Because the N2O and CH4 channels cannot function simultaneously, only one type of measurement is made for any nominal day. The data were recovered from the original magnetic tapes, and are now stored online as one file in its original proprietary binary format.The data for this product are available from 1 January 1979 through 30 December 1981. The principal investigators for the SAMS experiment were Prof. John T. Houghton and Dr. Fredric W. Taylor from Oxford University.This product was previously available from the NSSDC with the identifier ESAD-00180 (old ID 78-098A-02C).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Miami-Dade County, FL, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
https://i.neilsberg.com/ch/miami-dade-county-fl-mean-household-income-by-quintiles.jpeg" alt="Mean household income by quintiles in Miami-Dade County, FL (in 2022 inflation-adjusted dollars))">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Miami-Dade County median household income. You can refer the same here
Facebook
TwitterThis dataset features over 330,000 high-quality interior design images sourced from photographers worldwide. Designed to support AI and machine learning applications, it provides a richly varied and extensively annotated collection of indoor environment visuals.
Key Features: 1. Comprehensive Metadata: the dataset includes full EXIF data, detailing camera settings such as aperture, ISO, shutter speed, and focal length. Each image is pre-annotated with object and scene detection metadata, making it ideal for tasks such as room classification, furniture detection, and spatial layout analysis. Popularity metrics, derived from engagement on our proprietary platform, are also included.
Unique Sourcing Capabilities: the images are collected through a proprietary gamified platform for photographers. Competitions centered on interior design themes ensure a steady stream of fresh, high-quality submissions. Custom datasets can be sourced on-demand within 72 hours to fulfill specific requests, such as particular room types, design styles, or furnishings.
Global Diversity: photographs have been sourced from contributors in over 100 countries, covering a wide spectrum of architectural styles, cultural aesthetics, and functional spaces. The images include homes, offices, restaurants, studios, and public interiors—ranging from minimalist and modern to classic and eclectic designs.
High-Quality Imagery: the dataset includes standard to ultra-high-definition images that capture fine interior details. Both professionally staged and candid real-life spaces are included, offering versatility for training AI across design evaluation, object detection, and environmental understanding.
Popularity Scores: each image is assigned a popularity score based on its performance in GuruShots competitions. This provides valuable insights into global aesthetic trends, helping AI models learn user preferences, design appeal, and stylistic relevance.
AI-Ready Design: the dataset is optimized for machine learning tasks such as interior scene recognition, style transfer, virtual staging, and layout generation. It integrates smoothly with popular AI development environments and tools.
Licensing & Compliance: the dataset fully complies with data privacy regulations and includes transparent licensing suitable for commercial and academic use.
Use Cases: 1. Training AI for interior design recommendation engines and virtual staging tools. 2. Enhancing smart home applications and spatial recognition systems. 3. Powering AR/VR platforms for virtual tours, furniture placement, and room redesign. 4. Supporting architectural visualization, decor style transfer, and real estate marketing.
This dataset offers a comprehensive, high-quality resource tailored for AI-driven innovation in design, real estate, and spatial computing. Customizations are available upon request. Contact us to learn more!
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
📌 Dataset Description Get the full version of dataset here: https://colorstech.net/power-bi/power-bi-tutorial-building-a-thermal-power-plant-efficiency-dashboard-by-ankit-srivastava/ This dataset contains realistic, domain-informed synthetic data for 50 thermal power plants. It captures key operational, environmental, and design parameters that influence thermal power plant efficiency.
The goal is to provide a clean, structured dataset that helps learners, researchers, and ML enthusiasts explore how different factors affect plant performance. The dataset is ideal for regression modeling, performance benchmarking, and training energy analytics workflows.
🌍 Context
Thermal power plants play a major role in electricity generation worldwide. Their efficiency depends on several technical parameters including boiler type, fuel quality, steam conditions, condenser performance, and internal power usage.
Because real plant datasets are rarely public due to industrial confidentiality, this dataset is artificially generated using realistic engineering ranges and relationships between variables. It simulates typical power plant behavior to support learning, research, and experimentation.
🛠️ How the Dataset Was Created
This dataset is synthetically generated using engineering knowledge, covering realistic ranges for:
All numerical values follow realistic industrial behavior and approximate real-world physics without referencing any proprietary data.
📊 Dataset Columns & Meaning Categorical Variables Column Description Plant_Name Unique identifier for each power plant (Plant_1 to Plant_50). Region Plant location region: North, South, East, West. Fuel_Type Primary fuel used: Coal, Natural Gas, Oil, Biomass. Boiler_Type Boiler technology: Subcritical / Supercritical / Ultra Supercritical. Ownership Whether the plant is Public or Private. Operational Parameters Column Description Fuel_Input_Energy_GJ_per_hr Heat energy supplied through fuel per hour (Gigajoules/hr). Electrical_Output_MWh_per_hr Electrical power generated per hour (Megawatt-hour/hr). Steam_Temperature_C Temperature of steam entering turbine (°C). Steam_Pressure_bar Steam pressure entering turbine (bar). Condenser_Pressure_bar Pressure inside condenser (bar), affecting cooling. Auxiliary_Power_% Internal power consumption by pumps, fans, etc. Efficiency_% Overall plant efficiency derived from input-output ratio. 🏭 How a Thermal Power Plant Works (Short Explanation)
A thermal power plant converts heat energy from fuel into electricity. Fuel is burned in a boiler to produce high-pressure steam. This steam spins a turbine connected to a generator, generating electricity. After expansion, the steam is cooled in a condenser and converted back to water, which is pumped again to the boiler. This closed-loop system’s performance depends on steam conditions, heat losses, condenser efficiency, and the plant’s internal energy consumption.
🎯 Possible Use Cases
You can use this dataset for:
Machine Learning Efficiency prediction (Regression) Clustering plants by performance Predicting fuel consumption Analyzing factor impact using feature importance Energy Analytics Heat rate analysis Boiler and turbine performance comparison Fuel-type-based performance trends Education & Simulation Demonstrating power plant thermodynamics Teaching students how efficiency is calculated Building mock energy dashboards
📈 Why This Dataset Is Useful
Clean and ready for ML/EDA Includes both categorical & continuous variables Simulates real-world engineering relationships Balanced complexity for machine learning projects Ideal for beginner to intermediate energy analytics task
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Illinois, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
https://i.neilsberg.com/ch/illinois-mean-household-income-by-quintiles.jpeg" alt="Mean household income by quintiles in Illinois (in 2022 inflation-adjusted dollars))">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Illinois median household income. You can refer the same here
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Mexico, MO, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
https://i.neilsberg.com/ch/mexico-mo-mean-household-income-by-quintiles.jpeg" alt="Mean household income by quintiles in Mexico, MO (in 2022 inflation-adjusted dollars))">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Mexico median household income. You can refer the same here
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in King County, WA, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
https://i.neilsberg.com/ch/king-county-wa-mean-household-income-by-quintiles.jpeg" alt="Mean household income by quintiles in King County, WA (in 2022 inflation-adjusted dollars))">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for King County median household income. You can refer the same here
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Watertown, SD, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
https://i.neilsberg.com/ch/watertown-sd-mean-household-income-by-quintiles.jpeg" alt="Mean household income by quintiles in Watertown, SD (in 2022 inflation-adjusted dollars))">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Watertown median household income. You can refer the same here
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Union Center, WI, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
https://i.neilsberg.com/ch/union-center-wi-mean-household-income-by-quintiles.jpeg" alt="Mean household income by quintiles in Union Center, WI (in 2022 inflation-adjusted dollars))">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Union Center median household income. You can refer the same here
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Aggregate mean and standard deviation of out-of-sample r2 and NRMSE for contemporaneous prediction, indexed by methodology and indicator.