100+ datasets found

P
Data from: Data Science Problems Dataset
paperswithcode.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shubham Chandel; Colin B. Clement; Guillermo Serrato; Neel Sundaresan, Data Science Problems Dataset [Dataset]. https://paperswithcode.com/dataset/data-science-problems
Explore at:
Authors
Shubham Chandel; Colin B. Clement; Guillermo Serrato; Neel Sundaresan
Description
Evaluate a natural language code generation model on real data science pedagogical notebooks! Data Science Problems (DSP) includes well-posed data science problems in Markdown along with unit tests to verify correctness and a Docker environment for reproducible execution. About 1/3 of notebooks in this benchmark also include data dependencies, so this benchmark not only can test a model's ability to chain together complex tasks, but also evaluate the solutions on real data! See our paper Training and Evaluating a Jupyter Notebook Data Science Assistant for more details about state of the art results and other properties of the dataset.
R
Data from: Problem Dataset
universe.roboflow.com
zip
Updated Dec 23, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
jin (2024). Problem Dataset [Dataset]. https://universe.roboflow.com/jin-cthqm/problem-tqqcx
Explore at:
zipAvailable download formats
Dataset updated
Dec 23, 2024
Dataset authored and provided by
jin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Problem Bounding Boxes
Description
Problem

## Overview Problem is a dataset for object detection tasks - it contains Problem annotations for 2,923 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
d
Replication Data for: Problem Importance across Time and Space: Updating the...
search.dataone.org
dataverse.harvard.edu
Updated Sep 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Williams, Laron (2024). Replication Data for: Problem Importance across Time and Space: Updating the 'Most Important Problem Dataset' [Dataset]. http://doi.org/10.7910/DVN/NDMOFT
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/NDMOFT
Dataset updated
Sep 24, 2024
Dataset provided by
Harvard Dataverse
Authors
Williams, Laron
Description
This page contains the files necessary to reproduce all the empirical analysis found in the Journal of Elections, Public Opinion and Parties article.
i
data set for open-loop solution for a stochastic problem
ieee-dataport.org
Updated Apr 12, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andres Frederic (2023). data set for open-loop solution for a stochastic problem [Dataset]. https://ieee-dataport.org/documents/data-set-open-loop-solution-stochastic-problem-0
Explore at:
Dataset updated
Apr 12, 2023
Authors
Andres Frederic
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
the focus of this dataset is to provid an open-loop solution for a stochastic problem with imperfect state information andchance-constraints adjusted by an optimal gain.
d
Data from: Error-Level-Controlled Synthetic Forecasts for Renewable...
catalog.data.gov
data.openei.org
+1more
Updated Nov 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Renewable Energy Laboratory (NREL) (2023). Error-Level-Controlled Synthetic Forecasts for Renewable Generation [Dataset]. https://catalog.data.gov/dataset/error-level-controlled-synthetic-forecasts-for-renewable-generation
Explore at:
Dataset updated
Nov 30, 2023
Dataset provided by
National Renewable Energy Laboratory (NREL)
Description
Renewable energy resources, including solar and wind energy, play a significant role in sustainable energy systems. However, the inherent uncertainty and intermittency of renewable generation pose challenges to the safe and efficient operation of power systems. Recognizing the importance of short-term (hours ahead) renewable generation forecasting in power systems operation, it becomes crucial to address the potential inaccuracies in these forecasts. To systematically evaluate the performance of controllers in the presence of imperfect forecasts, we generate synthetic forecasts using actual renewable generation profiles (one from solar and one from wind). These synthetic forecasts incorporate different levels of statistical error, allowing us to control and manipulate the accuracy of the predictions. The primary objective is to employ synthetic forecasts with controlled yet realistic error levels to systematically investigate how controllers adapt to variations in forecast accuracy, providing valuable insights into their robustness and effectiveness under real-world conditions.
Z
Reduced Order Models Chapter - N.C. Clementi PhD Thesis (problem data set)
data.niaid.nih.gov
zenodo.org
Updated Feb 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Natalia C. Clementi (2021). Reduced Order Models Chapter - N.C. Clementi PhD Thesis (problem data set) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4558104
Explore at:
Dataset updated
Feb 24, 2021
Dataset authored and provided by
Natalia C. Clementi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Problem folders including all the input files necessary to reproduce the computations of the results related to the Reduced Order Models Chapter of N.C. Clementi PhD Thesis.
Housing Maintenance Code Complaints and Problems
data.cityofnewyork.us
s.cnmilf.com
+1more
application/rdfxml +5
Updated Jul 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Housing Preservation & Development (HPD) (2025). Housing Maintenance Code Complaints and Problems [Dataset]. https://data.cityofnewyork.us/Housing-Development/Housing-Maintenance-Code-Complaints-and-Problems/ygpa-z7cr
Explore at:
xml, tsv, csv, application/rssxml, application/rdfxml, jsonAvailable download formats
Dataset updated
Jul 14, 2025
Dataset provided by
New York City Department of Housing Preservation and Development
Authors
Department of Housing Preservation & Development (HPD)
Description
The Department of Housing Preservation and Development (HPD) records complaints that are made by the public for conditions which violate the New York City Housing Maintenance Code (HMC) or the New York State Multiple Dwelling Law (MDL).
United States SBOI: sa: Most Pressing Problem: A Year Ago: Others
ceicdata.com
Updated Mar 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2021). United States SBOI: sa: Most Pressing Problem: A Year Ago: Others [Dataset]. https://www.ceicdata.com/en/united-states/nfib-index-of-small-business-optimism/sboi-sa-most-pressing-problem-a-year-ago-others
Explore at:
Dataset updated
Mar 21, 2021
Dataset provided by
CEIC Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Mar 1, 2024 - Feb 1, 2025
Area covered
United States
Variables measured
Business Confidence Survey
Description
United States SBOI: sa: Most Pressing Problem: A Year Ago: Others data was reported at 5.000 % in Mar 2025. This records a decrease from the previous number of 6.000 % for Feb 2025. United States SBOI: sa: Most Pressing Problem: A Year Ago: Others data is updated monthly, averaging 7.000 % from Jan 2014 (Median) to Mar 2025, with 131 observations. The data reached an all-time high of 11.000 % in May 2023 and a record low of 3.000 % in Jul 2024. United States SBOI: sa: Most Pressing Problem: A Year Ago: Others data remains active status in CEIC and is reported by National Federation of Independent Business. The data is categorized under Global Database’s United States – Table US.S042: NFIB Index of Small Business Optimism. [COVID-19-IMPACT]
H
Data from: Randomly generated problems for the complexity resolution problem...
dataverse.harvard.edu
dataone.org
Updated Apr 6, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohamed Ossama Hassan; Antoine Saucier; Soumaya Yacout; Francois Soumis (2020). Randomly generated problems for the complexity resolution problem in a multi sector planning context [Dataset]. http://doi.org/10.7910/DVN/II5JZG
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/II5JZG
Dataset updated
Apr 6, 2020
Dataset provided by
Harvard Dataverse
Authors
Mohamed Ossama Hassan; Antoine Saucier; Soumaya Yacout; Francois Soumis
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
All the randomly generated problems in this data set involve a number A of aircraft passing through a square multi-sector area (MSA) of side 600 km. This MSA is composed of four square adjacent sectors of side 300 km. The aircraft use four different flight levels that belong to the same MSA. The aircraft trajectories are randomly generated in such a way that all aircraft are either flying from bottom to upper MSA borders, or from left to right borders. Taking the origin at the bottom left corner of the MSA, the distance between the first waypoint and the origin is randomly generated using the continuous uniform distribution U[75 km, 595 km]. Each trajectory is composed of three waypoints located on the MSA edges. The first waypoint is located on either the bottom or the left MSA border. The other two waypoints are generated randomly along the opposing sector borders using a uniform distribution. The cruise speeds of the aircraft are randomly generated using the continuous uniform distribution U[458 knots, 506 knots]. The time at which the aircraft enters the MSA follows the continuous uniform distribution U[20 min, 90 min]. The flight level used for each trajectory is randomly generated using a discrete uniform distribution U{1, K}. A constant flight level is used by 90% of the aircraft. The others undergo one flight level change at the internal boundary. For these aircraft, the second flight level is randomly generated using U{1, K} while excluding the first sector flight level.
Baseline Data on Students Problem-Solving (PS) Skills: Physics Concept and...
figshare.com
xlsx
Updated Nov 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Theophile Musengimana (2024). Baseline Data on Students Problem-Solving (PS) Skills: Physics Concept and PS Steps Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.27901977.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.27901977.v1
Dataset updated
Nov 25, 2024
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Theophile Musengimana
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset has two files. One file contains data from students where they are marked based on how they performed each question (physics concept). Another file contains an analysis based on how students followed each of the seven PS steps.
f
Descriptive statistics of sexual violence victim-survivors in the Crime...
plos.figshare.com
xls
Updated Jan 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Estela Capelas Barbosa; Niels Blom; Annie Bunce (2025). Descriptive statistics of sexual violence victim-survivors in the Crime Survey for England and Wales (CSEW) and Rape Crisis England & Wales (RCEW) datasets. [Dataset]. http://doi.org/10.1371/journal.pone.0301155.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0301155.t001
Dataset updated
Jan 14, 2025
Dataset provided by
PLOS ONE
Authors
Estela Capelas Barbosa; Niels Blom; Annie Bunce
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Descriptive statistics of sexual violence victim-survivors in the Crime Survey for England and Wales (CSEW) and Rape Crisis England & Wales (RCEW) datasets.
m
BWFLnet + data
data.mendeley.com
Updated Apr 24, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Waldron (2019). BWFLnet + data [Dataset]. http://doi.org/10.17632/srt4vr5k38.1
Explore at:
Unique identifier
https://doi.org/10.17632/srt4vr5k38.1
Dataset updated
Apr 24, 2019
Authors
Alexander Waldron
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is supplementary data to "Parameter Estimation for Water Distribution Networks with Multiple Head Loss Formulae" in ASCE Journal of Water Resources and Planning Management. Any use of this dataset must credit the authors.

BWFLnet is an operational network in Bristol, UK, operated by Bristol Water. The data provided is a the product of a long term research partnership between Bristol Water and Infrasense Labs at Imperial College London. All data provided is genuine recorded data with locations and names anonymised. The authors hope that the publication of this dataset can be a useful contribution for hydraulic model calibration as well as wider research purposes in the water distribution sector.
c
Data from: Peer-to-Peer Data Mining, Privacy Issues, and Games
s.cnmilf.com
data.staging.idas-ds1.appdat.jsc.nasa.gov
+3more
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Peer-to-Peer Data Mining, Privacy Issues, and Games [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/peer-to-peer-data-mining-privacy-issues-and-games
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description
Peer-to-Peer (P2P) networks are gaining increasing popularity in many distributed applications such as file-sharing, network storage, web caching, sear- ching and indexing of relevant documents and P2P network-threat analysis. Many of these applications require scalable analysis of data over a P2P network. This paper starts by offering a brief overview of distributed data mining applications and algorithms for P2P environments. Next it discusses some of the privacy concerns with P2P data mining and points out the problems of existing privacy-preserving multi-party data mining techniques. It further points out that most of the nice assumptions of these existing privacy preserving techniques fall apart in real-life applications of privacy-preserving distributed data mining (PPDM). The paper offers a more realistic formulation of the PPDM problem as a multi-party game and points out some recent results.
m
sorted_ES
data.mendeley.com
Updated Nov 2, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qinzhuo Liao (2018). sorted_ES [Dataset]. http://doi.org/10.17632/nzkrxf74d4.1
Explore at:
Unique identifier
https://doi.org/10.17632/nzkrxf74d4.1
Dataset updated
Nov 2, 2018
Authors
Qinzhuo Liao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Matlab codes for examples
g
Development Economics Data Group - Score on action when a problem arose |...
gimi9.com
Updated May 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Development Economics Data Group - Score on action when a problem arose | gimi9.com [Dataset]. https://gimi9.com/dataset/worldbank_wb_es_t_mgmt2/
Explore at:
Dataset updated
May 7, 2025
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Score on Action When a Problem Arises represents a measurement of how establishments respond to issues during the production process, encompassing actions taken to rectify problems and prevent future occurrences.
s
Data and source code for "Automating Intention Mining"
researchdata.smu.edu.sg
zip
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qiao HUANG; Xin XIA; David LO; Gail C. MURPHY (2023). Data and source code for "Automating Intention Mining" [Dataset]. http://doi.org/10.25440/smu.21261408.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.25440/smu.21261408.v1
Dataset updated
Jun 4, 2023
Dataset provided by
SMU Research Data Repository (RDR)
Authors
Qiao HUANG; Xin XIA; David LO; Gail C. MURPHY
License
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
Description
The dataset and source code for paper "Automating Intention Mining".

The code is based on dennybritz's implementation of Yoon Kim's paper Convolutional Neural Networks for Sentence Classification.

By default, the code uses Tensorflow 0.12. Some errors might be reported when using other versions of Tensorflow due to the incompatibility of some APIs.

Running 'online_prediction.py', you can input any sentence and check the classification result produced by a pre-trained CNN model. The model uses all sentences of the four Github projects as training data.

Running 'play.py', you can get the evaluation result of cross-project prediction. Please check the code for more details of the configuration. By default, it will use the four Github projects as training data to predict the sentences in DECA dataset, and in this setting, the category 'aspect evaluation' and 'others' are dropped since DECA dataset does not contain these two categories.
Italy: privacy concerns regarding personal data on the internet, by issue
statista.com
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Italy: privacy concerns regarding personal data on the internet, by issue [Dataset]. https://www.statista.com/statistics/830088/privacy-concerns-regarding-personal-data-on-the-internet-in-italy/
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jul 2016 - Aug 2016
Area covered
Italy
Description
This statistic displays the results of a survey on the share of individuals expressing privacy concerns regarding their personal data on the internet in Italy in 2016. During the survey period, it was found that **** percent of the respondents reported that the use of the internet exposes each one to be tracked and followed up while **** percent stated that privacy was not a real problem.
O
Replication Data for: Exact algorithms for a parallel machine scheduling...
portal.odissei.nl
dataverse.nl
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Giulia Caselli; Maxence Delorme; Manuel Iori; Carlo Alberto Magni (2025). Replication Data for: Exact algorithms for a parallel machine scheduling problem with workforce and contiguity constraints [Dataset]. http://doi.org/10.34894/LME3DH
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34894/LME3DH
Dataset updated
Mar 26, 2025
Dataset provided by
ODISSEI Portal
Authors
Giulia Caselli; Maxence Delorme; Manuel Iori; Carlo Alberto Magni
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This repository contains the instances used in the paper "Exact algorithms for a parallel machine scheduling problem with workforce and contiguity constraints" by Giulia Caselli, Maxence Delorme, Manuel Iori, and Carlo Alberto Magni.
e
Research Data for Gaming and Problem-Solving: Enhancing Critical Thinking
research-l8qya.lolm.eu.org
scholar-lgztf.lolm.eu.org
csv, json
Updated Jul 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dr. Daniel Hall (2025). Research Data for Gaming and Problem-Solving: Enhancing Critical Thinking [Dataset]. https://research-l8qya.lolm.eu.org/en/key=xgr3f6604130/
Explore at:
json, csvAvailable download formats
Dataset updated
Jul 18, 2025
Authors
Dr. Daniel Hall
Variables measured
Variable A, Variable B, Variable C, Correlation Index, Statistical Significance
Description
Complete dataset used in the research study on Gaming and Problem-Solving: Enhancing Critical Thinking by Dr. Daniel Hall
d
Data from: A Distributed Approach to System-Level Prognostics
catalog.data.gov
datasets.ai
+2more
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). A Distributed Approach to System-Level Prognostics [Dataset]. https://catalog.data.gov/dataset/a-distributed-approach-to-system-level-prognostics
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
Prognostics, which deals with predicting remaining useful life of components, subsystems, and systems, is a key tech- nology for systems health management that leads to improved safety and reliability with reduced costs. The prognostics problem is often approached from a component-centric view. However, in most cases, it is not specifically component life- times that are important, but, rather, the lifetimes of the sys- tems in which these components reside. The system-level prognostics problem can be quite difficult due to the increased scale and scope of the prognostics problem and the rela- tive lack of scalability and efficiency of typical prognostics approaches. In order to address these issues, we develop a distributed solution to the system-level prognostics prob- lem, based on the concept of structural model decomposi- tion. The system model is decomposed into independent submodels. Independent local prognostics subproblems are then formed based on these local submodels, resulting in a scalable, efficient, and flexible distributed approach to the system-level prognostics problem. We provide a formulation of the system-level prognostics problem and demonstrate the approach on a four-wheeled rover simulation testbed. The re- sults show that the system-level prognostics problem can be accurately and efficiently solved in a distributed fashion.

Facebook

Twitter

Click to copy link

Link copied

Cite

Shubham Chandel; Colin B. Clement; Guillermo Serrato; Neel Sundaresan, Data Science Problems Dataset [Dataset]. https://paperswithcode.com/dataset/data-science-problems

Data from: Data Science Problems Dataset

Explore at:

Authors

Shubham Chandel; Colin B. Clement; Guillermo Serrato; Neel Sundaresan

Description

Evaluate a natural language code generation model on real data science pedagogical notebooks! Data Science Problems (DSP) includes well-posed data science problems in Markdown along with unit tests to verify correctness and a Docker environment for reproducible execution. About 1/3 of notebooks in this benchmark also include data dependencies, so this benchmark not only can test a model's ability to chain together complex tasks, but also evaluate the solutions on real data! See our paper Training and Evaluating a Jupyter Notebook Data Science Assistant for more details about state of the art results and other properties of the dataset.

Clear search

Close search

Google apps

Main menu

Data from: Data Science Problems Dataset

Data from: Problem Dataset

Problem

Replication Data for: Problem Importance across Time and Space: Updating the...

data set for open-loop solution for a stochastic problem

Data from: Error-Level-Controlled Synthetic Forecasts for Renewable...

Reduced Order Models Chapter - N.C. Clementi PhD Thesis (problem data set)

Housing Maintenance Code Complaints and Problems

United States SBOI: sa: Most Pressing Problem: A Year Ago: Others

Data from: Randomly generated problems for the complexity resolution problem...

Baseline Data on Students Problem-Solving (PS) Skills: Physics Concept and...

Descriptive statistics of sexual violence victim-survivors in the Crime...

BWFLnet + data

Data from: Peer-to-Peer Data Mining, Privacy Issues, and Games

sorted_ES

Development Economics Data Group - Score on action when a problem arose |...

Data and source code for "Automating Intention Mining"

Italy: privacy concerns regarding personal data on the internet, by issue

Replication Data for: Exact algorithms for a parallel machine scheduling...

Research Data for Gaming and Problem-Solving: Enhancing Critical Thinking

Data from: A Distributed Approach to System-Level Prognostics

Data from: Data Science Problems DatasetSee More Versions

Data from: Data Science Problems Dataset