100+ datasets found
  1. P

    Data from: Data Science Problems Dataset

    • paperswithcode.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shubham Chandel; Colin B. Clement; Guillermo Serrato; Neel Sundaresan, Data Science Problems Dataset [Dataset]. https://paperswithcode.com/dataset/data-science-problems
    Explore at:
    Authors
    Shubham Chandel; Colin B. Clement; Guillermo Serrato; Neel Sundaresan
    Description

    Evaluate a natural language code generation model on real data science pedagogical notebooks! Data Science Problems (DSP) includes well-posed data science problems in Markdown along with unit tests to verify correctness and a Docker environment for reproducible execution. About 1/3 of notebooks in this benchmark also include data dependencies, so this benchmark not only can test a model's ability to chain together complex tasks, but also evaluate the solutions on real data! See our paper Training and Evaluating a Jupyter Notebook Data Science Assistant for more details about state of the art results and other properties of the dataset.

  2. R

    Data from: Problem Dataset

    • universe.roboflow.com
    zip
    Updated Dec 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    jin (2024). Problem Dataset [Dataset]. https://universe.roboflow.com/jin-cthqm/problem-tqqcx
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 23, 2024
    Dataset authored and provided by
    jin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Problem Bounding Boxes
    Description

    Problem

    ## Overview
    
    Problem is a dataset for object detection tasks - it contains Problem annotations for 2,923 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  3. d

    Replication Data for: Problem Importance across Time and Space: Updating the...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Sep 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Williams, Laron (2024). Replication Data for: Problem Importance across Time and Space: Updating the 'Most Important Problem Dataset' [Dataset]. http://doi.org/10.7910/DVN/NDMOFT
    Explore at:
    Dataset updated
    Sep 24, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Williams, Laron
    Description

    This page contains the files necessary to reproduce all the empirical analysis found in the Journal of Elections, Public Opinion and Parties article.

  4. i

    data set for open-loop solution for a stochastic problem

    • ieee-dataport.org
    Updated Apr 12, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andres Frederic (2023). data set for open-loop solution for a stochastic problem [Dataset]. https://ieee-dataport.org/documents/data-set-open-loop-solution-stochastic-problem-0
    Explore at:
    Dataset updated
    Apr 12, 2023
    Authors
    Andres Frederic
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    the focus of this dataset is to provid an open-loop solution for a stochastic problem with imperfect state information andchance-constraints adjusted by an optimal gain.

  5. d

    Data from: Error-Level-Controlled Synthetic Forecasts for Renewable...

    • catalog.data.gov
    • data.openei.org
    • +1more
    Updated Nov 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Renewable Energy Laboratory (NREL) (2023). Error-Level-Controlled Synthetic Forecasts for Renewable Generation [Dataset]. https://catalog.data.gov/dataset/error-level-controlled-synthetic-forecasts-for-renewable-generation
    Explore at:
    Dataset updated
    Nov 30, 2023
    Dataset provided by
    National Renewable Energy Laboratory (NREL)
    Description

    Renewable energy resources, including solar and wind energy, play a significant role in sustainable energy systems. However, the inherent uncertainty and intermittency of renewable generation pose challenges to the safe and efficient operation of power systems. Recognizing the importance of short-term (hours ahead) renewable generation forecasting in power systems operation, it becomes crucial to address the potential inaccuracies in these forecasts. To systematically evaluate the performance of controllers in the presence of imperfect forecasts, we generate synthetic forecasts using actual renewable generation profiles (one from solar and one from wind). These synthetic forecasts incorporate different levels of statistical error, allowing us to control and manipulate the accuracy of the predictions. The primary objective is to employ synthetic forecasts with controlled yet realistic error levels to systematically investigate how controllers adapt to variations in forecast accuracy, providing valuable insights into their robustness and effectiveness under real-world conditions.

  6. Z

    Reduced Order Models Chapter - N.C. Clementi PhD Thesis (problem data set)

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Natalia C. Clementi (2021). Reduced Order Models Chapter - N.C. Clementi PhD Thesis (problem data set) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4558104
    Explore at:
    Dataset updated
    Feb 24, 2021
    Dataset authored and provided by
    Natalia C. Clementi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Problem folders including all the input files necessary to reproduce the computations of the results related to the Reduced Order Models Chapter of N.C. Clementi PhD Thesis.

  7. Housing Maintenance Code Complaints and Problems

    • data.cityofnewyork.us
    • s.cnmilf.com
    • +1more
    application/rdfxml +5
    Updated Jul 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Housing Preservation & Development (HPD) (2025). Housing Maintenance Code Complaints and Problems [Dataset]. https://data.cityofnewyork.us/Housing-Development/Housing-Maintenance-Code-Complaints-and-Problems/ygpa-z7cr
    Explore at:
    xml, tsv, csv, application/rssxml, application/rdfxml, jsonAvailable download formats
    Dataset updated
    Jul 14, 2025
    Dataset provided by
    New York City Department of Housing Preservation and Development
    Authors
    Department of Housing Preservation & Development (HPD)
    Description

    The Department of Housing Preservation and Development (HPD) records complaints that are made by the public for conditions which violate the New York City Housing Maintenance Code (HMC) or the New York State Multiple Dwelling Law (MDL).

  8. United States SBOI: sa: Most Pressing Problem: A Year Ago: Others

    • ceicdata.com
    Updated Mar 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2021). United States SBOI: sa: Most Pressing Problem: A Year Ago: Others [Dataset]. https://www.ceicdata.com/en/united-states/nfib-index-of-small-business-optimism/sboi-sa-most-pressing-problem-a-year-ago-others
    Explore at:
    Dataset updated
    Mar 21, 2021
    Dataset provided by
    CEIC Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 2024 - Feb 1, 2025
    Area covered
    United States
    Variables measured
    Business Confidence Survey
    Description

    United States SBOI: sa: Most Pressing Problem: A Year Ago: Others data was reported at 5.000 % in Mar 2025. This records a decrease from the previous number of 6.000 % for Feb 2025. United States SBOI: sa: Most Pressing Problem: A Year Ago: Others data is updated monthly, averaging 7.000 % from Jan 2014 (Median) to Mar 2025, with 131 observations. The data reached an all-time high of 11.000 % in May 2023 and a record low of 3.000 % in Jul 2024. United States SBOI: sa: Most Pressing Problem: A Year Ago: Others data remains active status in CEIC and is reported by National Federation of Independent Business. The data is categorized under Global Database’s United States – Table US.S042: NFIB Index of Small Business Optimism. [COVID-19-IMPACT]

  9. H

    Data from: Randomly generated problems for the complexity resolution problem...

    • dataverse.harvard.edu
    • dataone.org
    Updated Apr 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamed Ossama Hassan; Antoine Saucier; Soumaya Yacout; Francois Soumis (2020). Randomly generated problems for the complexity resolution problem in a multi sector planning context [Dataset]. http://doi.org/10.7910/DVN/II5JZG
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 6, 2020
    Dataset provided by
    Harvard Dataverse
    Authors
    Mohamed Ossama Hassan; Antoine Saucier; Soumaya Yacout; Francois Soumis
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    All the randomly generated problems in this data set involve a number A of aircraft passing through a square multi-sector area (MSA) of side 600 km. This MSA is composed of four square adjacent sectors of side 300 km. The aircraft use four different flight levels that belong to the same MSA. The aircraft trajectories are randomly generated in such a way that all aircraft are either flying from bottom to upper MSA borders, or from left to right borders. Taking the origin at the bottom left corner of the MSA, the distance between the first waypoint and the origin is randomly generated using the continuous uniform distribution U[75 km, 595 km]. Each trajectory is composed of three waypoints located on the MSA edges. The first waypoint is located on either the bottom or the left MSA border. The other two waypoints are generated randomly along the opposing sector borders using a uniform distribution. The cruise speeds of the aircraft are randomly generated using the continuous uniform distribution U[458 knots, 506 knots]. The time at which the aircraft enters the MSA follows the continuous uniform distribution U[20 min, 90 min]. The flight level used for each trajectory is randomly generated using a discrete uniform distribution U{1, K}. A constant flight level is used by 90% of the aircraft. The others undergo one flight level change at the internal boundary. For these aircraft, the second flight level is randomly generated using U{1, K} while excluding the first sector flight level.

  10. Baseline Data on Students Problem-Solving (PS) Skills: Physics Concept and...

    • figshare.com
    xlsx
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Theophile Musengimana (2024). Baseline Data on Students Problem-Solving (PS) Skills: Physics Concept and PS Steps Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.27901977.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 25, 2024
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Theophile Musengimana
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset has two files. One file contains data from students where they are marked based on how they performed each question (physics concept). Another file contains an analysis based on how students followed each of the seven PS steps.

  11. f

    Descriptive statistics of sexual violence victim-survivors in the Crime...

    • plos.figshare.com
    xls
    Updated Jan 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Estela Capelas Barbosa; Niels Blom; Annie Bunce (2025). Descriptive statistics of sexual violence victim-survivors in the Crime Survey for England and Wales (CSEW) and Rape Crisis England & Wales (RCEW) datasets. [Dataset]. http://doi.org/10.1371/journal.pone.0301155.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 14, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Estela Capelas Barbosa; Niels Blom; Annie Bunce
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Descriptive statistics of sexual violence victim-survivors in the Crime Survey for England and Wales (CSEW) and Rape Crisis England & Wales (RCEW) datasets.

  12. m

    BWFLnet + data

    • data.mendeley.com
    Updated Apr 24, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Waldron (2019). BWFLnet + data [Dataset]. http://doi.org/10.17632/srt4vr5k38.1
    Explore at:
    Dataset updated
    Apr 24, 2019
    Authors
    Alexander Waldron
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is supplementary data to "Parameter Estimation for Water Distribution Networks with Multiple Head Loss Formulae" in ASCE Journal of Water Resources and Planning Management. Any use of this dataset must credit the authors.

    BWFLnet is an operational network in Bristol, UK, operated by Bristol Water. The data provided is a the product of a long term research partnership between Bristol Water and Infrasense Labs at Imperial College London. All data provided is genuine recorded data with locations and names anonymised. The authors hope that the publication of this dataset can be a useful contribution for hydraulic model calibration as well as wider research purposes in the water distribution sector.

  13. c

    Data from: Peer-to-Peer Data Mining, Privacy Issues, and Games

    • s.cnmilf.com
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • +3more
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Peer-to-Peer Data Mining, Privacy Issues, and Games [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/peer-to-peer-data-mining-privacy-issues-and-games
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    Peer-to-Peer (P2P) networks are gaining increasing popularity in many distributed applications such as file-sharing, network storage, web caching, sear- ching and indexing of relevant documents and P2P network-threat analysis. Many of these applications require scalable analysis of data over a P2P network. This paper starts by offering a brief overview of distributed data mining applications and algorithms for P2P environments. Next it discusses some of the privacy concerns with P2P data mining and points out the problems of existing privacy-preserving multi-party data mining techniques. It further points out that most of the nice assumptions of these existing privacy preserving techniques fall apart in real-life applications of privacy-preserving distributed data mining (PPDM). The paper offers a more realistic formulation of the PPDM problem as a multi-party game and points out some recent results.

  14. m

    sorted_ES

    • data.mendeley.com
    Updated Nov 2, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qinzhuo Liao (2018). sorted_ES [Dataset]. http://doi.org/10.17632/nzkrxf74d4.1
    Explore at:
    Dataset updated
    Nov 2, 2018
    Authors
    Qinzhuo Liao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Matlab codes for examples

  15. g

    Development Economics Data Group - Score on action when a problem arose |...

    • gimi9.com
    Updated May 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Development Economics Data Group - Score on action when a problem arose | gimi9.com [Dataset]. https://gimi9.com/dataset/worldbank_wb_es_t_mgmt2/
    Explore at:
    Dataset updated
    May 7, 2025
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Score on Action When a Problem Arises represents a measurement of how establishments respond to issues during the production process, encompassing actions taken to rectify problems and prevent future occurrences.

  16. s

    Data and source code for "Automating Intention Mining"

    • researchdata.smu.edu.sg
    zip
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qiao HUANG; Xin XIA; David LO; Gail C. MURPHY (2023). Data and source code for "Automating Intention Mining" [Dataset]. http://doi.org/10.25440/smu.21261408.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    SMU Research Data Repository (RDR)
    Authors
    Qiao HUANG; Xin XIA; David LO; Gail C. MURPHY
    License

    http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/

    Description

    The dataset and source code for paper "Automating Intention Mining".

    The code is based on dennybritz's implementation of Yoon Kim's paper Convolutional Neural Networks for Sentence Classification.

    By default, the code uses Tensorflow 0.12. Some errors might be reported when using other versions of Tensorflow due to the incompatibility of some APIs.

    Running 'online_prediction.py', you can input any sentence and check the classification result produced by a pre-trained CNN model. The model uses all sentences of the four Github projects as training data.

    Running 'play.py', you can get the evaluation result of cross-project prediction. Please check the code for more details of the configuration. By default, it will use the four Github projects as training data to predict the sentences in DECA dataset, and in this setting, the category 'aspect evaluation' and 'others' are dropped since DECA dataset does not contain these two categories.

  17. Italy: privacy concerns regarding personal data on the internet, by issue

    • statista.com
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Italy: privacy concerns regarding personal data on the internet, by issue [Dataset]. https://www.statista.com/statistics/830088/privacy-concerns-regarding-personal-data-on-the-internet-in-italy/
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jul 2016 - Aug 2016
    Area covered
    Italy
    Description

    This statistic displays the results of a survey on the share of individuals expressing privacy concerns regarding their personal data on the internet in Italy in 2016. During the survey period, it was found that **** percent of the respondents reported that the use of the internet exposes each one to be tracked and followed up while **** percent stated that privacy was not a real problem.

  18. O

    Replication Data for: Exact algorithms for a parallel machine scheduling...

    • portal.odissei.nl
    • dataverse.nl
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giulia Caselli; Maxence Delorme; Manuel Iori; Carlo Alberto Magni (2025). Replication Data for: Exact algorithms for a parallel machine scheduling problem with workforce and contiguity constraints [Dataset]. http://doi.org/10.34894/LME3DH
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 26, 2025
    Dataset provided by
    ODISSEI Portal
    Authors
    Giulia Caselli; Maxence Delorme; Manuel Iori; Carlo Alberto Magni
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This repository contains the instances used in the paper "Exact algorithms for a parallel machine scheduling problem with workforce and contiguity constraints" by Giulia Caselli, Maxence Delorme, Manuel Iori, and Carlo Alberto Magni.

  19. e

    Research Data for Gaming and Problem-Solving: Enhancing Critical Thinking

    • research-l8qya.lolm.eu.org
    • scholar-lgztf.lolm.eu.org
    csv, json
    Updated Jul 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Daniel Hall (2025). Research Data for Gaming and Problem-Solving: Enhancing Critical Thinking [Dataset]. https://research-l8qya.lolm.eu.org/en/key=xgr3f6604130/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Jul 18, 2025
    Authors
    Dr. Daniel Hall
    Variables measured
    Variable A, Variable B, Variable C, Correlation Index, Statistical Significance
    Description

    Complete dataset used in the research study on Gaming and Problem-Solving: Enhancing Critical Thinking by Dr. Daniel Hall

  20. d

    Data from: A Distributed Approach to System-Level Prognostics

    • catalog.data.gov
    • datasets.ai
    • +2more
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). A Distributed Approach to System-Level Prognostics [Dataset]. https://catalog.data.gov/dataset/a-distributed-approach-to-system-level-prognostics
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    Prognostics, which deals with predicting remaining useful life of components, subsystems, and systems, is a key tech- nology for systems health management that leads to improved safety and reliability with reduced costs. The prognostics problem is often approached from a component-centric view. However, in most cases, it is not specifically component life- times that are important, but, rather, the lifetimes of the sys- tems in which these components reside. The system-level prognostics problem can be quite difficult due to the increased scale and scope of the prognostics problem and the rela- tive lack of scalability and efficiency of typical prognostics approaches. In order to address these issues, we develop a distributed solution to the system-level prognostics prob- lem, based on the concept of structural model decomposi- tion. The system model is decomposed into independent submodels. Independent local prognostics subproblems are then formed based on these local submodels, resulting in a scalable, efficient, and flexible distributed approach to the system-level prognostics problem. We provide a formulation of the system-level prognostics problem and demonstrate the approach on a four-wheeled rover simulation testbed. The re- sults show that the system-level prognostics problem can be accurately and efficiently solved in a distributed fashion.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shubham Chandel; Colin B. Clement; Guillermo Serrato; Neel Sundaresan, Data Science Problems Dataset [Dataset]. https://paperswithcode.com/dataset/data-science-problems

Data from: Data Science Problems Dataset

Related Article
Explore at:
Authors
Shubham Chandel; Colin B. Clement; Guillermo Serrato; Neel Sundaresan
Description

Evaluate a natural language code generation model on real data science pedagogical notebooks! Data Science Problems (DSP) includes well-posed data science problems in Markdown along with unit tests to verify correctness and a Docker environment for reproducible execution. About 1/3 of notebooks in this benchmark also include data dependencies, so this benchmark not only can test a model's ability to chain together complex tasks, but also evaluate the solutions on real data! See our paper Training and Evaluating a Jupyter Notebook Data Science Assistant for more details about state of the art results and other properties of the dataset.

Search
Clear search
Close search
Google apps
Main menu