100+ datasets found

d
SIAM 2007 Text Mining Competition dataset
catalog.data.gov
data.nasa.gov
+2more
Updated Apr 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). SIAM 2007 Text Mining Competition dataset [Dataset]. https://catalog.data.gov/dataset/siam-2007-text-mining-competition-dataset
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
Subject Area: Text Mining Description: This is the dataset used for the SIAM 2007 Text Mining competition. This competition focused on developing text mining algorithms for document classification. The documents in question were aviation safety reports that documented one or more problems that occurred during certain flights. The goal was to label the documents with respect to the types of problems that were described. This is a subset of the Aviation Safety Reporting System (ASRS) dataset, which is publicly available. How Data Was Acquired: The data for this competition came from human generated reports on incidents that occurred during a flight. Sample Rates, Parameter Description, and Format: There is one document per incident. The datasets are in raw text format. All documents for each set will be contained in a single file. Each row in this file corresponds to a single document. The first characters on each line of the file are the document number and a tilde separats the document number from the text itself. Anomalies/Faults: This is a document category classification problem.
d
ARPA-E Grid Optimization (GO) Competition Challenge 1
catalog.data.gov
data.openei.org
+1more
Updated Sep 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pacific Northwest National Laboratory (2024). ARPA-E Grid Optimization (GO) Competition Challenge 1 [Dataset]. https://catalog.data.gov/dataset/arpa-e-grid-optimization-go-competition-challenge-1
Explore at:
Dataset updated
Sep 30, 2024
Dataset provided by
Pacific Northwest National Laboratory
Description
The ARPA-E Grid Optimization (GO) Competition Challenge 1, from 2018 to 2019, focused on the basic Security Constrained AC Optimal Power Flow problem (SCOPF) for a single time period. The Challenge utilized sets of unique datasets generated by the ARPA-E GRID DATA program. Each dataset consisted of a collection of power system network models of different sizes with associated operating scenarios (snapshots in time defining instantaneous power demand, renewable generation, generator and line availability, etc.). The datasets were of two types: Real-Time, which included starting-point information, and Online, which did not. Week-Ahead data is also provided for some cases but was not used in the Competition. Although most datasets were synthetic and generated by GRIDDATA, a few came from industry and were only used in the Final Event. All synthetic Input Data and Team Results for the GO Competition Challenge 1 for the Sandbox, Trial Events 1 to 3, and the Final Event along with problem, format, scoring and rules descriptions are available here. Data for industry scenarios will not be made public. Challenge 1, a minimization problem, required two computational steps. Solver 1 or Code 1 solved the base SCOPF problem under a strict wall clock time limit, as would be the case in industry, and reported the base case operating point as output, which was used to compute the Objective Function value that was used as the scenario score. The feasibility of the solution was provided by the Solver 2 or Code 2, which solves the power flow problem for all contingencies based on the results from Solver 1. This is not normally done in industry, so the time limits were relaxed. In fact, there were no time limits for Trial Event 1. This proved to be a mistake, with some codes running for more than 90 hours, and a time limit of 2 seconds per contingency was imposed for all other events. Entrants were free to use their own Solver 2 or use an open-source version provided by the Competition. Containers, such as Docker, were considered to improve the portability of codes, but none that could reliably support a multi-node parallel computing environment, e.g., MPI, could be found. For more information on the competition and challenge see the "GO Competition Challenge 1 Information" and "GO Competition Challenge 1 Additional Information" resources below.
P
M4 Dataset
paperswithcode.com
Updated Feb 7, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Makridakis (2021). M4 Dataset [Dataset]. https://paperswithcode.com/dataset/m4
Explore at:
Dataset updated
Feb 7, 2021
Authors
Makridakis
Description
The M4 dataset is a collection of 100,000 time series used for the fourth edition of the Makridakis forecasting Competition. The M4 dataset consists of time series of yearly, quarterly, monthly and other (weekly, daily and hourly) data, which are divided into training and test sets. The minimum numbers of observations in the training test are 13 for yearly, 16 for quarterly, 42 for monthly, 80 for weekly, 93 for daily and 700 for hourly series. The participants were asked to produce the following numbers of forecasts beyond the available data that they had been given: six for yearly, eight for quarterly, 18 for monthly series, 13 for weekly series and 14 and 48 forecasts respectively for the daily and hourly ones.

The M4 dataset was created by selecting a random sample of 100,000 time series from the ForeDeCk database. The selected series were then scaled to prevent negative observations and values lower than 10, thus avoiding possible problems when calculating various error measures. The scaling was performed by simply adding a constant to the series so that their minimum value was equal to 10 (29 occurrences across the whole dataset). In addition, any information that could possibly lead to the identification of the original series was removed so as to ensure the objectivity of the results. This included the starting dates of the series, which did not become available to the participants until the M4 had ended.
p
Taekwondo Competition Areas in Oregon, United States - 2 Verified Listings...
poidata.io
csv, excel, json
Updated Jul 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Poidata.io (2025). Taekwondo Competition Areas in Oregon, United States - 2 Verified Listings Database [Dataset]. https://www.poidata.io/report/taekwondo-competition-area/united-states/oregon
Explore at:
csv, excel, jsonAvailable download formats
Dataset updated
Jul 13, 2025
Dataset provided by
Poidata.io
Area covered
Oregon, United States
Description
Comprehensive dataset of 2 Taekwondo competition areas in Oregon, United States as of July, 2025. Includes verified contact information (email, phone), geocoded addresses, customer ratings, reviews, business categories, and operational details. Perfect for market research, lead generation, competitive analysis, and business intelligence. Download a complimentary sample to evaluate data quality and completeness.
p
Santa Fe Time Series Competition Data Set B
physionet.org
search.datacite.org
Updated Jan 6, 2000
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2000). Santa Fe Time Series Competition Data Set B [Dataset]. http://doi.org/10.13026/C20W2T
Explore at:
Unique identifier
https://doi.org/10.13026/C20W2T
Dataset updated
Jan 6, 2000
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Description
This is a multivariate data set recorded from a patient in the sleep laboratory of the Beth Israel Hospital (now the Beth Israel Deaconess Medical Center) in Boston, Massachusetts. This data set was extracted from record slp60 of the MIT-BIH Polysomnographic Database, and it was submitted to the Santa Fe Time Series Competition in 1991 by our group. The data are presented in text form and have been split into two sequential parts. Each line contains simultaneous samples of three parameters; the interval between samples in successive lines is 0.5 seconds. The first column is the heart rate, the second is the chest volume (respiration force), and the third is the blood oxygen concentration (measured by ear oximetry). The sampling frequency for each measurement is 2 Hz (i.e., the time interval between measurements in successive rows is 0.5 seconds).
Data from: Datasets and Supporting Materials for the IPIN 2023 Competition...
zenodo.org
producciocientifica.uv.es
+1more
zip
Updated Jul 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joaquín Torres-Sospedra; Joaquín Torres-Sospedra; Antonino Crivello; Antonino Crivello; Maximilian Stahlke; Maximilian Stahlke; Francesco Potortì; Francesco Potortì; Miguel Ortiz; Miguel Ortiz; Ziyou Li; Ziyou Li; Antoni Pérez-Navarro; Antoni Pérez-Navarro; Antonio R. Jiménez; Antonio R. Jiménez (2024). Datasets and Supporting Materials for the IPIN 2023 Competition Track 3 (Smartphone-based, off-site) [Dataset]. http://doi.org/10.5281/zenodo.8362205
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8362205
Dataset updated
Jul 9, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Joaquín Torres-Sospedra; Joaquín Torres-Sospedra; Antonino Crivello; Antonino Crivello; Maximilian Stahlke; Maximilian Stahlke; Francesco Potortì; Francesco Potortì; Miguel Ortiz; Miguel Ortiz; Ziyou Li; Ziyou Li; Antoni Pérez-Navarro; Antoni Pérez-Navarro; Antonio R. Jiménez; Antonio R. Jiménez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Sep 28, 2023
Description
This package contains the datasets and supplementary materials used in the IPIN 2023 Competition.
Contents
Track-3_TA-2023.pdf: Technical annexe describing the competition (Version 2)
01 Logfiles: This folder contains a subfolder with the 54 training trials, a subfolder with the 4 testing trials (validation), and a subfolder with the 2 blind scoring trials (test) as provided to competitors.
02 Supplementary_Materials: This folder contains the Matlab/octave parser, the raster maps, the files for the Matlab tools and the trajectory visualization.
03 Evaluation: This folder contains the scripts we used to calculate the competition metric, the 75th percentile on the 69 evaluation points. It requires the Matlab Mapping Toolbox. We also provide the ground truth as 2 CSV files. It contains samples of reported estimations and the corresponding results.
We provide additional information on the competition at: https://evaal.aaloa.org/2023/call-for-competition
Citation Policy
Please cite the following works when using the datasets included in this package:
Torres-Sospedra, J.; et al. Datasets and Supporting Materials for the IPIN 2023
Competition Track 3 (Smartphone-based, off-site), Zenodo 2023
http://dx.doi.org/10.5281/zenodo.8362205
Check the updated citation policy at: http://dx.doi.org/10.5281/zenodo.8362205
Contact
For any further questions about the database and this competition track, please contact:
Joaquín Torres-Sospedra
Centro ALGORITMI,
Universidade do Minho, Portugal
info@jtorr.es - jtorres@algoritmi.uminho.pt

Antonio R. Jiménez
Centre of Automation and Robotics (CAR)-CSIC/UPM, Spain
antonio.jimenez@csic.es
Antoni Pérez-Navarro
Faculty of Computer Sciences, Multimedia and Telecommunication, Universitat Oberta de Catalunya, Barcelona, Spain
aperezn@uoc.edu
Acknowledgements
We thank Maximilian Stahlke and Christopher Mutschler at Fraunhofer ISS, as well as Miguel Ortiz and Ziyou Li at Université Gustave Eiffel, for their invaluable support in collecting the datasets. And last but certainly not least, Antonino Crivello and Francesco Potortì for their huge effort in georeferencing the competition venue and evaluation points.
We extend our appreciation to the staff at the Museum for Industrial Culture (Museum Industriekultur) for their unwavering patience and invaluable support throughout our collection days.
We are also grateful to Francesco Potortì, the ISTI-CNR team (Paolo, Michele & Filippo), and the Fraunhofer IIS team (Chris, Tobi, Max, ...) for their invaluable commitment to organizing and promoting the IPIN competition.
This work and competition belong to the IPIN 2023 Conference in Nuremberg (Germany).
Parts of this work received the financial support received from projects and grants:
ORIENTATE (H2020-MSCA-IF-2020, Grant Agreement 101023072)
GeoLibero (from CYTED)
INDRI (MICINN, ref. PID2021-122642OB-C42, PID2021-122642OB-C43, PID2021-122642OB-C44, MCIU/AEI/FEDER UE)
MICROCEBUS (MICINN, ref. RTI2018-095168-B-C55, MCIU/AEI/FEDER UE)
TARSIUS (TIN2015-71564-C4-2-R, MINECO/FEDER)
SmartLoc(CSIC-PIE Ref.201450E011)
LORIS (TIN2012-38080-C04-04)
g
Data from: EU Merger Control Database: 1990-2014
search.gesis.org
datacatalogue.cessda.eu
+2more
Updated Apr 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Duso, Tomaso (2024). EU Merger Control Database: 1990-2014 [Dataset]. https://search.gesis.org/research_data/SDN-10.25652-diw_data_S0019_1
Explore at:
Dataset updated
Apr 13, 2024
Dataset provided by
Deutsches Institut für Wirtschaftsforschung e.V. (DIW Berlin)
GESIS search
Authors
Duso, Tomaso
License
https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms
Area covered
European Union
Description
We collected data on almost the complete population of the merger control decisions by the Directorate-General Competition’s (DG COMP) of the European Commission. We started the data collection with the first year of common European merger control, 1990, and included all years up to 2014. This amounts to 25 years of data on European merger control. With regard to the scope of the decisions, we collected data in all cases where a legal decision document exists. This includes all cases settled in the first phase of an investigation (Art. 6(1)(a), 6(1)(b), 6(1)(c) and 6(2)) and all cases decided in the second phase of an investigation (Art. 8(1), 8(2), and 8(3)). Note that this also includes all cases settled under a ‘simplified procedure’, provided that a legal decision document exists. Furthermore, we also intended to collect data on cases that were either referred back to member states by DG COMP or aborted by the merging parties. While we have collected some data on such cases, data on these cases is not always available. Therefore, we cannot guarantee that the final dataset covers all of these cases. The level of observation is not a particular merger case but a particular product/geographic market combination concerned by a merger. In total, the final dataset contains 5,196 DG COMP merger decisions. For each of this decision, we record a number of observations equal to the number of product/geographic markets identified in the specific transaction. Hence, the total dataset contains 31,451 observations.
pii-comp
kaggle.com
zip
Updated Apr 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Devin Anzelmo (2024). pii-comp [Dataset]. https://www.kaggle.com/datasets/devinanzelmo/pii-comp
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Apr 18, 2024
Authors
Devin Anzelmo
Description
Models and external data of 3rd place efficiency solution for https://www.kaggle.com/competitions/pii-detection-removal-from-educational-data competition.

See https://www.kaggle.com/code/devinanzelmo/piidd-efficiency-3rd-process-external-data for links to external data and processing code

See https://www.kaggle.com/code/devinanzelmo/piidd-efficiency-3rd-train for training code that generated models.

See https://www.kaggle.com/code/devinanzelmo/piidd-efficiency-3rd-inference for inference code
o
Data and Code for: Competition and Defaults in Online Search
openicpsr.org
delimited, spss
Updated Sep 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Francesco Decarolis; Muxin Li; Filippo Alberto Paternollo (2024). Data and Code for: Competition and Defaults in Online Search [Dataset]. http://doi.org/10.3886/E209142V1
Explore at:
spss, delimitedAvailable download formats
Unique identifier
https://doi.org/10.3886/E209142V1
Dataset updated
Sep 16, 2024
Dataset provided by
American Economic Association
Authors
Francesco Decarolis; Muxin Li; Filippo Alberto Paternollo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2009 - Dec 31, 2022
Area covered
World
Description
Dataset and code for the paper Competition and Defaults in Online Search.This paper offers the first systematic quantitative assessment of default-option interventions designed to mitigate Google’s search dominance. By analyzing interventions in the European Economic Area, Russia, and Turkey, we find that, across all three cases, changes to default settings effectively reduced Google’s market share. The causal impact amounts to less than 1 percentage point in the EEA and over 10 percentage points in Russia and Turkey. Differences arise from intervention nuances, including the size of the targeted users’ group, local market characteristics, and remedy designs. We discuss the complexity of assessing the interventions’ impact on welfare deriving from quality responses.
c
DXC'11 Industrial Track Competition Data
s.cnmilf.com
gimi9.com
+5more
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). DXC'11 Industrial Track Competition Data [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/dxc11-industrial-track-competition-data
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
Competition data, including nominal and faulty scenarios, for Diagnostic Problem I of the Third International Diagnostic Competition. Three file formats are provided, tab-delimited .txt files, Matlab .mat files, and tab-delimited .scn files. The scenario (.scn) files are read by the DXC framework. See the DXC'11 Industrial Track Sample Data resource page for additional documentation, including system catalogs and schematics. There were no DA entries for Diagnostic Problem II so we are withholding the data for use in a future Diagnostic Competition.
T
Global Database Management System Market Segment Outlook, Market Assessment,...
the-market.us
csv, pdf
Updated Aug 27, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). Global Database Management System Market Segment Outlook, Market Assessment, Competition Scenario, Trends and Forecast 2020-2029 [Dataset]. https://the-market.us/report/database-management-system-market/
Explore at:
pdf, csvAvailable download formats
Dataset updated
Aug 27, 2019
License
https://the-market.us/privacy-policy/https://the-market.us/privacy-policy/
Time period covered
2016 - 2022
Area covered
Global
Description
Table of Contents
Global Database Management System Market
Global Database Management System Market is estimated to be valued US$ XX.X million in 2019. The report on Database Management System Market provides qualitative as well as quantitative analysis in terms of market dynamics, competition scenarios, opportunity analysis, market growth, etc. for the forecast year up to 2029. The global database management system market is segmented on the basis of type, application, and geography.
In 2019, the North America market is valued US$ XX.X million and the market share is estimated X.X%, and it is expected to be US$ XX.X million and X.X% in 2029, with a CAGR X.X% from 2020 to 2029. Read More
PlaygroundS4E06|OriginalData
kaggle.com
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ravi Ramakrishnan (2024). PlaygroundS4E06|OriginalData [Dataset]. https://www.kaggle.com/datasets/ravi20076/playgrounds4e06originaldata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 1, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ravi Ramakrishnan
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This data is downloaded from the link shared in the PlaygroundS4E06 episode on the data page. We add a column id to keep consistency with the competition data and upload herewith.
Please feel free to use this dataset as part of your pipeline.

Key links:- 1. Competition - https://www.kaggle.com/competitions/playground-series-s4e6 2. Data page- https://www.kaggle.com/competitions/playground-series-s4e6/data
3. Original dataset link- https://archive.ics.uci.edu/dataset/697/predict+students+dropout+and+academic+success

This is a .csv file. Please use pandas.read_csv() or polars.scan_csv() to read in the file

Best regards!
p
Taekwondo Competition Areas in Wisconsin, United States - 1 Verified...
poidata.io
csv, excel, json
Updated Jul 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Poidata.io (2025). Taekwondo Competition Areas in Wisconsin, United States - 1 Verified Listings Database [Dataset]. https://www.poidata.io/report/taekwondo-competition-area/united-states/wisconsin
Explore at:
json, csv, excelAvailable download formats
Dataset updated
Jul 13, 2025
Dataset provided by
Poidata.io
Area covered
Wisconsin, United States
Description
Comprehensive dataset of 1 Taekwondo competition areas in Wisconsin, United States as of July, 2025. Includes verified contact information (email, phone), geocoded addresses, customer ratings, reviews, business categories, and operational details. Perfect for market research, lead generation, competitive analysis, and business intelligence. Download a complimentary sample to evaluate data quality and completeness.
d
National Legal Database Creative Teaching Competition-Title of Award-winning...
data.gov.tw
csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Information Management, National Legal Database Creative Teaching Competition-Title of Award-winning Lesson Plans [Dataset]. https://data.gov.tw/en/datasets/108263
Explore at:
csvAvailable download formats
Dataset authored and provided by
Department of Information Management
License
https://data.gov.tw/licensehttps://data.gov.tw/license
Description
Describe the names of awarded teaching plans in the National Legal Database Creative Teaching Competition.
J
Innovation and competition: An unstable relationship (replication data)
journaldata.zbw.eu
stata data, stata do +1
Updated Dec 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juan A. Correa; Juan A. Correa (2022). Innovation and competition: An unstable relationship (replication data) [Dataset]. http://doi.org/10.15456/jae.2022320.0724338878
Explore at:
txt(860), stata do(4833), txt(6634), stata data(96389), txt(328866)Available download formats
Unique identifier
https://doi.org/10.15456/jae.2022320.0724338878
Dataset updated
Dec 7, 2022
Dataset provided by
ZBW - Leibniz Informationszentrum Wirtschaft
Authors
Juan A. Correa; Juan A. Correa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This paper analyzes how the establishment of the United States Court of Appeals for the Federal Circuit in 1982 has affected the relationship between innovation and competition. Using the same dataset as Aghion et al. (Competition and innovation: an inverted-u relationship. Quarterly Journal of Economics 2005; 120(2):701-728) I find a structural break in the early 1980s. Taking this break into consideration, the inverted-U empirical relationship between innovation and competition found by Aghion et al. does not hold. In fact, I find that there is a positive innovation-competition relationship during the period 1973-1982 and no relationship at all in the 1983-1994 period.
competition-data
kaggle.com
Updated Feb 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maha Kosksi (2024). competition-data [Dataset]. https://www.kaggle.com/datasets/mahakosksi/competition-data/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 25, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Maha Kosksi
Description
Dataset

This dataset was created by Maha Kosksi

Contents
d
Data from: Utah FORGE: Well Data for Student Competition
catalog.data.gov
gdr.openei.org
+2more
Updated Jan 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Idaho National Laboratory (2025). Utah FORGE: Well Data for Student Competition [Dataset]. https://catalog.data.gov/dataset/utah-forge-well-data-for-student-competition-40483
Explore at:
Dataset updated
Jan 20, 2025
Dataset provided by
Idaho National Laboratory
Description
Well 58-32 (previously labeled MU-ESW1) was drilled near Milford Utah during Phase 2B of the FORGE Project to confirm geothermal reservoir characteristics met requirements for the final FORGE site. Well Accord-1 was drilled decades ago for geothermal exploration purposes. While the conditions encountered in the well were not suitable for developing a conventional hydrothermal system, the information obtained suggested the region may be suitable for an enhanced geothermal system. Geophysical well logs were collected in both wells to obtain useful information regarding there nature of the subsurface materials. For the recent testing of 58-32, the Utah FORGE Project contracted with the well services company Schlumberger to collect the well logs.
M4 Forecasting Competition Dataset
kaggle.com
zip
Updated Mar 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sri Yogesh (2020). M4 Forecasting Competition Dataset [Dataset]. https://www.kaggle.com/yogesh94/m4-forecasting-competition-dataset
Explore at:
zip(83502902 bytes)Available download formats
Dataset updated
Mar 9, 2020
Authors
Sri Yogesh
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The M4 Forecasting Competition Dataset

The M4 competition which is a continuation of the Makridakis Competitions for forecasting and was conducted in 2018. This competion includes the prediction of both Point Forecasts and Prediction Intervals.

More Details

Paper describing the competition and the various benchmarks and approaches was published in a special edition of the International Journal of Forecasting and is available for open access and can be found here

Code for benchmarks

The code for various benchmarks on this dataset can be found at the following github repository

Source

The data is available at both the github link and the official website of MOFC
JANE STREET PREPROCESSED
kaggle.com
Updated Dec 22, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saurabh Shahane (2020). JANE STREET PREPROCESSED [Dataset]. https://www.kaggle.com/datasets/saurabhshahane/jane-street-preprocessed-train
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 22, 2020
Dataset provided by
Kaggle
Authors
Saurabh Shahane
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Saurabh Shahane

Released under CC0: Public Domain

Contents
d
Training and validation data from the AI for Critical Mineral Assessment...
catalog.data.gov
data.usgs.gov
+1more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Training and validation data from the AI for Critical Mineral Assessment Competition [Dataset]. https://catalog.data.gov/dataset/training-and-validation-data-from-the-ai-for-critical-mineral-assessment-competition
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
Extracting useful and accurate information from scanned geologic and other earth science maps is a time-consuming and laborious process involving manual human effort. To address this limitation, the USGS partnered with the Defense Advanced Research Projects Agency (DARPA) to run the AI for Critical Mineral Assessment Competition, soliciting innovative solutions for automatically georeferencing and extracting features from maps. The competition opened for registration in August 2022 and concluded in December 2022. Training and validation data from the competition are provided here, as well as competition details and baseline solutions. The data are derived from published sources and are provided to the public to support continued development of automated georeferencing and feature extraction tools. References for all maps are included with the data.

Facebook

Twitter

Click to copy link

Link copied

Cite

Dashlink (2025). SIAM 2007 Text Mining Competition dataset [Dataset]. https://catalog.data.gov/dataset/siam-2007-text-mining-competition-dataset

SIAM 2007 Text Mining Competition dataset

Explore at:

25 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Apr 11, 2025

Dataset provided by

Dashlink

Description

Subject Area: Text Mining Description: This is the dataset used for the SIAM 2007 Text Mining competition. This competition focused on developing text mining algorithms for document classification. The documents in question were aviation safety reports that documented one or more problems that occurred during certain flights. The goal was to label the documents with respect to the types of problems that were described. This is a subset of the Aviation Safety Reporting System (ASRS) dataset, which is publicly available. How Data Was Acquired: The data for this competition came from human generated reports on incidents that occurred during a flight. Sample Rates, Parameter Description, and Format: There is one document per incident. The datasets are in raw text format. All documents for each set will be contained in a single file. Each row in this file corresponds to a single document. The first characters on each line of the file are the document number and a tilde separats the document number from the text itself. Anomalies/Faults: This is a document category classification problem.

Clear search

Close search

Google apps

Main menu

SIAM 2007 Text Mining Competition dataset

ARPA-E Grid Optimization (GO) Competition Challenge 1

M4 Dataset

Taekwondo Competition Areas in Oregon, United States - 2 Verified Listings...

Santa Fe Time Series Competition Data Set B

Data from: Datasets and Supporting Materials for the IPIN 2023 Competition...

Data from: EU Merger Control Database: 1990-2014

pii-comp

Data and Code for: Competition and Defaults in Online Search

DXC'11 Industrial Track Competition Data

Global Database Management System Market Segment Outlook, Market Assessment,...

Table of Contents

PlaygroundS4E06|OriginalData

Taekwondo Competition Areas in Wisconsin, United States - 1 Verified...

National Legal Database Creative Teaching Competition-Title of Award-winning...

Innovation and competition: An unstable relationship (replication data)

competition-data

Dataset

Contents

Data from: Utah FORGE: Well Data for Student Competition

M4 Forecasting Competition Dataset

The M4 Forecasting Competition Dataset

More Details

Code for benchmarks

Source

JANE STREET PREPROCESSED

Dataset

Contents

Training and validation data from the AI for Critical Mineral Assessment...

SIAM 2007 Text Mining Competition datasetSee More Versions

SIAM 2007 Text Mining Competition dataset