Facebook
TwitterThe dataset used for the experiments in the paper, containing 12,000 molecules with 12 biological effects.
Facebook
TwitterThis dataset describes the degradation of an aircraft engine. The dataset was used for the prognostics challenge competition at the International Conference on Prognostics and Health Management (PHM08). The challenge is still open for the researchers to develop and compare their efforts against the winners of the challenge in 2008. Data sets consist of multiple multivariate time series. Each data set is further divided into training and test subsets. Each time series is from a different aircraft engine – i.e., the data can be considered to be from a fleet of engines of the same type. Each engine starts with different degrees of initial wear and manufacturing variation which is unknown to the user. This wear and variation is considered normal, i.e., it is not considered a fault condition. There are three operational settings that have a substantial effect on engine performance. These settings are also included in the data. The data are contaminated with sensor noise.
Facebook
TwitterThis blog post was posted by Wes Barker on July 27, 2018. It was written by Steven Posnack, M.S., M.H.S., Dustin Charles and Wes Barker.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sangria includes two main datasets: each contains Gaussian instrumental noise and simulated waveforms from 30 million Galactic white dwarf binaries, from 17 verification Galactic binaries, and from merging massive black-hole binaries with parameters derived from an astrophysical model. The first dataset includes the full specification used to generate it: source parameters, a description of instrumental noise with the corresponding power spectral density, LISA's orbit, etc. We also release noiseless data for each type of source, for waveform validation purposes. The second dataset is blinded: the level of istrumental noise and number of sources of each type are not disclosed (except for the known parameters of the verification binaries).
See LDC website for more details.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This project analyzes the effectiveness of a strategic pilot program for the chips category in a retail environment. To drive growth, a retailer implemented targeted promotions in three trial stores (77, 86, and 88) from February to April 2019. This analysis measures the success of that trial by comparing performance against carefully selected control stores. Furthermore, the project delves into customer purchasing behavior to identify high-value segments and provide data-driven recommendations for a future national rollout.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository hosts the dataset used in the LSSTC AGN Data Challenge (DC) 2021 (PI: Gordon Richards). More information about the data challenge can be found in the DC GitHub repository @ https://github.com/RichardsGroup/AGN_DataChallenge.
Dataset Versions:
1.0: The initial dataset used in the DC, as well as the blinded dataset (ObjectTable_Blinded.parquet) that was used to evaluate submissions. Note that the image cutouts are not included here due to the large size, but the script used to generate those cutouts using SDSS archive services is included in the DC GitHub repository.
1.1: The same dataset as in v1.0 but with the following updates:
Uncovered the true coordinates of each source in the dataset
Added E(B-V) for every source using the SFD1998 dust map
Added spectrum source information (i.e., SDSS fiber, plate, mjd) if available.
Caveat:
The optical (grizY) and NIR photometry of sources in the XMM-LSS field is a product of the HSC/VISTA pixel-level joint processing initiative led by Raphael Shirley and Manda Banerji. Thus, it is an early prototype dataset and is still subject to testing and characterization.
Citation:
The DC dataset released here is a compilation of data from various sources. If you find the DC dataset useful for your research and would like to acknowledge it, please also reference the original sources of the data. Below is a list of publications that you should consider citing.
X-ray in XMM-LSS (XMM-SERVS): 2018MNRAS.478.2132C
UV Photometry (GALEX): 2017ApJS..230...24B
Optical Photometry (in the object/source tables):
DES: 2021ApJS..255...20A
SDSS Stripe 82 Coadd: 2014ApJ...794..120A
HSC DR2: 2019PASJ...71..114A
Optical Light Curves (in the ForcedSource table):
SDSS DR7: 2009ApJS..182..543A
SDSS II Supernova Survey: 2008AJ....135..338F
Astrometry (i.e., parallax, proper motion):
Gaia EDR3: 2021A&A...649A...1G
NOIRLab Source Catalog DR2: 2021AJ....161..192N
NIR in XMM-LSS (VISTA/VIDEO): 2013MNRAS.428.1281J
NIR in Stripe 82 (UKIDSS):
2006MNRAS.367..454H
2007MNRAS.379.1599L
2008MNRAS.384..637H
2009MNRAS.394..675H
Optical u-band in XMM-LSS (CFHTLS): 2012yCat.2317....0H
MIR in XMM-LSS (Spitzer DeepDrill): 2021MNRAS.501..892L
MIR in Stripe 82 (SpIES): 2016ApJS..225....1T
FIR (Hershel/HELP): 2019MNRAS.490..634S
Radio (FIRST): 1994ASPC...61..165B
HighZ QSOs:
2016ApJ...819...24W
2016ApJ...829...33Y
SDSS Spectroscopy:
SDSS DR16: 2020ApJS..249....3A
SDSS DR16 Quasar Catalog: 2020ApJS..250....8L
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
An Open Context "tables" dataset item.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Objective(s): Momentum for open access to research is growing. Funding agencies and publishers are increasingly requiring researchers make their data and research outputs open and publicly available. However, this introduces many challenges, especially when managing confidential clinical data. The aim of this 1 hr virtual workshop is to provide participants with knowledge about what synthetic data is, methods to create synthetic data, and the 2023 Pediatric Sepsis Data Challenge. Workshop Agenda: 1. Introduction - Speaker: Mark Ansermino, Director, Centre for International Child Health 2. "Leveraging Synthetic Data for an International Data Challenge" - Speaker: Charly Huxford, Research Assistant, Centre for International Child Health 3. "Methods in Synthetic Data Generation." - Speaker: Vuong Nguyen, Biostatistician, Centre for International Child Health and The HIPpy Lab This workshop draws on work supported by the Digital Research Alliance of Canada. Data Description: Presentation slides, Workshop Video, and Workshop Communication Charly Huxford: Leveraging Synthetic Data for an International Data Challenge presentation and accompanying PowerPoint slides. Vuong Nguyen: Methods in Synthetic Data Generation presentation and accompanying Powerpoint slides. This workshop was developed as part of Dr. Ansermino's Data Champions Pilot Project supported by the Digital Research Alliance of Canada. NOTE for restricted files: If you are not yet a CoLab member, please complete our membership application survey to gain access to restricted files within 2 business days. Some files may remain restricted to CoLab members. These files are deemed more sensitive by the file owner and are meant to be shared on a case-by-case basis. Please contact the CoLab coordinator on this page under "collaborate with the pediatric sepsis colab."
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Mauro Manclossi
Released under CC0: Public Domain
Facebook
TwitterCamden Open Data Challenge
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
In collaboration with the IEEE Conference on Biomedical and Health Informatics (BHI) 2018 and the IEEE Conference on Body Sensor Networks (BSN), we are hosting a challenge to explore real clinical questions in critically ill patients using the MIMIC-III database. Participants in the challenge will be invited to present at the BHI & BSN Annual Conference in Las Vegas, USA (4-7 March 2018): https://bhi-bsn.embs.org/2018/
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The ARPA-E Grid Optimization (GO) Competition Challenge 1, from 2018 to 2019, focused on the basic Security Constrained AC Optimal Power Flow problem (SCOPF) for a single time period. The Challenge utilized sets of unique datasets generated by the ARPA-E GRID DATA program. Each dataset consisted of a collection of power system network models of different sizes with associated operating scenarios (snapshots in time defining instantaneous power demand, renewable generation, generator and line availability, etc.). The datasets were of two types: Real-Time, which included starting-point information, and Online, which did not. Week-Ahead data is also provided for some cases but was not used in the Competition. Although most datasets were synthetic and generated by GRIDDATA, a few came from industry and were only used in the Final Event. All synthetic Input Data and Team Results for the GO Competition Challenge 1 for the Sandbox, Trial Events 1 to 3, and the Final Event along with problem, format, scoring and rules descriptions are available here. Data for industry scenarios will not be made public.
Challenge 1, a minimization problem, required two computational steps. Solver 1 or Code 1 solved the base SCOPF problem under a strict wall clock time limit, as would be the case in industry, and reported the base case operating point as output, which was used to compute the Objective Function value that was used as the scenario score. The feasibility of the solution was provided by the Solver 2 or Code 2, which solves the power flow problem for all contingencies based on the results from Solver 1. This is not normally done in industry, so the time limits were relaxed. In fact, there were no time limits for Trial Event 1. This proved to be a mistake, with some codes running for more than 90 hours, and a time limit of 2 seconds per contingency was imposed for all other events. Entrants were free to use their own Solver 2 or use an open-source version provided by the Competition.
Containers, such as Docker, were considered to improve the portability of codes, but none that could reliably support a multi-node parallel computing environment, e.g., MPI, could be found.
For more information on the competition and challenge see the "GO Competition Challenge 1 Information" and "GO Competition Challenge 1 Additional Information" resources below.
Facebook
TwitterData for the SPHERE Challenge that will take place in conjunction with ECML-PKDD 2016. Please cite: Niall Twomey, Tom Diethe, Meelis Kull, Hao Song, Massimo Camplani, Sion Hannuna, Xenofon Fafoutis, Ni Zhu, Pete Woznowski, Peter Flach, Ian Craddock: “The SPHERE Challenge: Activity Recognition with Multimodal Sensor Data”, 2016;arXiv:1603.00797. BibTeX record: @article{twomey2016sphere, title={The SPHERE Challenge: Activity Recognition with Multimodal Sensor Data}, author={Twomey, Niall and Diethe, Tom and Kull, Meelis and Song, Hao and Camplani, Massimo and Hannuna, Sion and Fafoutis, Xenofon and Zhu, Ni and Woznowski, Pete and Flach, Peter and others}, journal={arXiv preprint arXiv:1603.00797}, year={2016} } http://arxiv.org/abs/1603.00797v2 Complete download (zip, 41.4 MiB)
Facebook
TwitterWhen data and analytics leaders throughout Europe and the United States were asked what the top challenges were with using data to drive business value at their companies, ** percent indicated that the lack of analytical skills among employees was the top challenge as of 2021. Other challenges with using data included data democratization and organizational silos.
Facebook
TwitterThis dataset was created by Vibhanshu
Facebook
TwitterAnalysis of the projects proposed by the seven finalists to USDOT's Smart City Challenge, including challenge addressed, proposed project category, and project description. The time reported for the speed profiles are between 2:00PM to 8:00PM in increments of 10 minutes.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
full-sky diffuse gamma-ray model generated with DRAGON and GammaSkypi0, brems, IC includedgas: ringModel, Ferrieresource term: Case&Bhattacharya, Ferrierehealpix resolution: 8
Facebook
TwitterThis dataset was created by Anwar Beg Hajra
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Datasets for the Exoplanet imaging data challenge (https://exoplanet-imaging-challenge.github.io).
Facebook
TwitterThe PROSTATEx Challenge dataset contains prostate imaging data for prostate cancer assessment.
Facebook
TwitterThe dataset used for the experiments in the paper, containing 12,000 molecules with 12 biological effects.