100+ datasets found
  1. i

    HackerEarth Machine Learning challenge: Predict the price for Good Friday...

    • ieee-dataport.org
    Updated May 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siddharth Kekre (2020). HackerEarth Machine Learning challenge: Predict the price for Good Friday gifts [Dataset]. https://ieee-dataport.org/documents/hackerearth-machine-learning-challenge-predict-price-good-friday-gifts
    Explore at:
    Dataset updated
    May 19, 2020
    Authors
    Siddharth Kekre
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset consists of the following columns:Data description

  2. c

    Data from: Dataset from the ATLAS Higgs Boson Machine Learning Challenge...

    • opendata.cern.ch
    • opendata-dev.cern.ch
    Updated 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ATLAS collaboration (2014). Dataset from the ATLAS Higgs Boson Machine Learning Challenge 2014 [Dataset]. http://doi.org/10.7483/OPENDATA.ATLAS.ZBP2.M5T8
    Explore at:
    Dataset updated
    2014
    Dataset provided by
    CERN Open Data Portal
    Authors
    ATLAS collaboration
    Description

    The dataset has been built from official ATLAS full-detector simulation, with "Higgs to tautau" events mixed with different backgrounds. The simulator has two parts. In the first, random proton-proton collisions are simulated based on the knowledge that we have accumulated on particle physics. It reproduces the random microscopic explosions resulting from the proton-proton collisions. In the second part, the resulting particles are tracked through a virtual model of the detector. The process yields simulated events with properties that mimic the statistical properties of the real events with additional information on what has happened during the collision, before particles are measured in the detector.

    The signal sample contains events in which Higgs bosons (with a fixed mass of 125 GeV) were produced. The background sample was generated by other known processes that can produce events with at least one electron or muon and a hadronic tau, mimicking the signal. For the sake of simplicity, only three background processes were retained for the Challenge. The first comes from the decay of the Z boson (with a mass of 91.2 GeV) into two taus. This decay produces events with a topology very similar to that produced by the decay of a Higgs. The second set contains events with a pair of top quarks, which can have a lepton and a hadronic tau among their decay. The third set involves the decay of the W boson, where one electron or muon and a hadronic tau can appear simultaneously only through imperfections of the particle identification procedure.

    Due to the complexity of the simulation process, each simulated event has a weight that is proportional to the conditional density divided by the instrumental density used by the simulator (an importance-sampling flavour), and normalised for integrated luminosity such that, in any region, the sum of the weights of events falling in the region is an unbiased estimate of the expected number of events falling in the same region during a given fixed time interval. In our case, the weights correspond to the quantity of real data taken during the year 2012. The weights are an artifact of the way the simulation works and so they are not part of the input to the classifier. For the Challenge, weights have been provided in the training set so the AMS can be properly evaluated. Weights were not provided in the qualifying set since the weight distribution of the signal and background sets are very different and so they would give away the label immediately. However, in the opendata.cern.ch dataset, weights and labels have been provided for the complete dataset.

    The evaluation metric is the approximate median significance (AMS):

    \[ \text{AMS} = \sqrt{2\left((s+b+b_r) \log \left(1 + \frac{s}{b + b_r}\right)-s\right)}\]

    where

    • $s, b$: unnormalised true positive and false positive rates, respectively,
    • $b_r =10$ is the constant regularisation term,
    • $\log$ is the natural log.

    More precisely, let $(y_1, \ldots, y_n) \in \{\text{b},\text{s}\}^n$ be the vector of true test labels, let $(\hat{y}_1, \ldots, \hat{y}_n) \in \{\text{b},\text{s}\}^n$ be the vector of predicted (submitted) test labels, and let $(w_1, \ldots, w_n) \in {\mathbb{R}^+}^n$ be the vector of weights. Then

    \[ s = \sum_{i=1}^n w_i\mathbb{1}\{y_i = \text{s}\} \mathbb{1}\{\hat{y}_i = \text{s}\} \]

    and

    \[ b = \sum_{i=1}^n w_i\mathbb{1}\{y_i = \text{b}\} \mathbb{1}\{\hat{y}_i = \text{s}\}, \]

    where the indicator function $\mathbb{1}\{A\}$ is 1 if its argument $A$ is true and 0 otherwise.

    For more information on the statistical model and the derivation of the metric, see the documentation.

  3. 2018 Kaggle Machine Learning Challenge dataset

    • kaggle.com
    Updated Nov 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sreenanda Sai Dasari (2021). 2018 Kaggle Machine Learning Challenge dataset [Dataset]. https://www.kaggle.com/sreenandasaidasari/2021-kaggle-machine-learning-challenge/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 28, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sreenanda Sai Dasari
    Description

    Dataset

    This dataset was created by Sreenanda Sai Dasari

    Contents

  4. Machine learning challenges in companies 2018-2021

    • statista.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Machine learning challenges in companies 2018-2021 [Dataset]. https://www.statista.com/statistics/1111249/machine-learning-challenges/
    Explore at:
    Dataset updated
    Jul 9, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Nov 2020
    Area covered
    Worldwide
    Description

    According to a recent survey, ** percent of respondents state experiencing issues with security and auditability requirements when deploying machine learning and artificial intelligence in 2021. Auditability is the degree to which transaction from the originator to the approver and final disposition can be traced.

  5. Yahoo-Learning-to-Rank-Challenge

    • huggingface.co
    Updated Dec 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yahoo-Research (2024). Yahoo-Learning-to-Rank-Challenge [Dataset]. https://huggingface.co/datasets/YahooResearch/Yahoo-Learning-to-Rank-Challenge
    Explore at:
    Dataset updated
    Dec 15, 2024
    Dataset provided by
    Yahoo!https://tw.yahoo.com/
    Yahoo! Research
    Authors
    Yahoo-Research
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Yahoo! Learning to Rank Challenge, version 1.0

    Machine learning has been successfully applied to web search ranking and the goal of this dataset to benchmark such machine learning algorithms. The dataset consists of features extracted from (query,url) pairs along with relevance judgments. The queries, ulrs and features descriptions are not given, only the feature values are. There are two datasets in this distribution: a large one and a small one. Each dataset is divided in 3 sets:… See the full description on the dataset page: https://huggingface.co/datasets/YahooResearch/Yahoo-Learning-to-Rank-Challenge.

  6. Zimnat Insurance Recommendation Challenge

    • kaggle.com
    Updated Sep 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Naman Jaswani (2020). Zimnat Insurance Recommendation Challenge [Dataset]. https://www.kaggle.com/namanj27/zindiml
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 14, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Naman Jaswani
    Description

    This dataset is taken from the recent Competition on Zindi : Competition_page

  7. f

    General machine learning challenges, as reported in the literature.

    • datasetcatalog.nlm.nih.gov
    Updated Jan 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Blasimme, Alessandro; Puhan, Milo Alan; Amann, Julia; Vayena, Effy; Gille, Felix; Hubbs, Shannon; Landers, Constantin; Daniore, Paola; Nittas, Vasileios (2023). General machine learning challenges, as reported in the literature. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001021304
    Explore at:
    Dataset updated
    Jan 31, 2023
    Authors
    Blasimme, Alessandro; Puhan, Milo Alan; Amann, Julia; Vayena, Effy; Gille, Felix; Hubbs, Shannon; Landers, Constantin; Daniore, Paola; Nittas, Vasileios
    Description

    General machine learning challenges, as reported in the literature.

  8. Top marketing challenges of AI and ML usage in U.S. companies 2022

    • statista.com
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Top marketing challenges of AI and ML usage in U.S. companies 2022 [Dataset]. https://www.statista.com/statistics/1364821/ai-ml-usage-company-challenges-us/
    Explore at:
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jul 2022
    Area covered
    United States
    Description

    A survey carried out in July 2022 in the United States found out that ** percent of marketing professionals that were using artificial intelligence (AI) and machine learning (ML) tools in their marketing programs said that their leading challenges were risk and governance issues. Another ** percent of professionals working with AI and ML stated that the AI and ML technologies were too difficult to use or deploy. An additional ** percent of U.S. marketing professionals shared that AI and ML's costs were a challenge as well.

  9. HackerEarth Challenge : Adopt a Buddy

    • kaggle.com
    zip
    Updated Aug 14, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aadarsh Srivastava (2020). HackerEarth Challenge : Adopt a Buddy [Dataset]. https://www.kaggle.com/aadarsh168/hackerearth-challenge-adopt-a-buddy
    Explore at:
    zip(553282 bytes)Available download formats
    Dataset updated
    Aug 14, 2020
    Authors
    Aadarsh Srivastava
    Description

    Problem statement Having a pet is one of life’s most fulfilling experiences. Your pets spoil you with their love, compassion, and loyalty. And dare anyone lay a finger on you in your pet’s presence, they are in for a lot of trouble. Thanks to social media, videos of clumsy and fussy (yet adorable) pets from across the globe entertain you all day long. Their love is pure and infinite. So, in return, all pets deserve a warm and loving family, indeed. And occasional boops, of course.

    Numerous organizations across the world provide shelter to all homeless animals until they are adopted into a new home. However, finding a loving family for them can be a daunting task at times. This International Homeless Animals Day, we present a Machine Learning challenge to you: Adopt a buddy.

    The brighter side of the pandemic is an increase in animal adoption and fostering. To ensure that their customers stay indoors, a leading pet adoption agency plans on creating a virtual-tour experience, showcasing all animals available in their shelter. To enable that, you have been tasked to build a Machine Learning model that determines the type and breed of the animal-based on its physical attributes and other factors.

    Dataset The dataset consists of parameters such as a unique ID assigned to each animal that is up for adoption, the date on which they arrived at the shelter, their physical attributes such as color, length, and height, among other factors.

    The benefits of practicing this problem by using Machine Learning techniques are as follows:

    This challenge will help you to actively enhance your knowledge of multi-label classification. It is one of the basic building blocks of Machine Learning We challenge you to build a predictive model that detects the type and breed of an animal-based on its condition, appearance, and other factors.

    Prizes Considering these unprecedented times that the world is facing due to the Coronavirus pandemic, we wish to do our bit and contribute the prize money for the welfare of society.

    Overview Machine Learning is an application of Artificial Intelligence (AI) that provides systems with the ability to automatically learn and improve from experiences without being explicitly programmed. Machine Learning is a Science that determines patterns in data. These patterns provide a deeper meaning to problems. First, it helps you understand the problems better and then solve the same with elegance.

    Here is the new HackerEarth Machine Learning Challenge—Adopt a buddy

    This challenge is designed to help you improve your Machine Learning skills by competing and learning from fellow participants.

    Why should you participate? To analyze and implement multiple algorithms, and determine which is more appropriate for a problem. To get hands-on experience of Machine Learning problems.

    Who should participate? Working professionals. Data Science or Machine Learning enthusiasts. College students (if you understand the basics of predictive modeling).

  10. n

    Open Cities AI Challenge Dataset

    • cmr.earthdata.nasa.gov
    Updated Oct 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Open Cities AI Challenge Dataset [Dataset]. http://doi.org/10.34911/rdnt.f94cxb
    Explore at:
    Dataset updated
    Oct 10, 2023
    Time period covered
    Jan 1, 2020 - Jan 1, 2023
    Area covered
    Description

    This dataset was developed as part of a challenge to segment building footprints from aerial imagery. The goal of the challenge was to accelerate the development of more accurate, relevant, and usable open-source AI models to support mapping for disaster risk management in African cities [Read more about the challenge]. The data consists of drone imagery from 10 different cities and regions across Africa

  11. Fruits Recognition

    • kaggle.com
    Updated Jul 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vaishnavi.khilari (2021). Fruits Recognition [Dataset]. https://www.kaggle.com/datasets/vaishnavikhilari/fruits-recognition
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 24, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    vaishnavi.khilari
    Description

    Context

    There's a story behind every dataset and here's your opportunity to share yours.

    Content

    What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  12. Z

    BACH Dataset : Grand Challenge on Breast Cancer Histology images

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aguiar, Paulo (2020). BACH Dataset : Grand Challenge on Breast Cancer Histology images [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3632034
    Explore at:
    Dataset updated
    Jan 31, 2020
    Dataset provided by
    Polónia, António
    Aguiar, Paulo
    Eloy, Catarina
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    i3S Annotated Datasets on Digital Pathology

    WELCOME

    In an effort to contribute and push forward the field of Digital Pathology, Ipatimup and INEB, two major research institutions in Portugal, have joined forces in the construction of histology datasets to support grand Challenges on automatic classification of tissue malignancy. The researchers/pathologists responsible for the datasets are:

    António Polónia (MD), Ipatimup/i3S

    Catarina Eloy (MD, PhD), Ipatimup/i3S

    Paulo Aguiar (PhD), INEB/i3S

    This specific page refers to the Grand Challenge on Breast Cancer Histology images, or BACH Challenge

    THE BACH CHALLENGE DATASET

    ICIAR 2018 - Grand Challenge on Breast Cancer Histology images [Challenge organized by Teresa Araújo, Guilherme Aresta, António Polónia, Catarina Eloy and Paulo Aguiar]

    For detailed information visit: https://iciar2018-challenge.grand-challenge.org/home/

    THIS DATASET IS PUBLICALLY AVAILABLE UNDER A CREATIVE COMMONS CC BY-NC-ND LICENSE (ATTRIBUTION-NONCOMMERCIAL-NODERIVS) ESSENCIALLY, YOU ARE GRANTED ACCESS TO THE DATASET FOR USE IN YOUR RESEARCH AS LONG AS YOU CREDIT OUR WORK/PUBLICATIONS(*), BUT YOU CANNOT CHANGE THEM IN ANY WAY OR USE THEM COMMERCIALLY

    (*) Aresta, Guilherme, et al. "BACH: Grand challenge on breast cancer histology images." Medical image analysis (2019).

    (*) Araújo, Teresa, et al. "Classification of breast cancer histology images using convolutional neural networks." PloS one 12.6 (2017): e0177544.

    (*) Fondón, Irene, et al. "Automatic classification of tissue malignancy for breast carcinoma diagnosis." Computers in biology and medicine 96 (2018): 41-51.

  13. HackerEarth Deep Learning Challenge - Dance Forms

    • kaggle.com
    Updated May 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Krishna Chaitanya (2020). HackerEarth Deep Learning Challenge - Dance Forms [Dataset]. https://www.kaggle.com/krsna540/hackerearth-deep-learning-challenge-dance-forms/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 23, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Krishna Chaitanya
    Description

    This Dataset has been copied from Hacker earth with only intention to make it available in kaggle so people can use TPU to build model and increase their skills

    Problem statement

    This International Dance Day, an event management company organized an evening of Indian classical dance performances to celebrate the rich, eloquent, and elegant art of dance. Post the event, the company planned to create a microsite to promote and raise awareness among the public about these dance forms. However, identifying them from images is a tough nut to crack.

    You have been appointed as a Machine Learning Engineer for this project. Build an image tagging Deep Learning model that can help the company classify these images into eight categories of Indian classical dance.

    Dataset

    The dataset consists of 364 images belonging to 8 categories, namely manipuri, bharatanatyam, odissi, kathakali, kathak, sattriya, kuchipudi, and mohiniyattam.

    The benefits of practicing this problem by using Machine Learning/Deep Learning techniques are as follows:

    This challenge will encourage you to apply your Machine Learning skills to build models that classify images into multiple categories This challenge will help you enhance your knowledge of classification actively. It is one of the basic building blocks of Machine Learning and Deep Learning We challenge you to build a model that auto-tags images and classifies them into various categories of Indian classical dance forms.

  14. f

    Kaggle Display Advertising Challenge dataset

    • datasetcatalog.nlm.nih.gov
    Updated Dec 24, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiang, Zilong (2017). Kaggle Display Advertising Challenge dataset [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001796020
    Explore at:
    Dataset updated
    Dec 24, 2017
    Authors
    Jiang, Zilong
    Description

    Criteo Display Advertising Challenge dataset, which is provided by the Criteo company on the famous machine learning website Kaggle for advertising CTR .

  15. f

    Image-related challenges of machine learning, as reported in the literature....

    • datasetcatalog.nlm.nih.gov
    Updated Jan 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nittas, Vasileios; Daniore, Paola; Vayena, Effy; Puhan, Milo Alan; Gille, Felix; Amann, Julia; Hubbs, Shannon; Landers, Constantin; Blasimme, Alessandro (2023). Image-related challenges of machine learning, as reported in the literature. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001021290
    Explore at:
    Dataset updated
    Jan 31, 2023
    Authors
    Nittas, Vasileios; Daniore, Paola; Vayena, Effy; Puhan, Milo Alan; Gille, Felix; Amann, Julia; Hubbs, Shannon; Landers, Constantin; Blasimme, Alessandro
    Description

    Image-related challenges of machine learning, as reported in the literature.

  16. Cadenza Challenge (CAD1): databases for the First Cadenza Challenge - Task1

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Aug 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gerardo Roa Dabike; Gerardo Roa Dabike; Trevor John Cox; Trevor John Cox (2024). Cadenza Challenge (CAD1): databases for the First Cadenza Challenge - Task1 [Dataset]. http://doi.org/10.5281/zenodo.13285384
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Aug 9, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gerardo Roa Dabike; Gerardo Roa Dabike; Trevor John Cox; Trevor John Cox
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cadenza

    This is the training, validation and evaluation data for the First Cadenza Challenge - Task 1.

    The Cadenza Challenges are improving music production and processing for people with a hearing loss. According to The World Health Organization, 430 million people worldwide have a disabling hearing loss. Studies show that not being able to understand lyrics is an important problem to tackle for those with hearing loss. Consequently, this task is about improving the intelligibility of lyrics when listening to pop/rock over headphones. But this needs to be done without losing too much audio quality - you can't improve intelligibility just by turning off the rest of the band! We will be using one metric for intelligibility and another metric for audio quality, and giving you different targets to explore the balance between these metrics.

    Please see the Cadenza website for a full description of the data

  17. W

    Trojan Detection Software Challenge - Round 2 Holdout Dataset

    • cloud.csiss.gmu.edu
    • data.amerigeoss.org
    gz, text
    Updated Oct 30, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States (2020). Trojan Detection Software Challenge - Round 2 Holdout Dataset [Dataset]. http://identifiers.org/ark:/88434/mds2-2322
    Explore at:
    gz, textAvailable download formats
    Dataset updated
    Oct 30, 2020
    Dataset provided by
    United States
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    The data being generated and disseminated is the holdout data used to evaluate trojan detection software solutions. This data, generated at NIST, consists of human level AIs trained to perform a variety of tasks (image classification, natural language processing, etc.). A known percentage of these trained AI models have been poisoned with a known trigger which induces incorrect behavior. This data will be used to develop software solutions for detecting which trained AI models have been poisoned via embedded triggers. This dataset consists of 144 trained, human level, image classification AI models using a variety of architectures. The models were trained on synthetically created image data of non-real traffic signs superimposed on road background scenes. Half (50%) of the models have been poisoned with an embedded trigger which causes misclassification of the images when the trigger is present.

  18. W

    Trojan Detection Software Challenge - Round 1 Test Dataset

    • cloud.csiss.gmu.edu
    • data.amerigeoss.org
    gz, text
    Updated Aug 31, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States (2020). Trojan Detection Software Challenge - Round 1 Test Dataset [Dataset]. http://identifiers.org/ark:/88434/mds2-2283
    Explore at:
    text, gzAvailable download formats
    Dataset updated
    Aug 31, 2020
    Dataset provided by
    United States
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    The data being generated and disseminated is the test data used to evaluate trojan detection software solutions. This data, generated at NIST, consists of human level AIs trained to perform a variety of tasks (image classification, natural language processing, etc.). A known percentage of these trained AI models have been poisoned with a known trigger which induces incorrect behavior. This data will be used to develop software solutions for detecting which trained AI models have been poisoned via embedded triggers. This dataset consists of 1000 trained, human level, image classification AI models using the following architectures (Inception-v3, DenseNet-121, and ResNet50). The models were trained on synthetically created image data of non-real traffic signs superimposed on road background scenes. Half (50%) of the models have been poisoned with an embedded trigger which causes misclassification of the images when the trigger is present.

    Errata: This dataset had a software bug in the trigger embedding code that caused 2 models trained for this dataset to have a ground truth value of 'poisoned' but which did not contain any triggers embedded. These models should not be used.

    Models Without a Trigger Embedded: id-00000077, id-00000083

  19. Synapse2015

    • figshare.com
    zip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiujian Hu (2023). Synapse2015 [Dataset]. http://doi.org/10.6084/m9.figshare.22012538.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Xiujian Hu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A platform for end-to-end development of machine learning solutions in biomedical imaging. Challenges in medical image analysis became popular after the organization of the Grand Challenges for Medical Image Analysis at the MICCAI conference in 2007. Hosting challenge events quickly became commonplace for conferences: MICCAI, ISBI, and SPIE Medical Imaging, amongst others, have hosted challenge events. Leading journals such as IEEE Transactions on Medical Imaging and Medical Image Analysis have welcomed overview papers that described the results of individual challenges.

    Maintaining a challenge, so that new submissions are quickly processed upon submission, is a lot of work. Typically, a junior researcher at some institution is responsible for maintaining a challenge website, but at some point the researcher moves on and the site is no longer kept up-to-date.

    Grand Challenge was created in 2010 to make it easy for organizers of challenges to set up a website for a particular challenge. Its aim was to bring all information on challenges in the domain of biomedical image analysis available in a single place. In 2012 we switched to Django web framework, marking 2012 as our founding year.

  20. TrackML challenge Accuracy phase dataset (Tracking Machine Learning...

    • opendatalab.com
    zip
    Updated Aug 6, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bosch Center for Artificial Intelligence (2018). TrackML challenge Accuracy phase dataset (Tracking Machine Learning Challenge) [Dataset]. https://opendatalab.com/OpenDataLab/TrackML_challenge_Accuracy_phase_etc
    Explore at:
    zip(237421573480 bytes)Available download formats
    Dataset updated
    Aug 6, 2018
    Dataset provided by
    IBMhttp://ibm.com/
    Bosch Center for Artificial Intelligence
    University of Lisbon
    Goethe University Frankfurt
    Norwegian University of Science and Technology
    University of Massachusetts
    University of California, Berkeley
    California Institute of Technology
    Geneva University
    Sorbonne University
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset comprises multiple independent events, where each event contains simulated measurements (essentially 3D points) of particles generated in a collision between proton bunches at the Large Hadron Collider at CERN. The goal of the tracking machine learning challenge is to group the recorded measurements or hit for each event into tracks, sets of hits that belong to the same initial particle. A solution must uniquely associate each hit to one track. The training dataset contains the recorded hit, their ground truth counterpart and their association to particles, and the initial parameters of those particles. The test dataset contains only the recorded hits. The dataset was used for the Accuracy Phase of the Tracking Machine Learning challenge on Kaggle. See more details in the home page url.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Siddharth Kekre (2020). HackerEarth Machine Learning challenge: Predict the price for Good Friday gifts [Dataset]. https://ieee-dataport.org/documents/hackerearth-machine-learning-challenge-predict-price-good-friday-gifts

HackerEarth Machine Learning challenge: Predict the price for Good Friday gifts

Explore at:
Dataset updated
May 19, 2020
Authors
Siddharth Kekre
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The dataset consists of the following columns:Data description

Search
Clear search
Close search
Google apps
Main menu