Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The provided dataset comprises 43 instances of temporal bone volume CT scans. The scans were performed on human cadaveric specimen with a resulting isotropic voxel size of (99 \times 99 \times 99 \, \, \mathrm{\mu m}^3). Voxel-wise image labels of the fluid space of the bony labyrinth, subdivided in the three semantic classes cochlear volume, vestibular volume and semicircular canal volume are provided. In addition, each dataset contains JSON-like descriptor data defining the voxel coordinates of the anatomical landmarks: (1) apex of the cochlea, (2) oval window and (3) round window. The dataset can be used to train and evaluate algorithmic machine learning models for automated innear ear analysis in the context of the supervised learning paradigm.
Usage Notes
The datasets are formatted in the HDF5 format developed by the HDF5 Group. We utilized and thus recommend the usage of Python bindings pyHDF to handle the datasets.
The flat-panel volume CT raw data, labels and landmarks are saved in the HDF5-internal file structure using the respective group and datasets:
raw/raw-0 label/label-0 landmark/landmark-0 landmark/landmark-1 landmark/landmark-2
Array raw and label data can be read from the file by indexing into an opened h5py file handle, for example as numpy.ndarray. Further metadata is contained in the attribute dictionaries of the raw and label datasets.
Landmark coordinate data is available as an attribute dict and contains the coordinate system (LPS or RAS), IJK voxel coordinates and label information. The helicotrema or cochlea top is globally saved in landmark 0, the oval window in landmark 1 and the round window in landmark 2. Read as a Python dictionary, exemplary landmark information for a dataset may reads as follows:
{'coordsys': 'LPS', 'id': 1, 'ijk_position': array([181, 188, 100]), 'label': 'CochleaTop', 'orientation': array([-1., -0., -0., -0., -1., -0., 0., 0., 1.]), 'xyz_position': array([ 44.21109689, -139.38058589, -183.48249736])}
{'coordsys': 'LPS', 'id': 2, 'ijk_position': array([222, 182, 145]), 'label': 'OvalWindow', 'orientation': array([-1., -0., -0., -0., -1., -0., 0., 0., 1.]), 'xyz_position': array([ 48.27890112, -139.95991131, -179.04103763])}
{'coordsys': 'LPS', 'id': 3, 'ijk_position': array([223, 209, 147]), 'label': 'RoundWindow', 'orientation': array([-1., -0., -0., -0., -1., -0., 0., 0., 1.]), 'xyz_position': array([ 48.33120126, -137.27135678, -178.8665465 ])}
https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/LWN9XEhttps://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/LWN9XE
This repository contains code for reproducing experiments done in Marasovic and Frank (2018). Paper abstract: For over a decade, machine learning has been used to extract opinion-holder-target structures from text to answer the question "Who expressed what kind of sentiment towards what?". Recent neural approaches do not outperform the state-of-the-art feature-based models for Opinion Role Labeling (ORL). We suspect this is due to the scarcity of labeled training data and address this issue using different multi-task learning (MTL) techniques with a related task which has substantially more data, i.e. Semantic Role Labeling (SRL). We show that two MTL models improve significantly over the single-task model for labeling of both holders and targets, on the development and the test sets. We found that the vanilla MTL model, which makes predictions using only shared ORL and SRL features, performs the best. With deeper analysis, we determine what works and what might be done to make further improvements for ORL. Data for ORL Download MPQA 2.0 corpus. Check mpqa2-pytools for example usage. Splits can be found in the datasplit folder. Data for SRL The data is provided by: CoNLL-2005 Shared Task, but the original words are from the Penn Treebank dataset, which is not publicly available. How to train models? python main.py --adv_coef 0.0 --model fs --exp_setup_id new --n_layers_orl 0 --begin_fold 0 --end_fold 4 python main.py --adv_coef 0.0 --model html --exp_setup_id new --n_layers_orl 1 --n_layers_shared 2 --begin_fold 0 --end_fold 4 python main.py --adv_coef 0.0 --model sp --exp_setup_id new --n_layers_orl 3 --begin_fold 0 --end_fold 4 python main.py --adv_coef 0.1 --model asp --exp_setup_id prior --n_layers_orl 3 --begin_fold 0 --end_fold 10
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The AI Training Dataset Market size was valued at USD 2124.0 million in 2023 and is projected to reach USD 8593.38 million by 2032, exhibiting a CAGR of 22.1 % during the forecasts period. An AI training dataset is a collection of data used to train machine learning models. It typically includes labeled examples, where each data point has an associated output label or target value. The quality and quantity of this data are crucial for the model's performance. A well-curated dataset ensures the model learns relevant features and patterns, enabling it to generalize effectively to new, unseen data. Training datasets can encompass various data types, including text, images, audio, and structured data. The driving forces behind this growth include:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data entry contains annotated mouse data from the DeepLabCut Nature Neuroscience paper.
This data entry contains a public release of annotated mouse data from the DeepLabCut paper. The trail-tracking behavior is part of an investigation into odor guided navigation, where one or multiple wildtype (C57BL/6J) mice are running on a paper spool and following odor trails. These experiments were carried out by Alexander Mathis & Mackenzie Mathis in the Murthy lab at Harvard University.
Data was recorded by two different cameras (640×480 pixels with Point Grey Firefly (FMVU-03MTM-CS), and at approximately 1,700×1,200 pixels with Grasshopper 3 4.1MP Mono USB3 Vision (CMOSIS CMV4000-3E12)) at 30 Hz. The latter images were cropped around mice to generate images that are approximately 800×800.
Here we share 1066, frames from multiple experimental sessions observing 7 different mice. Pranav Mamidanna labeled the snout, the tip of the left and right ear as well as the base of the tail in the example images. The data is organized in DeepLabCut 2.0 project structure with images and annotations in the labeled-data folder. The names are pseudocodes indicating mouse id and session id, e.g. m4s1 = mouse 4 session 1.
Code for loading, visualizing & training deep neural networks available at https://github.com/DeepLabCut/DeepLabCut.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The algorithms ranking.
Data DescriptionThe DIPSER dataset is designed to assess student attention and emotion in in-person classroom settings, consisting of RGB camera data, smartwatch sensor data, and labeled attention and emotion metrics. It includes multiple camera angles per student to capture posture and facial expressions, complemented by smartwatch data for inertial and biometric metrics. Attention and emotion labels are derived from self-reports and expert evaluations. The dataset includes diverse demographic groups, with data collected in real-world classroom environments, facilitating the training of machine learning models for predicting attention and correlating it with emotional states.Data Collection and Generation ProceduresThe dataset was collected in a natural classroom environment at the University of Alicante, Spain. The recording setup consisted of six general cameras positioned to capture the overall classroom context and individual cameras placed at each student’s desk. Additionally, smartwatches were used to collect biometric data, such as heart rate, accelerometer, and gyroscope readings.Experimental SessionsNine distinct educational activities were designed to ensure a comprehensive range of engagement scenarios:News Reading – Students read projected or device-displayed news.Brainstorming Session – Idea generation for problem-solving.Lecture – Passive listening to an instructor-led session.Information Organization – Synthesizing information from different sources.Lecture Test – Assessment of lecture content via mobile devices.Individual Presentations – Students present their projects.Knowledge Test – Conducted using Kahoot.Robotics Experimentation – Hands-on session with robotics.MTINY Activity Design – Development of educational activities with computational thinking.Technical SpecificationsRGB Cameras: Individual cameras recorded at 640×480 pixels, while context cameras captured at 1280×720 pixels.Frame Rate: 9-10 FPS depending on the setup.Smartwatch Sensors: Collected heart rate, accelerometer, gyroscope, rotation vector, and light sensor data at a frequency of 1–100 Hz.Data Organization and FormatsThe dataset follows a structured directory format:/groupX/experimentY/subjectZ.zip Each subject-specific folder contains:images/ (individual facial images)watch_sensors/ (sensor readings in JSON format)labels/ (engagement & emotion annotations)metadata/ (subject demographics & session details)Annotations and LabelingEach data entry includes engagement levels (1-5) and emotional states (9 categories) based on both self-reported labels and evaluations by four independent experts. A custom annotation tool was developed to ensure consistency across evaluations.Missing Data and Data QualitySynchronization: A centralized server ensured time alignment across devices. Brightness changes were used to verify synchronization.Completeness: No major missing data, except for occasional random frame drops due to embedded device performance.Data Consistency: Uniform collection methodology across sessions, ensuring high reliability.Data Processing MethodsTo enhance usability, the dataset includes preprocessed bounding boxes for face, body, and hands, along with gaze estimation and head pose annotations. These were generated using YOLO, MediaPipe, and DeepFace.File Formats and AccessibilityImages: Stored in standard JPEG format.Sensor Data: Provided as structured JSON files.Labels: Available as CSV files with timestamps.The dataset is publicly available under the CC-BY license and can be accessed along with the necessary processing scripts via the DIPSER GitHub repository.Potential Errors and LimitationsDue to camera angles, some student movements may be out of frame in collaborative sessions.Lighting conditions vary slightly across experiments.Sensor latency variations are minimal but exist due to embedded device constraints.CitationIf you find this project helpful for your research, please cite our work using the following bibtex entry:@misc{marquezcarpintero2025dipserdatasetinpersonstudent1, title={DIPSER: A Dataset for In-Person Student1 Engagement Recognition in the Wild}, author={Luis Marquez-Carpintero and Sergio Suescun-Ferrandiz and Carolina Lorenzo Álvarez and Jorge Fernandez-Herrero and Diego Viejo and Rosabel Roig-Vila and Miguel Cazorla}, year={2025}, eprint={2502.20209}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2502.20209}, } Usage and ReproducibilityResearchers can utilize standard tools like OpenCV, TensorFlow, and PyTorch for analysis. The dataset supports research in machine learning, affective computing, and education analytics, offering a unique resource for engagement and attention studies in real-world classroom environments.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The code for paper '' Semi-supervised non-negative matrix factorization with structure preserving for image clustering''. This paper constructs a new label matrix with weights and further construct a label constraint regularizer to both utilize the label information and maintain the intrinsic structure of NMF. Based on the label constraint regularizer, the basis images of labeled data are extracted for monitoring and modifying the basis images learning of all data by establishing a basis regularizer. By incorporating the label constraint regularizer and the basis regularizer into NMF, a new semi-supervised NMF method is introduced. The proposed method is applied to image clustering and experimental results demonstrate the effectiveness of the proposed method in contrast with state-of-the-art unsupervised and semi-supervised algorithms.
https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy
BASE YEAR | 2024 |
HISTORICAL DATA | 2019 - 2024 |
REPORT COVERAGE | Revenue Forecast, Competitive Landscape, Growth Factors, and Trends |
MARKET SIZE 2023 | 2.83(USD Billion) |
MARKET SIZE 2024 | 3.38(USD Billion) |
MARKET SIZE 2032 | 14.02(USD Billion) |
SEGMENTS COVERED | Deployment Model ,Organization Size ,Industry Vertical ,Data Type ,Application ,Regional |
COUNTRIES COVERED | North America, Europe, APAC, South America, MEA |
KEY MARKET DYNAMICS | Increasing data privacy regulations Growing need for data security and compliance Proliferation of unstructured data Rise of artificial intelligence and machine learning Adoption of cloudbased data storage |
MARKET FORECAST UNITS | USD Billion |
KEY COMPANIES PROFILED | - Informatica ,- Oracle ,- Symantec ,- IBM ,- Informatica ,- Splunk ,- Varonis Systems ,- Digital Guardian ,- STEALTHbits Technologies ,- Cybereason ,- Netskope ,- FireEye ,- Trustwave ,- Check Point Software Technologies |
MARKET FORECAST PERIOD | 2024 - 2032 |
KEY MARKET OPPORTUNITIES | Increase in data breaches Growing adoption of cloud and SaaS solutions Need for data protection and compliance regulations Emergence of AI and ML technologies Growing focus on data privacy |
COMPOUND ANNUAL GROWTH RATE (CAGR) | 19.46% (2024 - 2032) |
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A data set of ligands used to evaluate the CheckMyBlob method, described in the Kowiel et al. paper "Automatic recognition of ligands in electron density by machine learning methods".
This data set repeats the setup used in the study of Carolan & Lamzin titled "Automated identification of crystallographic ligands using sparse-density representations". It consists of ligands from X-ray diffraction experiments with 1.0–2.5 Å resolution. Adjacent PDB ligands were not connected. Ligands were labeled according to the PDB naming convention. The data set was limited to the 82 ligand types listed by Carolan & Lamzin. The resulting data set consists of 121,360 examples with ligand counts ranging from 42,622 examples for SO4 to 16 for SPO (spheroidene).
For machine learning (classification) purposes, the target attribute is: res_name.
These images and associated binary labels were collected from collaborators across multiple universities to serve as a diverse representation of biomedical images of vessel structures, for use in the training and validation of machine learning tools for vessel segmentation. The dataset contains images from a variety of imaging modalities, at different resolutions, using difference sources of contrast and featuring different organs/ pathologies. This data was use to train, test and validated a foundational model for 3D vessel segmentation, tUbeNet, which can be found on github. The paper descripting the training and validation of the model can be found here. Filenames are structured as follows: Data - [Modality][species Organ][resolution].tif Labels - [Modality][species Organ][resolution]labels.tif Sub-volumes of larger dataset - [Modality][species Organ]_subvolume[dimensions in pixels].tif Manual labelling of blood vessels was carried out using Amira (2020.2, Thermo-Fisher, UK). Training data: opticalHREM_murineLiver_2.26x2.26x1.75um.tif: A high resolution episcopic microscopy (HREM) dataset, acquired in house by staining a healthy mouse liver with Eosin B and imaged using a standard HREM protocol. NB: 25% of this image volume was withheld from training, for use as test data. CT_murineTumour_20x20x20um.tif: X-ray microCT images of a microvascular cast, taken from a subcutaneous mouse model of colorectal cancer (acquired in house). NB: 25% of this image volume was withheld from training, for use as test data. RSOM_murineTumour_20x20um.tif: Raster-Scanning Optoacoustic Mesoscopy (RSOM) data from a subcutaneous tumour model (provided by Emma Brown, Bohndiek Group, University of Cambridge). The image data has undergone filtering to reduce the background (Brown et al., 2019). OCTA_humanRetina_24x24um.tif: retinal angiography data obtained using Optical Coherence Tomography Angiography (OCT-A) (provided by Dr Ranjan Rajendram, Moorfields Eye Hospital). Test data: MRI_porcineLiver_0.9x0.9x5mm.tif: T1-weighted Balanced Turbo Field Echo Magnetic Resonance Imaging (MRI) data from a machine-perfused porcine liver, acquired in-house. Test Data MFHREM_murineTumourLectin_2.76x2.76x2.61um.tif: a subcutaneous colorectal tumour mouse model was imaged in house using Multi-fluorescence HREM in house, with Dylight 647 conjugated lectin staining the vasculature (Walsh et al., 2021). The image data has been processed using an asymmetric deconvolution algorithm described by Walsh et al., 2020. NB: A sub-volume of 480x480x640 voxels was manually labelled (MFHREM_murineTumourLectin_subvolume480x480x640.tif). MFHREM_murineBrainLectin_0.85x0.85x0.86um.tif: an MF-HREM image of the cortex of a mouse brain, stained with Dylight-647 conjugated lectin, was acquired in house (Walsh et al., 2021). The image data has been downsampled and processed using an asymmetric deconvolution algorithm described by Walsh et al., 2020. NB: A sub-volume of 1000x1000x99 voxels was manually labelled. This sub-volume is provided at full resolution and without preprocessing (MFHREM_murineBrainLectin_subvol_0.57x0.57x0.86um.tif). 2Photon_murineOlfactoryBulbLectin_0.2x0.46x5.2um.tif: two-photon data of mouse olfactory bulb blood vessels, labelled with sulforhodamine 101, was kindly provided by Yuxin Zhang at the Sensory Circuits and Neurotechnology Lab, the Francis Crick Institute (Bosch et al., 2022). NB: A sub-volume of 500x500x79 voxel was manually labelled (2Photon_murineOlfactoryBulbLectin_subvolume500x500x79.tif). References: Bosch, C., Ackels, T., Pacureanu, A., Zhang, Y., Peddie, C. J., Berning, M., Rzepka, N., Zdora, M. C., Whiteley, I., Storm, M., Bonnin, A., Rau, C., Margrie, T., Collinson, L., & Schaefer, A. T. (2022). Functional and multiscale 3D structural investigation of brain tissue through correlative in vivo physiology, synchrotron microtomography and volume electron microscopy. Nature Communications 2022 13:1, 13(1), 1–16. https://doi.org/10.1038/s41467-022-30199-6 Brown, E., Brunker, J., & Bohndiek, S. E. (2019). Photoacoustic imaging as a tool to probe the tumour microenvironment. DMM Disease Models and Mechanisms, 12(7). https://doi.org/10.1242/DMM.039636 Walsh, C., Holroyd, N. A., Finnerty, E., Ryan, S. G., Sweeney, P. W., Shipley, R. J., & Walker-Samuel, S. (2021). Multifluorescence High-Resolution Episcopic Microscopy for 3D Imaging of Adult Murine Organs. Advanced Photonics Research, 2(10), 2100110. https://doi.org/10.1002/ADPR.202100110 Walsh, C., Holroyd, N., Shipley, R., & Walker-Samuel, S. (2020). Asymmetric Point Spread Function Estimation and Deconvolution for Serial-Sectioning Block-Face Imaging. Communications in Computer and Information Science, 1248 CCIS, 235–249. https://doi.org/10.1007/978-3-030-52791-4_19
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Std Dev Diff represents the standard deviation over validation sections of the differences of the metrics to the default rotate augmentation and the Avg Diff is the overall average differences compared to the default model. All models had a background weight of 1.0, all layers fully trainable, and a 0.001 learning rate.
http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitationshttp://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations
This dataset consists of sentences extracted from BGS memoirs, DECC/OGA onshore hydrocarbons well reports and Mineral Reconnaissance Programme (MRP) reports. The sentences have been annotated to enable the dataset to be used as labelled training data for a Named Entity Recognition model and Entity Relation Extraction model, both of which are Natural Language Processing (NLP) techniques that assist with extracting structured data from unstructured text. The entities of interest are rock formations, geological ages, rock types, physical properties and locations, with inter-relations such as overlies, observedIn. The entity labels for rock formations and geological ages in the BGS memoirs were an extract from earlier published work https://github.com/BritishGeologicalSurvey/geo-ner-model https://zenodo.org/records/4181488 . The data can be used to fine tune a pre-trained large language model using transfer learning, to create a model that can be used in inference mode to automatically create the labels, thereby creating structured data useful for geological modelling and subsurface characterisation. The data is provided in JSONL(Relation) format which is the export format from doccano open source text annotation software (https://doccano.github.io/doccano/) used to create the labels. The source documents are already publicly available, but the MRP and DECC reports are only published in pdf image form. These latter documents had to undergo OCR and resulted in lower quality text and a lower quality training data. The majority of the labelled data is from the higher quality BGS memoirs text. The dataset is a proof of concept. Minimal peer review of the labelling has been conducted so this should not be treated as a gold standard labelled dataset, and it is of insufficient volume to build a performant model. The development of this training data and the text processing scripts were supported by a grant from UK Government Office for Technology Transfer (GOTT) Knowledge Asset Grant Fund Project 10083604
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Optimized network structure.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Std Dev Diff represents the standard deviation of the differences of the metrics with the original model and the Avg Diff is the average differences compared to the original model. The trained model had a 1.0 background weight, all the layers trainable, 0.001 learning rate, and rotation augmentation (Rotate).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
🔒 Collection of Privacy-Sensitive Conversations between Care Workers and Care Home Residents in an Residential Care Home 🔒
The dataset is useful to train and evaluate models to identify and classify privacy-sensitive parts of conversations from text, especially in the context of AI assistants and LLMs.
The provided data format is .jsonl
, the JSON Lines text format, also called newline-delimited JSON. An example entry looks as follows.
{ "text": "CW: Have you ever been to Italy? CR: Oh, yes... many years ago.", "taxonomy": 0, "category": 0, "affected_speaker": 1, "language": "en", "locale": "US", "data_type": 1, "uid": 16, "split": "train" }
The data fields are:
text
: a string
feature. The abbreviaton of the speakers refer to the care worker (CW) and the care recipient (CR).taxonomy
: a classification label, with possible values including informational
(0), invasion
(1), collection
(2), processing
(3), dissemination
(4), physical
(5), personal-space
(6), territoriality
(7), intrusion
(8), obtrusion
(9), contamination
(10), modesty
(11), psychological
(12), interrogation
(13), psychological-distance
(14), social
(15), association
(16), crowding-isolation
(17), public-gaze
(18), solitude
(19), intimacy
(20), anonymity
(21), reserve
(22). The taxonomy is derived from Rueben et al. (2017). The classifications were manually labeled by an expert.category
: a classification label, with possible values including personal-information
(0), family
(1), health
(2), thoughts
(3), values
(4), acquaintance
(5), appointment
(6). The privacy category affected in the conversation. The classifications were manually labeled by an expert.affected_speaker
: a classification label, with possible values including care-worker
(0), care-recipient
(1), other
(2), both
(3). The speaker whose privacy is impacted during the conversation. The classifications were manually labeled by an expert.language
: a string
feature. Language code as defined by ISO 639.locale
: a string
feature. Regional code as defined by ISO 3166-1 alpha-2.data_type
: a string
a classification label, with possible values including real
(0), synthetic
(1).uid
: a int64
feature. A unique identifier within the dataset.split
: a string
feature. Either train
, validation
or test
.The dataset has 2 subsets:
split
: with a total of 95 examples split into train
, validation
and test
(70%-15%-15%)unsplit
: with a total of 95 examples in a single train splitname | train | validation | test |
---|---|---|---|
split | 66 | 14 | 15 |
unsplit | 95 | n/a | n/a |
The files follow the naming convention subset-split-language.jsonl
. The following files are contained in the dataset:
split-train-en.jsonl
split-validation-en.jsonl
split-test-en.jsonl
unsplit-train-en.jsonl
Recording audio of care workers and residents during care interactions, which includes partial and full body washing, giving of medication, as well as wound care, is a highly privacy-sensitive use case. Therefore, a dataset is created, which includes privacy-sensitive parts of conversations, synthesized from real-world data. This dataset serves as a basis for fine-tuning a local LLM to highlight and classify privacy-sensitive sections of transcripts created in care interactions, to further mask them to protect privacy.
The intial data was collected in the project Caring Robots of TU Wien in cooperation with Caritas Wien. One project track aims to facilitate Large Languge Models (LLM) to support documentation of care workers, with LLM-generated summaries of audio recordings of interactions between care workers and care home residents. The initial data are the transcriptions of those care interactions.
The transcriptions were thoroughly reviewed, and sections containing privacy-sensitive information were identified and marked using qualitative data analysis software by two experts. Subsequently, the accessible portions of the interviews were translated from German to US English using the locally executed LLM icky/translate. In the next step, another llama3.1:70b was used locally to synthesize the conversation segments. This process involved generating similar, yet distinct and new, conversations that are not linked to the original data. The dataset was split using the train_test_split
function from the <a href="https://scikit-learn.org/1.5/modules/generated/sklearn.model_selection.train_test_split.html" target="_blank"
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
According to Cognitive Market Research, the global Artificial Intelligence Chip market size will be USD 21584.2 million in 2024. It will expand at a compound annual growth rate (CAGR) of 39.50% from 2024 to 2031.
North America held the major market share for more than 40% of the global revenue with a market size of USD 8633.68 million in 2024 and will grow at a compound annual growth rate (CAGR) of 37.7% from 2024 to 2031.
Europe accounted for a market share of over 30% of the global revenue with a market size of USD 6475.26 million.
Asia Pacific held a market share of around 23% of the global revenue with a market size of USD 4964.37 million in 2024 and will grow at a compound annual growth rate (CAGR) of 41.5% from 2024 to 2031.
Latin America had a market share of more than 5% of the global revenue with a market size of USD 1079.21 million in 2024 and will grow at a compound annual growth rate (CAGR) of 38.9% from 2024 to 2031.
Middle East and Africa had a market share of around 2% of the global revenue and was estimated at a market size of USD 431.68 million in 2024 and will grow at a compound annual growth rate (CAGR) of 39.2% from 2024 to 2031.
The BFSI held the highest Artificial Intelligence Chip market revenue share in 2024.
Market Dynamics of Artificial Intelligence Chip Market
Key Drivers for Artificial Intelligence Chip Market
Rapid data growth and computational power demand to Increase the Demand Globally
A compute-intensive processor is a critical parameter for the processing of AI algorithms. The speedier the chip, the more quickly it can process the data necessary to construct an AI system. AI processors are primarily utilized in data centers and high-end servers due to the fact that end computers are unable to manage such substantial workloads due to a lack of power and time. AMD provides a series of EPYC processors that include cloud services, data analytics, and visualization. It boasts an Ethernet bandwidth of 8–10 GB and a memory capacity of up to 4 TB. It provides security capabilities, flexibility, and sophisticated I/O integration. Cloud computing, high-performance computing (HPC), and numerous other applications are optimally served by AMD EPYC processors.
Growing potential of AI-based healthcare tools to Propel Market Growth
AI improves emergency care monitoring, real-time patient data collecting, and preventative healthcare suggestions. Health and wellness services like mobile apps may track patients' movements using AI. With AI-based tools, in-home health monitoring and information access, personalized health management, and treatment devices like better hearing aids, visual assistive devices, and physical assistive devices like intelligent walkers can be implemented efficiently. Thus, AI-based solutions are being used to improve the physical, emotional, social, and mental health of the elderly globally. Future applications may combine ML, DL, and computer vision for posture detection and geriatric behavior learning.
Restraint Factor for the Artificial Intelligence Chip Market
Minimal organized data for AI system development to Limit the Sales
Training and building a full and powerful AI system need data. The manual entry of data structured datasets earlier. The growing digital footprint and technology trends like IoT and Industry 4.0 generated large amounts of data from wearable devices, smart homes, intelligent thermostats, connected cars, IP cameras, smart devices, manufacturing machines, industrial equipment, and other remotely connected devices. Text, audio, and pictures make up this unstructured data. Without an organized internal structure, developers can't extract relevant data. Training machine learning tools requires high-quality labelled data and skilled human trainers. Time and skill are needed to extract and label unstructured data. Structured data is essential for AI system development. Companies are using semi-structured data to get insights from groupings.
Impact of Covid-19 on the Artificial Intelligence Chip Market
The long-term impact of the initial outbreak has been beneficial, despite the disruptions to the supply chain and manufacturing delays. The pandemic has expedited the process of AI adoption in a variety of industries, such as healthcare, retail, and manufacturing. The demand for AI processors was driven by the heightened necessity for automation, remote monitoring, and data and analytics. ...
FSDnoisy18k is an audio dataset collected with the aim of fostering the investigation of label noise in sound event classification. It contains 42.5 hours of audio across 20 sound classes, including a small amount of manually-labeled data and a larger quantity of real-world noisy data.
Data curators
Eduardo Fonseca and Mercedes Collado
Contact
You are welcome to contact Eduardo Fonseca should you have any questions at eduardo.fonseca@upf.edu.
Citation
If you use this dataset or part of it, please cite the following ICASSP 2019 paper:
Eduardo Fonseca, Manoj Plakal, Daniel P. W. Ellis, Frederic Font, Xavier Favory, and Xavier Serra, “Learning Sound Event Classifiers from Web Audio with Noisy Labels”, arXiv preprint arXiv:1901.01189, 2019
You can also consider citing our ISMIR 2017 paper that describes the Freesound Annotator, which was used to gather the manual annotations included in FSDnoisy18k:
Eduardo Fonseca, Jordi Pons, Xavier Favory, Frederic Font, Dmitry Bogdanov, Andres Ferraro, Sergio Oramas, Alastair Porter, and Xavier Serra, “Freesound Datasets: A Platform for the Creation of Open Audio Datasets”, In Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017
FSDnoisy18k description
What follows is a summary of the most basic aspects of FSDnoisy18k. For a complete description of FSDnoisy18k, make sure to check:
the FSDnoisy18k companion site: http://www.eduardofonseca.net/FSDnoisy18k/
the description provided in Section 2 of our ICASSP 2019 paper
FSDnoisy18k is an audio dataset collected with the aim of fostering the investigation of label noise in sound event classification. It contains 42.5 hours of audio across 20 sound classes, including a small amount of manually-labeled data and a larger quantity of real-world noisy data.
The source of audio content is Freesound—a sound sharing site created an maintained by the Music Technology Group hosting over 400,000 clips uploaded by its community of users, who additionally provide some basic metadata (e.g., tags, and title). The 20 classes of FSDnoisy18k are drawn from the AudioSet Ontology and are selected based on data availability as well as on their suitability to allow the study of label noise. The 20 classes are: "Acoustic guitar", "Bass guitar", "Clapping", "Coin (dropping)", "Crash cymbal", "Dishes, pots, and pans", "Engine", "Fart", "Fire", "Fireworks", "Glass", "Hi-hat", "Piano", "Rain", "Slam", "Squeak", "Tearing", "Walk, footsteps", "Wind", and "Writing". FSDnoisy18k was created with the Freesound Annotator, which is a platform for the collaborative creation of open audio datasets.
We defined a clean portion of the dataset consisting of correct and complete labels. The remaining portion is referred to as the noisy portion. Each clip in the dataset has a single ground truth label (singly-labeled data).
The clean portion of the data consists of audio clips whose labels are rated as present in the clip and predominant (almost all with full inter-annotator agreement), meaning that the label is correct and, in most cases, there is no additional acoustic material other than the labeled class. A few clips may contain some additional sound events, but they occur in the background and do not belong to any of the 20 target classes. This is more common for some classes that rarely occur alone, e.g., “Fire”, “Glass”, “Wind” or “Walk, footsteps”.
The noisy portion of the data consists of audio clips that received no human validation. In this case, they are categorized on the basis of the user-provided tags in Freesound. Hence, the noisy portion features a certain amount of label noise.
Code
We've released the code for our ICASSP 2019 paper at https://github.com/edufonseca/icassp19. The framework comprises all the basic stages: feature extraction, training, inference and evaluation. After loading the FSDnoisy18k dataset, log-mel energies are computed and a CNN baseline is trained and evaluated. The code also allows to test four noise-robust loss functions. Please check our paper for more details.
Label noise characteristics
FSDnoisy18k features real label noise that is representative of audio data retrieved from the web, particularly from Freesound. The analysis of a per-class, random, 15% of the noisy portion of FSDnoisy18k revealed that roughly 40% of the analyzed labels are correct and complete, whereas 60% of the labels show some type of label noise. Please check the FSDnoisy18k companion site for a detailed characterization of the label noise in the dataset, including a taxonomy of label noise for singly-labeled data as well as a per-class description of the label noise.
FSDnoisy18k basic characteristics
The dataset most relevant characteristics are as follows:
FSDnoisy18k contains 18,532 audio clips (42.5h) unequally distributed in the 20 aforementioned classes drawn from the AudioSet Ontology.
The audio clips are provided as uncompressed PCM 16 bit, 44.1 kHz, mono audio files.
The audio clips are of variable length ranging from 300ms to 30s, and each clip has a single ground truth label (singly-labeled data).
The dataset is split into a test set and a train set. The test set is drawn entirely from the clean portion, while the remainder of data forms the train set.
The train set is composed of 17,585 clips (41.1h) unequally distributed among the 20 classes. It features a clean subset and a noisy subset. In terms of number of clips their proportion is 10%/90%, whereas in terms of duration the proportion is slightly more extreme (6%/94%). The per-class percentage of clean data within the train set is also imbalanced, ranging from 6.1% to 22.4%. The number of audio clips per class ranges from 51 to 170, and from 250 to 1000 in the clean and noisy subsets, respectively. Further, a noisy small subset is defined, which includes an amount of (noisy) data comparable (in terms of duration) to that of the clean subset.
The test set is composed of 947 clips (1.4h) that belong to the clean portion of the data. Its class distribution is similar to that of the clean subset of the train set. The number of per-class audio clips in the test set ranges from 30 to 72. The test set enables a multi-class classification problem.
FSDnoisy18k is an expandable dataset that features a per-class varying degree of types and amount of label noise. The dataset allows investigation of label noise as well as other approaches, from semi-supervised learning, e.g., self-training to learning with minimal supervision.
License
FSDnoisy18k has licenses at two different levels, as explained next. All sounds in Freesound are released under Creative Commons (CC) licenses, and each audio clip has its own license as defined by the audio clip uploader in Freesound. In particular, all Freesound clips included in FSDnoisy18k are released under either CC-BY or CC0. For attribution purposes and to facilitate attribution of these files to third parties, we include a relation of audio clips and their corresponding license in the LICENSE-INDIVIDUAL-CLIPS file downloaded with the dataset.
In addition, FSDnoisy18k as a whole is the result of a curation process and it has an additional license. FSDnoisy18k is released under CC-BY. This license is specified in the LICENSE-DATASET file downloaded with the dataset.
Files
FSDnoisy18k can be downloaded as a series of zip files with the following directory structure:
root
│
└───FSDnoisy18k.audio_train/ Audio clips in the train set
│
└───FSDnoisy18k.audio_test/ Audio clips in the test set
│
└───FSDnoisy18k.meta/ Files for evaluation setup
│ │
│ └───train.csv Data split and ground truth for the train set
│ │
│ └───test.csv Ground truth for the test set
│
└───FSDnoisy18k.doc/
│
└───README.md The dataset description file that you are reading
│
└───LICENSE-DATASET License of the FSDnoisy18k dataset as an entity
│
└───LICENSE-INDIVIDUAL-CLIPS.csv Licenses of the individual audio clips from Freesound
Each row (i.e. audio clip) of the train.csv file contains the following information:
fname: the file name
label: the audio classification label (ground truth)
aso_id: the id of the corresponding category as per the AudioSet Ontology
manually_verified: Boolean (1 or 0) flag to indicate whether the clip belongs to the clean portion (1), or to the noisy portion (0) of the train set
noisy_small: Boolean (1 or 0) flag to indicate whether the clip belongs to the noisy_small portion (1) of the train set
Each row (i.e. audio clip) of the test.csv file contains the following information:
fname: the file name
label: the audio classification label (ground truth)
aso_id: the id of the corresponding category as per the AudioSet Ontology
Links
Source code for our preprint: https://github.com/edufonseca/icassp19 Freesound Annotator: https://annotator.freesound.org/ Freesound: https://freesound.org Eduardo Fonseca’s personal website: http://www.eduardofonseca.net/
Acknowledgments
This work is partially supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 688382 AudioCommons. Eduardo Fonseca is also sponsored by a Google Faculty Research Award 2017. We thank everyone who contributed to FSDnoisy18k with annotations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Surface expressions derived from private in-hospital texts in Japanese (TEXT). For readability, we have used their English translations. These surface expressions, which are often symptom-like phrases, are categorized as parent diseases and labeled with corresponding ICD codes. We further label each surface expression with relevance to frequent adverse effects of anti-cancer drugs, such as stomatitis, peripheral neuropathy, and hand-foot syndrome (e.g., and ). The original table has eight relevant labels containing binary values (1 if relevant and 0 otherwise). The samples were randomly selected from entries relevant to stomatitis or peripheral neuropathy. (XLSX)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains tile imagery from the OpenStreetMap project alongside label masks for buildings from OpenStreetMap. Besides the original clean label set, additional noisy label sets for random noise, removed and added buildings are provided.
The purpose of this dataset is to provide training data for analysing the impact of noisy labels on the performance of models for semantic segmentation in Earth observation.
The code for downloading and creating the datasets as well as for performing some preliminary analyses is also provided, however it is necessary to have access to a tile server where OpenStreetMap tiles can be downloaded in sufficient amounts.
To reproduce the dataset and perform analysis on it, do the following:
Marmoset vocalization dataJammingAllData.mat contains a MATLAB structure array “SbjData” which stores the experimental data for the four subjects. Each row in “SbjData” is a subject. The first column is for low-frequency noise perturbation (and the baseline preceding it) whereas the second column is for high-frequency noise perturbation (and the baseline preceding it). “SbjData.Param” stores various parameters used in the analysis and the experiment info. For example, “exp_day” contains the day number for each experimental session. Perturbation sessions are labeled as “Jamming” in the data structures. “SbjData.Measure.FreqPrePostWin” stores the actual data, as described below for its several fields. “Jamming” and “Baseline” contains the original recorded data for the perturbation and baseline sessions, respectively. “JammingDT” and “BaselineDT” are the detrended data. Inside these matrices, Column 1 labels separate sessions; Column 2 is the call onset time (sec); Column 4 is the fundame...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The provided dataset comprises 43 instances of temporal bone volume CT scans. The scans were performed on human cadaveric specimen with a resulting isotropic voxel size of (99 \times 99 \times 99 \, \, \mathrm{\mu m}^3). Voxel-wise image labels of the fluid space of the bony labyrinth, subdivided in the three semantic classes cochlear volume, vestibular volume and semicircular canal volume are provided. In addition, each dataset contains JSON-like descriptor data defining the voxel coordinates of the anatomical landmarks: (1) apex of the cochlea, (2) oval window and (3) round window. The dataset can be used to train and evaluate algorithmic machine learning models for automated innear ear analysis in the context of the supervised learning paradigm.
Usage Notes
The datasets are formatted in the HDF5 format developed by the HDF5 Group. We utilized and thus recommend the usage of Python bindings pyHDF to handle the datasets.
The flat-panel volume CT raw data, labels and landmarks are saved in the HDF5-internal file structure using the respective group and datasets:
raw/raw-0 label/label-0 landmark/landmark-0 landmark/landmark-1 landmark/landmark-2
Array raw and label data can be read from the file by indexing into an opened h5py file handle, for example as numpy.ndarray. Further metadata is contained in the attribute dictionaries of the raw and label datasets.
Landmark coordinate data is available as an attribute dict and contains the coordinate system (LPS or RAS), IJK voxel coordinates and label information. The helicotrema or cochlea top is globally saved in landmark 0, the oval window in landmark 1 and the round window in landmark 2. Read as a Python dictionary, exemplary landmark information for a dataset may reads as follows:
{'coordsys': 'LPS', 'id': 1, 'ijk_position': array([181, 188, 100]), 'label': 'CochleaTop', 'orientation': array([-1., -0., -0., -0., -1., -0., 0., 0., 1.]), 'xyz_position': array([ 44.21109689, -139.38058589, -183.48249736])}
{'coordsys': 'LPS', 'id': 2, 'ijk_position': array([222, 182, 145]), 'label': 'OvalWindow', 'orientation': array([-1., -0., -0., -0., -1., -0., 0., 0., 1.]), 'xyz_position': array([ 48.27890112, -139.95991131, -179.04103763])}
{'coordsys': 'LPS', 'id': 3, 'ijk_position': array([223, 209, 147]), 'label': 'RoundWindow', 'orientation': array([-1., -0., -0., -0., -1., -0., 0., 0., 1.]), 'xyz_position': array([ 48.33120126, -137.27135678, -178.8665465 ])}