100+ datasets found
  1. h

    90sclub-dataset

    • huggingface.co
    Updated Sep 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Derrick Schultz (2025). 90sclub-dataset [Dataset]. https://huggingface.co/datasets/dvs/90sclub-dataset
    Explore at:
    Dataset updated
    Sep 30, 2025
    Authors
    Derrick Schultz
    Description

    dvs/90sclub-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  2. Medical Information Dataset

    • kaggle.com
    zip
    Updated Jul 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamadreza Momeni (2025). Medical Information Dataset [Dataset]. https://www.kaggle.com/datasets/imtkaggleteam/medical-information-dataset
    Explore at:
    zip(198508809 bytes)Available download formats
    Dataset updated
    Jul 15, 2025
    Authors
    Mohamadreza Momeni
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    MID: Medicines Information Dataset

    Description

    Numerous studies on medicines are conducted day by day. To address shortcomings of medicines information generation, prediction, and classification models, the authors introduce a large medicines information dataset of textual data. For this motivation, the authors named our dataset ‘MID’.

    • Value of the data - MID is the largest, to our knowledge, available and representative Medicines Information Dataset (MID) for a wide variety of drugs. It includes the names of over 192k medicines, making it a comprehensive collection of pharmaceutical products. - MID is the largest, making it robust for generating information about drugs such as indications or interactions. - MID offers over 192k rows distributed in 44 variety therapeutic classes, making it robust for drug classification to therapeutic label. - MID provides accurate, authoritative, and trustworthy information on medicines for enhancing predictions and efficiencies in clinical trial management. - MID includes details such as drug names, information URL, salt composition, drug introduction, therapeutic uses, side effects, drug benefits, how to use of drug, how to use of drug, how drug works, quick tips of drug, safety advice of drug, chemical class of drug, habit forming of drug, therapeutic class of drug, and action class of drug. This dataset aims to provide a useful resource for medical researchers, healthcare professionals, drug manufacturers, data scientists, and enthusiasts interested in exploring the world of medicines and healthcare products. - In contrast with the few small available datasets, MID's size makes it a suitable corpus for implementing both classical as well as deep learning models.

    • MID.xlsx provides the raw data, including medicine information. The data collected to ensure an acceleration and save experimental efforts for medicines through help in predicting or generating or classifying of medicine information preclinically.

    • Therapeutic_class_counts.xlsx is summarize distribution of medicines per therapeutic class.

  3. h

    VLM-3R-DATA

    • huggingface.co
    Updated Jun 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JIAN ZHANG (2025). VLM-3R-DATA [Dataset]. https://huggingface.co/datasets/Journey9ni/VLM-3R-DATA
    Explore at:
    Dataset updated
    Jun 10, 2025
    Authors
    JIAN ZHANG
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Journey9ni/VLM-3R-DATA dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. F

    LUMID: Large-scale Unlabled Medical Imaging Dataset for Unsupervised and...

    • frdr-dfdr.ca
    Updated Dec 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Osman, Islam; Gupta, Anubhav; Shehata, Mohamed S.; Braun, John W. (2024). LUMID: Large-scale Unlabled Medical Imaging Dataset for Unsupervised and Self-supervised Learning [Dataset]. http://doi.org/10.20383/103.01017
    Explore at:
    Dataset updated
    Dec 4, 2024
    Dataset provided by
    Federated Research Data Repository / dépôt fédéré de données de recherche
    Authors
    Osman, Islam; Gupta, Anubhav; Shehata, Mohamed S.; Braun, John W.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    LUMID is a large-scale, unlabeled collection of over 2 million medical images spanning multiple imaging modalities, including CT scans, X-rays, MRIs, and more. This dataset has been meticulously curated from publicly available medical imaging repositories, addressing the critical challenge of limited scale in existing public datasets and the inaccessibility of high-quality private datasets. The primary motivation behind creating this dataset is to empower the medical imaging community with a resource suited for developing and training advanced deep learning models. By enabling the use of unsupervised and self-supervised learning approaches, this dataset facilitates the learning of rich, transferable representations that can significantly enhance performance across various medical imaging tasks, including classification, segmentation, and anomaly detection.

    Key Features: 1) Diversity: Comprising images from multiple modalities and a wide range of medical imaging scenarios. 2) Scalability: A dataset of unprecedented size, providing a robust foundation for training deep neural networks. 3) Versatility: Specifically designed for unsupervised and self-supervised learning methods, fostering innovation in representation learning for medical imaging. 4) Open Access: Built entirely from public datasets, ensuring transparency and reproducibility.

    This dataset is intended to serve as a cornerstone for advancing research in medical AI, fostering the development of models capable of generalizing across diverse imaging types and clinical conditions.

  5. N

    Dataset for Brookfield, MO Census Bureau Demographics and Population...

    • neilsberg.com
    Updated Jul 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Dataset for Brookfield, MO Census Bureau Demographics and Population Distribution Across Age // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b78397d2-5460-11ee-804b-3860777c1fe6/
    Explore at:
    Dataset updated
    Jul 24, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Brookfield, Missouri
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Brookfield population by age. The dataset can be utilized to understand the age distribution and demographics of Brookfield.

    Content

    The dataset constitues the following three datasets

    • Brookfield, MO Age Group Population Dataset: A complete breakdown of Brookfield age demographics from 0 to 85 years, distributed across 18 age groups
    • Brookfield, MO Age Cohorts Dataset: Children, Working Adults, and Seniors in Brookfield - Population and Percentage Analysis
    • Brookfield, MO Population Pyramid Dataset: Age Groups, Male and Female Population, and Total Population for Demographics Analysis

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

  6. N

    Imperial, CA Population Breakdown By Race (Excluding Ethnicity) Dataset:...

    • neilsberg.com
    csv, json
    Updated Feb 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Imperial, CA Population Breakdown By Race (Excluding Ethnicity) Dataset: Population Counts and Percentages for 7 Racial Categories as Identified by the US Census Bureau // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/757ac77d-ef82-11ef-9e71-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 21, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Imperial, California
    Variables measured
    Asian Population, Black Population, White Population, Some other race Population, Two or more races Population, American Indian and Alaska Native Population, Asian Population as Percent of Total Population, Black Population as Percent of Total Population, White Population as Percent of Total Population, Native Hawaiian and Other Pacific Islander Population, and 4 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the racial categories idetified by the US Census Bureau. It is ensured that the population estimates used in this dataset pertain exclusively to the identified racial categories, and do not rely on any ethnicity classification. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Imperial by race. It includes the population of Imperial across racial categories (excluding ethnicity) as identified by the Census Bureau. The dataset can be utilized to understand the population distribution of Imperial across relevant racial categories.

    Key observations

    The percent distribution of Imperial population by race (across all racial categories recognized by the U.S. Census Bureau): 37.29% are white, 1.03% are Black or African American, 0.71% are American Indian and Alaska Native, 3.90% are Asian, 27.85% are some other race and 29.23% are multiracial.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Racial categories include:

    • White
    • Black or African American
    • American Indian and Alaska Native
    • Asian
    • Native Hawaiian and Other Pacific Islander
    • Some other race
    • Two or more races (multiracial)

    Variables / Data Columns

    • Race: This column displays the racial categories (excluding ethnicity) for the Imperial
    • Population: The population of the racial category (excluding ethnicity) in the Imperial is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each race as a proportion of Imperial total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Imperial Population by Race & Ethnicity. You can refer the same here

  7. o

    US_Stocks_Financial_Indicators

    • openml.org
    Updated Dec 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolas Carbone (2024). US_Stocks_Financial_Indicators [Dataset]. https://www.openml.org/d/46527
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 12, 2024
    Authors
    Nicolas Carbone
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    200+ Financial Indicators of US Stocks (2018)

    Context

    Algorithmic trading space is buzzing with new strategies. Companies have spent billions in infrastructure and R&D to be able to jump ahead of the competition and beat the market. Finding value in stocks is an art that very few mastered. Can a computer do that?

    Content

    This dataset contains 200+ financial indicators that are commonly found in the 10-K filings each publicly traded company releases yearly, for a period of US stocks for 2018.

    ## Target Variables The dataset includes two class labels: 1. PRICE VAR [%]: Lists the percent price variation for 2018 2. class: Binary classification for each stock where: - 1: Identifies stocks that an hypothetical trader should BUY - 0: Identifies stocks that an hypothetical trader should NOT BUY

    Important Notes

    • Some financial indicator values might be missing
    • Contains outliers with extreme values (possibly due to mistyping)
    • Price variations are calculated from the first trading day of 2018 to the last trading day of 2018
  8. Alzheimer's Disease Multiclass Images Dataset

    • kaggle.com
    zip
    Updated Jun 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aryan Singhal (2024). Alzheimer's Disease Multiclass Images Dataset [Dataset]. https://www.kaggle.com/datasets/aryansinghal10/alzheimers-multiclass-dataset-equal-and-augmented
    Explore at:
    zip(417170579 bytes)Available download formats
    Dataset updated
    Jun 26, 2024
    Authors
    Aryan Singhal
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The Alzheimer's Disease Multiclass Dataset contains approximately 44,000 MRI images categorized into four distinct classes based on the severity of Alzheimer's disease. This dataset is intended for use in machine learning model training and testing. All images are skull-stripped and clean of non-brain tissue.

    Dataset Structure The dataset is organized into the following four directories, each representing a different class of disease severity: NonDemented: Contains 12,800 MRI images of subjects with no signs of dementia. VeryMildDemented: Contains 11,200 MRI images of subjects with very mild symptoms of dementia. MildDemented: Contains 10,000 MRI images of subjects with mild dementia. ModerateDemented: Contains 10,000 MRI images of subjects with moderate dementia.

    Image Details Total Number of Images: 44,000 Image Format: MRI scans as .JPG files Image Usage: Suitable for training and testing machine learning models focused on classifying Alzheimer's disease stages.

    Disease Severity Classification The dataset follows a severity ranking system for Alzheimer's disease: NonDemented: No dementia. Very Mild Demented: Early signs of dementia, very mild symptoms. Mild Demented: Clear signs of dementia, but still mild. Moderate Demented: More pronounced symptoms of dementia, moderate severity.

    This dataset is an augmented and upsampled version of the dataset below: https://www.kaggle.com/datasets/uraninjo/augmented-alzheimer-mri-dataset-v2

    This dataset was upsampled as the original dataset had a large class imbalance.

  9. R

    License Plates Dataset

    • universe.roboflow.com
    zip
    Updated Aug 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sezgin KOC (2023). License Plates Dataset [Dataset]. https://universe.roboflow.com/sezgin-koc-3z1r3/license-plates-kwudy/dataset/6
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 4, 2023
    Dataset authored and provided by
    Sezgin KOC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Car License Plate Bounding Boxes
    Description

    Here are a few use cases for this project:

    1. Automatic License Plate Recognition (ALPR) System: Use the "License Plates" model to develop an ALPR system for traffic management, toll collection, and parking access control, making these processes more efficient and accurate.

    2. Stolen Vehicle Tracking and Recovery: Integrate the "License Plates" model into security and surveillance systems to identify and track stolen vehicles in real-time, helping law enforcement to locate and recover them more efficiently.

    3. Traffic Violation Detection: Combine the model with other computer vision and sensor technologies to detect traffic violations, such as speeding, illegal parking, or running red lights, and automatically generate citations based on license plate identification.

    4. Vehicle Data Collection and Analytics: Use the "License Plates" model for data collection and analytics on traffic patterns, vehicle types, and license plate distribution in specific areas. This information can be used to optimize urban planning, infrastructure development, and transportation policies.

    5. Enhanced Augmented Reality Navigation: Implement the "License Plates" model in augmented reality applications for drivers, allowing them to receive information about nearby vehicles, such as make and model, or routing assistance based on license plate detection and computations.

  10. r

    Data from: SMARTBUY dataset

    • researchdata.se
    • resodate.org
    • +2more
    Updated Jan 29, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Karl Andersson; Damianos Gavalas (2021). SMARTBUY dataset [Dataset]. http://doi.org/10.5878/cg82-h783
    Explore at:
    (181405)Available download formats
    Dataset updated
    Jan 29, 2021
    Dataset provided by
    Luleå University of Technology
    Authors
    Karl Andersson; Damianos Gavalas
    Time period covered
    Sep 1, 2018 - Dec 31, 2018
    Area covered
    Greece
    Description

    The dataset represents a compilation of user interaction data generated by users who participated in the project's pilot activities in Patras, Greece. Data was generated by users in the SMARTBUY app and includes information about users, stores, product categories, professions, and events.

    The dataset comprises the following data: - users: user account data for the Patras pilot users - occupation: all possible occupations that the pilot users could choose from - stores: stores which participated in the Patras pilot - sel_products_cat: products uploaded to the SMARTBUY platform by retailers - events: geo-stamped and time-stamped descriptions of a user interaction event (for instance, "user_id 67 rated product_id 722 with rating 4 at location x1 at datetime y1", or "user_id 91 denoted product_id 78 as favorite at location x2 at datetime y2") - event_types: all possible event types captured by the SMARTBUY platform ('Product searches', 'Product views', 'Featured product', 'Products near you views', 'Product photos browsed', 'Product ratings', 'Clicks on Read More button to read product reviews', 'Clicks on Open map button', 'Clicks on Send this info by email button', 'Products denoted as Favorite')

    Privacy-sensitive information such as user names, retailer owner names and store names and keywords searched are anonymized.

  11. h

    taco-datasets

    • huggingface.co
    Updated Nov 17, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Secure and Assured Intelligent Learning (SAIL) Lab (2023). taco-datasets [Dataset]. https://huggingface.co/datasets/saillab/taco-datasets
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 17, 2023
    Dataset authored and provided by
    Secure and Assured Intelligent Learning (SAIL) Lab
    Description

    This repo consists of the datasets used for the TaCo paper. There are four datasets:

    Multilingual Alpaca-52K GPT-4 dataset Multilingual Dolly-15K GPT-4 dataset TaCo dataset Multilingual Vicuna Benchmark dataset

    We translated the first three datasets using Google Cloud Translation. The TaCo dataset is created by using the TaCo approach as described in our paper, combining the Alpaca-52K and Dolly-15K datasets. If you would like to create the TaCo dataset for a specific language, you can… See the full description on the dataset page: https://huggingface.co/datasets/saillab/taco-datasets.

  12. R

    Windsor V1 Dataset

    • universe.roboflow.com
    zip
    Updated May 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    cocotoyolov1 (2023). Windsor V1 Dataset [Dataset]. https://universe.roboflow.com/cocotoyolov1/windsor-v1/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 8, 2023
    Dataset authored and provided by
    cocotoyolov1
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Road Defects Bounding Boxes
    Description

    Windsor V1

    ## Overview
    
    Windsor V1 is a dataset for object detection tasks - it contains Road Defects annotations for 211 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  13. Z

    Atticus Open Contract Dataset (AOK) (beta)

    • data-staging.niaid.nih.gov
    • live.european-language-grid.eu
    • +3more
    Updated Mar 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Atticus Project (2021). Atticus Open Contract Dataset (AOK) (beta) [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_4064879
    Explore at:
    Dataset updated
    Mar 11, 2021
    Dataset authored and provided by
    The Atticus Project
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Atticus Open Contract Dataset (AOK)(beta) is a corpus of 5,000+ labels in 200 commercial legal contracts that have been manually labeled by legal experts to identify 40 types of clauses that are important during contract review in connection with corporate transactions, such as mergers and acquisitions, IPO, and corporate financing.

    AOK Dataset is curated and maintained by The Atticus Project, Inc., a non-profit organization, to support NLP research and development in legal contract review.

    If you download this dataset, we'd love to know more about you and your project! Please fill out this short form: https://forms.gle/h47GUENTTbBqH39m7.

    Check out our website at atticusprojectai.org.

    Update: The expanded 1.0 version of the dataset is available here https://zenodo.org/record/4595826

  14. SLF Evaluation Dataset

    • zenodo.org
    bin, csv
    Updated Jul 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmed Abotaleb; Ahmed Abotaleb (2024). SLF Evaluation Dataset [Dataset]. http://doi.org/10.5281/zenodo.12706833
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Jul 13, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ahmed Abotaleb; Ahmed Abotaleb
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset was constructed from the test set split of the VoxCeleb 2 dataset (VoxCeleb). The VoxCeleb 2 test set contains 118 speakers each in several different videos. To develop this dataset, only one video per speaker was selected. A face image was also extracted from the video, as well as, a low resolution face image (8x8). Age, gender and ethnicity of the person in the face image were determined using the “DeepFace” library, a face recognition and facial attribute analysis library.

    This dataset can be used to evaluate speech2face, speech conditioned face generation and speech conditioned face super-resolution systems.

  15. Orange dataset table

    • figshare.com
    xlsx
    Updated Mar 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rui Simões (2022). Orange dataset table [Dataset]. http://doi.org/10.6084/m9.figshare.19146410.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Mar 4, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Rui Simões
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The complete dataset used in the analysis comprises 36 samples, each described by 11 numeric features and 1 target. The attributes considered were caspase 3/7 activity, Mitotracker red CMXRos area and intensity (3 h and 24 h incubations with both compounds), Mitosox oxidation (3 h incubation with the referred compounds) and oxidation rate, DCFDA fluorescence (3 h and 24 h incubations with either compound) and oxidation rate, and DQ BSA hydrolysis. The target of each instance corresponds to one of the 9 possible classes (4 samples per class): Control, 6.25, 12.5, 25 and 50 µM for 6-OHDA and 0.03, 0.06, 0.125 and 0.25 µM for rotenone. The dataset is balanced, it does not contain any missing values and data was standardized across features. The small number of samples prevented a full and strong statistical analysis of the results. Nevertheless, it allowed the identification of relevant hidden patterns and trends.

    Exploratory data analysis, information gain, hierarchical clustering, and supervised predictive modeling were performed using Orange Data Mining version 3.25.1 [41]. Hierarchical clustering was performed using the Euclidean distance metric and weighted linkage. Cluster maps were plotted to relate the features with higher mutual information (in rows) with instances (in columns), with the color of each cell representing the normalized level of a particular feature in a specific instance. The information is grouped both in rows and in columns by a two-way hierarchical clustering method using the Euclidean distances and average linkage. Stratified cross-validation was used to train the supervised decision tree. A set of preliminary empirical experiments were performed to choose the best parameters for each algorithm, and we verified that, within moderate variations, there were no significant changes in the outcome. The following settings were adopted for the decision tree algorithm: minimum number of samples in leaves: 2; minimum number of samples required to split an internal node: 5; stop splitting when majority reaches: 95%; criterion: gain ratio. The performance of the supervised model was assessed using accuracy, precision, recall, F-measure and area under the ROC curve (AUC) metrics.

  16. W

    Dataset Group: Climate stations in Mecklenburg Vorpommern

    • wdc-climate.de
    Updated Nov 9, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kreienkamp, Frank; Enke, Wolfgang; Spekat, Arne (2010). Dataset Group: Climate stations in Mecklenburg Vorpommern [Dataset]. https://www.wdc-climate.de/ui/entry?acronym=WR2010_EH5_1_A1B_MV_KL
    Explore at:
    Dataset updated
    Nov 9, 2010
    Dataset provided by
    World Data Center for Climate (WDCC) at DKRZ
    Authors
    Kreienkamp, Frank; Enke, Wolfgang; Spekat, Arne
    Time period covered
    Jan 1, 1961 - Dec 31, 2100
    Area covered
    Description

    An english description is given below.

    In diesem Datensatz sind alle ( 15) Klimastationen des gewählten Bundeslandes abgelegt. Je Station 10 Realisierungen. 150 Dateien mit je 4'241'439 Byte. Datensatz ist zip-gepackt.

    Daten: (ASCII) Datasatz Kürzel : WR2010_EH5_1_A1B_MV_KL Datasatz Name : UBA-WETTREG ECHAM5/OM 20C + A1B Lauf 1 1961-2100 für das gewählte Bundesland, Klimastationen

    Dateistruktur Klimastation: (Kopfzeilen) Stationsname Breite Länge Höhe Typ ta.mo.jahr TX TM TN RR RF PP DD SD NN FF

    Stationslist: Stationsliste_MV_KL.txt Stationsnummer, Stationsname, Bundeslandkürzel, Breite, Länge, Stationshöhe,Typ

    Es gibt keine Jahre mit Schalttag. Die Ausfallkennung ist -999.0

    This data set is a pool of all ( 15) climate stations of the selected Federal State, specified in the entry_name. 10 realizations per station . 150 files with 4'292'439 Byte. Dataset is zip-compressed.

    Data: (ASCII) Dataset acronym: WR2010_EH5_1_A1B_MV_KL Dataset name: UBA-WETTREG ECHAM5/OM 20C + A1B Run 1 realization 1961-2100 for the selected Federal State - climate stations

    File structure climate stations: (header) station name Latitude Longitude height type ta.mo.jahr TX TM TN RR RF PP DD SD NN FF

    Station list: Stationsliste_MV_KL.txt station number, name of station, Abbreviation of federal state, latitude, longitude, height over sea level,type

    There are no leap years. Missing values are indicated with -999.0

  17. E

    LIMO EEG Dataset

    • find.data.gov.scot
    • dtechtive.com
    tar, txt, zip
    Updated Nov 17, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Edinburgh, Centre for Clinical Brain Sciences (2016). LIMO EEG Dataset [Dataset]. http://doi.org/10.7488/ds/1556
    Explore at:
    tar(8588.288 MB), zip(307.1 MB), txt(0.0166 MB)Available download formats
    Dataset updated
    Nov 17, 2016
    Dataset provided by
    University of Edinburgh, Centre for Clinical Brain Sciences
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data in support of the article entitled Experiential modulation of social dominance in a SYNGAP1 rat model of ASD in the European Journal of Neuroscience Advances in our understanding of developmental brain disorders such as autism spectrum disorders (ASD) are being achieved through human neurogenetics in, for example, identifying de novo mutations in SYNGAP1 as one relatively common cause of ASD. A recently developed rat line lacking the calcium/lipid binding (C2) and GTPase activation protein (GAP) domain may further help understanding the neurobiology of deficits seen in children with ASD. This study focused on social dominance in the tube test using Syngap+/D-GAP (rats heterozygous for the ) as alterations in social behaviour are a key facet of the human phenotype. Male animals of this line living together formed a stable intra- cage hierarchy but when living with WT cage-mates, they were submissive, modelling the social withdrawal seen in ASD, with detailed analysis of the specific behaviours shown in social interactions by dominant and submissive animals. A further suggestive observation was that when the Syngap+/D-GAP mutants that had been living together had dominance encounters with WT animals from other cages, the two higher ranking Syngap+/D-GAP rats were dominant whereas the two lower ranking mutants showed the opposite pattern of being submissive. These findings confirm earlier observations with a rat model of Fragile-X indicating that although genotype may be a major determinant of intra-cage hierarchies, the experience of winning or losing can have an influence on subsequent encounters with others. Our results highlight and model that even with single-gene mutations, dominance phenotypes reflect an interaction between genotypic and environmental factors.

  18. N

    Dublin, PA Population Breakdown By Race (Excluding Ethnicity) Dataset:...

    • neilsberg.com
    csv, json
    Updated Feb 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Dublin, PA Population Breakdown By Race (Excluding Ethnicity) Dataset: Population Counts and Percentages for 7 Racial Categories as Identified by the US Census Bureau // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/756cb089-ef82-11ef-9e71-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 21, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Pennsylvania, Dublin
    Variables measured
    Asian Population, Black Population, White Population, Some other race Population, Two or more races Population, American Indian and Alaska Native Population, Asian Population as Percent of Total Population, Black Population as Percent of Total Population, White Population as Percent of Total Population, Native Hawaiian and Other Pacific Islander Population, and 4 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the racial categories idetified by the US Census Bureau. It is ensured that the population estimates used in this dataset pertain exclusively to the identified racial categories, and do not rely on any ethnicity classification. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Dublin by race. It includes the population of Dublin across racial categories (excluding ethnicity) as identified by the Census Bureau. The dataset can be utilized to understand the population distribution of Dublin across relevant racial categories.

    Key observations

    The percent distribution of Dublin population by race (across all racial categories recognized by the U.S. Census Bureau): 87.46% are white, 0.28% are Black or African American, 0.43% are Asian, 5.54% are some other race and 6.29% are multiracial.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Racial categories include:

    • White
    • Black or African American
    • American Indian and Alaska Native
    • Asian
    • Native Hawaiian and Other Pacific Islander
    • Some other race
    • Two or more races (multiracial)

    Variables / Data Columns

    • Race: This column displays the racial categories (excluding ethnicity) for the Dublin
    • Population: The population of the racial category (excluding ethnicity) in the Dublin is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each race as a proportion of Dublin total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Dublin Population by Race & Ethnicity. You can refer the same here

  19. Environment Agency Potential Evapotranspiration Dataset - Dataset -...

    • ckan.publishing.service.gov.uk
    Updated Nov 5, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2021). Environment Agency Potential Evapotranspiration Dataset - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/environment-agency-potential-evapotranspiration-dataset
    Explore at:
    Dataset updated
    Nov 5, 2021
    Dataset provided by
    CKANhttps://ckan.org/
    Description

    Potential Evapotranspiration (PET) is the amount of evaporation which would occur if there was an unlimited supply of water. This dataset represents Potential Evapotranspiration for well watered grass and is available either with an Interception element (PETI) or without one (PET). The dataset has been calculated from homogenised climate station data, gridded to a 1km grid. The dataset starts in 1961, is available on a water day (09:00 to 09:00 on the following day) timestep and is updated frequently. Attribution statement: © Environment Agency copyright and/or database right 2024. All rights reserved.

  20. State Medicaid and CHIP Applications, Eligibility Determinations, and...

    • catalog.data.gov
    • data.virginia.gov
    • +12more
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Medicare & Medicaid Services, State Medicaid and CHIP Applications, Eligibility Determinations, and Enrollment Data [Dataset]. https://catalog.data.gov/dataset/state-medicaid-and-chip-applications-eligibility-determinations-and-enrollment-data-f1647
    Explore at:
    Dataset provided by
    Centers for Medicare & Medicaid Services
    Description

    All states (including the District of Columbia) are required to provide data to The Centers for Medicare & Medicaid Services (CMS) on a range of Medicaid and Children’s Health Insurance Program (CHIP) indicators related to key application, eligibility, enrollment and call center processes. These data reflect enrollment activity for all populations receiving comprehensive Medicaid and CHIP benefits in all states, as well as state program performance. States submit this data via the Performance Indicator dataset. Further information about this dataset is available at: https://www.medicaid.gov/medicaid/national-medicaid-chip-program-information/medicaid-chip-enrollment-data/performance-indicator-technical-assistance/index.html.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Derrick Schultz (2025). 90sclub-dataset [Dataset]. https://huggingface.co/datasets/dvs/90sclub-dataset

90sclub-dataset

dvs/90sclub-dataset

Explore at:
Dataset updated
Sep 30, 2025
Authors
Derrick Schultz
Description

dvs/90sclub-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

Search
Clear search
Close search
Google apps
Main menu