100+ datasets found
  1. h

    alpr-vlm-instruct-dataset

    • huggingface.co
    Updated Feb 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hirai-Labs (2025). alpr-vlm-instruct-dataset [Dataset]. https://huggingface.co/datasets/Hirai-Labs/alpr-vlm-instruct-dataset
    Explore at:
    Dataset updated
    Feb 20, 2025
    Dataset authored and provided by
    Hirai-Labs
    Description

    Hirai-Labs/alpr-vlm-instruct-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  2. Lending Club Loan - Pre-Processed Dataset

    • kaggle.com
    zip
    Updated Jul 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriel Santello (2022). Lending Club Loan - Pre-Processed Dataset [Dataset]. https://www.kaggle.com/datasets/gabrielsantello/lending-club-loan-preprocessed-dataset
    Explore at:
    zip(28819588 bytes)Available download formats
    Dataset updated
    Jul 6, 2022
    Authors
    Gabriel Santello
    Description

    A subset of the LendingClub DataSet obtained from Kaggle: https://www.kaggle.com/wordsforthewise/lending-club

    LendingClub is a US peer-to-peer lending company, headquartered in San Francisco, California. It was the first peer-to-peer lender to register its offerings as securities with the Securities and Exchange Commission (SEC), and to offer loan trading on a secondary market. LendingClub is the world's largest peer-to-peer lending platform.

  3. h

    90sclub-dataset

    • huggingface.co
    Updated Sep 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Derrick Schultz (2025). 90sclub-dataset [Dataset]. https://huggingface.co/datasets/dvs/90sclub-dataset
    Explore at:
    Dataset updated
    Sep 30, 2025
    Authors
    Derrick Schultz
    Description

    dvs/90sclub-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. N

    Income Distribution by Quintile: Mean Household Income in Savage, MN // 2025...

    • neilsberg.com
    csv, json
    Updated Mar 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Income Distribution by Quintile: Mean Household Income in Savage, MN // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/savage-mn-median-household-income/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Mar 3, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Savage, Minnesota
    Variables measured
    Income Level, Mean Household Income
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It delineates income distributions across income quintiles (mentioned above) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents the mean household income for each of the five quintiles in Savage, MN, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.

    Key observations

    • Income disparities: The mean income of the lowest quintile (20% of households with the lowest income) is 36,907, while the mean income for the highest quintile (20% of households with the highest income) is 320,934. This indicates that the top earners earn 9 times compared to the lowest earners.
    • *Top 5%: * The mean household income for the wealthiest population (top 5%) is 467,495, which is 145.67% higher compared to the highest quintile, and 1266.68% higher compared to the lowest quintile.
    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Income Levels:

    • Lowest Quintile
    • Second Quintile
    • Third Quintile
    • Fourth Quintile
    • Highest Quintile
    • Top 5 Percent

    Variables / Data Columns

    • Income Level: This column showcases the income levels (As mentioned above).
    • Mean Household Income: Mean household income, in 2023 inflation-adjusted dollars for the specific income level.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Savage median household income. You can refer the same here

  5. CSIRO Sentinel-1 SAR image dataset of oil- and non-oil features for machine...

    • data.csiro.au
    • researchdata.edu.au
    Updated Dec 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Blondeau-Patissier; Thomas Schroeder; Foivos Diakogiannis; Zhibin Li (2022). CSIRO Sentinel-1 SAR image dataset of oil- and non-oil features for machine learning ( Deep Learning ) [Dataset]. http://doi.org/10.25919/4v55-dn16
    Explore at:
    Dataset updated
    Dec 15, 2022
    Dataset provided by
    CSIROhttps://www.csiro.au/
    Authors
    David Blondeau-Patissier; Thomas Schroeder; Foivos Diakogiannis; Zhibin Li
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Time period covered
    May 1, 2015 - Aug 31, 2022
    Area covered
    Dataset funded by
    ESA
    CSIROhttps://www.csiro.au/
    Description

    What this collection is: A curated, binary-classified image dataset of grayscale (1 band) 400 x 400-pixel size, or image chips, in a JPEG format extracted from processed Sentinel-1 Synthetic Aperture Radar (SAR) satellite scenes acquired over various regions of the world, and featuring clear open ocean chips, look-alikes (wind or biogenic features) and oil slick chips.

    This binary dataset contains chips labelled as: - "0" for chips not containing any oil features (look-alikes or clean seas)
    - "1" for those containing oil features.

    This binary dataset is imbalanced, and biased towards "0" labelled chips (i.e., no oil features), which correspond to 66% of the dataset. Chips containing oil features, labelled "1", correspond to 34% of the dataset.

    Why: This dataset can be used for training, validation and/or testing of machine learning, including deep learning, algorithms for the detection of oil features in SAR imagery. Directly applicable for algorithm development for the European Space Agency Sentinel-1 SAR mission (https://sentinel.esa.int/web/sentinel/missions/sentinel-1 ), it may be suitable for the development of detection algorithms for other SAR satellite sensors.

    Overview of this dataset: Total number of chips (both classes) is N=5,630 Class 0 1 Total 3,725 1,905

    Further information and description is found in the ReadMe file provided (ReadMe_Sentinel1_SAR_OilNoOil_20221215.txt)

  6. Anabolic Steroids Dataset

    • kaggle.com
    zip
    Updated Dec 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kanchana1990 (2024). Anabolic Steroids Dataset [Dataset]. https://www.kaggle.com/datasets/kanchana1990/anabolic-steroids-dataset
    Explore at:
    zip(2487 bytes)Available download formats
    Dataset updated
    Dec 23, 2024
    Authors
    Kanchana1990
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    Dataset Overview

    This dataset, titled "Anabolic Steroids", provides a meticulously curated compilation of nearly 50 steroids. It includes detailed information on their original names, common names, medicinal applications, abuse potential, side effects, historical context, and relative molecular mass (RMM). The dataset aims to serve as a resource for exploring the dual nature of anabolic steroids—both their therapeutic benefits and their misuse in sports and bodybuilding.

    Anabolic steroids are synthetic derivatives of testosterone that have been used for decades in medicine to treat conditions like anemia, muscle-wasting diseases, and hormone deficiencies. However, they are also widely abused for performance enhancement and aesthetic purposes. This dataset captures a comprehensive view of these compounds, making it valuable for researchers, educators, and data enthusiasts.

    Data Science Applications

    While this dataset is relatively small (approx 50 entries), it offers rich opportunities for exploratory analysis and domain-specific insights. Potential applications include:

    • Exploratory Data Analysis (EDA):

      • Analyze trends in medicinal vs. non-medicinal use.
      • Study correlations between molecular mass and reported side effects.
      • Visualize the historical development of anabolic steroids over time.
    • Domain-Specific Insights:

      • Examine the evolution of steroid formulations from the 1930s to the present.
      • Investigate patterns in therapeutic uses versus abuse potential.
    • Educational Use:

      • Serve as a teaching tool for understanding data cleaning, visualization, and analysis.
      • Provide insights into the pharmacological and chemical properties of anabolic steroids.

    Column Descriptors

    1. Original Name: The scientific or chemical name of the steroid compound (e.g., Testosterone).
    2. Common Name: The popular or brand name under which the steroid is marketed (e.g., Testoviron).
    3. Medicinal Use: Approved therapeutic applications of the steroid (e.g., treating anemia or hormone replacement therapy).
    4. Abused For: Non-medical uses often associated with performance enhancement or bodybuilding (e.g., bulking cycles, lean muscle retention).
    5. Side Effects: Documented adverse effects resulting from steroid use or abuse (e.g., liver toxicity, gynecomastia).
    6. History: A brief historical context about the steroid's development or usage (e.g., year introduced, medical approval status).
    7. Relative Molecular Mass (g/mol): The molar mass of the steroid compound, useful for chemical analysis.

    Ethically Mined Data

    This dataset has been ethically compiled from publicly available sources such as scientific journals, chemical databases, and educational websites. No proprietary or confidential information has been included. The data was aggregated to ensure accuracy and relevance while respecting intellectual property rights.

    Acknowledgements

    The following sources were instrumental in compiling this dataset: 1. PubChem Database – For verifying chemical properties and molecular mass values. 2. Wikipedia – For historical context and general information on anabolic steroids. 3. NIST Chemistry WebBook – For accurate molecular mass values and chemical details. 4. Scientific Journals – Referenced for medicinal uses, side effects documentation, and abuse patterns. 5. DALL·E 3 by OpenAI – Used to generate illustrative images related to anabolic steroids to complement dataset visualizations.

    Discouraging Steroid Usage and Highlighting Harms

    The misuse of anabolic steroids poses significant health risks and ethical concerns. While anabolic steroids have legitimate medical applications, their abuse for performance enhancement or aesthetic purposes can lead to severe physical and psychological side effects. Common adverse effects include liver damage, cardiovascular strain, hormonal imbalances, infertility, aggression, and mental health issues such as depression. Prolonged misuse can also result in irreversible damage to vital organs and an increased risk of life-threatening conditions like heart attacks or strokes. Beyond individual health risks, steroid abuse undermines the integrity of sports and creates unfair advantages in competitive environments. It is crucial to prioritize natural methods of achieving fitness goals and seek professional guidance for any medical conditions requiring treatment.

    Notes for Kaggle Users

    This dataset is not intended for machine learning due to its small size but serves as an excellent resource for exploratory data analysis (EDA), visualization projects, and domain-specific research into anabolic steroids' pharmacology and societal impact.

  7. R

    Dataset First Dataset

    • universe.roboflow.com
    zip
    Updated May 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    1st (2025). Dataset First Dataset [Dataset]. https://universe.roboflow.com/1st-spusr/dataset-first
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 6, 2025
    Dataset authored and provided by
    1st
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Dataset First Bounding Boxes
    Description

    Dataset First

    ## Overview
    
    Dataset First is a dataset for object detection tasks - it contains Dataset First annotations for 280 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  8. R

    Pen Dataset

    • universe.roboflow.com
    zip
    Updated Feb 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rathinam College of Arts and Sciences (2023). Pen Dataset [Dataset]. https://universe.roboflow.com/rathinam-college-of-arts-and-sciences/pen-dataset/dataset/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 16, 2023
    Dataset authored and provided by
    Rathinam College of Arts and Sciences
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Pen Bounding Boxes
    Description

    Pen Dataset

    ## Overview
    
    Pen Dataset is a dataset for object detection tasks - it contains Pen annotations for 304 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
    
  9. N

    Merced, CA Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Merced, CA Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b243e3c4-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    California, Merced
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Merced by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Merced across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of female population, with 50.64% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Merced is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Merced total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Merced Population by Race & Ethnicity. You can refer the same here

  10. r

    Data from: SMARTBUY dataset

    • researchdata.se
    • gimi9.com
    Updated Jan 29, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Karl Andersson; Damianos Gavalas (2021). SMARTBUY dataset [Dataset]. http://doi.org/10.5878/cg82-h783
    Explore at:
    (181405)Available download formats
    Dataset updated
    Jan 29, 2021
    Dataset provided by
    Luleå University of Technology
    Authors
    Karl Andersson; Damianos Gavalas
    Time period covered
    Sep 1, 2018 - Dec 31, 2018
    Area covered
    Greece
    Description

    The dataset represents a compilation of user interaction data generated by users who participated in the project's pilot activities in Patras, Greece. Data was generated by users in the SMARTBUY app and includes information about users, stores, product categories, professions, and events.

    The dataset comprises the following data: - users: user account data for the Patras pilot users - occupation: all possible occupations that the pilot users could choose from - stores: stores which participated in the Patras pilot - sel_products_cat: products uploaded to the SMARTBUY platform by retailers - events: geo-stamped and time-stamped descriptions of a user interaction event (for instance, "user_id 67 rated product_id 722 with rating 4 at location x1 at datetime y1", or "user_id 91 denoted product_id 78 as favorite at location x2 at datetime y2") - event_types: all possible event types captured by the SMARTBUY platform ('Product searches', 'Product views', 'Featured product', 'Products near you views', 'Product photos browsed', 'Product ratings', 'Clicks on Read More button to read product reviews', 'Clicks on Open map button', 'Clicks on Send this info by email button', 'Products denoted as Favorite')

    Privacy-sensitive information such as user names, retailer owner names and store names and keywords searched are anonymized.

  11. R

    11 Original Dataset

    • universe.roboflow.com
    zip
    Updated May 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    caps (2025). 11 Original Dataset [Dataset]. https://universe.roboflow.com/caps-vmqdh/4-11-original/dataset/4
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 9, 2025
    Dataset authored and provided by
    caps
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Person Bounding Boxes
    Description

    11 Original

    ## Overview
    
    11 Original is a dataset for object detection tasks - it contains Person annotations for 225 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  12. Employee Attrition Classification Dataset

    • kaggle.com
    zip
    Updated Jun 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Umair Zia (2024). Employee Attrition Classification Dataset [Dataset]. https://www.kaggle.com/datasets/stealthtechnologies/employee-attrition-dataset
    Explore at:
    zip(1802815 bytes)Available download formats
    Dataset updated
    Jun 11, 2024
    Authors
    Umair Zia
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The Synthetic Employee Attrition Dataset is a simulated dataset designed for the analysis and prediction of employee attrition. It contains detailed information about various aspects of an employee's profile, including demographics, job-related features, and personal circumstances.

    The dataset comprises 74,498 samples, split into training and testing sets to facilitate model development and evaluation. Each record includes a unique Employee ID and features that influence employee attrition. The goal is to understand the factors contributing to attrition and develop predictive models to identify at-risk employees.

    This dataset is ideal for HR analytics, machine learning model development, and demonstrating advanced data analysis techniques. It provides a comprehensive and realistic view of the factors affecting employee retention, making it a valuable resource for researchers and practitioners in the field of human resources and organizational development.

    FEATURES:

    Employee ID: A unique identifier assigned to each employee. Age: The age of the employee, ranging from 18 to 60 years. Gender: The gender of the employee Years at Company: The number of years the employee has been working at the company. Monthly Income: The monthly salary of the employee, in dollars. Job Role: The department or role the employee works in, encoded into categories such as Finance, Healthcare, Technology, Education, and Media. Work-Life Balance: The employee's perceived balance between work and personal life, (Poor, Below Average, Good, Excellent) Job Satisfaction: The employee's satisfaction with their job: (Very Low, Low, Medium, High) Performance Rating: The employee's performance rating: (Low, Below Average, Average, High) Number of Promotions: The total number of promotions the employee has received. Distance from Home: The distance between the employee's home and workplace, in miles. Education Level: The highest education level attained by the employee: (High School, Associate Degree, Bachelor’s Degree, Master’s Degree, PhD) Marital Status: The marital status of the employee: (Divorced, Married, Single) Job Level: The job level of the employee: (Entry, Mid, Senior) Company Size: The size of the company the employee works for: (Small,Medium,Large) Company Tenure: The total number of years the employee has been working in the industry. Remote Work: Whether the employee works remotely: (Yes or No) Leadership Opportunities: Whether the employee has leadership opportunities: (Yes or No) Innovation Opportunities: Whether the employee has opportunities for innovation: (Yes or No) Company Reputation: The employee's perception of the company's reputation: (Very Poor, Poor,Good, Excellent) Employee Recognition: The level of recognition the employee receives:(Very Low, Low, Medium, High)

    Attrition: Whether the employee has left the company, encoded as 0 (stayed) and 1 (Left).

  13. State Medicaid and CHIP Applications, Eligibility Determinations, and...

    • catalog.data.gov
    • data.virginia.gov
    • +12more
    Updated Jan 31, 2026
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Medicare & Medicaid Services (2026). State Medicaid and CHIP Applications, Eligibility Determinations, and Enrollment Data [Dataset]. https://catalog.data.gov/dataset/state-medicaid-and-chip-applications-eligibility-determinations-and-enrollment-data-f1647
    Explore at:
    Dataset updated
    Jan 31, 2026
    Dataset provided by
    Centers for Medicare & Medicaid Services
    Description

    All states (including the District of Columbia) are required to provide data to The Centers for Medicare & Medicaid Services (CMS) on a range of Medicaid and Children’s Health Insurance Program (CHIP) indicators related to key application, eligibility, enrollment and call center processes. These data reflect enrollment activity for all populations receiving comprehensive Medicaid and CHIP benefits in all states, as well as state program performance. States submit this data via the Performance Indicator dataset. Further information about this dataset is available at: https://www.medicaid.gov/medicaid/national-medicaid-chip-program-information/medicaid-chip-enrollment-data/performance-indicator-technical-assistance/index.html.

  14. h

    WorldSense

    • huggingface.co
    Updated Feb 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jack Hong (2025). WorldSense [Dataset]. https://huggingface.co/datasets/honglyhly/WorldSense
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 6, 2025
    Authors
    Jack Hong
    Description

    WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs

    Jack Hong1, Shilin Yan1†, Jiayin Cai1, Xiaolong Jiang1, Yao Hu1, Weidi Xie2‡

    †Project Leader
    ‡Corresponding Author
    

    1Xiaohongshu Inc. 2Shanghai Jiao Tong University [🏠 Project Page] [📖 arXiv Paper] [🤗 Dataset] [🏆 Leaderboard]

      🔥 News
    

    2025.02.07 🌟 We release WorldSense, the first benchmark for real-world omnimodal understanding of MLLMs.

      👀 WorldSense Overview
    

    we… See the full description on the dataset page: https://huggingface.co/datasets/honglyhly/WorldSense.

  15. t

    Simulated Low-Dose CT Dataset - Dataset - LDM

    • service.tib.eu
    • resodate.org
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Simulated Low-Dose CT Dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/simulated-low-dose-ct-dataset
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    A simulated low-dose CT dataset generated from normal-dose CT images to be used for training a deep neural network to remove noise from low-dose CT images.

  16. Data from: CABra: a novel large-sample dataset for Brazilian catchments

    • zenodo.org
    • data.niaid.nih.gov
    pdf, txt, zip
    Updated Jul 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andre Almagro; Andre Almagro; Paulo Tarso Sanches Oliveira; Paulo Tarso Sanches Oliveira; Antonio Alves Meira Neto; Antonio Alves Meira Neto; Tirthankar Roy; Tirthankar Roy; Peter Troch; Peter Troch (2024). CABra: a novel large-sample dataset for Brazilian catchments [Dataset]. http://doi.org/10.5281/zenodo.7612350
    Explore at:
    txt, zip, pdfAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andre Almagro; Andre Almagro; Paulo Tarso Sanches Oliveira; Paulo Tarso Sanches Oliveira; Antonio Alves Meira Neto; Antonio Alves Meira Neto; Tirthankar Roy; Tirthankar Roy; Peter Troch; Peter Troch
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Hydrometeorological time series and catchment attributes from the CABra dataset. The manuscript of "CABra: a novel large-sample dataset for Brazilian catchments" is under review in Hydrology and Earth System Sciences (HESS) journal.

    Here we present the Catchments Attributes for Brazil (CABra), which is a large-sample dataset for Brazilian catchments that includes long-term data (30 years) for 735 catchments in eight main catchment attribute classes (climate, streamflow, groundwater, geology, soil, topography, land-use and land-cover, and hydrologic disturbance). We have collected and synthesized data from multiple sources (ground stations, remote sensing, and gridded datasets). To prepare the dataset, we delineated all the catchments using the Multi-Error-Removed Improved-Terrain Digital Elevation Model and the coordinates of the streamflow stations provided by the Brazilian Water Agency (ANA), where only the stations with 30 years (1980-2010) of data and less than 10% of missing records were included. Catchment areas range from 9 to 4,800,000 km² and the mean daily streamflow varies from 0.02 to 9 mm day-1. Several signatures and indices were calculated based on the climate and streamflow data. Additionally, our dataset includes boundary shapefiles, geographic coordinates, and drainage areas for each catchment, aside from more than 100 attributes within the attribute classes.

    Data can also be accessed at: thecabradataset.shinyapps.io/CABra

    * This version includes water demand in CABra catchments for 2020 and 2040 (projection).

  17. t

    Synthetic Binary Dataset - Dataset - LDM

    • service.tib.eu
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Synthetic Binary Dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/synthetic-binary-dataset
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    A synthetic binary dataset of desired characteristics, comprising 3000 instances with 20 features.

  18. t

    LLM Synthetic Dataset - Dataset - LDM

    • service.tib.eu
    • resodate.org
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). LLM Synthetic Dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/llm-synthetic-dataset
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    The dataset is used to evaluate the performance of the 𝛼-RNN model on various time series tasks.

  19. COVID-19 Diagnostic Laboratory Testing (PCR Testing) Time Series

    • healthdata.gov
    • data.vi-vn.virginia.gov
    • +10more
    Updated Jun 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Health & Human Services (2024). COVID-19 Diagnostic Laboratory Testing (PCR Testing) Time Series [Dataset]. https://healthdata.gov/dataset/COVID-19-Diagnostic-Laboratory-Testing-PCR-Testing/j8mb-icvb
    Explore at:
    application/geo+json, csv, kmz, kml, xlsx, xmlAvailable download formats
    Dataset updated
    Jun 21, 2024
    Dataset provided by
    United States Department of Health and Human Serviceshttp://www.hhs.gov/
    Authors
    U.S. Department of Health & Human Services
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    After May 3, 2024, this dataset and webpage will no longer be updated because hospitals are no longer required to report data on COVID-19 hospital admissions, and hospital capacity and occupancy data, to HHS through CDC’s National Healthcare Safety Network. Data voluntarily reported to NHSN after May 1, 2024, will be available starting May 10, 2024, at COVID Data Tracker Hospitalizations.


    This time series dataset includes viral COVID-19 laboratory test [Polymerase chain reaction (PCR)] results from over 1,000 U.S. laboratories and testing locations including commercial and reference laboratories, public health laboratories, hospital laboratories, and other testing locations. Data are reported to state and jurisdictional health departments in accordance with applicable state or local law and in accordance with the Coronavirus Aid, Relief, and Economic Security (CARES) Act (CARES Act Section 18115).

    Data are provisional and subject to change.

    Data presented here is representative of diagnostic specimens being tested - not individual people - and excludes serology tests where possible. Data presented might not represent the most current counts for the most recent 3 days due to the time it takes to report testing information. The data may also not include results from all potential testing sites within the jurisdiction (e.g., non-laboratory or point of care test sites) and therefore reflect the majority, but not all, of COVID-19 testing being conducted in the United States.

    Sources: CDC COVID-19 Electronic Laboratory Reporting (CELR), Commercial Laboratories, State Public Health Labs, In-House Hospital Labs

    Data for each state is sourced from either data submitted directly by the state health department via COVID-19 electronic laboratory reporting (CELR), or a combination of commercial labs, public health labs, and in-house hospital labs. Data is taken from CELR for states that either submit line level data or submit aggregate counts which do not include serology tests.

  20. t

    AIST dataset - Dataset - LDM

    • service.tib.eu
    • resodate.org
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). AIST dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/aist-dataset
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    The AIST dataset, which consists of dance videos of 30 human subjects captured from 9 cameras.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Hirai-Labs (2025). alpr-vlm-instruct-dataset [Dataset]. https://huggingface.co/datasets/Hirai-Labs/alpr-vlm-instruct-dataset

alpr-vlm-instruct-dataset

Hirai-Labs/alpr-vlm-instruct-dataset

Explore at:
Dataset updated
Feb 20, 2025
Dataset authored and provided by
Hirai-Labs
Description

Hirai-Labs/alpr-vlm-instruct-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

Search
Clear search
Close search
Google apps
Main menu