100+ datasets found
  1. Data from: Mental Health Services Children & Young People

    • kaggle.com
    Updated Jan 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Mental Health Services Children & Young People [Dataset]. https://www.kaggle.com/datasets/thedevastator/mental-health-services-children-young-people/discussion?sort=undefined
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 21, 2023
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    Mental Health Services Children & Young People

    Monthly Statistics on Referrals, Contacts and Care

    By data.world's Admin [source]

    About this dataset

    This dataset provides essential information on the mental health services provided to children and young people in England. The data contained within the Mental Health Services Data Set (MHSDS) - Children & Young People covers a variety of different categories during a given reporting period, including primary level details, secondary level descriptions, number of open referrals for children's and young people's mental health services at the end of the reporting period, as well as number of first attended contacts for referrals open in the reporting period aged 0-18. It also provides insight into how many people are in contact with mental health services aged 0 to 18 at the time of reporting, how many referrals starting during this time were self-refreshers and more. This dataset includes valuable information that is necessary to better track and understand trends in order to provide more effective care

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This guide will provide you with an overview of the data contained in this dataset as well as information on how to effectively use it for your own research or personal purposes. Let's get started!

    Overview of Data Fields

    • REPORTING_PERIOD: The month and year of the reporting period (Date)
    • BREAKDOWN: The type of breakdown of the data (String)
    • PRIMARY_LEVEL: The primary level of the data (String)
    • PRIMARY_LEVEL_DESCRIPTION: A description at the primary level of the data (String)
    • SECONDARY_LEVEL: The secondary level of the data (String)

    Research Ideas

    • Evaluating the efficacy of existing mental health services for children and young people by examining changes in relationships between different aspects of service delivery (e.g. referral activity, hospital spell activity, etc).
    • Analysing geographical trends in mental health services to inform investment decisions and policies across different regions.
    • Identifying areas of high need among vulnerable or marginalised citizens, such as those aged 0-18 or those with particular genetic makeup, to better target resources and support those most in need of help

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: mhsds-monthly-cyp-data-file-feb-fin-2017-1.csv | Column name | Description | |:-------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------| | REPORTING_PERIOD | The period of time for which the data was collected. (String) | | BREAKDOWN | The breakdown of the data by age group. (String) | | PRIMARY_LEVEL | The primary level of the data. (String) | | PRIMARY_LEVEL_DESCRIPTION ...

  2. f

    ORBIT: A real-world few-shot dataset for teachable object recognition...

    • city.figshare.com
    bin
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniela Massiceti; Lida Theodorou; Luisa Zintgraf; Matthew Tobias Harris; Simone Stumpf; Cecily Morrison; Edward Cutrell; Katja Hofmann (2023). ORBIT: A real-world few-shot dataset for teachable object recognition collected from people who are blind or low vision [Dataset]. http://doi.org/10.25383/city.14294597.v3
    Explore at:
    binAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    City, University of London
    Authors
    Daniela Massiceti; Lida Theodorou; Luisa Zintgraf; Matthew Tobias Harris; Simone Stumpf; Cecily Morrison; Edward Cutrell; Katja Hofmann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Object recognition predominately still relies on many high-quality training examples per object category. In contrast, learning new objects from only a few examples could enable many impactful applications from robotics to user personalization. Most few-shot learning research, however, has been driven by benchmark datasets that lack the high variation that these applications will face when deployed in the real-world. To close this gap, we present the ORBIT dataset, grounded in a real-world application of teachable object recognizers for people who are blind/low vision. We provide a full, unfiltered dataset of 4,733 videos of 588 objects recorded by 97 people who are blind/low-vision on their mobile phones, and a benchmark dataset of 3,822 videos of 486 objects collected by 77 collectors. The code for loading the dataset, computing all benchmark metrics, and running the baseline models is available at https://github.com/microsoft/ORBIT-DatasetThis version comprises several zip files:- train, validation, test: benchmark dataset, organised by collector, with raw videos split into static individual frames in jpg format at 30FPS- other: data not in the benchmark set, organised by collector, with raw videos split into static individual frames in jpg format at 30FPS (please note that the train, validation, test, and other files make up the unfiltered dataset)- *_224: as for the benchmark, but static individual frames are scaled down to 224 pixels.- *_unfiltered_videos: full unfiltered dataset, organised by collector, in mp4 format.

  3. d

    Traffic Crashes - People

    • catalog.data.gov
    • data.cityofchicago.org
    Updated Aug 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.cityofchicago.org (2025). Traffic Crashes - People [Dataset]. https://catalog.data.gov/dataset/traffic-crashes-people
    Explore at:
    Dataset updated
    Aug 30, 2025
    Dataset provided by
    data.cityofchicago.org
    Description

    This data contains information about people involved in a crash and if any injuries were sustained. This dataset should be used in combination with the traffic Crash and Vehicle dataset. Each record corresponds to an occupant in a vehicle listed in the Crash dataset. Some people involved in a crash may not have been an occupant in a motor vehicle, but may have been a pedestrian, bicyclist, or using another non-motor vehicle mode of transportation. Injuries reported are reported by the responding police officer. Fatalities that occur after the initial reports are typically updated in these records up to 30 days after the date of the crash. Person data can be linked with the Crash and Vehicle dataset using the “CRASH_RECORD_ID” field. A vehicle can have multiple occupants and hence have a one to many relationship between Vehicle and Person dataset. However, a pedestrian is a “unit” by itself and have a one to one relationship between the Vehicle and Person table. The Chicago Police Department reports crashes on IL Traffic Crash Reporting form SR1050. The crash data published on the Chicago data portal mostly follows the data elements in SR1050 form. The current version of the SR1050 instructions manual with detailed information on each data elements is available here. Change 11/21/2023: We have removed the RD_NO (Chicago Police Department report number) for privacy reasons.

  4. o

    Geonames - All Cities with a population > 1000

    • public.opendatasoft.com
    • data.smartidf.services
    • +2more
    csv, excel, geojson +1
    Updated Mar 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/
    Explore at:
    csv, json, geojson, excelAvailable download formats
    Dataset updated
    Mar 10, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name

  5. How Common is Your Birthday?

    • kaggle.com
    Updated Nov 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). How Common is Your Birthday? [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-births-how-common-is-your-birthday
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 23, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    Description

    US Births - How Common is Your Birthday?

    How popular is your birthday?

    By Andy Kriebel [source]

    About this dataset

    The file contains data on births in the United States from 1994 to 2014. The data includes the following columns: year: The year of the observation. (Integer) month: The month of the observation. (Integer) date_of_month: The date of the observation. (Integer) day_of_week: The day of the week of the observation. (Integer) births: The number of births on the given day. (Integer)

    How to use the dataset

    The US Births dataset on Kaggle contains data on births in the United States from 1994 to 2014. The data is broken down by year, month, date of month, day of week, and births.

    This dataset can be used to answer questions about when people are born, how common certain birthdays are, and any trends over time. For example, you could use this dataset to find out which day of the week has the most births or which month has the most births

    Research Ideas

    • Determining which day of the year and what time of day that people are mostly born to help with staffing levels in maternity wards
    • Identifying trends in baby names over time
    • Predicting the number of births on a given day

    Acknowledgements

    This data set is a combined effort of the U.S. National Center for Health Statistics and the U.S. Social Security Administration, provided by FiveThirtyEight. It contains data on births in the United States from 1994 to 2014, with the following columns: year, month, date_of_month, day_of_week, births

    ->Thank you to FiveThirtyEight for providing this dataset!

    Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: US_births_1994-2014.csv | Column name | Description | |:------------------|:---------------------------------------------| | year | Year of the data. (Integer) | | month | Month of the data. (Integer) | | date_of_month | Day of the month of the data. (Integer) | | day_of_week | Day of the week of the data. (Integer) | | births | Number of births on the given day. (Integer) |

    Acknowledgements

    If you use this dataset in your research, please credit Andy Kriebel.

  6. d

    2.02 Customer Service (detail)

    • catalog.data.gov
    • open.tempe.gov
    • +4more
    Updated Aug 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2025). 2.02 Customer Service (detail) [Dataset]. https://catalog.data.gov/dataset/2-02-customer-service-detail-be51b
    Explore at:
    Dataset updated
    Aug 11, 2025
    Dataset provided by
    City of Tempe
    Description

    This dataset provides Customer Service Satisfaction results from the Annual Community Survey. The survey questions assess satisfaction with overall customer service for individuals who had contacted the city in the past year. For years where there are multiple questions related to overall customer service and treatment, the average of those responses is provided in the summary dataset, and the values for each question are provided in the detailed dataset. For years 2010-2014, respondents were first asked, "Have you contacted the city in the past year?". If they answered that they had contacted the city, then they were asked additional questions about their experience. The "number of respondents" field represents the number of people who answered yes to the contact question. Responses of "don't know" are not included in this dataset, but can be found in the dataset for the entire Community Survey. A survey was not completed for 2015 (99999 indicates no recorded data). Due to changes in the survey questions, this dataset was last updated in 2017 and may not be updated again. The performance measure dashboard is available at 2.02 Customer Service Satisfaction. Additional InformationSource: Community Attitude SurveyContact: Wydale HolmesContact E-Mail: Wydale_Holmes@tempe.govData Source Type: Excel and PDFPreparation Method: Extracted from Annual Community Survey resultsPublish Frequency: AnnualPublish Method: ManualData Dictionary

  7. Empathy dataset

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv, html
    Updated Dec 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2024). Empathy dataset [Dataset]. http://doi.org/10.5281/zenodo.7683907
    Explore at:
    bin, html, csvAvailable download formats
    Dataset updated
    Dec 18, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    The database for this study (Briganti et al. 2018; the same for the Braun study analysis) was composed of 1973 French-speaking students in several universities or schools for higher education in the following fields: engineering (31%), medicine (18%), nursing school (16%), economic sciences (15%), physiotherapy, (4%), psychology (11%), law school (4%) and dietetics (1%). The subjects were 17 to 25 years old (M = 19.6 years, SD = 1.6 years), 57% were females and 43% were males. Even though the full dataset was composed of 1973 participants, only 1270 answered the full questionnaire: missing data are handled using pairwise complete observations in estimating a Gaussian Graphical Model, meaning that all available information from every subject are used.

    The feature set is composed of 28 items meant to assess the four following components: fantasy, perspective taking, empathic concern and personal distress. In the questionnaire, the items are mixed; reversed items (items 3, 4, 7, 12, 13, 14, 15, 18, 19) are present. Items are scored from 0 to 4, where “0” means “Doesn’t describe me very well” and “4” means “Describes me very well”; reverse-scoring is calculated afterwards. The questionnaires were anonymized. The reanalysis of the database in this retrospective study was approved by the ethical committee of the Erasmus Hospital.

    Size: A dataset of size 1973*28

    Number of features: 28

    Ground truth: No

    Type of Graph: Mixed graph

    The following gives the description of the variables:

    FeatureFeatureLabelDomainItem meaning from Davis 1980
    0011FSGreenI daydream and fantasize, with some regularity, about things that might happen to me.
    0022ECPurpleI often have tender, concerned feelings for people less fortunate than me.
    0033PT_RYellowI sometimes find it difficult to see things from the “other guy’s” point of view.
    0044EC_RPurpleSometimes I don’t feel very sorry for other people when they are having problems.
    0055FSGreenI really get involved with the feelings of the characters in a novel.
    0066PDRedIn emergency situations, I feel apprehensive and ill-at-ease.
    0077FS_RGreenI am usually objective when I watch a movie or play, and I don’t often get completely caught up in it.(Reversed)
    0088PTYellowI try to look at everybody’s side of a disagreement before I make a decision.
    0099ECPurpleWhen I see someone being taken advantage of, I feel kind of protective towards them.
    01010PDRedI sometimes feel helpless when I am in the middle of a very emotional situation.
    01111PTYellowsometimes try to understand my friends better by imagining how things look from their perspective
    01212FS_RGreenBecoming extremely involved in a good book or movie is somewhat rare for me. (Reversed)
    01313PD_RRedWhen I see someone get hurt, I tend to remain calm. (Reversed)
    01414EC_RPurpleOther people’s misfortunes do not usually disturb me a great deal. (Reversed)
    01515PT_RYellowIf I’m sure I’m right about something, I don’t waste much time listening to other people’s arguments. (Reversed)
    01616FSGreenAfter seeing a play or movie, I have felt as though I were one of the characters.
    01717PDRedBeing in a tense emotional situation scares me.
    01818EC_RPurpleWhen I see someone being treated unfairly, I sometimes don’t feel very much pity for them. (Reversed)
    01919PD_RRedI am usually pretty effective in dealing with emergencies. (Reversed)
    02020FSGreenI am often quite touched by things that I see happen.
    02121PTYellowI believe that there are two sides to every question and try to look at them both.
    02222ECPurpleI would describe myself as a pretty soft-hearted person.
    02323FSGreenWhen I watch a good movie, I can very easily put myself in the place of a leading character.
    02424PDRedI tend to lose control during emergencies.
    02525PTYellowWhen I’m upset at someone, I usually try to “put myself in his shoes” for a while.
    02626FSGreenWhen I am reading an interesting story or novel, I imagine how I would feel if the events in the story were happening to me.
    02727PDRedWhen I see someone who badly needs help in an emergency, I go to pieces.
    02828PTYellowBefore criticizing somebody, I try to imagine how I would feel if I were in their place

    More information about the dataset is contained in empathy_description.html file.

  8. p

    Ghana Number Dataset

    • listtodata.com
    • jw.listtodata.com
    .csv, .xls, .txt
    Updated Jul 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    List to Data (2025). Ghana Number Dataset [Dataset]. https://listtodata.com/ghana-dataset
    Explore at:
    .csv, .xls, .txtAvailable download formats
    Dataset updated
    Jul 17, 2025
    Authors
    List to Data
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2025 - Dec 31, 2025
    Area covered
    Ghana
    Variables measured
    phone numbers, Email Address, full name, Address, City, State, gender,age,income,ip address,
    Description

    Ghana number dataset has accurate numbers attached with verified through our team. These client contact data belong to active users only. In fact, these things make it a valuable marketing resource. Whether your business is new or old, you can boost your reach and connect to a large audience with this database. Again, you will find many people who have an interest in your products and will accept from you. Moreover, the Ghana number dataset will support you make your brand more renowned. In other words, by becoming a known brand in the market, you can increase your brand value greatly. Similarly, many people will show interest in your products and services. However, the contacts on this mobile number list are active and real. Yet, you will benefit greatly if you purchase this cheap but valuable database. Ghana phone data can be a great solution for SMS and telemarketing. Anyone can use the contact lead here to reach different people in this area. Ghana phone data allows you to give product details with your messages to make them more appealing and reliable. Your product quality and content will catch the attention of the interested audience. This will create more traffic and you can reach sales from there. Likewise, the Ghana phone data is an opt-in and permission-based contact list. In addition, with an affordable yet fresh list like ours, your marketing will be more effective. People can now relate to your business more after you successfully use this tool. Thus, order the contact library now from List To Data to promote your goods and services everywhere inside the country. Ghana phone number list is a massive database. Our team promises you sincere service and active support. In general, you can contact us anytime on our website if you face any problems with our list. Our support team will solve the problem for you, thus you don’t have to worry about not obtaining the worth of your money. Further, the Ghana phone number list will aid your business in many new ways. The benefits of marketing on SMS marketing are enormous as we all know very well. Moreover, no one wants to miss out on such a huge and versatile audience in Ghana. Hence, purchasing this contact number package will be a gem for any business any day.

  9. PERU MIGRANT Study | Baseline and 5yr follow-up dataset

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    bin
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    J. Jaime Miranda; Antonio Bernabe-Ortiz; Rodrigo Carrillo Larco (2023). PERU MIGRANT Study | Baseline and 5yr follow-up dataset [Dataset]. http://doi.org/10.6084/m9.figshare.4832612.v4
    Explore at:
    binAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    J. Jaime Miranda; Antonio Bernabe-Ortiz; Rodrigo Carrillo Larco
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Peru
    Description

    This is an update of a prior dataset publication containing baseline and 5-year follow-up data from the PERU MIGRANT Study (PEru's Rural to Urban MIGRANTs Study).The PERU MIGRANT Study was designed to investigate the magnitude of differences between rural-to-urban migrant and non-migrant groups in specific cardiovascular risk factors. Three groups were selected: i) Rural, people who have always have lived in a rural environment; ii) Rural-urban, people who migrated from rural to urban areas; and, iii) Urban, people who have always lived in a urban environment.PERU MIGRANT Study protocol, instruments and variables are described in full in:Miranda JJ, Gilman RH, García HH, Smeeth L. The effect on cardiovascular risk factors of migration from rural to urban areas in Peru: PERU MIGRANT Study. BMC Cardiovasc Disord 2009;9:23. PERU MIGRANT Study baseline dataset is available at:https://figshare.com/articles/PERU_MIGRANT_Study_Baseline_dataset/3125005Main findings of the baseline study:Miranda JJ, Gilman RH, Smeeth L. Differences in cardiovascular risk factors in rural, urban and rural-to-urban migrants in Peru. Heart 2011;97(10):787-96. Main findings of the 5-yr follow-up study: Carrillo-Larco RM, Bernabé-Ortiz A, Pillay TD, Gilman RH, Sanchez JF, Poterico JA, Quispe R, Smeeth L, Miranda JJ. Obesity risk in rural, urban and rural-to-urban migrants: prospective results of the PERU MIGRANT study. Int J Obes (Lond) 2016;40(1):181-5. Bernabe-Ortiz A, Sanchez JF, Carrillo-Larco RM, Gilman RH, Poterico JA, Quispe R, Smeeth L, Miranda JJ. Rural-to-urban migration and risk of hypertension: longitudinal results of the PERU MIGRANT study. J Hum Hypertens 2017;31(1):22-28. Lazo-Porras M, Bernabe-Ortiz A, Málaga G, Gilman RH, Acuña-Villaorduña A, Cardenas-Montero D, Smeeth L, Miranda JJ. Low HDL cholesterol as a cardiovascular risk factor in rural, urban, and rural-urban migrants: PERU MIGRANT cohort study. Atherosclerosis 2016;246:36-43.Burroughs Pena MS, Bernabé-Ortiz A, Carrillo-Larco RM, Sánchez JF, Quispe R, Pillay TD, Málaga G, Gilman RH, Smeeth L, Miranda JJ. Migration, urbanisation and mortality: 5-year longitudinal analysis of the PERU MIGRANT study. J Epidemiol Community Health 2015;69(7):715-8.

  10. The dataset contains PII

    • catalog.data.gov
    • gimi9.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). The dataset contains PII [Dataset]. https://catalog.data.gov/dataset/the-dataset-contains-pii
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    These data are interview transcripts with individuals who are users of the Smoke Sense app. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: This data is available on request to approved individuals. Format: This data contains PII. These are interview transcripts. This dataset is associated with the following publication: Hano, M., L. Wei, B. Hubbell, and A. Rappold. Scaling Up: Citizen Science Engagement and Impacts Beyond the Individual. Citizen Science: Theory and Practice. Ubiquity Press, London, UK, 5(1): 1-13, (2020).

  11. Dataset #2: Experimental study

    • figshare.com
    docx
    Updated Jul 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adam Baimel (2023). Dataset #2: Experimental study [Dataset]. http://doi.org/10.6084/m9.figshare.23708766.v1
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jul 19, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Adam Baimel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Project Title: Add title here

    Project Team: Add contact information for research project team members

    Summary: Provide a descriptive summary of the nature of your research project and its aims/focal research questions.

    Relevant publications/outputs: When available, add links to the related publications/outputs from this data.

    Data availability statement: If your data is not linked on figshare directly, provide links to where it is being hosted here (i.e., Open Science Framework, Github, etc.). If your data is not going to be made publicly available, please provide details here as to the conditions under which interested individuals could gain access to the data and how to go about doing so.

    Data collection details: 1. When was your data collected? 2. How were your participants sampled/recruited?

    Sample information: How many and who are your participants? Demographic summaries are helpful additions to this section.

    Research Project Materials: What materials are necessary to fully reproduce your the contents of your dataset? Include a list of all relevant materials (e.g., surveys, interview questions) with a brief description of what is included in each file that should be uploaded alongside your datasets.

    List of relevant datafile(s): If your project produces data that cannot be contained in a single file, list the names of each of the files here with a brief description of what parts of your research project each file is related to.

    Data codebook: What is in each column of your dataset? Provide variable names as they are encoded in your data files, verbatim question associated with each response, response options, details of any post-collection coding that has been done on the raw-response (and whether that's encoded in a separate column).

    Examples available at: https://www.thearda.com/data-archive?fid=PEWMU17 https://www.thearda.com/data-archive?fid=RELLAND14

  12. Statewide Death Profiles

    • data.chhs.ca.gov
    • data.ca.gov
    • +3more
    csv, zip
    Updated Aug 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Statewide Death Profiles [Dataset]. https://data.chhs.ca.gov/dataset/statewide-death-profiles
    Explore at:
    csv(4689434), csv(16301), csv(5034), csv(463460), csv(2026589), csv(5401561), csv(164006), csv(200270), csv(419332), csv(406971), zipAvailable download formats
    Dataset updated
    Aug 22, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    This dataset contains counts of deaths for California as a whole based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.

    The final data tables include both deaths that occurred in California regardless of the place of residence (by occurrence) and deaths to California residents (by residence), whereas the provisional data table only includes deaths that occurred in California regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

    The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

  13. COVID-19 Post-Vaccination Infection Data (ARCHIVED)

    • data.chhs.ca.gov
    • data.ca.gov
    • +4more
    csv, xlsx, zip
    Updated Aug 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2024). COVID-19 Post-Vaccination Infection Data (ARCHIVED) [Dataset]. https://data.chhs.ca.gov/dataset/covid-19-post-vaccination-infection-data
    Explore at:
    csv(38212), xlsx(11056), csv(90508), csv(78921), zipAvailable download formats
    Dataset updated
    Aug 30, 2024
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    Note: This dataset is no longer being updated due to the end of the COVID-19 Public Health Emergency.

    The California Department of Public Health (CDPH) is identifying vaccination status of COVID-19 cases, hospitalizations, and deaths by analyzing the state immunization registry and registry of confirmed COVID-19 cases. Post-vaccination cases are individuals who have a positive SARS-Cov-2 molecular test (e.g. PCR) at least 14 days after they have completed their primary vaccination series.

    Tracking cases of COVID-19 that occur after vaccination is important for monitoring the impact of immunization campaigns. While COVID-19 vaccines are safe and effective, some cases are still expected in persons who have been vaccinated, as no vaccine is 100% effective. For more information, please see https://www.cdph.ca.gov/Programs/CID/DCDC/Pages/COVID-19/Post-Vaccine-COVID19-Cases.aspx

    Post-vaccination infection data is updated monthly and includes data on cases, hospitalizations, and deaths among the unvaccinated and the vaccinated. Partially vaccinated individuals are excluded. To account for reporting and processing delays, there is at least a one-month lag in provided data (for example data published on 9/9/22 will include data through 7/31/22).

    Notes:

    • On September 9, 2022, the post-vaccination data has been changed to compare unvaccinated with those with at least a primary series completed for persons age 5+. These data will be updated monthly (first Thursday of the month) and include at least a one month lag.

    • On February 2, 2022, the post-vaccination data has been changed to distinguish between vaccination with a primary series only versus vaccinated and boosted. The previous dataset has been uploaded as an archived table. Additionally, the lag on this data has been extended to 14 days.

    • On November 29, 2021, the denominator for calculating vaccine coverage has been changed from age 16+ to age 12+ to reflect new vaccine eligibility criteria. The previous dataset based on age 16+ denominators has been uploaded as an archived table.

  14. Africa - Population and Internet users statistics

    • kaggle.com
    Updated Dec 17, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ishmeet singh (2020). Africa - Population and Internet users statistics [Dataset]. https://www.kaggle.com/datasets/ishmeet/africa-population-and-internet-users-statistics
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 17, 2020
    Dataset provided by
    Kaggle
    Authors
    Ishmeet singh
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Area covered
    Africa
    Description

    Context

    Africa - Population and Internet users statistics

    Content

    What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

    Acknowledgements

    Source: https://data.humdata.org/dataset/africa-population-and-internet-users-statistics Last updated at https://data.humdata.org/organization/openafrica : 2019-09-11

  15. t

    PLACE OF BIRTH - DP02_DES_T - Dataset - CKAN

    • portal.tad3.org
    Updated Nov 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). PLACE OF BIRTH - DP02_DES_T - Dataset - CKAN [Dataset]. https://portal.tad3.org/dataset/place-of-birth-dp02_des_t
    Explore at:
    Dataset updated
    Nov 18, 2024
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    SELECTED SOCIAL CHARACTERISTICS IN THE UNITED STATES PLACE OF BIRTH - DP02 Universe - Total population Survey-Program - American Community Survey 5-year estimates Years - 2020, 2021, 2022 People not reporting a place of birth were assigned the state or country of birth of another family member, or were allocated the response of another individual with similar characteristics. People born outside the United States were asked to report their place of birth according to current international boundaries. Since numerous changes in boundaries of foreign countries have occurred in the last century, some people may have reported their place of birth in terms of boundaries that existed at the time of their birth or emigration, or in accordance with their own national preference.

  16. SceneFake

    • zenodo.org
    zip
    Updated Feb 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiangyan Yi; Chenglong Wang; Jiangyan Yi; Chenglong Wang (2023). SceneFake [Dataset]. http://doi.org/10.5281/zenodo.7663324
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 23, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jiangyan Yi; Chenglong Wang; Jiangyan Yi; Chenglong Wang
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Many datasets have been designed to further the development of fake audio detection. However, fake utterances in previous datasets are mostly generated by altering timbre, prosody, linguistic content or channel noise of original audio. These datasets leave out a scenario, in which the acoustic scene of an original audio is manipulated with a forged one. It will pose a major threat to our society if some people misuse the manipulated audio with malicious purpose. Therefore, this motivates us to fill in the gap. This paper proposes such a dataset for scene fake audio detection named SceneFake, where a manipulated audio is generated by only tampering with the acoustic scene of an real utterance by using speech enhancement technologies. The results show that scene fake utterances cannot be detected reliably by the baseline models trained using the ASVspoof 2019 dataset. When the models are trained using the training set of SceneFake, they perform well when evaluated with the seen testing set, but still perform poorly when dealing with the unseen test set.

    The SceneFake dataset is publicly available. The source code of baselines is available on GitHub https://github.com/ADDchallenge/SceneFake

    This data set is licensed with a CC BY-NC-ND 4.0 license.

  17. COVID-19 Vaccine Progress Dashboard Data

    • data.chhs.ca.gov
    • data.ca.gov
    • +5more
    csv, xlsx, zip
    Updated Sep 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). COVID-19 Vaccine Progress Dashboard Data [Dataset]. https://data.chhs.ca.gov/dataset/vaccine-progress-dashboard
    Explore at:
    csv(82754), csv(675610), csv(2447143), csv(83128924), csv(12877811), csv(26828), csv(724860), csv(303068812), csv(503270), xlsx(11870), csv(110928434), xlsx(11731), csv(6772350), xlsx(11249), csv(148732), zip, csv(7777694), csv(54906), xlsx(7708), csv(2641927), csv(188895), csv(638738), csv(111682), csv(18403068), xlsx(11534)Available download formats
    Dataset updated
    Sep 1, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    Note: In these datasets, a person is defined as up to date if they have received at least one dose of an updated COVID-19 vaccine. The Centers for Disease Control and Prevention (CDC) recommends that certain groups, including adults ages 65 years and older, receive additional doses.

    On 6/16/2023 CDPH replaced the booster measures with a new “Up to Date” measure based on CDC’s new recommendations, replacing the primary series, boosted, and bivalent booster metrics The definition of “primary series complete” has not changed and is based on previous recommendations that CDC has since simplified. A person cannot complete their primary series with a single dose of an updated vaccine. Whereas the booster measures were calculated using the eligible population as the denominator, the new up to date measure uses the total estimated population. Please note that the rates for some groups may change since the up to date measure is calculated differently than the previous booster and bivalent measures.

    This data is from the same source as the Vaccine Progress Dashboard at https://covid19.ca.gov/vaccination-progress-data/ which summarizes vaccination data at the county level by county of residence. Where county of residence was not reported in a vaccination record, the county of provider that vaccinated the resident is included. This applies to less than 1% of vaccination records. The sum of county-level vaccinations does not equal statewide total vaccinations due to out-of-state residents vaccinated in California.

    These data do not include doses administered by the following federal agencies who received vaccine allocated directly from CDC: Indian Health Service, Veterans Health Administration, Department of Defense, and the Federal Bureau of Prisons.

    Totals for the Vaccine Progress Dashboard and this dataset may not match, as the Dashboard totals doses by Report Date and this dataset totals doses by Administration Date. Dose numbers may also change for a particular Administration Date as data is updated.

    Previous updates:

    • On March 3, 2023, with the release of HPI 3.0 in 2022, the previous equity scores have been updated to reflect more recent community survey information. This change represents an improvement to the way CDPH monitors health equity by using the latest and most accurate community data available. The HPI uses a collection of data sources and indicators to calculate a measure of community conditions ranging from the most to the least healthy based on economic, housing, and environmental measures.

    • Starting on July 13, 2022, the denominator for calculating vaccine coverage has been changed from age 5+ to all ages to reflect new vaccine eligibility criteria. Previously the denominator was changed from age 16+ to age 12+ on May 18, 2021, then changed from age 12+ to age 5+ on November 10, 2021, to reflect previous changes in vaccine eligibility criteria. The previous datasets based on age 16+ and age 5+ denominators have been uploaded as archived tables.

    • Starting on May 29, 2021 the methodology for calculating on-hand inventory in the shipped/delivered/on-hand dataset has changed. Please see the accompanying data dictionary for details. In addition, this dataset is now down to the ZIP code level.

  18. F

    Spanish Open Ended Question Answer Text Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Spanish Open Ended Question Answer Text Dataset [Dataset]. https://www.futurebeeai.com/dataset/prompt-response-dataset/spanish-open-ended-question-answer-text-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    What’s Included

    The Spanish Open-Ended Question Answering Dataset is a meticulously curated collection of comprehensive Question-Answer pairs. It serves as a valuable resource for training Large Language Models (LLMs) and Question-answering models in the Spanish language, advancing the field of artificial intelligence.

    Dataset Content:

    This QA dataset comprises a diverse set of open-ended questions paired with corresponding answers in Spanish. There is no context paragraph given to choose an answer from, and each question is answered without any predefined context content. The questions cover a broad range of topics, including science, history, technology, geography, literature, current affairs, and more.

    Each question is accompanied by an answer, providing valuable information and insights to enhance the language model training process. Both the questions and answers were manually curated by native Spanish people, and references were taken from diverse sources like books, news articles, websites, and other reliable references.

    This question-answer prompt completion dataset contains different types of prompts, including instruction type, continuation type, and in-context learning (zero-shot, few-shot) type. The dataset also contains questions and answers with different types of rich text, including tables, code, JSON, etc., with proper markdown.

    Question Diversity:

    To ensure diversity, this Q&A dataset includes questions with varying complexity levels, ranging from easy to medium and hard. Different types of questions, such as multiple-choice, direct, and true/false, are included. Additionally, questions are further classified into fact-based and opinion-based categories, creating a comprehensive variety. The QA dataset also contains the question with constraints and persona restrictions, which makes it even more useful for LLM training.

    Answer Formats:

    To accommodate varied learning experiences, the dataset incorporates different types of answer formats. These formats include single-word, short phrases, single sentences, and paragraph types of answers. The answer contains text strings, numerical values, date and time formats as well. Such diversity strengthens the Language model's ability to generate coherent and contextually appropriate answers.

    Data Format and Annotation Details:

    This fully labeled Spanish Open Ended Question Answer Dataset is available in JSON and CSV formats. It includes annotation details such as id, language, domain, question_length, prompt_type, question_category, question_type, complexity, answer_type, rich_text.

    Quality and Accuracy:

    The dataset upholds the highest standards of quality and accuracy. Each question undergoes careful validation, and the corresponding answers are thoroughly verified. To prioritize inclusivity, the dataset incorporates questions and answers representing diverse perspectives and writing styles, ensuring it remains unbiased and avoids perpetuating discrimination.

    Both the question and answers in Spanish are grammatically accurate without any word or grammatical errors. No copyrighted, toxic, or harmful content is used while building this dataset.

    Continuous Updates and Customization:

    The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Continuous efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to collect custom question-answer data tailored to specific needs, providing flexibility and customization options.

    License:

    The dataset, created by FutureBeeAI, is now ready for commercial use. Researchers, data scientists, and developers can utilize this fully labeled and ready-to-deploy Spanish Open Ended Question Answer Dataset to enhance the language understanding capabilities of their generative ai models, improve response generation, and explore new approaches to NLP question-answering tasks.

  19. World cities database

    • kaggle.com
    Updated May 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juanma Hernández (2025). World cities database [Dataset]. http://doi.org/10.34740/kaggle/dsv/11944536
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 25, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Juanma Hernández
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data is from:

    https://simplemaps.com/data/world-cities

    We're proud to offer a simple, accurate and up-to-date database of the world's cities and towns. We've built it from the ground up using authoritative sources such as the NGIA, US Geological Survey, US Census Bureau, and NASA.

    Our database is:

    • Up-to-date: It was last refreshed on May 11, 2025.
    • Comprehensive: Over 4 million unique cities and towns from every country in the world (about 48 thousand in basic database).
    • Accurate: Cleaned and aggregated from official sources. Includes latitude and longitude coordinates.
    • Simple: A single CSV file, concise field names, only one entry per city.
  20. Asthma Prevalence

    • data.ca.gov
    • data.chhs.ca.gov
    • +3more
    csv, pdf, zip
    Updated Aug 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2024). Asthma Prevalence [Dataset]. https://data.ca.gov/dataset/asthma-prevalence
    Explore at:
    pdf, csv, zipAvailable download formats
    Dataset updated
    Aug 28, 2024
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the estimated percentage of Californians with asthma (asthma prevalence). Two types of asthma prevalence are included: 1) lifetime asthma prevalence describes the percentage of people who have ever been diagnosed with asthma by a health care provider, 2) current asthma prevalence describes the percentage of people who have ever been diagnosed with asthma by a health care provider AND report they still have asthma and/or had an asthma episode or attack within the past 12 months. The tables “Lifetime Asthma Prevalence by County” and “Current Asthma Prevalence by County” are derived from the California Health Interview Survey (CHIS) and include data stratified by county and age group (all ages, 0-17, 18+, 0-4, 5-17, 18-64, 65+) reported for 2-year periods. The table “Asthma Prevalence, Adults (18 and older)” is derived from the California Behavioral Risk Factor Surveillance System (BRFSS) and includes statewide data on adults reported by year.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Devastator (2023). Mental Health Services Children & Young People [Dataset]. https://www.kaggle.com/datasets/thedevastator/mental-health-services-children-young-people/discussion?sort=undefined
Organization logo

Data from: Mental Health Services Children & Young People

Monthly Statistics on Referrals, Contacts and Care

Related Article
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 21, 2023
Dataset provided by
Kaggle
Authors
The Devastator
Description

Mental Health Services Children & Young People

Monthly Statistics on Referrals, Contacts and Care

By data.world's Admin [source]

About this dataset

This dataset provides essential information on the mental health services provided to children and young people in England. The data contained within the Mental Health Services Data Set (MHSDS) - Children & Young People covers a variety of different categories during a given reporting period, including primary level details, secondary level descriptions, number of open referrals for children's and young people's mental health services at the end of the reporting period, as well as number of first attended contacts for referrals open in the reporting period aged 0-18. It also provides insight into how many people are in contact with mental health services aged 0 to 18 at the time of reporting, how many referrals starting during this time were self-refreshers and more. This dataset includes valuable information that is necessary to better track and understand trends in order to provide more effective care

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

This guide will provide you with an overview of the data contained in this dataset as well as information on how to effectively use it for your own research or personal purposes. Let's get started!

Overview of Data Fields

  • REPORTING_PERIOD: The month and year of the reporting period (Date)
  • BREAKDOWN: The type of breakdown of the data (String)
  • PRIMARY_LEVEL: The primary level of the data (String)
  • PRIMARY_LEVEL_DESCRIPTION: A description at the primary level of the data (String)
  • SECONDARY_LEVEL: The secondary level of the data (String)

Research Ideas

  • Evaluating the efficacy of existing mental health services for children and young people by examining changes in relationships between different aspects of service delivery (e.g. referral activity, hospital spell activity, etc).
  • Analysing geographical trends in mental health services to inform investment decisions and policies across different regions.
  • Identifying areas of high need among vulnerable or marginalised citizens, such as those aged 0-18 or those with particular genetic makeup, to better target resources and support those most in need of help

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

Columns

File: mhsds-monthly-cyp-data-file-feb-fin-2017-1.csv | Column name | Description | |:-------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------| | REPORTING_PERIOD | The period of time for which the data was collected. (String) | | BREAKDOWN | The breakdown of the data by age group. (String) | | PRIMARY_LEVEL | The primary level of the data. (String) | | PRIMARY_LEVEL_DESCRIPTION ...

Search
Clear search
Close search
Google apps
Main menu