100+ datasets found
  1. Fake Dataset for Practice

    • kaggle.com
    zip
    Updated Aug 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shuvo Kumar Basak-4004 (2023). Fake Dataset for Practice [Dataset]. https://www.kaggle.com/datasets/shuvokumarbasak4004/fake-dataset-for-practice
    Explore at:
    zip(1515599 bytes)Available download formats
    Dataset updated
    Aug 21, 2023
    Authors
    Shuvo Kumar Basak-4004
    Description

    Description: This dataset is created solely for the purpose of practice and learning. It contains entirely fake and fabricated information, including names, phone numbers, emails, cities, ages, and other attributes. None of the information in this dataset corresponds to real individuals or entities. It serves as a resource for those who are learning data manipulation, analysis, and machine learning techniques. Please note that the data is completely fictional and should not be treated as representing any real-world scenarios or individuals.

    Attributes: - phone_number: Fake phone numbers in various formats. - name: Fictitious names generated for practice purposes. - email: Imaginary email addresses created for the dataset. - city: Made-up city names to simulate geographical diversity. - age: Randomly generated ages for practice analysis. - sex: Simulated gender values (Male, Female). - married_status: Synthetic marital status information. - job: Fictional job titles for practicing data analysis. - income: Fake income values for learning data manipulation. - religion: Pretend religious affiliations for practice. - nationality: Simulated nationalities for practice purposes.

    Please be aware that this dataset is not based on real data and should be used exclusively for educational purposes.

  2. B

    Data Cleaning Sample

    • borealisdata.ca
    • dataone.org
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    Borealis
    Authors
    Rong Luo
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Sample data for exercises in Further Adventures in Data Cleaning.

  3. Dirty Dataset to practice Data Cleaning

    • kaggle.com
    zip
    Updated May 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Kanju (2024). Dirty Dataset to practice Data Cleaning [Dataset]. https://www.kaggle.com/datasets/martinkanju/dirty-dataset-to-practice-data-cleaning
    Explore at:
    zip(1235 bytes)Available download formats
    Dataset updated
    May 20, 2024
    Authors
    Martin Kanju
    Description

    Dataset

    This dataset was created by Martin Kanju

    Released under Other (specified in description)

    Contents

  4. c

    Sample Sales Dataset

    • cubig.ai
    zip
    Updated Jun 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Sample Sales Dataset [Dataset]. https://cubig.ai/store/products/477/sample-sales-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 15, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Sample Sales Data is a retail sales dataset of 2,823 orders and 25 columns that includes a variety of sales-related data, including order numbers, product information, quantity, unit price, sales, order date, order status, customer and delivery information.

    2) Data Utilization (1) Sample Sales Data has characteristics that: • This dataset consists of numerical (sales, quantity, unit price, etc.), categorical (product, country, city, customer name, transaction size, etc.), and date (order date) variables, with missing values in some columns (STATE, ADDRESSLINE2, POSTALCODE, etc.). (2) Sample Sales Data can be used to: • Analysis of sales trends and performance by product: Key variables such as order date, product line, and country can be used to visualize and analyze monthly and yearly sales trends, the proportion of sales by product line, and top sales by country and region. • Segmentation and marketing strategies: Segmentation of customer groups based on customer information, transaction size, and regional data, and use them to design targeted marketing and customized promotion strategies.

  5. H

    Political Analysis Using R: Example Code and Data, Plus Data for Practice...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Apr 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jamie Monogan (2020). Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems [Dataset]. http://doi.org/10.7910/DVN/ARKOTI
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 28, 2020
    Dataset provided by
    Harvard Dataverse
    Authors
    Jamie Monogan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Each R script replicates all of the example code from one chapter from the book. All required data for each script are also uploaded, as are all data used in the practice problems at the end of each chapter. The data are drawn from a wide array of sources, so please cite the original work if you ever use any of these data sets for research purposes.

  6. New 1000 Sales Records Data 2

    • kaggle.com
    zip
    Updated Jan 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Calvin Oko Mensah (2023). New 1000 Sales Records Data 2 [Dataset]. https://www.kaggle.com/datasets/calvinokomensah/new-1000-sales-records-data-2
    Explore at:
    zip(49305 bytes)Available download formats
    Dataset updated
    Jan 12, 2023
    Authors
    Calvin Oko Mensah
    Description

    This is a dataset downloaded off excelbianalytics.com created off of random VBA logic. I recently performed an extensive exploratory data analysis on it and I included new columns to it, namely: Unit margin, Order year, Order month, Order weekday and Order_Ship_Days which I think can help with analysis on the data. I shared it because I thought it was a great dataset to practice analytical processes on for newbies like myself.

  7. 18 excel spreadsheets by species and year giving reproduction and growth...

    • catalog.data.gov
    • data.wu.ac.at
    Updated Aug 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2024). 18 excel spreadsheets by species and year giving reproduction and growth data. One excel spreadsheet of herbicide treatment chemistry. [Dataset]. https://catalog.data.gov/dataset/18-excel-spreadsheets-by-species-and-year-giving-reproduction-and-growth-data-one-excel-sp
    Explore at:
    Dataset updated
    Aug 17, 2024
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Excel spreadsheets by species (4 letter code is abbreviation for genus and species used in study, year 2010 or 2011 is year data collected, SH indicates data for Science Hub, date is date of file preparation). The data in a file are described in a read me file which is the first worksheet in each file. Each row in a species spreadsheet is for one plot (plant). The data themselves are in the data worksheet. One file includes a read me description of the column in the date set for chemical analysis. In this file one row is an herbicide treatment and sample for chemical analysis (if taken). This dataset is associated with the following publication: Olszyk , D., T. Pfleeger, T. Shiroyama, M. Blakely-Smith, E. Lee , and M. Plocher. Plant reproduction is altered by simulated herbicide drift toconstructed plant communities. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY. Society of Environmental Toxicology and Chemistry, Pensacola, FL, USA, 36(10): 2799-2813, (2017).

  8. Dataset #1: Cross-sectional survey data

    • figshare.com
    txt
    Updated Jul 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adam Baimel (2023). Dataset #1: Cross-sectional survey data [Dataset]. http://doi.org/10.6084/m9.figshare.23708730.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 19, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Adam Baimel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    N.B. This is not real data. Only here for an example for project templates.

    Project Title: Add title here

    Project Team: Add contact information for research project team members

    Summary: Provide a descriptive summary of the nature of your research project and its aims/focal research questions.

    Relevant publications/outputs: When available, add links to the related publications/outputs from this data.

    Data availability statement: If your data is not linked on figshare directly, provide links to where it is being hosted here (i.e., Open Science Framework, Github, etc.). If your data is not going to be made publicly available, please provide details here as to the conditions under which interested individuals could gain access to the data and how to go about doing so.

    Data collection details: 1. When was your data collected? 2. How were your participants sampled/recruited?

    Sample information: How many and who are your participants? Demographic summaries are helpful additions to this section.

    Research Project Materials: What materials are necessary to fully reproduce your the contents of your dataset? Include a list of all relevant materials (e.g., surveys, interview questions) with a brief description of what is included in each file that should be uploaded alongside your datasets.

    List of relevant datafile(s): If your project produces data that cannot be contained in a single file, list the names of each of the files here with a brief description of what parts of your research project each file is related to.

    Data codebook: What is in each column of your dataset? Provide variable names as they are encoded in your data files, verbatim question associated with each response, response options, details of any post-collection coding that has been done on the raw-response (and whether that's encoded in a separate column).

    Examples available at: https://www.thearda.com/data-archive?fid=PEWMU17 https://www.thearda.com/data-archive?fid=RELLAND14

  9. s

    Unlocking Data to Inform Public Health Policy and Practice: WP1 Mapping...

    • orda.shef.ac.uk
    • datasetcatalog.nlm.nih.gov
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mark Clowes; Anthea Sutton; Tony Stone; Matthew Franklin (2023). Unlocking Data to Inform Public Health Policy and Practice: WP1 Mapping Review Supplementary Excel S1 [Dataset]. http://doi.org/10.15131/shef.data.21222272.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    The University of Sheffield
    Authors
    Mark Clowes; Anthea Sutton; Tony Stone; Matthew Franklin
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Unlocking Data to Inform Public Health Policy and Practice: WP1 Mapping Review Supplementary Excel S1
    The data extracted into Excel Tab "S1 Case studies (extracted)" represents information from 31 case studies as part of the "Unlocking Data to Inform Public Health Policy and Practice" project, Workpackage (WP) 1 Mapping Review. Details about the WP1 mapping review can be found in the "Unlocking Data to Inform Public Health Policy and Practice" project report, which can be found via this DOI link: https://doi.org/10.15131/shef.data.21221606

  10. w

    Synthetic Data for an Imaginary Country, Sample, 2023 - World

    • microdata.worldbank.org
    • nada-demo.ihsn.org
    Updated Jul 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Development Data Group, Data Analytics Unit (2023). Synthetic Data for an Imaginary Country, Sample, 2023 - World [Dataset]. https://microdata.worldbank.org/index.php/catalog/5906
    Explore at:
    Dataset updated
    Jul 7, 2023
    Dataset authored and provided by
    Development Data Group, Data Analytics Unit
    Time period covered
    2023
    Area covered
    World
    Description

    Abstract

    The dataset is a relational dataset of 8,000 households households, representing a sample of the population of an imaginary middle-income country. The dataset contains two data files: one with variables at the household level, the other one with variables at the individual level. It includes variables that are typically collected in population censuses (demography, education, occupation, dwelling characteristics, fertility, mortality, and migration) and in household surveys (household expenditure, anthropometric data for children, assets ownership). The data only includes ordinary households (no community households). The dataset was created using REaLTabFormer, a model that leverages deep learning methods. The dataset was created for the purpose of training and simulation and is not intended to be representative of any specific country.

    The full-population dataset (with about 10 million individuals) is also distributed as open data.

    Geographic coverage

    The dataset is a synthetic dataset for an imaginary country. It was created to represent the population of this country by province (equivalent to admin1) and by urban/rural areas of residence.

    Analysis unit

    Household, Individual

    Universe

    The dataset is a fully-synthetic dataset representative of the resident population of ordinary households for an imaginary middle-income country.

    Kind of data

    ssd

    Sampling procedure

    The sample size was set to 8,000 households. The fixed number of households to be selected from each enumeration area was set to 25. In a first stage, the number of enumeration areas to be selected in each stratum was calculated, proportional to the size of each stratum (stratification by geo_1 and urban/rural). Then 25 households were randomly selected within each enumeration area. The R script used to draw the sample is provided as an external resource.

    Mode of data collection

    other

    Research instrument

    The dataset is a synthetic dataset. Although the variables it contains are variables typically collected from sample surveys or population censuses, no questionnaire is available for this dataset. A "fake" questionnaire was however created for the sample dataset extracted from this dataset, to be used as training material.

    Cleaning operations

    The synthetic data generation process included a set of "validators" (consistency checks, based on which synthetic observation were assessed and rejected/replaced when needed). Also, some post-processing was applied to the data to result in the distributed data files.

    Response rate

    This is a synthetic dataset; the "response rate" is 100%.

  11. h

    amazon-product-data-sample

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iftach Arbel, amazon-product-data-sample [Dataset]. https://huggingface.co/datasets/iarbel/amazon-product-data-sample
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Iftach Arbel
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Dataset Card for "amazon-product-data-filter"

      Dataset Summary
    

    The Amazon Product Dataset contains product listing data from the Amazon US website. It can be used for various NLP and classification tasks, such as text generation, product type classification, attribute extraction, image recognition and more. NOTICE: This is a sample of the full Amazon Product Dataset, which contains 1K examples. Follow the link to gain access to the full dataset.

      Languages… See the full description on the dataset page: https://huggingface.co/datasets/iarbel/amazon-product-data-sample.
    
  12. q

    Cleaning Biodiversity Data: A Botanical Example Using Excel or RStudio

    • qubeshub.org
    Updated Jul 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shelly Gaynor (2020). Cleaning Biodiversity Data: A Botanical Example Using Excel or RStudio [Dataset]. http://doi.org/10.25334/DRGD-F069
    Explore at:
    Dataset updated
    Jul 16, 2020
    Dataset provided by
    QUBES
    Authors
    Shelly Gaynor
    Description

    Access and clean an open source herbarium dataset using Excel or RStudio.

  13. Z

    Cloud-based User Entity Behavior Analytics Log Data Set

    • data.niaid.nih.gov
    • zenodo.org
    Updated Oct 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Landauer, Max; Skopik, Florian; Höld, Georg; Wurzenberger, Markus (2023). Cloud-based User Entity Behavior Analytics Log Data Set [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7119952
    Explore at:
    Dataset updated
    Oct 30, 2023
    Dataset provided by
    AIT Austrian Institute of Technology
    Authors
    Landauer, Max; Skopik, Florian; Höld, Georg; Wurzenberger, Markus
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This respository contains the CLUE-LDS (CLoud-based User Entity behavior analytics Log Data Set). The data set contains log events from real users utilizing a cloud storage suitable for User Entity Behavior Analytics (UEBA). Events include logins, file accesses, link shares, config changes, etc. The data set contains around 50 million events generated by more than 5000 distinct users in more than five years (2017-07-07 to 2022-09-29 or 1910 days). The data set is complete except for 109 events missing on 2021-04-22, 2021-08-20, and 2021-09-05 due to database failure. The unpacked file size is around 14.5 GB. A detailed analysis of the data set is provided in [1]. The logs are provided in JSON format with the following attributes in the first level:

    id: Unique log line identifier that starts at 1 and increases incrementally, e.g., 1. time: Time stamp of the event in ISO format, e.g., 2021-01-01T00:00:02Z. uid: Unique anonymized identifier for the user generating the event, e.g., old-pink-crane-sharedealer. uidType: Specifier for uid, which is either the user name or IP address for logged out users. type: The action carried out by the user, e.g., file_accessed. params: Additional event parameters (e.g., paths, groups) stored in a nested dictionary. isLocalIP: Optional flag for event origin, which is either internal (true) or external (false). role: Optional user role: consulting, administration, management, sales, technical, or external. location: Optional IP-based geolocation of event origin, including city, country, longitude, latitude, etc. In the following data sample, the first object depicts a successful user login (see type: login_successful) and the second object depicts a file access (see type: file_accessed) from a remote location:

    {"params": {"user": "intact-gray-marlin-trademarkagent"}, "type": "login_successful", "time": "2019-11-14T11:26:43Z", "uid": "intact-gray-marlin-trademarkagent", "id": 21567530, "uidType": "name"}

    {"isLocalIP": false, "params": {"path": "/proud-copper-orangutan-artexer/doubtful-plum-ptarmigan-merchant/insufficient-amaranth-earthworm-qualitycontroller/curious-silver-galliform-tradingstandards/incredible-indigo-octopus-printfinisher/wicked-bronze-sloth-claimsmanager/frantic-aquamarine-horse-cleric"}, "type": "file_accessed", "time": "2019-11-14T11:26:51Z", "uid": "graceful-olive-spoonbill-careersofficer", "id": 21567531, "location": {"countryCode": "AT", "countryName": "Austria", "region": "4", "city": "Gmunden", "latitude": 47.915, "longitude": 13.7959, "timezone": "Europe/Vienna", "postalCode": "4810", "metroCode": null, "regionName": "Upper Austria", "isInEuropeanUnion": true, "continent": "Europe", "accuracyRadius": 50}, "uidType": "ipaddress"} The data set was generated at the premises of Huemer Group, a midsize IT service provider located in Vienna, Austria. Huemer Group offers a range of Infrastructure-as-a-Service solutions for enterprises, including cloud computing and storage. In particular, their cloud storage solution called hBOX enables customers to upload their data, synchronize them with multiple devices, share files with others, create versions and backups of their documents, collaborate with team members in shared data spaces, and query the stored documents using search terms. The hBOX extends the open-source project Nextcloud with interfaces and functionalities tailored to the requirements of customers. The data set comprises only normal user behavior, but can be used to evaluate anomaly detection approaches by simulating account hijacking. We provide an implementation for identifying similar users, switching pairs of users to simulate changes of behavior patterns, and a sample detection approach in our github repo. Acknowledgements: Partially funded by the FFG project DECEPT (873980). The authors thank Walter Huemer, Oskar Kruschitz, Kevin Truckenthanner, and Christian Aigner from Huemer Group for supporting the collection of the data set. If you use the dataset, please cite the following publication: [1] M. Landauer, F. Skopik, G. Höld, and M. Wurzenberger. "A User and Entity Behavior Analytics Log Data Set for Anomaly Detection in Cloud Computing". 2022 IEEE International Conference on Big Data - 6th International Workshop on Big Data Analytics for Cyber Intelligence and Defense (BDA4CID 2022), December 17-20, 2022, Osaka, Japan. IEEE. [PDF]

  14. B

    Easing into Excellent Excel Practices Learning Series / Série...

    • borealisdata.ca
    • search.dataone.org
    Updated Nov 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julie Marcoux (2023). Easing into Excellent Excel Practices Learning Series / Série d'apprentissages en route vers des excellentes pratiques Excel [Dataset]. http://doi.org/10.5683/SP3/WZYO1F
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 15, 2023
    Dataset provided by
    Borealis
    Authors
    Julie Marcoux
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    With a step-by-step approach, learn to prepare Excel files, data worksheets, and individual data columns for data analysis; practice conditional formatting and creating pivot tables/charts; go over basic principles of Research Data Management as they might apply to an Excel project. Avec une approche étape par étape, apprenez à préparer pour l’analyse des données des fichiers Excel, des feuilles de calcul de données et des colonnes de données individuelles; pratiquez la mise en forme conditionnelle et la création de tableaux croisés dynamiques ou de graphiques; passez en revue les principes de base de la gestion des données de recherche tels qu’ils pourraient s’appliquer à un projet Excel.

  15. s

    Snowplow Modeled Customer Data Sample

    • snowplow.io
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Snowplow Analytics, Snowplow Modeled Customer Data Sample [Dataset]. https://snowplow.io/explore-snowplow-data-part-2
    Explore at:
    Dataset authored and provided by
    Snowplow Analytics
    Time period covered
    Apr 1, 2020 - Apr 3, 2020
    Variables measured
    user_id, mkt_source, page_views, session_id, conversions, geo_country, device_class, mkt_campaign, session_length, time_engaged_in_s
    Description

    Example of modeled customer behavioral data showing user sessions, engagement metrics, and conversion data across multiple platforms and devices

  16. Z

    Investigating Data Assets, Management, and Planning at UF

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Smith, Plato (2020). Investigating Data Assets, Management, and Planning at UF [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1243282
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    University of Florida
    Authors
    Smith, Plato
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset represents a data assessment of select researchers across multiple communities of practice at the University of Florida as part of an IRB 201602303 study to investigate the data management practices, storage, and training needs of researchers. The study was conducted from January 3, 2017 - April 30, 2017. One hundred fifty-nine starts, one hundred fifty-six informed consent, and one hundred thirty-three completes for a 83% completion. However, Question 26 which contained PID was deleted from this raw dataset.

  17. Blog | Innovative Care Models and Uses of Clinical Practice Data – the...

    • catalog.data.gov
    • data.virginia.gov
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HHS Office of the Chief Data Officer (2025). Blog | Innovative Care Models and Uses of Clinical Practice Data – the future of Medicine [Dataset]. https://catalog.data.gov/dataset/blog-innovative-care-models-and-uses-of-clinical-practice-data-the-future-of-medicine
    Explore at:
    Dataset updated
    Mar 26, 2025
    Dataset provided by
    United States Department of Health and Human Serviceshttp://www.hhs.gov/
    Description

    This blog post was posted on January 28, 2013.

  18. LinkedIn Data Analyst Course OrderData Dataset

    • kaggle.com
    zip
    Updated Nov 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shokolatte Tachikawa (2022). LinkedIn Data Analyst Course OrderData Dataset [Dataset]. https://www.kaggle.com/datasets/showeenyc/linkedin-data-analyst-course-orderdata-dataset
    Explore at:
    zip(264482 bytes)Available download formats
    Dataset updated
    Nov 6, 2022
    Authors
    Shokolatte Tachikawa
    Description

    Practice Dataset From LinkedIn Course >>> Learning Data Analytics: 1 Foundations By: Robin Hunt

    https://www.linkedin.com/learning/learning-data-analytics-1-foundations?contextUrn=urn%3Ali%3AlyndaLearningPath%3A5ec59c4a498e70845153bbc5

  19. d

    Analysis Practice Data

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arshad, Abdul Rehman (2023). Analysis Practice Data [Dataset]. http://doi.org/10.7910/DVN/R1VIPU
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Arshad, Abdul Rehman
    Description

    This data set comes as a supplementary resource for my book on Biostatistics and SPSS. Readers are free to download this file and practice using SPSS as they go along reading the book.

  20. N

    DOB sample data

    • data.cityofnewyork.us
    • data.wu.ac.at
    csv, xlsx, xml
    Updated Dec 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Buildings (DOB) (2025). DOB sample data [Dataset]. https://data.cityofnewyork.us/Housing-Development/DOB-sample-data/bkyx-e5n5
    Explore at:
    xml, xlsx, csvAvailable download formats
    Dataset updated
    Dec 2, 2025
    Authors
    Department of Buildings (DOB)
    Description

    A list of complaints received and associated data. Prior monthly reports are archived at DOB and are not available on NYC Open Data.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shuvo Kumar Basak-4004 (2023). Fake Dataset for Practice [Dataset]. https://www.kaggle.com/datasets/shuvokumarbasak4004/fake-dataset-for-practice
Organization logo

Fake Dataset for Practice

Explore at:
zip(1515599 bytes)Available download formats
Dataset updated
Aug 21, 2023
Authors
Shuvo Kumar Basak-4004
Description

Description: This dataset is created solely for the purpose of practice and learning. It contains entirely fake and fabricated information, including names, phone numbers, emails, cities, ages, and other attributes. None of the information in this dataset corresponds to real individuals or entities. It serves as a resource for those who are learning data manipulation, analysis, and machine learning techniques. Please note that the data is completely fictional and should not be treated as representing any real-world scenarios or individuals.

Attributes: - phone_number: Fake phone numbers in various formats. - name: Fictitious names generated for practice purposes. - email: Imaginary email addresses created for the dataset. - city: Made-up city names to simulate geographical diversity. - age: Randomly generated ages for practice analysis. - sex: Simulated gender values (Male, Female). - married_status: Synthetic marital status information. - job: Fictional job titles for practicing data analysis. - income: Fake income values for learning data manipulation. - religion: Pretend religious affiliations for practice. - nationality: Simulated nationalities for practice purposes.

Please be aware that this dataset is not based on real data and should be used exclusively for educational purposes.

Search
Clear search
Close search
Google apps
Main menu