87 datasets found
  1. CSV file used in statistical analyses

    • data.csiro.au
    • researchdata.edu.au
    • +1more
    Updated Oct 13, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CSIRO (2014). CSV file used in statistical analyses [Dataset]. http://doi.org/10.4225/08/543B4B4CA92E6
    Explore at:
    Dataset updated
    Oct 13, 2014
    Dataset authored and provided by
    CSIROhttp://www.csiro.au/
    License

    https://research.csiro.au/dap/licences/csiro-data-licence/https://research.csiro.au/dap/licences/csiro-data-licence/

    Time period covered
    Mar 14, 2008 - Jun 9, 2009
    Dataset funded by
    CSIROhttp://www.csiro.au/
    Description

    A csv file containing the tidal frequencies used for statistical analyses in the paper "Estimating Freshwater Flows From Tidally-Affected Hydrographic Data" by Dan Pagendam and Don Percival.

  2. Adventure Works 2022 CSVs

    • kaggle.com
    zip
    Updated Nov 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Algorismus (2022). Adventure Works 2022 CSVs [Dataset]. https://www.kaggle.com/datasets/algorismus/adventure-works-in-excel-tables
    Explore at:
    zip(567646 bytes)Available download formats
    Dataset updated
    Nov 2, 2022
    Authors
    Algorismus
    License

    http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

    Description

    Adventure Works 2022 dataset

    How this Dataset is created?

    On the official website the dataset is available over SQL server (localhost) and CSVs to be used via Power BI Desktop running on Virtual Lab (Virtaul Machine). As per first two steps of Importing data are executed in the virtual lab and then resultant Power BI tables are copied in CSVs. Added records till year 2022 as required.

    How this Dataset may help you?

    this dataset will be helpful in case you want to work offline with Adventure Works data in Power BI desktop in order to carry lab instructions as per training material on official website. The dataset is useful in case you want to work on Power BI desktop Sales Analysis example from Microsoft website PL 300 learning.

    How to use this Dataset?

    Download the CSV file(s) and import in Power BI desktop as tables. The CSVs are named as tables created after first two steps of importing data as mentioned in the PL-300 Microsoft Power BI Data Analyst exam lab.

  3. m

    Data from: Sample CSV file

    • mygeodata.cloud
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Sample CSV file [Dataset]. https://mygeodata.cloud/converter/asc-to-csv
    Explore at:
    Dataset updated
    Jul 9, 2025
    Description

    Sample data in CSV - Comma Separated Values format available for download for testing purposes.

  4. train csv file

    • kaggle.com
    zip
    Updated May 5, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emmanuel Arias (2018). train csv file [Dataset]. https://www.kaggle.com/datasets/eamanu/train
    Explore at:
    zip(33695 bytes)Available download formats
    Dataset updated
    May 5, 2018
    Authors
    Emmanuel Arias
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Dataset

    This dataset was created by Emmanuel Arias

    Released under Database: Open Database, Contents: Database Contents

    Contents

  5. Event Logs CSV

    • figshare.com
    rar
    Updated Dec 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dina Bayomie (2019). Event Logs CSV [Dataset]. http://doi.org/10.6084/m9.figshare.11342063.v1
    Explore at:
    rarAvailable download formats
    Dataset updated
    Dec 9, 2019
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Dina Bayomie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The event logs in CSV format. The dataset contains both correlated and uncorrelated logs

  6. MOT testing data for Great Britain

    • s3.amazonaws.com
    • gov.uk
    Updated Mar 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Driver and Vehicle Standards Agency (2022). MOT testing data for Great Britain [Dataset]. https://s3.amazonaws.com/thegovernmentsays-files/content/179/1797262.html
    Explore at:
    Dataset updated
    Mar 24, 2022
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Driver and Vehicle Standards Agency
    Area covered
    United Kingdom, Great Britain
    Description

    About this data set

    This data set comes from data held by the Driver and Vehicle Standards Agency (DVSA).

    It is not classed as an ‘official statistic’. This means it’s not subject to scrutiny and assessment by the UK Statistics Authority.

    MOT test results by class

    The MOT test checks that your vehicle meets road safety and environmental standards. Different types of vehicles (for example, cars and motorcycles) fall into different ‘classes’.

    This data table shows the number of initial tests. It does not include abandoned tests, aborted tests, or retests.

    The initial fail rate is the rate for vehicles as they were brought for the MOT. The final fail rate excludes vehicles that pass the test after rectification of minor defects at the time of the test.

    This data table is updated every 3 months.

    https://www.gov.uk/assets/whitehall/pub-cover-spreadsheet-471052e0d03e940bbc62528a05ac204a884b553e4943e63c8bffa6b8baef8967.png">

    Initial failures by defect category

    These tables give data for the following classes of vehicles:

    • class 1 and 2 vehicles - motorcycles
    • class 3 and 4 vehicles - cars and light vans up to 3,000kg
    • class 5 vehicles - private passenger vehicles with more than 12 seats
    • class 7 vehicles - goods vehicles between 3,000kg and 3,500kg gross vehicle weight

    All figures are for vehicles as they were brought in for the MOT.

    A failed test usually has multiple failure items.

    The percentage of tests is worked out as the number of tests with one or more failure items in the defect as a percentage of total tests.

    The percentage of defects is worked out as the total defects in the category as a percentage of total defects for all categories.

    The average defects per initial test failure is worked out as the total failure items as a percentage of total tests failed plus tests that passed after rectification of a minor defect at the time of the test.

    These data tables are updated every 3 months.

    https://www.gov.uk/assets/whitehall/pub-cover-spreadsheet-471052e0d03e940bbc62528a05ac204a884b553e4943e63c8bffa6b8baef8967.png">

    https://www.gov.uk/assets/whitehall/pub-cover-spreadsheet-471052e0d03e940bbc62528a05ac204a884b553e4943e63c8bffa6b8baef8967.png">

    MOT class 3 and 4 vehicles: initial failures by defect category</h3

  7. Amazon ML Challenge 25 Dataset

    • kaggle.com
    zip
    Updated Oct 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AlienXc137 (2025). Amazon ML Challenge 25 Dataset [Dataset]. https://www.kaggle.com/datasets/alienxc137/amazonml25
    Explore at:
    zip(49977798 bytes)Available download formats
    Dataset updated
    Oct 10, 2025
    Authors
    AlienXc137
    Description

    Problem Statement Amazon ML Challenge 2025 Problem Statement

    Smart Product Pricing Challenge

    In e-commerce, determining the optimal price point for products is crucial for marketplace success and customer satisfaction. Your challenge is to develop an ML solution that analyzes product details and predict the price of the product. The relationship between product attributes and pricing is complex - with factors like brand, specifications, product quantity directly influence pricing. Your task is to build a model that can analyze these product details holistically and suggest an optimal price.

    Data Description: The dataset consists of the following columns:

    1. sample_id: A unique identifier for the input sample
    2. catalog_content: Text field containing title, product description and an Item Pack Quantity(IPQ) concatenated.
    3. image_link: Public URL where the product image is available for download. Example link - https://m.media-amazon.com/images/I/71XfHPR36-L.jpg (https://m.media-amazon.com/images/I/71XfHPR36-L.jpg) To download images, use the download_images function from src/utils.py. See sample code in src/test.ipynb.
    4. price: Price of the product (Target variable - only available in training data)

    Dataset Details: Training Dataset: 75k products with complete product details and prices Test Set: 75k products for final evaluation

    Output Format: The output file should be a CSV with 2 columns:

    sample_id: The unique identifier of the data sample. Note the ID should match the test record sample_id.

    price: A float value representing the predicted price of the product.

    Note: Make sure to output a prediction for all sample IDs. If you have less/more number of output samples in the output file as compared to test.csv, your output won't be evaluated.

    File Descriptions:

    Source files

    1. src/utils.py: Contains helper functions for downloading images from the image_link. You may need to retry a few times to download all images due to possible throttling issues.

    2. sample_code.py: Sample dummy code that can generate an output file in the given format. Usage of this file is optional.

    Dataset files

    dataset/train.csv: Training file with labels (price).

    dataset/test.csv: Test file without output labels (price). Generate predictions using your model/solution on this file's

    data and format the output file to match sample_test_out.csv

    dataset/sample_test.csv: Sample test input file.

    dataset/sample_test_out.csv: Sample outputs for sample_test.csv. The output for test.csv must be formatted in the

    exact same way. Note: The predictions in the file might not be correct

    Constraints:

    You will be provided with a sample output file. Format your output to match the sample output file exactly.

    Predicted prices must be positive float values.

    Final model should be a MIT/Apache 2.0 License model and up to 8 Billion parameters.

    Evaluation Criteria:

    Submissions are evaluated using Symmetric Mean Absolute Percentage Error (SMAPE): A statistical measure that expresses the relative difference between predicted and actual values as a percentage, while treating positive and negative errors equally.

    Formula:

    SMAPE = (1/n) * Σ |predicted_price - actual_price| / ((|actual_price| + |predicted_price|)/2) Example: If actual price = $100 and predicted price = $120 SMAPE = |100-120| / ((|100| + |120|)/2) * 100% = 18.18%

    Note: SMAPE is bounded between 0% and 200%. Lower values indicate better performance. Leaderboard Information:

    Public Leaderboard: During the challenge, rankings will be based on 25K samples from the test set to provide real-time feedback on your model's performance. Final Rankings: The final decision will be based on performance on the complete 75K test set along with provided documentation of the proposed approach by the teams.

    Submission Requirements:

    Upload a test_out.csv file in the Portal with the exact same formatting as sample_test_out.csv

    All participating teams must also provide a 1-page document describing:

    • Methodology used
    • Model architecture/algorithms selected
    • Feature engineering techniques applied
    • Any other relevant information about the approach
    • Note: A sample template for this documentation is provided in Documentation_template.md

    Academic Integrity and Fair Play: ⚠ STRICTLY PROHIBITED: External Price Lookup

    Participants are STRICTLY NOT ALLOWED to obtain prices from the internet, external databases, or any sources outside the provided dataset. This includes but is not limited to:

    Web scraping product prices from e-commerce websites Using APIs to fetch current market prices Manual price lookup from online sources Using any external pricing databases or services Enforcement:

    All submitted approaches, methodologies, and code pipelines will be thoroughly reviewed and verified Any evidence of external price lookup or data augmentation from internet sources will result in immediate d...

  8. 1000 Empirical Time series

    • figshare.com
    • bridges.monash.edu
    • +1more
    png
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ben Fulcher (2023). 1000 Empirical Time series [Dataset]. http://doi.org/10.6084/m9.figshare.5436136.v10
    Explore at:
    pngAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Ben Fulcher
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A diverse selection of 1000 empirical time series, along with results of an hctsa feature extraction, using v1.06 of hctsa and Matlab 2019b, computed on a server at The University of Sydney.The results of the computation are in the hctsa file, HCTSA_Empirical1000.mat for use in Matlab using v1.06 of hctsa.The same data is also provided in .csv format for the hctsa_datamatrix.csv (results of feature computation), with information about rows (time series) in hctsa_timeseries-info.csv, information about columns (features) in hctsa_features.csv (and corresponding hctsa code used to compute each feature in hctsa_masterfeatures.csv), and the data of individual time series (each line a time series, for time series described in hctsa_timeseries-info.csv) is in hctsa_timeseries-data.csv. These .csv files were produced by running >>OutputToCSV(HCTSA_Empirical1000.mat,true,true); in hctsa.The input file, INP_Empirical1000.mat, is for use with hctsa, and contains the time-series data and metadata for the 1000 time series. For example, massive feature extraction from these data on the user's machine, using hctsa, can proceed as>> TS_Init('INP_Empirical1000.mat');Some visualizations of the dataset are in CarpetPlot.png (first 1000 samples of all time series as a carpet (color) plot) and 150TS-250samples.png (conventional time-series plots of the first 250 samples of a sample of 150 time series from the dataset). More visualizations can be performed by the user using TS_PlotTimeSeries from the hctsa package.See links in references for more comprehensive documentation for performing methodological comparison using this dataset, and on how to download and use v1.06 of hctsa.

  9. CIFAR-10 Python in CSV

    • kaggle.com
    zip
    Updated Jun 22, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fedesoriano (2021). CIFAR-10 Python in CSV [Dataset]. https://www.kaggle.com/fedesoriano/cifar10-python-in-csv
    Explore at:
    zip(218807675 bytes)Available download formats
    Dataset updated
    Jun 22, 2021
    Authors
    fedesoriano
    Description

    Context

    The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. The classes are completely mutually exclusive. There are 50000 training images and 10000 test images.

    The batches.meta file contains the label names of each class.

    The dataset was originally divided in 5 training batches with 10000 images per batch. The original dataset can be found here: https://www.cs.toronto.edu/~kriz/cifar.html. This dataset contains all the training data and test data in the same CSV file so it is easier to load.

    Content

    Here is the list of the 10 classes in the CIFAR-10:

    Classes: 1) 0: airplane 2) 1: automobile 3) 2: bird 4) 3: cat 5) 4: deer 6) 5: dog 7) 6: frog 8) 7: horse 9) 8: ship 10) 9: truck

    Acknowledgements

    • Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009. Link

    How to load the batches.meta file (Python)

    The function used to open the file: def unpickle(file): import pickle with open(file, 'rb') as fo: dict = pickle.load(fo, encoding='bytes') return dict

    Example of how to read the file: metadata_path = './cifar-10-python/batches.meta' # change this path metadata = unpickle(metadata_path)

  10. Human Resources.csv

    • figshare.com
    csv
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    anurag pardiash (2025). Human Resources.csv [Dataset]. http://doi.org/10.6084/m9.figshare.28780886.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    anurag pardiash
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset titled Human Resources.csv contains anonymized employee data collected for internal HR analysis and research purposes. It includes fields such as employee ID, department, gender, age, job role, and employment status. The data can be used for workforce trend analysis, HR benchmarking, diversity studies, and training models in human resource analytics.The file is provided in CSV format (3.05 MB) and adheres to general data privacy standards, with no personally identifiable information (PII).Last updated: April 11, 2025. Uploaded by Anurag Pardiash.

  11. Full oral and gene database (csv format)

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    application/gzip
    Updated May 22, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Braden Tierney (2019). Full oral and gene database (csv format) [Dataset]. http://doi.org/10.6084/m9.figshare.8001362.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 22, 2019
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Braden Tierney
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is our complete database in csv format (with gene names, ID's, annotations, lengths, cluster sizes, and taxonomic classifications) that can be queried on our website. The difference is that it does not have the sequences – those can be downloaded in other files on figshare. This file, as well as those, can be parsed and linked by the gene identifier.We recommend downloading this database and parsing it yourself if you attempt to run a query that is too large for our servers to handle.

  12. m

    Sample WKT file

    • mygeodata.cloud
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Sample WKT file [Dataset]. https://mygeodata.cloud/converter/asc-to-wkt
    Explore at:
    Dataset updated
    Jul 11, 2025
    Description

    Sample data in WKT - Well-known text (.csv + WKT column) format available for download for testing purposes.

  13. d

    can-train-and-test

    • data.dtu.dk
    zip
    Updated Dec 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brooke Elizabeth Kidmose (2023). can-train-and-test [Dataset]. http://doi.org/10.11583/DTU.24805533.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 15, 2023
    Dataset provided by
    Technical University of Denmark
    Authors
    Brooke Elizabeth Kidmose
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    can-train-and-testThis repository provides controller area network (CAN) datasets for the training and testing of machine learning schemes. The datasets are derived from the can-dataset and can-ml repositories.This repository contains controller area network (CAN) traffic for the 2017 Subaru Forester, the 2016 Chevrolet Silverado, the 2011 Chevrolet Traverse, and the 2011 Chevrolet Impala.For each vehicle, there are samples of attack-free traffic--that is, normal traffic--as well as samples of various types of attacks.The samples are stored in comma-separated values (CSV) format. All of the samples are labeled; attack frames are assigned "1," while attack-free frames are designated "0."This repository has been curated into four sub-datasets, dubbed "set_01," "set_02," "set_03," and "set_04." For each sub-dataset, there are five subsets: one training subset and four testing subsets. Each subset contains both attack-free and attack data.Training/testing subsets:train_01: Train the modeltest_01_known_vehicle_known_attack: Test the model against a known vehicle (seen in training) and known attacks (seen in training)test_02_unknown_vehicle_known_attack: Test the model against an unknown vehicle (not seen in training) and known attacks (seen in training)test_03_known_vehicle_unknown_attack: Test the model against a known vehicle (seen in training) and unknown attacks (not seen in training)test_04_unknown_vehicle_unknown_attack: Test the model against an unknown vehicle (not seen in training) and unknown attacks (not seen in training)The known/unknown attacks are identified by the file names (e.g., DoS, fuzzing, etc.). The known/unknown vehicles are as follows:set_01known vehicle --- Chevrolet Impalaunknown vehicle --- Chevrolet Silveradoset_02known vehicle --- Chevrolet Traverseunknown vehicle --- Subaru Foresterset_03known vehicle --- Chevrolet Silveradounknown vehicle --- Subaru Foresterset_04known vehicle --- Subaru Foresterunknown vehicle --- Chevrolet Traverse

  14. ECG in High Intensity Exercise Dataset

    • zenodo.org
    • opendatalab.com
    • +4more
    zip
    Updated Dec 26, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elisabetta De Giovanni; Elisabetta De Giovanni; Tomas Teijeiro; Tomas Teijeiro; David Meier; Grégoire Millet; Grégoire Millet; David Atienza; David Atienza; David Meier (2021). ECG in High Intensity Exercise Dataset [Dataset]. http://doi.org/10.5281/zenodo.5727800
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 26, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Elisabetta De Giovanni; Elisabetta De Giovanni; Tomas Teijeiro; Tomas Teijeiro; David Meier; Grégoire Millet; Grégoire Millet; David Atienza; David Atienza; David Meier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data presented here was extracted from a larger dataset collected through a collaboration between the Embedded Systems Laboratory (ESL) of the Swiss Federal Institute of Technology in Lausanne (EPFL), Switzerland and the Institute of Sports Sciences of the University of Lausanne (ISSUL). In this dataset, we report the extracted segments used for an analysis of R peak detection algorithms during high intensity exercise.

    Protocol of the experiments
    The protocol of the experiment was the following.

    • 22 subjects performing a cardio-pulmonary maximal exercise test on a cycle ergometer, using a gas mask. A single-lead electrocardiogram (ECG) was measured using the BIOPAC system.
    • An initial 3 min of rest were recorded.
    • After this baseline, the subjects started cycling at a power of 60W or 90W depending on their fitness level.
    • Then, the power of the cycle ergometer was increased by 30W every 3 min till exhaustion (in terms of maximum oxygen uptake or VO2max).
    • Finally, physiology experts assessed the so-called ventilatory thresholds and the VO2max based on the pulmonary data (volume of oxygen and CO2).

    Description of the extracted dataset

    The characteristics of the dataset are the following:

    • We report only 20 out of 22 subjects that were used for the analysis, because for two subjects the signals were too corrupted or not complete. Specifically, subjects 5 and 12 were discarded.
    • The ECG signal was sampled at 500 Hz and then downsampled at 250 Hz. The original ECG signal were measured at maximum 10 mV. Then, they were scaled down by a factor of 1000, hence the data is represented in uV.
    • For each subject, 5 segments of 20 s were extracted from the ECG recordings and chosen based on different phases of the maximal exercise test (i.e., before and after the so-called second ventilatory threshold or VT2, before and in the middle of VO2max, and during the recovery after exhaustion) to represent different intensities of physical activity.

    seg1 --> [VT2-50,VT2-30]
    seg2 --> [VT2+60,VT2+80]
    seg3 --> [VO2max-50,VO2max-30]
    seg4 --> [VO2max-10,VO2max+10]
    seg5 --> [VO2max+60,VO2max+80]

    • The R peak locations were manually annotated in all segments and reviewed by a physician of the Lausanne University Hospital, CHUV. Only segment 5 of subject 9 could not be annotated since there was a problem with the input signal. So, the total number of segments extracted were 20 * 5 - 1 = 99.

    Format of the extracted dataset

    The dataset is divided in two main folders:

    • The folder `ecg_segments/` contains the ECG signals saved in two formats, `.csv` and `.mat`. This folder includes both raw (`ecg_raw`) and processed (`ecg`) signals. The processing consists of a morphological filtering and a relative energy non filtering method to enhance the R peaks. The `.csv` files contain only the signal, while the `.mat` files include the signal, the time vector within the maximal stress test, the sampling frequency and the unit of the signal amplitude (uV, as we mentioned before).
    • The folder `manual_annotations/` contains the sample indices of the annotated R peaks in `.csv` format. The annotation was done on the processed signals.
  15. Z

    Clotho-AQA dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuel Lipping; Parthasaarathy Sudarsanam; Konstantinos Drossos; Tuomas Virtanen (2022). Clotho-AQA dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6473206
    Explore at:
    Dataset updated
    Apr 22, 2022
    Dataset provided by
    Audio Research Group, Tampere University
    Authors
    Samuel Lipping; Parthasaarathy Sudarsanam; Konstantinos Drossos; Tuomas Virtanen
    Description

    Clotho-AQA is an audio question-answering dataset consisting of 1991 audio samples taken from Clotho dataset [1]. Each audio sample has 6 associated questions collected through crowdsourcing. For each question, the answers are provided by three different annotators making a total of 35,838 question-answer pairs. For each audio sample, 4 questions are designed to be answered with 'yes' or 'no', while the remaining two questions are designed to be answered in a single word. More details about the data collection process and data splitting process can be found in our following paper.

    S. Lipping, P. Sudarsanam, K. Drossos, T. Virtanen ‘Clotho-AQA: A Crowdsourced Dataset for Audio Question Answering.’ The paper is available online at 2204.09634.pdf (arxiv.org)

    If you use the Clotho-AQA dataset, please cite the paper mentioned above. A sample baseline model to use the Clotho-AQA dataset can be found at partha2409/AquaNet (github.com)

    To use the dataset,

    • Download and extract ‘audio_files.zip’. This contains all the 1991 audio samples in the dataset.

    • Download ‘clotho_aqa_train.csv’, ‘clotho_aqa_val.csv’, and ‘clotho_aqa_test.csv’. These files contain the train, validation, and test splits, respectively. They contain the audio file name, questions, answers, and confidence scores provided by the annotators.

    License:

    The audio files in the archive ‘audio_files.zip’ are under the corresponding licenses (mostly CreativeCommons with attribution) of Freesound [2] platform, mentioned explicitly in the CSV file ’clotho_aqa_metadata.csv’ for each of the audio files. That is, each audio file in the archive is listed in the CSV file with meta-data. The meta-data for each file are:

    • File name

    • Keywords

    • URL for the original audio file

    • Start and ending samples for the excerpt that is used in the Clotho dataset

    • Uploader/user in the Freesound platform (manufacturer)

    • Link to the license of the file.

    The questions and answers in the files:

    • clotho_aqa_train.csv

    • clotho_aqa_val.csv

    • clotho_aqa_test.csv

    are under the MIT license, described in the LICENSE file.

    References:

    [1] K. Drossos, S. Lipping and T. Virtanen, "Clotho: An Audio Captioning Dataset," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 736- 740, doi: 10.1109/ICASSP40776.2020.9052990.

    [2] Frederic Font, Gerard Roma, and Xavier Serra. 2013. Freesound technical demo. In Proceedings of the 21st ACM international conference on Multimedia (MM '13). ACM, New York, NY, USA, 411-412. DOI: https://doi.org/10.1145/2502081.2502245

  16. c

    Sephora Makeup Dataset – Free Beauty Product CSV

    • crawlfeeds.com
    csv, zip
    Updated Dec 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Sephora Makeup Dataset – Free Beauty Product CSV [Dataset]. https://crawlfeeds.com/datasets/sephora-sample-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Dec 2, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Looking for a free dataset of cosmetic products? The Sephora Makeup Products Sample Dataset provides a ready-to-use CSV of beauty product data containing 340 verified Sephora makeup product records. It includes details like product name, brand, price, ingredients, availability, user reviews count, and images - perfect for e-commerce research, market analysis, price tracking, or building machine-learning and recommendation systems for the beauty industry.

    Key Features

    • Complete Product Metadata: Each record includes URL, product name, brand, price, SKU, ingredients, product description, usage instructions, review count, image links, availability status, and more.
    • CSV Format: Ready to Use: Download instantly without any scraping or data cleaning required.
    • Ideal for Beauty-Tech & ML Projects: Useful for price comparison tools, recommendation engines, product cataloging, trend analysis, sentiment analysis based on reviews/ratings.
    • Free Sample Access: This sample comes at zero cost (USD $0.0) — an excellent starting point for analysts, developers, or researchers.

    This dataset is perfect for market research, price tracking, sentiment analysis, and AI-based recommendation systems. Whether you're an e-commerce retailer, a data analyst, or a machine learning professional, this dataset provides valuable insights into the beauty industry.

    Explore the Beauty and Cosmetics Data Collection and elevate your data-driven strategies today!

    Who Can Use This Dataset?

    • E-commerce analysts/retailers analyzing cosmetic product catalogs and pricing.
    • Data scientists / ML engineers building recommendation engines or product-based machine-learning models.
    • Market researchers & beauty industry analysts tracking brand/product trends, availability, and consumer preferences.
    • Students/hobby developers exploring beauty-tech projects, demo analyses, or building portfolios with real-world data.

    Why This Sephora Dataset?

    • Skip the hassle: no need for manual scraping or dealing with anti-scraping restrictions.
    • Clean, structured data - ready for immediate integration with tools or pipelines.
    • Free and accessible: great for testing, proof-of-concept or small-scale analysis.
    • Beauty industry focus: concentrated on makeup and cosmetics products - ideal for niche analyses or applications.
  17. Pan-cancer Aberrant Pathway Activity Analysis (PAPAA)

    • zenodo.org
    application/gzip, csv +1
    Updated Dec 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DANIEL BLANKENBERG; DANIEL BLANKENBERG; VIJAY NAGAMPALLI; VIJAY NAGAMPALLI (2020). Pan-cancer Aberrant Pathway Activity Analysis (PAPAA) [Dataset]. http://doi.org/10.5281/zenodo.3630647
    Explore at:
    application/gzip, tsv, csvAvailable download formats
    Dataset updated
    Dec 5, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    DANIEL BLANKENBERG; DANIEL BLANKENBERG; VIJAY NAGAMPALLI; VIJAY NAGAMPALLI
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Information about the dataset files:

    1) pancan_rnaseq_freeze.tsv.gz: Publicly available gene expression data for the TCGA Pan-cancer dataset. File: PanCanAtlas EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/3586c0da-64d0-4b74-a449-5ff4d9136611] [https://doi.org/10.1016/j.celrep.2018.03.046]

    2) pancan_mutation_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset. File: mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046]

    3) pancan_GISTIC_threshold.tsv.gz: Publicly available Gene- level copy number information of the TCGA Pan-cancer dataset. This file is processed using script process_copynumber.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. The files copy_number_loss_status.tsv.gz and copy_number_gain_status.tsv.gz generated from this data are used as inputs in our Galaxy pipeline. [https://xenabrowser.net/datapages/?cohort=TCGA%20Pan-Cancer%20(PANCAN)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443] [https://doi.org/10.1016/j.celrep.2018.03.046]

    4) mutation_burden_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/][http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046]

    5) sample_freeze.tsv or sample_freeze_version4_modify.tsv: The file lists the frozen samples as determined by TCGA PanCancer Atlas consortium along with raw RNAseq and mutation data. These were previously determined and included for all downstream analysis All other datasets were processed and subset according to the frozen samples.[https://github.com/greenelab/pancancer/]

    6) vogelstein_cancergenes.tsv: compendium of OG and TSG used for the analysis. [https://github.com/greenelab/pancancer/]

    7) CCLE_DepMap_18Q1_maf_20180207.txt.gz Publicly available Mutational data for CCLE cell lines from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2FCCLE_DepMap_18Q1_maf_20180207.txt]

    8) ccle_rnaseq_genes_rpkm_20180929.gct.gz: Publicly available Expression data for 1019 cell lines (RPKM) from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2Fccle_2019%2FCCLE_RNAseq_genes_rpkm_20180929.gct.gz]

    9) CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct: Publicly available merged Mutational and copy number alterations that include gene amplifications and deletions for the CCLE cell lines. This data is represented in the binary format and provided by the Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://data.broadinstitute.org/ccle_legacy_data/binary_calls_for_copy_number_and_mutation_data/CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct]

    10) GDSC_cell_lines_EXP_CCLE_names.csv.gz Publicly available RMA normalized expression data for Genomics of Drug Sensitivity in Cancer(GDSC) cell-lines. File gdsc_cell_line_RMA_proc_basalExp.csv was downloaded. This data was subsetted to 389 cell lines that are common among CCLE and GDSC. All the GDSC cell line names were replaced with CCLE cell line names for further processing. [https://www.cancerrxgene.org/gdsc1000/GDSC1000_WebResources//Data/preprocessed/Cell_line_RMA_proc_basalExp.txt.zip]

    11) GDSC_CCLE_common_mut_cnv_binary.csv.gz: A subset of merged Mutational and copy number alterations that include gene amplifications and deletions for common cell lines between GDSC and CCLE. This file is generated using CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct and a list of common cell lines.

    12) gdsc1_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC1 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC1_fitted_dose_response_15Oct19.xlsx]

    13) gdsc2_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC2 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC2_fitted_dose_response_15Oct19.xlsx]

    14) compounds.csv: list of pharmacological compounds tested for our analysis

    15) tcga_dictonary.tsv: list of cancer types used in the analysis.

    16) seg_based_scores.tsv: Measurement of total copy number burden, Percent of genome altered by copy number alterations. This file was used as part of the Pancancer analysis by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/]

    17) sign.csv: file with original values assigned for tumor [1] or normal [-1] for given external samples (GSE69822)

    18) vlog_trans.csv: variant stabilized log transformed expression values for given external samples (GSE69822)

    19 path_genes.csv: file with list of ERK/RAS/PI3K pathway genes used in the analysis.

  18. Magic, Memory, and Curiosity (MMC) fMRI Dataset

    • openneuro.org
    Updated May 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefanie Meliss; Cristina Pascua-Martin; Jeremy Skipper; Kou Murayama (2023). Magic, Memory, and Curiosity (MMC) fMRI Dataset [Dataset]. http://doi.org/10.18112/openneuro.ds004182.v1.0.1
    Explore at:
    Dataset updated
    May 1, 2023
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Stefanie Meliss; Cristina Pascua-Martin; Jeremy Skipper; Kou Murayama
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Overview

    • The Magic, Memory, Curiosity (MMC) dataset contains data from 50 healthy human adults incidentally encoding 36 videos of magic tricks inside the MRI scanner across three runs.
    • Before and after incidental learning, a 10-min resting-state scan was acquired.
    • The MMC dataset includes contextual incentive manipulation, curiosity ratings for the magic tricks, as well as incidental memory performance tested a week later using a surprise cued recall and recognition test .
    • Working memory and constructs potentially relevant in the context of motivated learning (e.g., need for cognition, fear of failure) were additionally assessed.

    Stimuli

    The stimuli used here were short videos of magic tricks taken from a validated stimulus set (MagicCATs, Ozono et al., 2021) specifically created for the usage in fMRI studies. All final stimuli are available upon request. The request procedure is outlined in the Open Science Framework repository associated with the MagicCATs stimulus set (https://osf.io/ad6uc/).

    Participant responses

    Participants’ responses to demographic questions, questionnaires, and performance in the working memory assessment as well as both tasks are available in comma-separated value (CSV) files. Demographic (MMC_demographics.csv), raw questionnaire (MMC_raw_quest_data.csv) and other score data (MMC_scores.csv) as well as other information (MMC_other_information.csv) are structured as one line per participant with questions and/or scores as columns. Explicit wordings and naming of variables can be found in the supplementary information. Participant scan summaries (MMC_scan_subj_sum.csv) contain descriptives of brain coverage, TSNR, and framewise displacement (one row per participant) averaged first within acquisitions and then within participants. Participants’ responses and reaction times in the magic trick watching and memory task (MMC_experimental_data.csv) are stored as one row per trial per participant.

    Preprocessing

    Data was preprocessed using the AFNI (version 21.2.03) software suite. As a first step, the EPI timeseries were distortion-corrected along the encoding axis (P>>A) using the phase difference map (‘epi_b0_correct.py’). The resulting distortion-corrected EPIs were then processed separately for each task, but scans from the same task were processed together. The same blocks were applied to both task and resting-state distortion-corrected EPI data using afni_proc.py (see below): despiking, slice-timing and head-motion correction, intrasubject alignment between anatomy and EPI, intersubject registration to MNI, masking, smoothing, scaling, and denoising. For more details, please refer to the data descriptor (LINK) or the Github repository (https://github.com/stefaniemeliss/MMC_dataset).

    afni_proc.py -subj_id "${subjstr}" \
      -blocks despike tshift align tlrc volreg mask blur scale regress \
      -radial_correlate_blocks tcat volreg \
      -copy_anat $derivindir/$anatSS \
      -anat_has_skull no \
      -anat_follower anat_w_skull anat $derivindir/$anatUAC \
      -anat_follower_ROI aaseg anat $sswindir/$fsparc \
      -anat_follower_ROI aeseg epi $sswindir/$fsparc \
      -anat_follower_ROI FSvent epi $sswindir/$fsvent \
      -anat_follower_ROI FSWMe epi $sswindir/$fswm \
      -anat_follower_ROI FSGMe epi $sswindir/$fsgm \
      -anat_follower_erode FSvent FSWMe \
      -dsets $epi_dpattern \
      -outlier_polort $POLORT \
      -tcat_remove_first_trs 0 \
      -tshift_opts_ts -tpattern altplus \
      -align_opts_aea -cost lpc+ZZ -giant_move -check_flip \
      -align_epi_strip_method 3dSkullStrip \
      -tlrc_base MNI152_2009_template_SSW.nii.gz \
      -tlrc_NL_warp \
      -tlrc_NL_warped_dsets $sswindir/$anatQQ $sswindir/$matrix $sswindir/$warp \
      -volreg_base_ind 1 $min_out_first_run \
      -volreg_post_vr_allin yes \
      -volreg_pvra_base_index MIN_OUTLIER \
      -volreg_align_e2a \
      -volreg_tlrc_warp \
      -volreg_no_extent_mask \
      -mask_dilate 8 \
      -mask_epi_anat yes \
      -blur_to_fwhm -blur_size 8 \
      -regress_motion_per_run \
      -regress_ROI_PC FSvent 3 \
      -regress_ROI_PC_per_run FSvent \
      -regress_make_corr_vols aeseg FSvent \
      -regress_anaticor_fast \
      -regress_anaticor_label FSWMe \
      -regress_censor_motion 0.3 \
      -regress_censor_outliers 0.1 \
      -regress_apply_mot_types demean deriv \
      -regress_est_blur_epits \
      -regress_est_blur_errts \
      -regress_run_clustsim no \
      -regress_polort 2 \
      -regress_bandpass 0.01 1 \
      -html_review_style pythonic
    

    Derivatives

    The anat folder contains derivatives associated with the anatomical scan. The skull-stripped image created using @SSwarper is available in original and ICBM 2009c Nonlinear Asymmetric Template space as sub-[group][ID]_space-[space]_desc-skullstripped_T1w.nii.gz together with the corresponding affine matrix (sub-[group][ID]_aff12.1D) and incremental warp (sub-[group][ID]_warp.nii.gz). Output generated using @SUMA_Make_Spec_FS (defaced anatomical image, whole brain and tissue masks, as well as FreeSurfer discrete segmentations based on the Desikan-Killiany cortical atlas and the Destrieux cortical atlas) are also available as sub-[group][ID]_space-orig_desc-surfvol_T1w.nii.gz, sub-[group][ID]_space-orig_label-[label]_mask.nii.gz, and sub-[group][ID]_space-orig_desc-[atlas]_dseg.nii.gz, respectively.

    The func folder contains derivatives associated with the functional scans. To enhance re-usability, the fully preprocessed and denoised files are shared as sub-[group][ID]_task-[task]_desc-fullpreproc_bold.nii.gz. Additionally, partially preprocessed files (distortion corrected, despiked, slice-timing/head-motion corrected, aligned to anatomy and template space) are uploaded as sub-[group][ID]_task-[task]_run-[1-3]_desc-MNIaligned_bold.nii.gz together with slightly dilated brain mask in EPI resolution and template space where white matter and lateral ventricle were removed (sub-[group][ID]_task-[task]_space-MNI152NLin2009cAsym_label-dilatedGM_mask.nii.gz) as well as tissue masks in EPI resolution and template space (sub-[group][ID]_task-[task]_space-MNI152NLin2009cAsym_label-[tissue]_mask.nii.gz).

    The regressors folder contains nuisance regressors stemming from the output of the full afni_proc.py preprocessing pipeline. They are provided as space-delimited text values where each row represents one volume concatenated across all runs for each task separately. Those estimates that are provided per run contain the data for the volumes of one run and zeros for the volumes of other runs. This allows them to be regressed out separately for each run. The motion estimates show rotation (degree counterclockwise) in roll, pitch, and yaw and displacement (mm) in superior, left, and posterior direction. In addition to the motion parameters with respect to the base volume (sub-[group][ID]_task-[task]_label-mot_regressor.1D), motion derivatives (sub-[group][ID]_task-[task]_run[1-3]_label-motderiv_regressor.1D) and demeaned motion parameters (sub-[group][ID]_task-[task]_run[1-3]_label-motdemean_regressor.1D) are also available for each run separately. The sub-[group][ID]_task-[task]_run[1-3]_label-ventriclePC_regressor.1D files contain time course of the first three PCs of the lateral ventricle per run. Additionally, outlier fractions for each volume are provided (sub-[group][ID]_task-[task]_label-outlierfrac_regressor.1D) and sub-[group][ID]_task-[task]_label-censorTRs_regressor.1D shows which volumes were censored because motion or outlier fraction exceeded the limits specified. The voxelwise time course of local WM regressors created using fast ANATICOR is shared as sub-[group][ID]_task-[task]_label-localWM_regressor.nii.gz.

  19. m

    Sample WKB file

    • mygeodata.cloud
    Updated Sep 11, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Sample WKB file [Dataset]. https://mygeodata.cloud/converter/dwg-to-wkb
    Explore at:
    Dataset updated
    Sep 11, 2018
    Description

    Sample data in CSV - Comma Separated Values format available for download for testing purposes.

  20. Vehicle licensing statistics data files

    • gov.uk
    • s3.amazonaws.com
    Updated Oct 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Transport (2025). Vehicle licensing statistics data files [Dataset]. https://www.gov.uk/government/statistical-data-sets/vehicle-licensing-statistics-data-files
    Explore at:
    Dataset updated
    Oct 15, 2025
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Transport
    Description

    We welcome any feedback on the structure of our data files, their usability, or any suggestions for improvements; please contact vehicles statistics.

    The Department for Transport is committed to continuously improving the quality and transparency of our outputs, in line with the Code of Practice for Statistics. In line with this, we have recently concluded a planned review of the processes and methodologies used in the production of Vehicle licensing statistics data. The review sought to seek out and introduce further improvements and efficiencies in the coding technologies we use to produce our data and as part of that, we have identified several historical errors across the published data tables affecting different historical periods. These errors are the result of mistakes in past production processes that we have now identified, corrected and taken steps to eliminate going forward.

    Most of the revisions to our published figures are small, typically changing values by less than 1% to 3%. The key revisions are:

    Licensed Vehicles (2014 Q3 to 2016 Q3)

    We found that some unlicensed vehicles during this period were mistakenly counted as licensed. This caused a slight overstatement, about 0.54% on average, in the number of licensed vehicles during this period.

    3.5 - 4.25 tonnes Zero Emission Vehicles (ZEVs) Classification

    Since 2023, ZEVs weighing between 3.5 and 4.25 tonnes have been classified as light goods vehicles (LGVs) instead of heavy goods vehicles (HGVs). We have now applied this change to earlier data and corrected an error in table VEH0150. As a result, the number of newly registered HGVs has been reduced by:

    • 3.1% in 2024

    • 2.3% in 2023

    • 1.4% in 2022

    Table VEH0156 (2018 to 2023)

    Table VEH0156, which reports average CO₂ emissions for newly registered vehicles, has been updated for the years 2018 to 2023. Most changes are minor (under 3%), but the e-NEDC measure saw a larger correction, up to 15.8%, due to a calculation error. Other measures (WLTP and Reported) were less notable, except for April 2020 when COVID-19 led to very few new registrations which led to greater volatility in the resultant percentages.

    Neither these specific revisions, nor any of the others introduced, have had a material impact on the statistics overall, the direction of trends nor the key messages that they previously conveyed.

    Specific details of each revision made has been included in the relevant data table notes to ensure transparency and clarity. Users are advised to review these notes as part of their regular use of the data to ensure their analysis accounts for these changes accordingly.

    If you have questions regarding any of these changes, please contact the Vehicle statistics team.

    Data tables containing aggregated information about vehicles in the UK are also available.

    How to use CSV files

    CSV files can be used either as a spreadsheet (using Microsoft Excel or similar spreadsheet packages) or digitally using software packages and languages (for example, R or Python).

    When using as a spreadsheet, there will be no formatting, but the file can still be explored like our publication tables. Due to their size, older software might not be able to open the entire file.

    Download data files

    Make and model by quarter

    df_VEH0120_GB: https://assets.publishing.service.gov.uk/media/68ed0c52f159f887526bbda6/df_VEH0120_GB.csv">Vehicles at the end of the quarter by licence status, body type, make, generic model and model: Great Britain (CSV, 59.8 MB)

    Scope: All registered vehicles in Great Britain; from 1994 Quarter 4 (end December)

    Schema: BodyType, Make, GenModel, Model, Fuel, LicenceStatus, [number of vehicles; 1 column per quarter]

    df_VEH0120_UK: <a class="govuk-link" href="https://assets.publishing.service.gov.uk/media/68ed0c2

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
CSIRO (2014). CSV file used in statistical analyses [Dataset]. http://doi.org/10.4225/08/543B4B4CA92E6
Organization logo

CSV file used in statistical analyses

Explore at:
Dataset updated
Oct 13, 2014
Dataset authored and provided by
CSIROhttp://www.csiro.au/
License

https://research.csiro.au/dap/licences/csiro-data-licence/https://research.csiro.au/dap/licences/csiro-data-licence/

Time period covered
Mar 14, 2008 - Jun 9, 2009
Dataset funded by
CSIROhttp://www.csiro.au/
Description

A csv file containing the tidal frequencies used for statistical analyses in the paper "Estimating Freshwater Flows From Tidally-Affected Hydrographic Data" by Dan Pagendam and Don Percival.

Search
Clear search
Close search
Google apps
Main menu