100+ datasets found
  1. train csv file

    • kaggle.com
    zip
    Updated May 5, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emmanuel Arias (2018). train csv file [Dataset]. https://www.kaggle.com/datasets/eamanu/train
    Explore at:
    zip(33695 bytes)Available download formats
    Dataset updated
    May 5, 2018
    Authors
    Emmanuel Arias
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Dataset

    This dataset was created by Emmanuel Arias

    Released under Database: Open Database, Contents: Database Contents

    Contents

  2. f

    Example of a csv file exported from the database.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Oct 24, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Caselle, Jennifer E.; Iles, Alison; Tinker, Martin T.; Black, August; Novak, Mark; Carr, Mark H.; Malone, Dan; Beas-Luna, Rodrigo; Hoban, Michael (2014). Example of a csv file exported from the database. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001227183
    Explore at:
    Dataset updated
    Oct 24, 2014
    Authors
    Caselle, Jennifer E.; Iles, Alison; Tinker, Martin T.; Black, August; Novak, Mark; Carr, Mark H.; Malone, Dan; Beas-Luna, Rodrigo; Hoban, Michael
    Description

    Example of a csv file exported from the database.

  3. Database with raw data (CSV file).

    • figshare.com
    txt
    Updated Jun 3, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bartosz Symonides (2018). Database with raw data (CSV file). [Dataset]. http://doi.org/10.6084/m9.figshare.6411002.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 3, 2018
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Bartosz Symonides
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Survival after open versus endovascular repair of abdominal aortic aneurysm. Polish population analysis. (in press)

  4. Synthetic Person Records: 10K to 10M Records

    • kaggle.com
    zip
    Updated Oct 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Swain (2025). Synthetic Person Records: 10K to 10M Records [Dataset]. https://www.kaggle.com/datasets/swainproject/synthetic-data-person
    Explore at:
    zip(913881690 bytes)Available download formats
    Dataset updated
    Oct 26, 2025
    Authors
    Swain
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains 7 pre-generated CSV files with realistic synthetic person records, ranging from 10,000 to 10,000,000 records. Perfect for development, testing, prototyping, and data analysis workflows without privacy concerns.

    What's Included

    Each CSV file contains complete demographic information: - person_id: Unique identifier - firstname, lastname: Realistic names (international) - gender, age: Demographics - street, streetnumber, address_unit, postalcode, city: Complete addresses - phone: Realistic phone numbers - email: Valid email addresses

    File Sizes

    • 10K records
    • 100K records
    • 500K records
    • 1M records
    • 2M records
    • 5M records
    • 10M records

    Why This Dataset?

    ✓ No privacy concerns—completely synthetic data ✓ Perfect for database testing and imports ✓ Ideal for ML model training and prototyping ✓ Ready-to-use CSV format ✓ Multiple sizes for different use cases

    Use Cases

    • Database development and testing
    • Data pipeline validation
    • ETL workflow testing
    • Machine learning prototyping
    • API testing with realistic data
    • Load testing and performance benchmarks

    License: CC BY 4.0 (Please attribute to Swain / Swainlabs when sharing)

  5. m

    Download CSV DB

    • maclookup.app
    json
    Updated Nov 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Download CSV DB [Dataset]. https://maclookup.app/downloads/csv-database
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Nov 20, 2025
    Description

    Free, daily updated MAC prefix and vendor CSV database. Download now for accurate device identification.

  6. Full oral and gene database (csv format)

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    application/gzip
    Updated May 22, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Braden Tierney (2019). Full oral and gene database (csv format) [Dataset]. http://doi.org/10.6084/m9.figshare.8001362.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 22, 2019
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Braden Tierney
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is our complete database in csv format (with gene names, ID's, annotations, lengths, cluster sizes, and taxonomic classifications) that can be queried on our website. The difference is that it does not have the sequences – those can be downloaded in other files on figshare. This file, as well as those, can be parsed and linked by the gene identifier.We recommend downloading this database and parsing it yourself if you attempt to run a query that is too large for our servers to handle.

  7. Board Games Dataset in CSV

    • kaggle.com
    zip
    Updated Jan 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sean B Coates (2024). Board Games Dataset in CSV [Dataset]. https://www.kaggle.com/datasets/seanbcoates/board-games-dataset-in-csv
    Explore at:
    zip(43036908 bytes)Available download formats
    Dataset updated
    Jan 11, 2024
    Authors
    Sean B Coates
    Description

    I don't know SQLite, I use PostgreSQL. I needed to work with this dataset in PGAdmin, so I converted Gabriele Baldassarre's Board Games Dataset (https://www.kaggle.com/datasets/gabrio/board-games-dataset/data) .sqlite files into .csv UTF-8 format to create my own database in PGAdmin. I uploaded them here to make it easier for anyone else that wants to do the same.

    The board_games.csv file likely contains all the information you are looking for.

  8. f

    A CSV file of our study database, which we used for the analyses in this...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Jun 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mik, Egbert G.; Juffermans, Nicole P.; van der Bom, Johanna G.; Wille, Maarten E.; Tsonaka, Roula; Baysan, Meryem; Bergsma, Jule E.; Arbous, Sesmu M.; Broere, Mark (2024). A CSV file of our study database, which we used for the analyses in this manuscript. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001357370
    Explore at:
    Dataset updated
    Jun 3, 2024
    Authors
    Mik, Egbert G.; Juffermans, Nicole P.; van der Bom, Johanna G.; Wille, Maarten E.; Tsonaka, Roula; Baysan, Meryem; Bergsma, Jule E.; Arbous, Sesmu M.; Broere, Mark
    Description

    A CSV file of our study database, which we used for the analyses in this manuscript.

  9. c

    Dog Food Data Extracted from Chewy (USA) - 4,500 Records in CSV Format

    • crawlfeeds.com
    csv, zip
    Updated Apr 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Dog Food Data Extracted from Chewy (USA) - 4,500 Records in CSV Format [Dataset]. https://crawlfeeds.com/datasets/dog-food-data-extracted-from-chewy-usa-4-500-records-in-csv-format
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Apr 22, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    The Dog Food Data Extracted from Chewy (USA) dataset contains 4,500 detailed records of dog food products sourced from one of the leading pet supply platforms in the United States, Chewy. This dataset is ideal for businesses, researchers, and data analysts who want to explore and analyze the dog food market, including product offerings, pricing strategies, brand diversity, and customer preferences within the USA.

    The dataset includes essential information such as product names, brands, prices, ingredient details, product descriptions, weight options, and availability. Organized in a CSV format for easy integration into analytics tools, this dataset provides valuable insights for those looking to study the pet food market, develop marketing strategies, or train machine learning models.

    Key Features:

    • Record Count: 4,500 dog food product records.
    • Data Fields: Product names, brands, prices, descriptions, ingredients .. etc. Find more fields under data points section.
    • Format: CSV, easy to import into databases and data analysis tools.
    • Source: Extracted from Chewy’s official USA platform.
    • Geography: Focused on the USA dog food market.

    Use Cases:

    • Market Research: Analyze trends and preferences in the USA dog food market, including popular brands, price ranges, and product availability.
    • E-commerce Analysis: Understand how Chewy presents and prices dog food products, helping businesses compare their own product offerings.
    • Competitor Analysis: Compare different brands and products to develop competitive strategies for dog food businesses.
    • Machine Learning Models: Use the dataset for machine learning tasks such as product recommendation systems, demand forecasting, and price optimization.

  10. f

    File S1 - Mynodbcsv: Lightweight Zero-Config Database Solution for Handling...

    • figshare.com
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanisław Adaszewski (2023). File S1 - Mynodbcsv: Lightweight Zero-Config Database Solution for Handling Very Large CSV Files [Dataset]. http://doi.org/10.1371/journal.pone.0103319.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Stanisław Adaszewski
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Set of Python scripts to generate data for benchmarks: equivalents of ADNI_Clin_6800_geno.csv, PTDEMOG.csv, MicroarrayExpression_fixed.csv and Probes.csv files, the dummy.csv, dummy2.csv and the microbenchmark CSV files. (ZIP)

  11. Adventure Works 2022 CSVs

    • kaggle.com
    zip
    Updated Nov 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Algorismus (2022). Adventure Works 2022 CSVs [Dataset]. https://www.kaggle.com/datasets/algorismus/adventure-works-in-excel-tables
    Explore at:
    zip(567646 bytes)Available download formats
    Dataset updated
    Nov 2, 2022
    Authors
    Algorismus
    License

    http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

    Description

    Adventure Works 2022 dataset

    How this Dataset is created?

    On the official website the dataset is available over SQL server (localhost) and CSVs to be used via Power BI Desktop running on Virtual Lab (Virtaul Machine). As per first two steps of Importing data are executed in the virtual lab and then resultant Power BI tables are copied in CSVs. Added records till year 2022 as required.

    How this Dataset may help you?

    this dataset will be helpful in case you want to work offline with Adventure Works data in Power BI desktop in order to carry lab instructions as per training material on official website. The dataset is useful in case you want to work on Power BI desktop Sales Analysis example from Microsoft website PL 300 learning.

    How to use this Dataset?

    Download the CSV file(s) and import in Power BI desktop as tables. The CSVs are named as tables created after first two steps of importing data as mentioned in the PL-300 Microsoft Power BI Data Analyst exam lab.

  12. arXiv publications dataset with simulated citation relationships

    • figshare.com
    txt
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacek Miecznikowski; Dominik Tomaszuk (2023). arXiv publications dataset with simulated citation relationships [Dataset]. http://doi.org/10.6084/m9.figshare.6449756.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Jacek Miecznikowski; Dominik Tomaszuk
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    arXiv publications dataset with simulated citation relationshipshttps://github.com/jacekmiecznikowski/neo4index App evaluates scientific reasearch impact using author-level metrics (h-index and more)This collection contains data aquired from arXiv.org via OAI2 protocol.arXiv does not provide citations metadata so this data was pseudo-randomly simulated.We evaluated scientific reasearch impact using six popular author-level metrics:* h-index,* m quotient,* e-index,* m-index,* r-index,* ar-index.Sourcehttps://arxiv.org/help/bulk_data (downloaded: 2018-03-23; over 1.3 million publications)Files* arxiv_bulk_metadata_2018-03-23.tar.gz - file downloaded using oai-harvester contains metadata of all arXiv publications to date.* categories.csv - file contains categories from arXiv with category-subcategory division* publications.csv - file contains information about articles like: id, title, abstract, url, categories and date* authors.csv - file contains authors data like first name, last name and id of published article* citations.csv - file contains simulated relationships between all publications using arxivCite* indices.csv - file contains 6 author-level metrics calculated on database using neo4indexStatisticsh-index Average = 3.5836524733724495m quotient Average = 0.5831426366846965e-index Average = 7.9260187734579075m-index Average = 29.436844659143155r-index Average = 8.931101630575293ar-index Average = 3.5439082808721025h-index Median = 1.0m quotient Median = 0.4167e-index Median = 5.3852m-index Median = 17.0r-index Median = 5.831ar-index Median = 2.7928h-index Mode = 1.0m quotient Mode = 1.0e-index Mode = 0.0m-index Mode = 0.0r-index Mode = 0.0ar-index Mode = 0.0

  13. Z

    HWRT database of handwritten symbols

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thoma, Martin (2020). HWRT database of handwritten symbols [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_50022
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Karlsruhe Institute of Technology
    Authors
    Thoma, Martin
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    The HWRT database of handwritten symbols contains on-line data of handwritten symbols such as all alphanumeric characters, arrows, greek characters and mathematical symbols like the integral symbol.

    The database can be downloaded in form of bzip2-compressed tar files. Each tar file contains:

    symbols.csv: A CSV file with the rows symbol_id, latex, training_samples, test_samples. The symbol id is an integer, the row latex contains the latex code of the symbol, the rows training_samples and test_samples contain integers with the number of labeled data.

    train-data.csv: A CSV file with the rows symbol_id, user_id, user_agent and data.

    test-data.csv: A CSV file with the rows symbol_id, user_id, user_agent and data.

    All CSV files use ";" as delimiter and "'" as quotechar. The data is given in YAML format as a list of lists of dictinaries. Each dictionary has the keys "x", "y" and "time". (x,y) are coordinates and time is the UNIX time.

    About 90% of the data was made available by Daniel Kirsch via github.com/kirel/detexify-data. Thank you very much, Daniel!

  14. Database and Model weights for Clinical Decision Support System for...

    • zenodo.org
    bin, csv
    Updated Mar 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Arias-Garzón; Daniel Arias-Garzón (2025). Database and Model weights for Clinical Decision Support System for Discharge Diagnosis Recommendations Project v1.0.0 [Dataset]. http://doi.org/10.5281/zenodo.14969314
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Mar 5, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Daniel Arias-Garzón; Daniel Arias-Garzón
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains:

    • An SQL script in MySQL for creating the database of the project on GitHub, Clinical Decision Support System for Discharge Diagnosis Recommendations.
    • Data from each table in CSV files, with "ind" at the beginning.
    • Weights for the Negations and Uncertainties Model (Bert.h5).
  15. MIT-BIH annotation CSV file

    • kaggle.com
    zip
    Updated Nov 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sadia Khan (2020). MIT-BIH annotation CSV file [Dataset]. https://www.kaggle.com/sadiakhanesha/mitbih-annotation-csv-file
    Explore at:
    zip(857583 bytes)Available download formats
    Dataset updated
    Nov 18, 2020
    Authors
    Sadia Khan
    Description

    Dataset

    This dataset was created by Sadia Khan

    Contents

  16. Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...

    • zenodo.org
    • data.europa.eu
    zip
    Updated Oct 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari (2022). LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive snapshots of our lives in the wild [Dataset]. http://doi.org/10.5281/zenodo.6832242
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 20, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    LifeSnaps Dataset Documentation

    Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction.

    The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication.

    Data Import: Reading CSV

    For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command.

    Data Import: Setting up a MongoDB (Recommended)

    To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database.

    To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here.

    For the Fitbit data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c fitbit 

    For the SEMA data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c sema 

    For surveys data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c surveys 

    If you have access control enabled, then you will need to add the --username and --password parameters to the above commands.

    Data Availability

    The MongoDB database contains three collections, fitbit, sema, and surveys, containing the Fitbit, SEMA3, and survey data, respectively. Similarly, the CSV files contain related information to these collections. Each document in any collection follows the format shown below:

    {
      _id: 
  17. Plant Species.csv

    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Malka Halgamuge (2023). Plant Species.csv [Dataset]. http://doi.org/10.6084/m9.figshare.4793326.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Malka Halgamuge
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A dataset of plant species in Glenelg Shire, Australia.

  18. w

    Our World In Data - Dataset - waterdata

    • wbwaterdata.org
    Updated Jul 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Our World In Data - Dataset - waterdata [Dataset]. https://wbwaterdata.org/dataset/our-world-in-data
    Explore at:
    Dataset updated
    Jul 12, 2020
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This database collates 3552 development indicators from different studies with data by country and year, including single year and multiple year time series. The data is presented as charts, the data can be downloaded from linked project pages/references for each set, and the data for each presented graph is available as a CSV file as well as a visual download of the graph (both available via the download link under each chart).

  19. Z

    The dataset of the Global Collections survey of natural history collections

    • data.niaid.nih.gov
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Woodburn, Matt; Corrigan, Robert J.; Drew, Nicholas; Meyer, Cailin; Smith, Vincent S.; Vincent, Sarah (2024). The dataset of the Global Collections survey of natural history collections [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6985398
    Explore at:
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Natural History Museum, London
    Smithsonian National Museum of Natural History
    Authors
    Woodburn, Matt; Corrigan, Robert J.; Drew, Nicholas; Meyer, Cailin; Smith, Vincent S.; Vincent, Sarah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    From 2016 to 2018, we surveyed the world’s largest natural history museum collections to begin mapping this globally distributed scientific infrastructure. The resulting dataset includes 73 institutions across the globe. It has:

    Basic institution data for the 73 contributing institutions, including estimated total collection sizes, geographic locations (to the city) and latitude/longitude, and Research Organization Registry (ROR) identifiers where available.

    Resourcing information, covering the numbers of research, collections and volunteer staff in each institution.

    Indicators of the presence and size of collections within each institution broken down into a grid of 19 collection disciplines and 16 geographic regions.

    Measures of the depth and breadth of individual researcher experience across the same disciplines and geographic regions.

    This dataset contains the data (raw and processed) collected for the survey, and specifications for the schema used to store the data. It includes:

    A diagram of the MySQL database schema.

    A SQL dump of the MySQL database schema, excluding the data.

    A SQL dump of the MySQL database schema with all data. This may be imported into an instance of MySQL Server to create a complete reconstruction of the database.

    Raw data from each database table in CSV format.

    A set of more human-readable views of the data in CSV format. These correspond to the database tables, but foreign keys are substituted for values from the linked tables to make the data easier to read and analyse.

    A text file containing the definitions of the size categories used in the collection_unit table.

    The global collections data may also be accessed at https://rebrand.ly/global-collections. This is a preliminary dashboard, constructed and published using Microsoft Power BI, that enables the exploration of the data through a set of visualisations and filters. The dashboard consists of three pages:

    Institutional profile: Enables the selection of a specific institution and provides summary information on the institution and its location, staffing, total collection size, collection breakdown and researcher expertise.

    Overall heatmap: Supports an interactive exploration of the global picture, including a heatmap of collection distribution across the discipline and geographic categories, and visualisations that demonstrate the relative breadth of collections across institutions and correlations between collection size and breadth. Various filters allow the focus to be refined to specific regions and collection sizes.

    Browse: Provides some alternative methods of filtering and visualising the global dataset to look at patterns in the distribution and size of different types of collections across the global view.

  20. All trait data - Datasets - OpenData.eol.org

    • opendata.eol.org
    Updated Jun 5, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    eol.org (2019). All trait data - Datasets - OpenData.eol.org [Dataset]. https://opendata.eol.org/dataset/all-trait-data-large
    Explore at:
    Dataset updated
    Jun 5, 2019
    Dataset provided by
    Encyclopedia of Lifehttp://eol.org/
    Description

    This zip archive records all of the trait records in EOL's graph database. It contains five .csv files: pages.csv listing taxa and their names, traits.csv with trait records, metadata.csv with auxiliary records referred to by trait records, inferred.csv (see below) and terms.csv listing all of the relationship URIs in the database. For a description of the schema, see https://github.com/EOL/eol_website/blob/master/doc/trait-schema.md inferred.csv lists additional taxa to which a trait record applies by taxonomic inference, in addition to the ancestral taxon to which it is attached. For instance, the record describing locomotion=flight for Aves is also inferred to apply to most of the descendants of Aves, except for any flightless subclades that are excluded from the inference pattern. All the trait record referred to in the 2nd column of the inferred file have full records available in the traits file. THIS RESOURCE IS UPDATED MONTHLY. It is not archived regularly. Please save your download if you want to be able to refer to it at a later date

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Emmanuel Arias (2018). train csv file [Dataset]. https://www.kaggle.com/datasets/eamanu/train
Organization logo

train csv file

Explore at:
zip(33695 bytes)Available download formats
Dataset updated
May 5, 2018
Authors
Emmanuel Arias
License

http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

Description

Dataset

This dataset was created by Emmanuel Arias

Released under Database: Open Database, Contents: Database Contents

Contents

Search
Clear search
Close search
Google apps
Main menu