100+ datasets found
  1. ARC-AGI-CSV-DATA

    • kaggle.com
    zip
    Updated Jun 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PUN (2024). ARC-AGI-CSV-DATA [Dataset]. https://www.kaggle.com/datasets/pshikk/arc-agi-csv-data
    Explore at:
    zip(296979 bytes)Available download formats
    Dataset updated
    Jun 15, 2024
    Authors
    PUN
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    CSV formatted dataset for the ARC-AGI challenge, as the original dataset was in json format.

    Both the files are formatted as Id, Input, Output. Id contains the id of the task and train or test label with it position in the task. Input contains the input. Output contains the output.

  2. CSV_File

    • kaggle.com
    zip
    Updated Jan 14, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmad Fakhar (2024). CSV_File [Dataset]. https://www.kaggle.com/datasets/ahmadfakhar/csv-file
    Explore at:
    zip(5386219 bytes)Available download formats
    Dataset updated
    Jan 14, 2024
    Authors
    Ahmad Fakhar
    Description

    Dataset

    This dataset was created by Ahmad Fakhar

    Contents

  3. Vehicle licensing statistics data files

    • s3.amazonaws.com
    • gov.uk
    Updated May 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Transport (2022). Vehicle licensing statistics data files [Dataset]. https://s3.amazonaws.com/thegovernmentsays-files/content/181/1811927.html
    Explore at:
    Dataset updated
    May 24, 2022
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Transport
    Description

    The following datafiles contain detailed information about vehicles in the UK, which would be too large to use as structured tables. They are provided as simple CSV text files that should be easier to use digitally.

    We welcome any feedback on the structure of our new datafiles, their usability, or any suggestions for improvements, please contact vehicles statistics.

    How to use CSV files

    CSV files can be used either as a spreadsheet (using Microsoft Excel or similar spreadsheet packages) or digitally using software packages and languages (for example, R or Python).

    When using as a spreadsheet, there will be no formatting, but the file can still be explored like our publication tables. Due to their size, older software might not be able to open the entire file.

    Download data files

    Make and model by quarter

    df_VEH0120_GB: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1077520/df_VEH0120_GB.csv">Vehicles at the end of the quarter by licence status, body type, make, generic model and model: Great Britain (CSV, 37.6 MB)

    Scope: All registered vehicles in Great Britain; from 1994 Quarter 4 (end December)

    Schema: BodyType, Make, GenModel, Model, LicenceStatus, [number of vehicles; one column per quarter]

    df_VEH0120_UK: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1077521/df_VEH0120_UK.csv">Vehicles at the end of the quarter by licence status, body type, make, generic model and model: United Kingdom (CSV, 20.8 MB)

    Scope: All registered vehicles in the United Kingdom; from 2014 Quarter 3 (end September)

    Schema: BodyType, Make, GenModel, Model, LicenceStatus, [number of vehicles; one column per quarter]

    df_VEH0160_GB: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1077522/df_VEH0160_GB.csv">Vehicles registered for the first time by body type, make, generic model and model: Great Britain (CSV, 17.1 MB)

    Scope: All vehicles registered for the first time in Great Britain; from 2001 Quarter 1 (January to March)

    Schema: BodyType, Make, GenModel, Model, [number of vehicles; one column per quarter]

    df_VEH0160_UK: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1077523/df_VEH0160_UK.csv">Vehicles registered for the first time by body type, make, generic model and model: United Kingdom (CSV, 4.93 MB)

    Scope: All vehicles registered for the first time in the United Kingdom; from 2014 Quarter 3 (July to September)

    Schema: BodyType, Make, GenModel, Model, [number of vehicles; one column per quarter]

    Make and model by age

    df_VEH0124: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1077524/df_VEH0124.csv">Vehicles at the end of the quarter by licence status, body type, make, generic model, model, year of first use and year of manufacture: United Kingdom (CSV, 28.2 MB)

    Scope: All licensed vehicles in the United Kingdom; 2021 Quarter 4 (end December) only

    Schema: BodyType, Make, GenModel, Model, YearFirstUsed, YearManufacture, Licensed (number of vehicles), SORN (number of vehicles)

    Make and model by engine size

    df_VEH0220: <a class="govu

  4. Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...

    • zenodo.org
    • data.europa.eu
    zip
    Updated Oct 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari (2022). LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive snapshots of our lives in the wild [Dataset]. http://doi.org/10.5281/zenodo.6832242
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 20, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    LifeSnaps Dataset Documentation

    Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction.

    The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication.

    Data Import: Reading CSV

    For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command.

    Data Import: Setting up a MongoDB (Recommended)

    To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database.

    To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here.

    For the Fitbit data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c fitbit 

    For the SEMA data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c sema 

    For surveys data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c surveys 

    If you have access control enabled, then you will need to add the --username and --password parameters to the above commands.

    Data Availability

    The MongoDB database contains three collections, fitbit, sema, and surveys, containing the Fitbit, SEMA3, and survey data, respectively. Similarly, the CSV files contain related information to these collections. Each document in any collection follows the format shown below:

    {
      _id: 
  5. d

    Data from: Tidal Daily Discharge and Quality Assurance Data Supporting an...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Nov 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Tidal Daily Discharge and Quality Assurance Data Supporting an Assessment of Water Quality and Discharge in the Herring River, Wellfleet, Massachusetts, November 2015–September 2017 [Dataset]. https://catalog.data.gov/dataset/tidal-daily-discharge-and-quality-assurance-data-supporting-an-assessment-of-water-quality
    Explore at:
    Dataset updated
    Nov 18, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Wellfleet, Herring River, Massachusetts
    Description

    This data release provides data in support of an assessment of water quality and discharge in the Herring River at the Chequessett Neck Road dike in Wellfleet, Massachusetts, from November 2015 to September 2017. The assessment was a cooperative project among the U.S. Geological Survey, National Park Service, Cape Cod National Seashore, and the Friends of Herring River to characterize environmental conditions prior to a future removal of the dike. It is described in U.S. Geological Survey (USGS) Scientific Investigations Report "Assessment of Water Quality and Discharge in the Herring River, Wellfleet, Massachusetts, November 2015 – September 2017." This data release is structured as a set of comma-separated values (CSV) files, each of which contains information on data source (or laboratory used for analysis), USGS site identification (ID) number, beginning date of time of observation or sampling, ending date and time of observation or sampling and data such as flow rate and analytical results. The CSV files include calculated tidal daily flows (Flood_Tide_Tidal_Day.csv and Ebb_Tide_Tidal_Day.csv) that were used in Huntington and others (2020) for estimation of nutrient loads. Tidal daily flows are the estimated mean daily discharges for two consecutive flood and ebb tide cycles (average duration: 24 hours, 48 minutes). The associated date is the day on which most of the flow occurred. CSV files contain quality assurance data for water-quality samples including blanks (Blanks.csv), replicates (Replicates.csv), standard reference materials (Standard_Reference_Material.csv), and atmospheric ammonium contamination (NH4_Atmospheric_Contamination.csv). One CSV file (EWI_vs_ISCO.csv) contains data comparing composite samples collected by an automatic sampler (ISCO) at a fixed point with depth-integrated samples collected at equal width increments (EWI). One CSV file (Cross_Section_Field_Parameters.csv) contains field parameter data (specific conductance, temperature, pH, and dissolved oxygen) collected at a fixed location and data collected along the cross sections at variable water depths and horizontal distances across the openings of the culverts at the Chequessett Neck Road dike. One CSV file (LOADEST_Bias_Statistics.csv) contains data that include estimated natural log of load, model residuals, Z-scores, and seasonal model residuals for winter (December, January, and February); spring (March, April and May); summer (June, July and August); and fall (September, October, and November). The data release also includes a data dictionary (Data_Dictionary.csv) that provides detailed descriptions of each field in each CSV file, including: data filename; laboratory or data source; U.S. Geological Survey site ID numbers; data types; constituent (analyte) U.S. Geological Survey parameter codes; descriptions of parameters; units; methods; minimum reporting limits; limits of quantitation, if appropriate; method reference citations; and minimum, maximum, median, and average values for each analyte. The data release also includes an abbreviations file (Abbreviations.pdf) that defines all the abbreviations in the data dictionary and CSV files. Note that the USGS site ID includes a leading zero (011058798) and some of the parameter codes contain leading zeros, so care must be taken in opening and subsequently saving these files in other formats where leading zeros may be dropped.

  6. Z

    Types, open citations, closed citations, publishers, and participation...

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hiebi, Ivan; Peroni, Silvio; Shotton, David (2020). Types, open citations, closed citations, publishers, and participation reports of Crossref entities [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_2558257
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Digital Humanities Advanced Research Centre, Department of Computer Science and Engineering, University of Bologna, Bologna, Italy
    Oxford e-Research Centre, University of Oxford, Oxford, United Kingdom
    Authors
    Hiebi, Ivan; Peroni, Silvio; Shotton, David
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This publication contains several datasets that have been used in the paper "Crowdsourcing open citations with CROCI – An analysis of the current status of open citations, and a proposal" submitted to the 17th International Conference on Scientometrics and Bibliometrics (ISSI 2019), available at https://opencitations.wordpress.com/2019/02/07/crowdsourcing-open-citations-with-croci/.

    Additional information about the analyses described in the paper, including the code and the data we have used to compute all the figures, is available as a Jupyter notebook at https://github.com/sosgang/pushing-open-citations-issi2019/blob/master/script/croci_nb.ipynb. The datasets contain the following information.

    non_open.zip: it is a zipped (~5 GB unzipped) CSV file containing the numbers of open citations and closed citations received by the entities in the Crossref dump used in our computation, dated October 2018. All the entity types retrieved from Crossref were aligned to one of following five categories: journal, book, proceedings, dataset, other. The open CC0 citation data we used came from the CSV dump of most recent release of COCI dated 12 November 2018. The number of closed citations was calculated by subtracting the number of open citations to each entity available within COCI from the value “is-referenced-by-count” available in the Crossref metadata for that particular cited entity, which reports all the DOI-to-DOI citation links that point to the cited entity from within the whole Crossref database (including those present in the Crossref ‘closed’ dataset).

    The columns of the CSV file are the following ones:

    doi: the DOI of the publication in Crossref;

    type: the type of the publication as indicated in Crossref;

    cited_by: the number of open citations received by the publication according to COCI;

    non_open: the number of closed citations received by the publication according to Crossref + COCI.

    croci_types.csv: it is a CSV file that contains the numbers of open citations and closed citations received by the entities in the Crossref dump used in our computation, as collected in the previous CSV file, alligned in five classes depening on the entity types retrieved from Crossref: journal (Crossref types: journal-article, journal-issue, journal-volume, journal), book (Crossref types: book, book-chapter, book-section, monograph, book track, book-part, book-set, reference-book, dissertation, book series, edited book), proceedings (Crossref types: proceedings-article, proceedings, proceedings-series), dataset (Crossref types: dataset), other (Crossref types: other, report, peer review, reference-entry, component, report-series, standard, posted-content, standard-series).

    The columns of the CSV file are the following ones:

    type: the type publication between "journal", "book", "proceedings", "dataset", "other";

    label: the label assigned to the type for visualisation purposes;

    coci_open_cit: the number of open citations received by the publication type according to COCI;

    crossref_close_cit: the number of closed citations received by the publication according to Crossref + COCI.

    publishers_cits.csv: it is a CSV file that contains the top twenty publishers that received the greatest number of open citations. The columns of the CSV file are the following ones:

    publisher: the name of the publisher;

    doi_prefix: the list of DOI prefixes used assigned by the publisher;

    coci_open_cit: the number of open citations received by the publications of the publisher according to COCI;

    crossref_close_cit: the number of closed citations received by the publications of the publishers according to Crossref + COCI;

    total_cit: the total number of citations received by the publications of the publisher (= coci_open_cit + crossref_close_cit).

    20publishers_cr.csv: it is a CSV file that contains the numbers of the contributions to open citations made by the twenty publishers introduced in the previous CSV file as of 24 January 2018, according to the data available through the Crossref API. The counts listed in this file refers to the number of publications for which each publisher has submitted metadata to Crossref that include the publication’s reference list. The categories 'closed', 'limited' and 'open' refer to publications for which the reference lists are not visible to anyone outside the Crossref Cited-by membership, are visible only to them and to Crossref Metadata Plus members, or are visible to all, respectively. In addition, the file also record the total number of publications for which the publisher has submitted metadata to Crossref, whether or not those metadata include the reference lists of those publications.

    The columns of the CSV file are the following ones:

    publisher: the name of the publisher;

    open: the number of publications in Crossref with an 'open' visibility for their reference lists;

    limited: the number of publications in Crossref with an 'limited' visibility for their reference lists;

    closed: the number of publications in Crossref with an 'closed' visibility for their reference lists;

    overall_deposited: the overall number of publications for which the publisher has submitted metadata to Crossref.

  7. d

    Rainfall, Volumetric Soil-Water Content, Video, and Geophone Data from the...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Nov 13, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Rainfall, Volumetric Soil-Water Content, Video, and Geophone Data from the Hermits Peak-Calf Canyon Fire Burn Area, New Mexico, June 2022 to June 2024 [Dataset]. https://catalog.data.gov/dataset/rainfall-volumetric-soil-water-content-video-and-geophone-data-from-the-hermits-peak-calf-
    Explore at:
    Dataset updated
    Nov 13, 2025
    Dataset provided by
    U.S. Geological Survey
    Area covered
    Calf Canyon, Hermit Peak
    Description

    Precipitation, volumetric soil-water content, videos, and geophone data characterizing postfire debris flows were collected at the 2022 Hermit’s Peak Calf-Canyon Fire in New Mexico. This dataset contains data from June 22, 2022, to June 26, 2024. The data were obtained from a station located at 35° 42’ 28.86” N, 105° 27’ 18.03” W (geographic coordinate system). Each data type is described below. Raw Rainfall Data: Rainfall data, Rainfall.csv, are contained in a comma separated value (.csv) file. The data are continuous and sampled at 1-minute intervals. The columns in the csv file are TIMESTAMP(UTC), RainSlowInt (the depth of rain in each minute [mm]), CumRain (cumulative rainfall since the beginning of the record [mm]), and VWC# (volumetric water content [V/V]) at three depths (1 = 10 cm, 2=30 cm, and 3=50 cm). VWC values outside of the range of 0 to 0.5 represent sensor malfunctions and were replaced with -99999 . Storm Record: We summarized the rainfall, volumetric soil-water content, and geophone data based on rainstorms. We defined a storm as rain for a duration >= 5 minutes or with an accumulation > 2.54 mm. Each storm was then assigned a storm ID starting at 0. The storm record data, StormRecord.csv, provides peak rainfall intensities and times and volumetric soil-water content information for each storm. The columns from left to right provide the information as follows: ID, StormStart yyyy-mm-dd hh:mm:ss-tz, StormStop yyyy-mm-dd hh:mm:ss-tz, StormDepth mm, StormDuration h, I-5 mm h-1, I-10 mm h-1, I-15 mm h-1, I-30 mm h-1, I-60 mm h-1, I-5 time yyyy-mm-dd hh:mm:ss-tz, I-10 time yyyy-mm-dd hh:mm:ss-tz, I-15 time yyyy-mm-dd hh:mm:ss-tz] ([UTC], the time of the peak 15-minute rainfall intensity), I-30 time yyyy-mm-dd hh:mm:ss-tz] ] ([UTC], the time of the peak 30-minute rainfall intensity), I-60 time [yyyy-mm-dd hh:mm:ss-tz] [UTC], (the time of the peak 60-minute rainfall intensity), VWC (volumetric water content [V/V] at three depths (1 = 10 cm, 2 = 30 cm, 3 = 50 cm) at the start of the storm, the time of the peak 15-minute rainfall intensity, and the end of the storm), Velocity [m s-1] of the flow, and Event (qualitative observation of type of flow from video footage). VWC values outside of the range of 0 to 0.5 represent sensor malfunctions and were replaced with -99999. Velocity was only calculated for flows with a noticeable surge as the rest of the signal is not sufficient for a cross-correlation, and Event was only filled for storms with quality video data. Values of -99999 were assigned for these columns for all other storms. Geophone Data: Geophone data, GeophoneData.zip, are contained in comma separated value (.csv) files labeled by ‘storm’ and the corresponding storm ID in the storm record and labeled IDa and IDb if the geophone stopped recording for more than an hour during the storm. The data was recorded at two geophones sampled at 50 Hz, one 11.5 m upstream from the station and one 9.75 m downstream from the station. Geophones were triggered to record when 1.6 mm of rain was detected during a period of 10 minutes, and they continued to record for 30 minutes past the last timestamp when this criteria was met. The columns in each csv file are TIMESTAMP [UTC], GeophoneUp_mV (the upstream geophone [mV]), GeophoneDn_mV (the downstream geophone [mV]). Note that there are occasional missed samples when the data logger did not record due to geophone malfunction when data points are 0.04 s or more apart. Videos: The videos stormID_mmdd.mp4 (or .mov) are organized by storm ID where one folder contains data for one storm. Within folders for each storm, videos are labeled by the timestamp in UTC of the end of the video as IMGPhhmm. Some videos in the early mornings or late evenings, or in very intense rainfall, have had brightness and contrast adjustments in Adobe Premiere Pro for better video quality and are in MP4 format. All raw videos are in MOV format. The camera triggered when a minimum of 1.6 mm of rain fell in a 10-minute interval and it recorded in 16-minute video clips until it was 30 minutes since the last trigger. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

  8. g

    USDA White Mountain National Forest Volume 1 (2014 - 2024) | gimi9.com

    • gimi9.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    USDA White Mountain National Forest Volume 1 (2014 - 2024) | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_usda-white-mountain-national-forest-volume-1-2014-2024
    Explore at:
    Area covered
    White Mountains
    Description

    This volume's release consists of 325099 media files captured by autonomous wildlife monitoring devices under the project, USDA White Mountain National Forest. The attached files listed below include several CSV files that provide information about the data release. The file, "media.csv" provides the metadata about the media, such as filename and date/time of capture. The actual media files are housed within folders under the volume's "child items" as compressed files. A critical CSV file is "dictionary.csv", which describes each CSV file, including field names, data types, descriptions, and the relationship of each field to fields in other CSV files. Some of the media files may have been "tagged" or "annotated" by either humans or by machine learning models, identifying wildlife targets within the media. If so, this information is stored in "annotations.csv" and "modeloutputs.csv", respectively. To protect privacy, all personally identifiable information (PII) have been removed, locations have been "blurred" by bounding boxes, and media featuring sensitive taxa or humans have been omitted. To enhance data reuse, the sbRehydrate() function in the AMMonitor R package will download files and re-create the original AMMonitor project (database + media files). See source code at https://code.usgs.gov/vtcfwru/ammonitor.

  9. g

    Massachusetts Wildlife Monitoring Project (2022 - 2024) | gimi9.com

    • gimi9.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massachusetts Wildlife Monitoring Project (2022 - 2024) | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_massachusetts-wildlife-monitoring-project-2022-2024/
    Explore at:
    Area covered
    Massachusetts
    Description

    This volume's release consists of 143321 media files captured by autonomous wildlife monitoring devices under the project, Massachusetts Wildlife Monitoring Project. The attached files listed below include several CSV files that provide information about the data release. The file, "media.csv" provides the metadata about the media, such as filename and date/time of capture. The actual media files are housed within folders under the volume's "child items" as compressed files. A critical CSV file is "dictionary.csv", which describes each CSV file, including field names, data types, descriptions, and the relationship of each field to fields in other CSV files. Some of the media files may have been "tagged" or "annotated" by either humans or by machine learning models, identifying wildlife targets within the media. If so, this information is stored in "annotations.csv" and "modeloutputs.csv", respectively. To protect privacy, all personally identifiable information (PII) have been removed, locations have been "blurred" by bounding boxes, and media featuring sensitive taxa or humans have been omitted. To enhance data reuse, the sbRehydrate() function in the AMMonitor R package will download files and re-create the original AMMonitor project (database + media files). See source code at https://code.usgs.gov/vtcfwru/ammonitor.

  10. e

    Anion Data for the East River Watershed, Colorado (2014-2022)

    • knb.ecoinformatics.org
    • data.nceas.ucsb.edu
    • +5more
    Updated Feb 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kenneth Williams; Curtis Beutler; Wendy Brown; Alexander Newman; Dylan O'Ryan; Roelof Versteeg (2023). Anion Data for the East River Watershed, Colorado (2014-2022) [Dataset]. http://doi.org/10.15485/1668054
    Explore at:
    Dataset updated
    Feb 1, 2023
    Dataset provided by
    ESS-DIVE
    Authors
    Kenneth Williams; Curtis Beutler; Wendy Brown; Alexander Newman; Dylan O'Ryan; Roelof Versteeg
    Time period covered
    May 2, 2014 - Mar 14, 2022
    Area covered
    Description

    The anion data for the East River Watershed, Colorado, consists of fluoride, chloride, sulfate, nitrate, and phosphate concentrations collected at multiple, long-term monitoring sites that include stream, groundwater, and spring sampling locations. These locations represent important and/or unique end-member locations for which solute concentrations can be diagnostic of the connection between terrestrial and aquatic systems. Such locations include drainages underlined entirely or largely by shale bedrock, land covered dominated by conifers, aspens, or meadows, and drainages impacted by historic mining activity and the presence of naturally mineralized rock. Developing a long-term record of solute concentrations from a diversity of environments is a critical component of quantifying the impacts of both climate change and discrete climate perturbations, such as drought, forest mortality, and wildfire, on the riverine export of multiple anionic species. Such data may be combined with stream gauging stations co-located at each monitoring site to directly quantify the seasonal and annual mass flux of these anionic species out of the watershed. This data package contains (1) a zip file (anion_data_2014-2022.zip) containing a total of 345 data files of anion data from across the Lawrence Berkeley National Laboratory (LBNL) Watershed Function Scientific Focus Area (SFA) which is reported in .csv files per location; (2) a file-level metadata (flmd.csv) file that lists each file contained in the dataset with associated metadata; and (3) a data dictionary (dd.csv) file that contains terms/column_headers used throughout the files along with a definition, units, and data type. Update on 6/10/2022: versioned updates to this dataset was made along with these changes: (1) updated anion data for all locations up to 2021-12-31, (2) removal of units from column headers in datafiles, (3) added row underneath headers to contain units of variables, (4) restructure of units to comply with CSV reporting format requirements, and (5) the addition of the file-level metadata (flmd.csv) and data dictionary (dd.csv) were added to comply with the File-Level Metadata Reporting Format. Update on 2022-09-09: Updates were made to reporting format specific files (file-level metadata and data dictionary) to correct swapped file names, add additional details on metadata descriptions on both files, add a header_row column to enable parsing, and add version number and date to file names (v2_20220909_flmd.csv and v2_20220909_dd.csv).Update on 2022-12-20: Updates were made to both the data files and reporting format specific files. Conversion issues affecting ER-PLM locations for anion data was resolved for the data files. Additionally, the flmd and dd files were updated to reflect the updated versions of these files. Available data was added up until 2022-03-14.

  11. List of File Extensions and Descriptions

    • kaggle.com
    zip
    Updated Jul 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luis Vinatea (2024). List of File Extensions and Descriptions [Dataset]. https://www.kaggle.com/datasets/luisvinateabarberena/file-extensions
    Explore at:
    zip(5666 bytes)Available download formats
    Dataset updated
    Jul 14, 2024
    Authors
    Luis Vinatea
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Introduction

    This dataset provides a comprehensive list of 567 file extensions along with their descriptions, meticulously scraped from a Wikipedia page. It serves as a valuable resource for developers, researchers, and anyone interested in understanding various file types and their purposes.

    Content

    The dataset contains the following columns: - File Extension: The extension of the file (e.g., .txt, .jpg). - Description: A brief description of what the file extension is used for.

    Usage

    This dataset can be used for various purposes, including: - Building applications that need to recognize and handle different file types. - Educating and training individuals on file extensions and their uses. - Conducting research on file formats and their prevalence in different domains.

    Keywords

    File Extensions, Data Description, CSV, Web Scraping, Beautiful Soup, Wikipedia, Data Analysis, Development, Research

    File Extensions CSV Preview

    File ExtensionDescription
    .txtPlain text file
    .jpgJPEG image file
    .pdfPortable Document Format file
    .docMicrosoft Word document file
    .xlsxMicrosoft Excel spreadsheet file
  12. Z

    DIAMAS survey on Institutional Publishing - aggregated data

    • data-staging.niaid.nih.gov
    • nde-dev.biothings.io
    • +3more
    Updated Mar 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kramer, Bianca; Ross, George (2025). DIAMAS survey on Institutional Publishing - aggregated data [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_10590502
    Explore at:
    Dataset updated
    Mar 13, 2025
    Dataset provided by
    Sesame Open Science
    Jisc
    Authors
    Kramer, Bianca; Ross, George
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The DIAMAS project investigates Institutional Publishing Service Providers (IPSP) in the broadest sense, with a special focus on those publishing initiatives that do not charge fees to authors or readers. To collect information on Institutional Publishing in the ERA, a survey was conducted among IPSPs between March-May 2024. This dataset contains aggregated data from the 685 valid responses to the DIAMAS survey on Institutional Publishing.

    The dataset supplements D2.3 Final IPSP landscape Report Institutional Publishing in the ERA: results from the DIAMAS survey.

    The data

    Basic aggregate tabular data

    Full individual survey responses are not being shared to prevent the easy identification of respondents (in line with conditions set out in the survey questionnaire). This dataset contains full tables with aggregate data for all questions from the survey, with the exception of free-text responses, from all 685 survey respondents. This includes, per question, overall totals and percentages for the answers given as well the breakdown by both IPSP-types: institutional publishers (IPs) and service providers (SPs). Tables at country level have not been shared, as cell values often turned out to be too low to prevent potential identification of respondents. The data is available in csv and docx formats, with csv files grouped and packaged into ZIP files. Metadata describing data type, question type, as well as question response rate, is available in csv format. The R code used to generate the aggregate tables is made available as well.

    Files included in this dataset

    survey_questions_data_description.csv - metadata describing data type, question type, as well as question response rate per survey question.

    tables_raw_all.zip - raw tables (csv format) with aggregated data per question for all respondents, with the exception of free-text responses. Questions with multiple answers have a table for each answer option. Zip file contains 180 csv files.

    tables_raw_IP.zip - as tables_raw_all.zip, for responses from institutional publishers (IP) only. Zip file contains 180 csv files.

    tables_raw_SP.zip - as tables_raw_all.zip, for responses from service providers (SP) only. Zip file contains 170 csv files.

    tables_formatted_all.docx - formatted tables (docx format) with aggregated data per question for all respondents, with the exception of free-text responses. Questions with multiple answers have a table for each answer option.

    tables_formatted_IP.docx - as tables_formatted_all.docx, for responses from institutional publishers (IP) only.

    tables_formatted_SP.docx - as tables_formatted_all.docx, for responses from service providers (SP) only.

    DIAMAS_Tables_single.R - R script used to generate raw tables with aggregated data for all single response questions

    DIAMAS_Tables_multiple.R - R script used to generate raw tables with aggregated data for all multiple response questions

    DIAMAS_Tables_layout.R - R script used to generate document with formatted tables from raw tables with aggregated data

    DIAMAS Survey on Instititutional Publishing - data availability statement (pdf)

    All data are made available under a CC0 license.

  13. d

    Data from: Maine Department of Inland Fisheries and Wildlife Moose Project -...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Nov 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Maine Department of Inland Fisheries and Wildlife Moose Project - Volume 2 (2021 - 2024) [Dataset]. https://catalog.data.gov/dataset/maine-department-of-inland-fisheries-and-wildlife-moose-project-volume-2-2021-2024
    Explore at:
    Dataset updated
    Nov 26, 2025
    Dataset provided by
    U.S. Geological Survey
    Area covered
    Maine
    Description

    This volume's release consists of 320104 media files captured by autonomous wildlife monitoring devices under the project, Maine Department of Inland Fisheries and Wildlife. The attached files listed below include several CSV files that provide information about the data release. The file, "media.csv" provides the metadata about the media, such as filename and date/time of capture. The actual media files are housed within folders under the volume's "child items" as compressed files. A critical CSV file is "dictionary.csv", which describes each CSV file, including field names, data types, descriptions, and the relationship of each field to fields in other CSV files. Some of the media files may have been "tagged" or "annotated" by either humans or by machine learning models, identifying wildlife targets within the media. If so, this information is stored in "annotations.csv" and "modeloutputs.csv", respectively. To protect privacy, all personally identifiable information (PII) have been removed, locations have been "blurred" by bounding boxes, and media featuring sensitive taxa or humans have been omitted. To enhance data reuse, the sbRehydrate() function in the AMMonitor R package will download files and re-create the original AMMonitor project (database + media files). See source code at https://code.usgs.gov/vtcfwru/ammonitor.

  14. d

    Data from: Indiana Dunes National Park Volume 1 (2019)

    • catalog.data.gov
    • data.usgs.gov
    Updated Nov 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Indiana Dunes National Park Volume 1 (2019) [Dataset]. https://catalog.data.gov/dataset/indiana-dunes-national-park-volume-1-2019
    Explore at:
    Dataset updated
    Nov 21, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Indiana
    Description

    This volume's release consists of 26141 media files captured by autonomous wildlife monitoring devices under the project, Indiana Dunes National Park. The attached files listed below include several CSV files that provide information about the data release. The file, "media.csv" provides the metadata about the media, such as filename and date/time of capture. The actual media files are housed within folders under the volume's "child items" as compressed files. A critical CSV file is "dictionary.csv", which describes each CSV file, including field names, data types, descriptions, and the relationship of each field to fields other CSV files. Some of the media files may have been "tagged" or "annotated" by either humans or by machine learning models, identifying wildlife targets within the media. If so, this information is stored in "annotations.csv" and "modeloutputs.csv", respectively. To protect privacy, all personally identifiable information (PII) have been removed, locations have been "blurred" by bounding boxes, and media featuring sensitive taxa or humans have been omitted. To enhance data reuse, the sbRehydrate() function in the AMMonitor R package will download files and re-create the original AMMonitor project (database + media files). See source code at https://code.usgs.gov/vtcfwru/ammonitor.

  15. d

    Data from: Maine Department of Inland Fisheries and Wildlife Volume 1 (2022...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Nov 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Maine Department of Inland Fisheries and Wildlife Volume 1 (2022 - 2023) [Dataset]. https://catalog.data.gov/dataset/maine-department-of-inland-fisheries-and-wildlife-volume-1-2022-2023
    Explore at:
    Dataset updated
    Nov 26, 2025
    Dataset provided by
    U.S. Geological Survey
    Area covered
    Maine
    Description

    This volume's release consists of 64642 media files captured by autonomous wildlife monitoring devices under the project, Maine Department of Inland Fisheries and Wildlife. The attached files listed below include several CSV files that provide information about the data release. The file, "media.csv" provides the metadata about the media, such as filename and date/time of capture. The actual media files are housed within folders under the volume's "child items" as compressed files. A critical CSV file is "dictionary.csv", which describes each CSV file, including field names, data types, descriptions, and the relationship of each field to fields other CSV files. Some of the media files may have been "tagged" or "annotated" by either humans or by machine learning models, identifying wildlife targets within the media. If so, this information is stored in "annotations.csv" and "modeloutputs.csv", respectively. To protect privacy, all personally identifiable information (PII) have been removed, locations have been "blurred" by bounding boxes, and media featuring sensitive taxa or humans have been omitted. To enhance data reuse, the sbRehydrate() function in the AMMonitor R package will download files and re-create the original AMMonitor project (database + media files). See source code at https://code.usgs.gov/vtcfwru/ammonitor.

  16. d

    New Hampshire Fish and Game Department Volume 1 (2014 - 2024)

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Nov 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). New Hampshire Fish and Game Department Volume 1 (2014 - 2024) [Dataset]. https://catalog.data.gov/dataset/new-hampshire-fish-and-game-department-volume-1-2014-2024
    Explore at:
    Dataset updated
    Nov 27, 2025
    Dataset provided by
    U.S. Geological Survey
    Area covered
    New Hampshire
    Description

    This volume's release consists of 463615 media files captured by autonomous wildlife monitoring devices under the project, New Hampshire Fish and Game Department. The attached files listed below include several CSV files that provide information about the data release. The file, "media.csv" provides the metadata about the media, such as filename and date/time of capture. The actual media files are housed within folders under the volume's "child items" as compressed files. A critical CSV file is "dictionary.csv", which describes each CSV file, including field names, data types, descriptions, and the relationship of each field to fields in other CSV files. Some of the media files may have been "tagged" or "annotated" by either humans or by machine learning models, identifying wildlife targets within the media. If so, this information is stored in "annotations.csv" and "modeloutputs.csv", respectively. To protect privacy, all personally identifiable information (PII) have been removed, locations have been "blurred" by bounding boxes, and media featuring sensitive taxa or humans have been omitted. To enhance data reuse, the sbRehydrate() function in the AMMonitor R package will download files and re-create the original AMMonitor project (database + media files). See source code at https://code.usgs.gov/vtcfwru/ammonitor.

  17. d

    NGEE Arctic Phase 4 Plant Functional Type Framework for Pan-Arctic...

    • search.dataone.org
    Updated Aug 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Verity Salmon; Amy Breen; Alistair Rogers; Kim Ely; Jitendra Kumar; Benjamin Sulman; Colleen Iversen (2025). NGEE Arctic Phase 4 Plant Functional Type Framework for Pan-Arctic Vegetation [Dataset]. http://doi.org/10.15485/2529470
    Explore at:
    Dataset updated
    Aug 25, 2025
    Dataset provided by
    ESS-DIVE
    Authors
    Verity Salmon; Amy Breen; Alistair Rogers; Kim Ely; Jitendra Kumar; Benjamin Sulman; Colleen Iversen
    Time period covered
    Jan 1, 1978 - Jan 1, 2025
    Area covered
    Arctic,
    Description

    The NGEE-Arctic research team identified a common set of hierarchical plant functional types (PFTs) for pan-arctic vegetation that we will use across our research activities. Interdisciplinary work within a large team requires agreement regarding levels of functional organization so that knowledge, data, and technologies can be shared and combined effectively. The team has identified plant functional types as a crucial area where such interoperability is needed. PFTs are used to represent plant pools and fluxes within models, summarize observational data, and map vegetation across the landscape. Within each of these applications, varying levels of PFT specificity are needed according to the specific scientific research goal, computational limitations, and data availability. By agreeing on a specific hierarchical framework for grouping variables in our vegetation data, we ensure the resulting research products will be robust, flexible, and scalable. In this document, we lay out the agreed upon PFT framework with definitions and references to existing literature. Table 1 included in the "NGA700_Phase4PFTFramework_about*" file outlines the relationship between NGEE-Arctic Phase 4, Tier 1 PFTs and the PFTs used within prominent arctic literature as well as publications by the NGEE-Arctic team during phases 1-3. This dataset consists of a table detailing a hierarchical PFT framework that spans 4 tiers with the most granular PFTs listed in tier 1 and the most general PFTs in tier 4. The PFTs within each tier has a single column in the dataset where the PFTs are named and a separate column where the characteristics used to define that PFT are listed. Grey fill of the cells is used to indicate where a given PFT starts to “lose” tier 1 details as you look from left to right. Note the excel file has merged cells to indicate grouping of PFTs across the Tiers- it will not translate into a delimited filetype (.csv, .txt, etc) without modification thus the hierarchical PFT framework table is available in three different file formats: 1) NGA700_Phase4PTS.xlsx – maintains the merged cells and grey fill; 2) NGA700_Phase4PTS.csv – merged cells are split, and grey fill is removed; 3) NGA700_Phase4PTS.pdf – image of the table with merged cells and grey fill. Metadata document included as a *.pdf and file-level metadata and data dictionary as *.csv files.

  18. Renewable Energy Generation Amount (kWh) by Renewable Energy Type |...

    • data.gov.hk
    Updated Nov 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.gov.hk (2025). Renewable Energy Generation Amount (kWh) by Renewable Energy Type | DATA.GOV.HK [Dataset]. https://data.gov.hk/en-data/dataset/hkelectric-cs_cbd-renewable-energy-generation-by-renewable-energy-type
    Explore at:
    Dataset updated
    Nov 24, 2025
    Dataset provided by
    data.gov.hk
    Description

    Provide the renewable energy generation amounts by renewable energy system type. The CSV file contains the renewable energy generation amounts from solar photovoltaic systems and wind power systems respectively.

  19. online review.csv

    • kaggle.com
    zip
    Updated Jun 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Farha Kousar (2024). online review.csv [Dataset]. https://www.kaggle.com/datasets/farhakouser/online-review-csv
    Explore at:
    zip(1747813 bytes)Available download formats
    Dataset updated
    Jun 22, 2024
    Authors
    Farha Kousar
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The /kaggle/input/online-review-csv/online_review.csv file contains customer reviews from Flipkart. It includes the following columns:

    review_id: Unique identifier for each review. product_id: Unique identifier for each product. user_id: Unique identifier for each user. rating: Star rating (1 to 5) given by the user. title: Summary of the review. review_text: Detailed feedback from the user. review_date: Date the review was submitted. verified_purchase: Indicates if the purchase was verified (true/false). helpful_votes: Number of users who found the review helpful. reviewer_name: Name or alias of the reviewer. Uses Sentiment Analysis: Understand customer sentiments. Product Improvement: Identify areas for product enhancement. Market Research: Analyze customer preferences. Recommendation Systems: Improve recommendation algorithms. This dataset is ideal for practicing data analysis and machine learning techniques.

  20. w

    Exploration Gap Assessment (FY13 Update) geochemistry_data.csv

    • data.wu.ac.at
    csv
    Updated Mar 6, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HarvestMaster (2018). Exploration Gap Assessment (FY13 Update) geochemistry_data.csv [Dataset]. https://data.wu.ac.at/schema/geothermaldata_org/NmNlMzQ4ZjctYWY5Zi00ZTBjLWFjNTItYjgxODQzMjY4ODE0
    Explore at:
    csvAvailable download formats
    Dataset updated
    Mar 6, 2018
    Dataset provided by
    HarvestMaster
    Area covered
    0fbac37cf8e68e98b387b1289c32043473bf973a
    Description

    This submission contains an update to the previous Exploration Gap Assessment funded in 2012, which identify high potential hydrothermal areas where critical data are needed (gap analysis on exploration data).

    The uploaded data are contained in two data files for each data category: A shape (SHP) file containing the grid, and a data file (CSV) containing the individual layers that intersected with the grid. This CSV can be joined with the map to retrieve a list of datasets that are available at any given site. A grid of the contiguous U.S. was created with 88,000 10-km by 10-km grid cells, and each cell was populated with the status of data availability corresponding to five data types:

    1. well data
    2. geologic maps
    3. fault maps
    4. geochemistry data
    5. geophysical data The raw table of intersected services for the geochemistry gap assessment.

    The attributes in the CSV include:

    1. grid_id : The id of the grid cell that the data intersects with
    2. title: This represents the name of the WFS service that intersected with this grid cell
    3. abstract: This represents the description of the WFS service that intersected with this grid cell
    4. gap_type: This represents the category of data availability that these data fall within. As the current processing is pulling data from NGDS, this category universally represents data that are available in the NGDS and are ready for acquisition for analytic purposes.
    5. proprietary_type: Whether the data are considered proprietary
    6. service_type: The type of service
    7. base_url: The service URL
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
PUN (2024). ARC-AGI-CSV-DATA [Dataset]. https://www.kaggle.com/datasets/pshikk/arc-agi-csv-data
Organization logo

ARC-AGI-CSV-DATA

CSV formatted data for the ARC-AGI Challenge

Explore at:
zip(296979 bytes)Available download formats
Dataset updated
Jun 15, 2024
Authors
PUN
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

CSV formatted dataset for the ARC-AGI challenge, as the original dataset was in json format.

Both the files are formatted as Id, Input, Output. Id contains the id of the task and train or test label with it position in the task. Input contains the input. Output contains the output.

Search
Clear search
Close search
Google apps
Main menu