100+ datasets found

ARC-AGI-CSV-DATA
kaggle.com
zip
Updated Jun 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PUN (2024). ARC-AGI-CSV-DATA [Dataset]. https://www.kaggle.com/datasets/pshikk/arc-agi-csv-data
Explore at:
zip(296979 bytes)Available download formats
Dataset updated
Jun 15, 2024
Authors
PUN
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
CSV formatted dataset for the ARC-AGI challenge, as the original dataset was in json format.

Both the files are formatted as Id, Input, Output. Id contains the id of the task and train or test label with it position in the task. Input contains the input. Output contains the output.
CSV_File
kaggle.com
zip
Updated Jan 14, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmad Fakhar (2024). CSV_File [Dataset]. https://www.kaggle.com/datasets/ahmadfakhar/csv-file
Explore at:
zip(5386219 bytes)Available download formats
Dataset updated
Jan 14, 2024
Authors
Ahmad Fakhar
Description
Dataset

This dataset was created by Ahmad Fakhar

Contents
Vehicle licensing statistics data files
s3.amazonaws.com
gov.uk
Updated May 24, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department for Transport (2022). Vehicle licensing statistics data files [Dataset]. https://s3.amazonaws.com/thegovernmentsays-files/content/181/1811927.html
Explore at:
Dataset updated
May 24, 2022
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Department for Transport
Description
The following datafiles contain detailed information about vehicles in the UK, which would be too large to use as structured tables. They are provided as simple CSV text files that should be easier to use digitally.

Data tables containing aggregated information about vehicles in the UK are also available.

We welcome any feedback on the structure of our new datafiles, their usability, or any suggestions for improvements, please contact vehicles statistics.

How to use CSV files

CSV files can be used either as a spreadsheet (using Microsoft Excel or similar spreadsheet packages) or digitally using software packages and languages (for example, R or Python).

When using as a spreadsheet, there will be no formatting, but the file can still be explored like our publication tables. Due to their size, older software might not be able to open the entire file.

Download data files

Make and model by quarter

df_VEH0120_GB: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1077520/df_VEH0120_GB.csv">Vehicles at the end of the quarter by licence status, body type, make, generic model and model: Great Britain (CSV, 37.6 MB)

Scope: All registered vehicles in Great Britain; from 1994 Quarter 4 (end December)

Schema: BodyType, Make, GenModel, Model, LicenceStatus, [number of vehicles; one column per quarter]

df_VEH0120_UK: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1077521/df_VEH0120_UK.csv">Vehicles at the end of the quarter by licence status, body type, make, generic model and model: United Kingdom (CSV, 20.8 MB)

Scope: All registered vehicles in the United Kingdom; from 2014 Quarter 3 (end September)

Schema: BodyType, Make, GenModel, Model, LicenceStatus, [number of vehicles; one column per quarter]

df_VEH0160_GB: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1077522/df_VEH0160_GB.csv">Vehicles registered for the first time by body type, make, generic model and model: Great Britain (CSV, 17.1 MB)

Scope: All vehicles registered for the first time in Great Britain; from 2001 Quarter 1 (January to March)

Schema: BodyType, Make, GenModel, Model, [number of vehicles; one column per quarter]

df_VEH0160_UK: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1077523/df_VEH0160_UK.csv">Vehicles registered for the first time by body type, make, generic model and model: United Kingdom (CSV, 4.93 MB)

Scope: All vehicles registered for the first time in the United Kingdom; from 2014 Quarter 3 (July to September)

Schema: BodyType, Make, GenModel, Model, [number of vehicles; one column per quarter]

Make and model by age

df_VEH0124: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1077524/df_VEH0124.csv">Vehicles at the end of the quarter by licence status, body type, make, generic model, model, year of first use and year of manufacture: United Kingdom (CSV, 28.2 MB)

Scope: All licensed vehicles in the United Kingdom; 2021 Quarter 4 (end December) only

Schema: BodyType, Make, GenModel, Model, YearFirstUsed, YearManufacture, Licensed (number of vehicles), SORN (number of vehicles)

Make and model by engine size

df_VEH0220: <a class="govu
Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...
zenodo.org
data.europa.eu
zip
Updated Oct 20, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari (2022). LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive snapshots of our lives in the wild [Dataset]. http://doi.org/10.5281/zenodo.6832242
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6832242
Dataset updated
Oct 20, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
LifeSnaps Dataset Documentation

Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction.

The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication.

Data Import: Reading CSV

For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command.

Data Import: Setting up a MongoDB (Recommended)

To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database.

To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here.

For the Fitbit data, run the following:

mongorestore --host localhost:27017 -d rais_anonymized -c fitbit

For the SEMA data, run the following:

mongorestore --host localhost:27017 -d rais_anonymized -c sema

For surveys data, run the following:

mongorestore --host localhost:27017 -d rais_anonymized -c surveys

If you have access control enabled, then you will need to add the --username and --password parameters to the above commands.

Data Availability

The MongoDB database contains three collections, fitbit, sema, and surveys, containing the Fitbit, SEMA3, and survey data, respectively. Similarly, the CSV files contain related information to these collections. Each document in any collection follows the format shown below:

{ _id:
d
Data from: Tidal Daily Discharge and Quality Assurance Data Supporting an...
catalog.data.gov
data.usgs.gov
+1more
Updated Nov 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Tidal Daily Discharge and Quality Assurance Data Supporting an Assessment of Water Quality and Discharge in the Herring River, Wellfleet, Massachusetts, November 2015–September 2017 [Dataset]. https://catalog.data.gov/dataset/tidal-daily-discharge-and-quality-assurance-data-supporting-an-assessment-of-water-quality
Explore at:
Dataset updated
Nov 18, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Wellfleet, Herring River, Massachusetts
Description
This data release provides data in support of an assessment of water quality and discharge in the Herring River at the Chequessett Neck Road dike in Wellfleet, Massachusetts, from November 2015 to September 2017. The assessment was a cooperative project among the U.S. Geological Survey, National Park Service, Cape Cod National Seashore, and the Friends of Herring River to characterize environmental conditions prior to a future removal of the dike. It is described in U.S. Geological Survey (USGS) Scientific Investigations Report "Assessment of Water Quality and Discharge in the Herring River, Wellfleet, Massachusetts, November 2015 – September 2017." This data release is structured as a set of comma-separated values (CSV) files, each of which contains information on data source (or laboratory used for analysis), USGS site identification (ID) number, beginning date of time of observation or sampling, ending date and time of observation or sampling and data such as flow rate and analytical results. The CSV files include calculated tidal daily flows (Flood_Tide_Tidal_Day.csv and Ebb_Tide_Tidal_Day.csv) that were used in Huntington and others (2020) for estimation of nutrient loads. Tidal daily flows are the estimated mean daily discharges for two consecutive flood and ebb tide cycles (average duration: 24 hours, 48 minutes). The associated date is the day on which most of the flow occurred. CSV files contain quality assurance data for water-quality samples including blanks (Blanks.csv), replicates (Replicates.csv), standard reference materials (Standard_Reference_Material.csv), and atmospheric ammonium contamination (NH4_Atmospheric_Contamination.csv). One CSV file (EWI_vs_ISCO.csv) contains data comparing composite samples collected by an automatic sampler (ISCO) at a fixed point with depth-integrated samples collected at equal width increments (EWI). One CSV file (Cross_Section_Field_Parameters.csv) contains field parameter data (specific conductance, temperature, pH, and dissolved oxygen) collected at a fixed location and data collected along the cross sections at variable water depths and horizontal distances across the openings of the culverts at the Chequessett Neck Road dike. One CSV file (LOADEST_Bias_Statistics.csv) contains data that include estimated natural log of load, model residuals, Z-scores, and seasonal model residuals for winter (December, January, and February); spring (March, April and May); summer (June, July and August); and fall (September, October, and November). The data release also includes a data dictionary (Data_Dictionary.csv) that provides detailed descriptions of each field in each CSV file, including: data filename; laboratory or data source; U.S. Geological Survey site ID numbers; data types; constituent (analyte) U.S. Geological Survey parameter codes; descriptions of parameters; units; methods; minimum reporting limits; limits of quantitation, if appropriate; method reference citations; and minimum, maximum, median, and average values for each analyte. The data release also includes an abbreviations file (Abbreviations.pdf) that defines all the abbreviations in the data dictionary and CSV files. Note that the USGS site ID includes a leading zero (011058798) and some of the parameter codes contain leading zeros, so care must be taken in opening and subsequently saving these files in other formats where leading zeros may be dropped.
Z
Types, open citations, closed citations, publishers, and participation...
data-staging.niaid.nih.gov
data.niaid.nih.gov
+1more
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hiebi, Ivan; Peroni, Silvio; Shotton, David (2020). Types, open citations, closed citations, publishers, and participation reports of Crossref entities [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_2558257
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Digital Humanities Advanced Research Centre, Department of Computer Science and Engineering, University of Bologna, Bologna, Italy
Oxford e-Research Centre, University of Oxford, Oxford, United Kingdom
Authors
Hiebi, Ivan; Peroni, Silvio; Shotton, David
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This publication contains several datasets that have been used in the paper "Crowdsourcing open citations with CROCI – An analysis of the current status of open citations, and a proposal" submitted to the 17th International Conference on Scientometrics and Bibliometrics (ISSI 2019), available at https://opencitations.wordpress.com/2019/02/07/crowdsourcing-open-citations-with-croci/.

Additional information about the analyses described in the paper, including the code and the data we have used to compute all the figures, is available as a Jupyter notebook at https://github.com/sosgang/pushing-open-citations-issi2019/blob/master/script/croci_nb.ipynb. The datasets contain the following information.

non_open.zip: it is a zipped (~5 GB unzipped) CSV file containing the numbers of open citations and closed citations received by the entities in the Crossref dump used in our computation, dated October 2018. All the entity types retrieved from Crossref were aligned to one of following five categories: journal, book, proceedings, dataset, other. The open CC0 citation data we used came from the CSV dump of most recent release of COCI dated 12 November 2018. The number of closed citations was calculated by subtracting the number of open citations to each entity available within COCI from the value “is-referenced-by-count” available in the Crossref metadata for that particular cited entity, which reports all the DOI-to-DOI citation links that point to the cited entity from within the whole Crossref database (including those present in the Crossref ‘closed’ dataset).

The columns of the CSV file are the following ones:

doi: the DOI of the publication in Crossref;

type: the type of the publication as indicated in Crossref;

cited_by: the number of open citations received by the publication according to COCI;

non_open: the number of closed citations received by the publication according to Crossref + COCI.

croci_types.csv: it is a CSV file that contains the numbers of open citations and closed citations received by the entities in the Crossref dump used in our computation, as collected in the previous CSV file, alligned in five classes depening on the entity types retrieved from Crossref: journal (Crossref types: journal-article, journal-issue, journal-volume, journal), book (Crossref types: book, book-chapter, book-section, monograph, book track, book-part, book-set, reference-book, dissertation, book series, edited book), proceedings (Crossref types: proceedings-article, proceedings, proceedings-series), dataset (Crossref types: dataset), other (Crossref types: other, report, peer review, reference-entry, component, report-series, standard, posted-content, standard-series).

The columns of the CSV file are the following ones:

type: the type publication between "journal", "book", "proceedings", "dataset", "other";

label: the label assigned to the type for visualisation purposes;

coci_open_cit: the number of open citations received by the publication type according to COCI;

crossref_close_cit: the number of closed citations received by the publication according to Crossref + COCI.

publishers_cits.csv: it is a CSV file that contains the top twenty publishers that received the greatest number of open citations. The columns of the CSV file are the following ones:

publisher: the name of the publisher;

doi_prefix: the list of DOI prefixes used assigned by the publisher;

coci_open_cit: the number of open citations received by the publications of the publisher according to COCI;

crossref_close_cit: the number of closed citations received by the publications of the publishers according to Crossref + COCI;

total_cit: the total number of citations received by the publications of the publisher (= coci_open_cit + crossref_close_cit).

20publishers_cr.csv: it is a CSV file that contains the numbers of the contributions to open citations made by the twenty publishers introduced in the previous CSV file as of 24 January 2018, according to the data available through the Crossref API. The counts listed in this file refers to the number of publications for which each publisher has submitted metadata to Crossref that include the publication’s reference list. The categories 'closed', 'limited' and 'open' refer to publications for which the reference lists are not visible to anyone outside the Crossref Cited-by membership, are visible only to them and to Crossref Metadata Plus members, or are visible to all, respectively. In addition, the file also record the total number of publications for which the publisher has submitted metadata to Crossref, whether or not those metadata include the reference lists of those publications.

The columns of the CSV file are the following ones:

publisher: the name of the publisher;

open: the number of publications in Crossref with an 'open' visibility for their reference lists;

limited: the number of publications in Crossref with an 'limited' visibility for their reference lists;

closed: the number of publications in Crossref with an 'closed' visibility for their reference lists;

overall_deposited: the overall number of publications for which the publisher has submitted metadata to Crossref.
d
Rainfall, Volumetric Soil-Water Content, Video, and Geophone Data from the...
catalog.data.gov
data.usgs.gov
+1more
Updated Nov 13, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Rainfall, Volumetric Soil-Water Content, Video, and Geophone Data from the Hermits Peak-Calf Canyon Fire Burn Area, New Mexico, June 2022 to June 2024 [Dataset]. https://catalog.data.gov/dataset/rainfall-volumetric-soil-water-content-video-and-geophone-data-from-the-hermits-peak-calf-
Explore at:
Dataset updated
Nov 13, 2025
Dataset provided by
U.S. Geological Survey
Area covered
Calf Canyon, Hermit Peak
Description
Precipitation, volumetric soil-water content, videos, and geophone data characterizing postfire debris flows were collected at the 2022 Hermit’s Peak Calf-Canyon Fire in New Mexico. This dataset contains data from June 22, 2022, to June 26, 2024. The data were obtained from a station located at 35° 42’ 28.86” N, 105° 27’ 18.03” W (geographic coordinate system). Each data type is described below. Raw Rainfall Data: Rainfall data, Rainfall.csv, are contained in a comma separated value (.csv) file. The data are continuous and sampled at 1-minute intervals. The columns in the csv file are TIMESTAMP(UTC), RainSlowInt (the depth of rain in each minute [mm]), CumRain (cumulative rainfall since the beginning of the record [mm]), and VWC# (volumetric water content [V/V]) at three depths (1 = 10 cm, 2=30 cm, and 3=50 cm). VWC values outside of the range of 0 to 0.5 represent sensor malfunctions and were replaced with -99999 . Storm Record: We summarized the rainfall, volumetric soil-water content, and geophone data based on rainstorms. We defined a storm as rain for a duration >= 5 minutes or with an accumulation > 2.54 mm. Each storm was then assigned a storm ID starting at 0. The storm record data, StormRecord.csv, provides peak rainfall intensities and times and volumetric soil-water content information for each storm. The columns from left to right provide the information as follows: ID, StormStart yyyy-mm-dd hh:mm:ss-tz, StormStop yyyy-mm-dd hh:mm:ss-tz, StormDepth mm, StormDuration h, I-5 mm h-1, I-10 mm h-1, I-15 mm h-1, I-30 mm h-1, I-60 mm h-1, I-5 time yyyy-mm-dd hh:mm:ss-tz, I-10 time yyyy-mm-dd hh:mm:ss-tz, I-15 time yyyy-mm-dd hh:mm:ss-tz] ([UTC], the time of the peak 15-minute rainfall intensity), I-30 time yyyy-mm-dd hh:mm:ss-tz] ] ([UTC], the time of the peak 30-minute rainfall intensity), I-60 time [yyyy-mm-dd hh:mm:ss-tz] [UTC], (the time of the peak 60-minute rainfall intensity), VWC (volumetric water content [V/V] at three depths (1 = 10 cm, 2 = 30 cm, 3 = 50 cm) at the start of the storm, the time of the peak 15-minute rainfall intensity, and the end of the storm), Velocity [m s-1] of the flow, and Event (qualitative observation of type of flow from video footage). VWC values outside of the range of 0 to 0.5 represent sensor malfunctions and were replaced with -99999. Velocity was only calculated for flows with a noticeable surge as the rest of the signal is not sufficient for a cross-correlation, and Event was only filled for storms with quality video data. Values of -99999 were assigned for these columns for all other storms. Geophone Data: Geophone data, GeophoneData.zip, are contained in comma separated value (.csv) files labeled by ‘storm’ and the corresponding storm ID in the storm record and labeled IDa and IDb if the geophone stopped recording for more than an hour during the storm. The data was recorded at two geophones sampled at 50 Hz, one 11.5 m upstream from the station and one 9.75 m downstream from the station. Geophones were triggered to record when 1.6 mm of rain was detected during a period of 10 minutes, and they continued to record for 30 minutes past the last timestamp when this criteria was met. The columns in each csv file are TIMESTAMP [UTC], GeophoneUp_mV (the upstream geophone [mV]), GeophoneDn_mV (the downstream geophone [mV]). Note that there are occasional missed samples when the data logger did not record due to geophone malfunction when data points are 0.04 s or more apart. Videos: The videos stormID_mmdd.mp4 (or .mov) are organized by storm ID where one folder contains data for one storm. Within folders for each storm, videos are labeled by the timestamp in UTC of the end of the video as IMGPhhmm. Some videos in the early mornings or late evenings, or in very intense rainfall, have had brightness and contrast adjustments in Adobe Premiere Pro for better video quality and are in MP4 format. All raw videos are in MOV format. The camera triggered when a minimum of 1.6 mm of rain fell in a 10-minute interval and it recorded in 16-minute video clips until it was 30 minutes since the last trigger. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
g
USDA White Mountain National Forest Volume 1 (2014 - 2024) | gimi9.com
gimi9.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
USDA White Mountain National Forest Volume 1 (2014 - 2024) | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_usda-white-mountain-national-forest-volume-1-2014-2024
Explore at:
Area covered
White Mountains
Description
This volume's release consists of 325099 media files captured by autonomous wildlife monitoring devices under the project, USDA White Mountain National Forest. The attached files listed below include several CSV files that provide information about the data release. The file, "media.csv" provides the metadata about the media, such as filename and date/time of capture. The actual media files are housed within folders under the volume's "child items" as compressed files. A critical CSV file is "dictionary.csv", which describes each CSV file, including field names, data types, descriptions, and the relationship of each field to fields in other CSV files. Some of the media files may have been "tagged" or "annotated" by either humans or by machine learning models, identifying wildlife targets within the media. If so, this information is stored in "annotations.csv" and "modeloutputs.csv", respectively. To protect privacy, all personally identifiable information (PII) have been removed, locations have been "blurred" by bounding boxes, and media featuring sensitive taxa or humans have been omitted. To enhance data reuse, the sbRehydrate() function in the AMMonitor R package will download files and re-create the original AMMonitor project (database + media files). See source code at https://code.usgs.gov/vtcfwru/ammonitor.
g
Massachusetts Wildlife Monitoring Project (2022 - 2024) | gimi9.com
gimi9.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massachusetts Wildlife Monitoring Project (2022 - 2024) | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_massachusetts-wildlife-monitoring-project-2022-2024/
Explore at:
Area covered
Massachusetts
Description
This volume's release consists of 143321 media files captured by autonomous wildlife monitoring devices under the project, Massachusetts Wildlife Monitoring Project. The attached files listed below include several CSV files that provide information about the data release. The file, "media.csv" provides the metadata about the media, such as filename and date/time of capture. The actual media files are housed within folders under the volume's "child items" as compressed files. A critical CSV file is "dictionary.csv", which describes each CSV file, including field names, data types, descriptions, and the relationship of each field to fields in other CSV files. Some of the media files may have been "tagged" or "annotated" by either humans or by machine learning models, identifying wildlife targets within the media. If so, this information is stored in "annotations.csv" and "modeloutputs.csv", respectively. To protect privacy, all personally identifiable information (PII) have been removed, locations have been "blurred" by bounding boxes, and media featuring sensitive taxa or humans have been omitted. To enhance data reuse, the sbRehydrate() function in the AMMonitor R package will download files and re-create the original AMMonitor project (database + media files). See source code at https://code.usgs.gov/vtcfwru/ammonitor.
e
Anion Data for the East River Watershed, Colorado (2014-2022)
knb.ecoinformatics.org
data.nceas.ucsb.edu
+5more
Updated Feb 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kenneth Williams; Curtis Beutler; Wendy Brown; Alexander Newman; Dylan O'Ryan; Roelof Versteeg (2023). Anion Data for the East River Watershed, Colorado (2014-2022) [Dataset]. http://doi.org/10.15485/1668054
Explore at:
Unique identifier
https://doi.org/10.15485/1668054
Dataset updated
Feb 1, 2023
Dataset provided by
ESS-DIVE
Authors
Kenneth Williams; Curtis Beutler; Wendy Brown; Alexander Newman; Dylan O'Ryan; Roelof Versteeg
Time period covered
May 2, 2014 - Mar 14, 2022
Area covered

Description
The anion data for the East River Watershed, Colorado, consists of fluoride, chloride, sulfate, nitrate, and phosphate concentrations collected at multiple, long-term monitoring sites that include stream, groundwater, and spring sampling locations. These locations represent important and/or unique end-member locations for which solute concentrations can be diagnostic of the connection between terrestrial and aquatic systems. Such locations include drainages underlined entirely or largely by shale bedrock, land covered dominated by conifers, aspens, or meadows, and drainages impacted by historic mining activity and the presence of naturally mineralized rock. Developing a long-term record of solute concentrations from a diversity of environments is a critical component of quantifying the impacts of both climate change and discrete climate perturbations, such as drought, forest mortality, and wildfire, on the riverine export of multiple anionic species. Such data may be combined with stream gauging stations co-located at each monitoring site to directly quantify the seasonal and annual mass flux of these anionic species out of the watershed. This data package contains (1) a zip file (anion_data_2014-2022.zip) containing a total of 345 data files of anion data from across the Lawrence Berkeley National Laboratory (LBNL) Watershed Function Scientific Focus Area (SFA) which is reported in .csv files per location; (2) a file-level metadata (flmd.csv) file that lists each file contained in the dataset with associated metadata; and (3) a data dictionary (dd.csv) file that contains terms/column_headers used throughout the files along with a definition, units, and data type. Update on 6/10/2022: versioned updates to this dataset was made along with these changes: (1) updated anion data for all locations up to 2021-12-31, (2) removal of units from column headers in datafiles, (3) added row underneath headers to contain units of variables, (4) restructure of units to comply with CSV reporting format requirements, and (5) the addition of the file-level metadata (flmd.csv) and data dictionary (dd.csv) were added to comply with the File-Level Metadata Reporting Format. Update on 2022-09-09: Updates were made to reporting format specific files (file-level metadata and data dictionary) to correct swapped file names, add additional details on metadata descriptions on both files, add a header_row column to enable parsing, and add version number and date to file names (v2_20220909_flmd.csv and v2_20220909_dd.csv).Update on 2022-12-20: Updates were made to both the data files and reporting format specific files. Conversion issues affecting ER-PLM locations for anion data was resolved for the data files. Additionally, the flmd and dd files were updated to reflect the updated versions of these files. Available data was added up until 2022-03-14.
List of File Extensions and Descriptions
kaggle.com
zip
Updated Jul 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luis Vinatea (2024). List of File Extensions and Descriptions [Dataset]. https://www.kaggle.com/datasets/luisvinateabarberena/file-extensions
Explore at:
zip(5666 bytes)Available download formats
Dataset updated
Jul 14, 2024
Authors
Luis Vinatea
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Introduction

This dataset provides a comprehensive list of 567 file extensions along with their descriptions, meticulously scraped from a Wikipedia page. It serves as a valuable resource for developers, researchers, and anyone interested in understanding various file types and their purposes.

Content

The dataset contains the following columns: - File Extension: The extension of the file (e.g., .txt, .jpg). - Description: A brief description of what the file extension is used for.

Usage

This dataset can be used for various purposes, including: - Building applications that need to recognize and handle different file types. - Educating and training individuals on file extensions and their uses. - Conducting research on file formats and their prevalence in different domains.

Keywords

File Extensions, Data Description, CSV, Web Scraping, Beautiful Soup, Wikipedia, Data Analysis, Development, Research

File Extensions CSV Preview

File Extension Description
.txt Plain text file
.jpg JPEG image file
.pdf Portable Document Format file
.doc Microsoft Word document file
.xlsx Microsoft Excel spreadsheet file
Z
DIAMAS survey on Institutional Publishing - aggregated data
data-staging.niaid.nih.gov
nde-dev.biothings.io
+3more
Updated Mar 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kramer, Bianca; Ross, George (2025). DIAMAS survey on Institutional Publishing - aggregated data [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_10590502
Explore at:
Dataset updated
Mar 13, 2025
Dataset provided by
Sesame Open Science
Jisc
Authors
Kramer, Bianca; Ross, George
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The DIAMAS project investigates Institutional Publishing Service Providers (IPSP) in the broadest sense, with a special focus on those publishing initiatives that do not charge fees to authors or readers. To collect information on Institutional Publishing in the ERA, a survey was conducted among IPSPs between March-May 2024. This dataset contains aggregated data from the 685 valid responses to the DIAMAS survey on Institutional Publishing.

The dataset supplements D2.3 Final IPSP landscape Report Institutional Publishing in the ERA: results from the DIAMAS survey.

The data

Basic aggregate tabular data

Full individual survey responses are not being shared to prevent the easy identification of respondents (in line with conditions set out in the survey questionnaire). This dataset contains full tables with aggregate data for all questions from the survey, with the exception of free-text responses, from all 685 survey respondents. This includes, per question, overall totals and percentages for the answers given as well the breakdown by both IPSP-types: institutional publishers (IPs) and service providers (SPs). Tables at country level have not been shared, as cell values often turned out to be too low to prevent potential identification of respondents. The data is available in csv and docx formats, with csv files grouped and packaged into ZIP files. Metadata describing data type, question type, as well as question response rate, is available in csv format. The R code used to generate the aggregate tables is made available as well.

Files included in this dataset

survey_questions_data_description.csv - metadata describing data type, question type, as well as question response rate per survey question.

tables_raw_all.zip - raw tables (csv format) with aggregated data per question for all respondents, with the exception of free-text responses. Questions with multiple answers have a table for each answer option. Zip file contains 180 csv files.

tables_raw_IP.zip - as tables_raw_all.zip, for responses from institutional publishers (IP) only. Zip file contains 180 csv files.

tables_raw_SP.zip - as tables_raw_all.zip, for responses from service providers (SP) only. Zip file contains 170 csv files.

tables_formatted_all.docx - formatted tables (docx format) with aggregated data per question for all respondents, with the exception of free-text responses. Questions with multiple answers have a table for each answer option.

tables_formatted_IP.docx - as tables_formatted_all.docx, for responses from institutional publishers (IP) only.

tables_formatted_SP.docx - as tables_formatted_all.docx, for responses from service providers (SP) only.

DIAMAS_Tables_single.R - R script used to generate raw tables with aggregated data for all single response questions

DIAMAS_Tables_multiple.R - R script used to generate raw tables with aggregated data for all multiple response questions

DIAMAS_Tables_layout.R - R script used to generate document with formatted tables from raw tables with aggregated data

DIAMAS Survey on Instititutional Publishing - data availability statement (pdf)

All data are made available under a CC0 license.
d
Data from: Maine Department of Inland Fisheries and Wildlife Moose Project -...
catalog.data.gov
data.usgs.gov
+2more
Updated Nov 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Maine Department of Inland Fisheries and Wildlife Moose Project - Volume 2 (2021 - 2024) [Dataset]. https://catalog.data.gov/dataset/maine-department-of-inland-fisheries-and-wildlife-moose-project-volume-2-2021-2024
Explore at:
Dataset updated
Nov 26, 2025
Dataset provided by
U.S. Geological Survey
Area covered
Maine
Description
This volume's release consists of 320104 media files captured by autonomous wildlife monitoring devices under the project, Maine Department of Inland Fisheries and Wildlife. The attached files listed below include several CSV files that provide information about the data release. The file, "media.csv" provides the metadata about the media, such as filename and date/time of capture. The actual media files are housed within folders under the volume's "child items" as compressed files. A critical CSV file is "dictionary.csv", which describes each CSV file, including field names, data types, descriptions, and the relationship of each field to fields in other CSV files. Some of the media files may have been "tagged" or "annotated" by either humans or by machine learning models, identifying wildlife targets within the media. If so, this information is stored in "annotations.csv" and "modeloutputs.csv", respectively. To protect privacy, all personally identifiable information (PII) have been removed, locations have been "blurred" by bounding boxes, and media featuring sensitive taxa or humans have been omitted. To enhance data reuse, the sbRehydrate() function in the AMMonitor R package will download files and re-create the original AMMonitor project (database + media files). See source code at https://code.usgs.gov/vtcfwru/ammonitor.
d
Data from: Indiana Dunes National Park Volume 1 (2019)
catalog.data.gov
data.usgs.gov
Updated Nov 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Indiana Dunes National Park Volume 1 (2019) [Dataset]. https://catalog.data.gov/dataset/indiana-dunes-national-park-volume-1-2019
Explore at:
Dataset updated
Nov 21, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Indiana
Description
This volume's release consists of 26141 media files captured by autonomous wildlife monitoring devices under the project, Indiana Dunes National Park. The attached files listed below include several CSV files that provide information about the data release. The file, "media.csv" provides the metadata about the media, such as filename and date/time of capture. The actual media files are housed within folders under the volume's "child items" as compressed files. A critical CSV file is "dictionary.csv", which describes each CSV file, including field names, data types, descriptions, and the relationship of each field to fields other CSV files. Some of the media files may have been "tagged" or "annotated" by either humans or by machine learning models, identifying wildlife targets within the media. If so, this information is stored in "annotations.csv" and "modeloutputs.csv", respectively. To protect privacy, all personally identifiable information (PII) have been removed, locations have been "blurred" by bounding boxes, and media featuring sensitive taxa or humans have been omitted. To enhance data reuse, the sbRehydrate() function in the AMMonitor R package will download files and re-create the original AMMonitor project (database + media files). See source code at https://code.usgs.gov/vtcfwru/ammonitor.
d
Data from: Maine Department of Inland Fisheries and Wildlife Volume 1 (2022...
catalog.data.gov
data.usgs.gov
+1more
Updated Nov 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Maine Department of Inland Fisheries and Wildlife Volume 1 (2022 - 2023) [Dataset]. https://catalog.data.gov/dataset/maine-department-of-inland-fisheries-and-wildlife-volume-1-2022-2023
Explore at:
Dataset updated
Nov 26, 2025
Dataset provided by
U.S. Geological Survey
Area covered
Maine
Description
This volume's release consists of 64642 media files captured by autonomous wildlife monitoring devices under the project, Maine Department of Inland Fisheries and Wildlife. The attached files listed below include several CSV files that provide information about the data release. The file, "media.csv" provides the metadata about the media, such as filename and date/time of capture. The actual media files are housed within folders under the volume's "child items" as compressed files. A critical CSV file is "dictionary.csv", which describes each CSV file, including field names, data types, descriptions, and the relationship of each field to fields other CSV files. Some of the media files may have been "tagged" or "annotated" by either humans or by machine learning models, identifying wildlife targets within the media. If so, this information is stored in "annotations.csv" and "modeloutputs.csv", respectively. To protect privacy, all personally identifiable information (PII) have been removed, locations have been "blurred" by bounding boxes, and media featuring sensitive taxa or humans have been omitted. To enhance data reuse, the sbRehydrate() function in the AMMonitor R package will download files and re-create the original AMMonitor project (database + media files). See source code at https://code.usgs.gov/vtcfwru/ammonitor.
d
New Hampshire Fish and Game Department Volume 1 (2014 - 2024)
catalog.data.gov
data.usgs.gov
+2more
Updated Nov 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). New Hampshire Fish and Game Department Volume 1 (2014 - 2024) [Dataset]. https://catalog.data.gov/dataset/new-hampshire-fish-and-game-department-volume-1-2014-2024
Explore at:
Dataset updated
Nov 27, 2025
Dataset provided by
U.S. Geological Survey
Area covered
New Hampshire
Description
This volume's release consists of 463615 media files captured by autonomous wildlife monitoring devices under the project, New Hampshire Fish and Game Department. The attached files listed below include several CSV files that provide information about the data release. The file, "media.csv" provides the metadata about the media, such as filename and date/time of capture. The actual media files are housed within folders under the volume's "child items" as compressed files. A critical CSV file is "dictionary.csv", which describes each CSV file, including field names, data types, descriptions, and the relationship of each field to fields in other CSV files. Some of the media files may have been "tagged" or "annotated" by either humans or by machine learning models, identifying wildlife targets within the media. If so, this information is stored in "annotations.csv" and "modeloutputs.csv", respectively. To protect privacy, all personally identifiable information (PII) have been removed, locations have been "blurred" by bounding boxes, and media featuring sensitive taxa or humans have been omitted. To enhance data reuse, the sbRehydrate() function in the AMMonitor R package will download files and re-create the original AMMonitor project (database + media files). See source code at https://code.usgs.gov/vtcfwru/ammonitor.
d
NGEE Arctic Phase 4 Plant Functional Type Framework for Pan-Arctic...
search.dataone.org
Updated Aug 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Verity Salmon; Amy Breen; Alistair Rogers; Kim Ely; Jitendra Kumar; Benjamin Sulman; Colleen Iversen (2025). NGEE Arctic Phase 4 Plant Functional Type Framework for Pan-Arctic Vegetation [Dataset]. http://doi.org/10.15485/2529470
Explore at:
Unique identifier
https://doi.org/10.15485/2529470
Dataset updated
Aug 25, 2025
Dataset provided by
ESS-DIVE
Authors
Verity Salmon; Amy Breen; Alistair Rogers; Kim Ely; Jitendra Kumar; Benjamin Sulman; Colleen Iversen
Time period covered
Jan 1, 1978 - Jan 1, 2025
Area covered
Arctic,
Description
The NGEE-Arctic research team identified a common set of hierarchical plant functional types (PFTs) for pan-arctic vegetation that we will use across our research activities. Interdisciplinary work within a large team requires agreement regarding levels of functional organization so that knowledge, data, and technologies can be shared and combined effectively. The team has identified plant functional types as a crucial area where such interoperability is needed. PFTs are used to represent plant pools and fluxes within models, summarize observational data, and map vegetation across the landscape. Within each of these applications, varying levels of PFT specificity are needed according to the specific scientific research goal, computational limitations, and data availability. By agreeing on a specific hierarchical framework for grouping variables in our vegetation data, we ensure the resulting research products will be robust, flexible, and scalable. In this document, we lay out the agreed upon PFT framework with definitions and references to existing literature. Table 1 included in the "NGA700_Phase4PFTFramework_about*" file outlines the relationship between NGEE-Arctic Phase 4, Tier 1 PFTs and the PFTs used within prominent arctic literature as well as publications by the NGEE-Arctic team during phases 1-3. This dataset consists of a table detailing a hierarchical PFT framework that spans 4 tiers with the most granular PFTs listed in tier 1 and the most general PFTs in tier 4. The PFTs within each tier has a single column in the dataset where the PFTs are named and a separate column where the characteristics used to define that PFT are listed. Grey fill of the cells is used to indicate where a given PFT starts to “lose” tier 1 details as you look from left to right. Note the excel file has merged cells to indicate grouping of PFTs across the Tiers- it will not translate into a delimited filetype (.csv, .txt, etc) without modification thus the hierarchical PFT framework table is available in three different file formats: 1) NGA700_Phase4PTS.xlsx – maintains the merged cells and grey fill; 2) NGA700_Phase4PTS.csv – merged cells are split, and grey fill is removed; 3) NGA700_Phase4PTS.pdf – image of the table with merged cells and grey fill. Metadata document included as a *.pdf and file-level metadata and data dictionary as *.csv files.
Renewable Energy Generation Amount (kWh) by Renewable Energy Type |...
data.gov.hk
Updated Nov 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.gov.hk (2025). Renewable Energy Generation Amount (kWh) by Renewable Energy Type | DATA.GOV.HK [Dataset]. https://data.gov.hk/en-data/dataset/hkelectric-cs_cbd-renewable-energy-generation-by-renewable-energy-type
Explore at:
Dataset updated
Nov 24, 2025
Dataset provided by
data.gov.hk
Description
Provide the renewable energy generation amounts by renewable energy system type. The CSV file contains the renewable energy generation amounts from solar photovoltaic systems and wind power systems respectively.
online review.csv
kaggle.com
zip
Updated Jun 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Farha Kousar (2024). online review.csv [Dataset]. https://www.kaggle.com/datasets/farhakouser/online-review-csv
Explore at:
zip(1747813 bytes)Available download formats
Dataset updated
Jun 22, 2024
Authors
Farha Kousar
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The /kaggle/input/online-review-csv/online_review.csv file contains customer reviews from Flipkart. It includes the following columns:

review_id: Unique identifier for each review. product_id: Unique identifier for each product. user_id: Unique identifier for each user. rating: Star rating (1 to 5) given by the user. title: Summary of the review. review_text: Detailed feedback from the user. review_date: Date the review was submitted. verified_purchase: Indicates if the purchase was verified (true/false). helpful_votes: Number of users who found the review helpful. reviewer_name: Name or alias of the reviewer. Uses Sentiment Analysis: Understand customer sentiments. Product Improvement: Identify areas for product enhancement. Market Research: Analyze customer preferences. Recommendation Systems: Improve recommendation algorithms. This dataset is ideal for practicing data analysis and machine learning techniques.
w
Exploration Gap Assessment (FY13 Update) geochemistry_data.csv
data.wu.ac.at
csv
Updated Mar 6, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HarvestMaster (2018). Exploration Gap Assessment (FY13 Update) geochemistry_data.csv [Dataset]. https://data.wu.ac.at/schema/geothermaldata_org/NmNlMzQ4ZjctYWY5Zi00ZTBjLWFjNTItYjgxODQzMjY4ODE0
Explore at:
csvAvailable download formats
Dataset updated
Mar 6, 2018
Dataset provided by
HarvestMaster
Area covered
0fbac37cf8e68e98b387b1289c32043473bf973a
Description
This submission contains an update to the previous Exploration Gap Assessment funded in 2012, which identify high potential hydrothermal areas where critical data are needed (gap analysis on exploration data).

The uploaded data are contained in two data files for each data category: A shape (SHP) file containing the grid, and a data file (CSV) containing the individual layers that intersected with the grid. This CSV can be joined with the map to retrieve a list of datasets that are available at any given site. A grid of the contiguous U.S. was created with 88,000 10-km by 10-km grid cells, and each cell was populated with the status of data availability corresponding to five data types:

well data

geologic maps

fault maps

geochemistry data

geophysical data The raw table of intersected services for the geochemistry gap assessment.

The attributes in the CSV include:

grid_id : The id of the grid cell that the data intersects with

title: This represents the name of the WFS service that intersected with this grid cell

abstract: This represents the description of the WFS service that intersected with this grid cell

gap_type: This represents the category of data availability that these data fall within. As the current processing is pulling data from NGDS, this category universally represents data that are available in the NGDS and are ready for acquisition for analytic purposes.

proprietary_type: Whether the data are considered proprietary

service_type: The type of service

base_url: The service URL

File Extension	Description
.txt	Plain text file
.jpg	JPEG image file
.pdf	Portable Document Format file
.doc	Microsoft Word document file
.xlsx	Microsoft Excel spreadsheet file

Facebook

Twitter

Click to copy link

Link copied

Cite

PUN (2024). ARC-AGI-CSV-DATA [Dataset]. https://www.kaggle.com/datasets/pshikk/arc-agi-csv-data

ARC-AGI-CSV-DATA

CSV formatted data for the ARC-AGI Challenge

Explore at:

zip(296979 bytes)Available download formats

Dataset updated

Jun 15, 2024

Authors

PUN

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

CSV formatted dataset for the ARC-AGI challenge, as the original dataset was in json format.

Both the files are formatted as Id, Input, Output. Id contains the id of the task and train or test label with it position in the task. Input contains the input. Output contains the output.

Clear search

Close search

Google apps

Main menu

ARC-AGI-CSV-DATA

CSV_File

Dataset

Contents

Vehicle licensing statistics data files

How to use CSV files

Download data files

Make and model by quarter

Make and model by age

Make and model by engine size

Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...

Data from: Tidal Daily Discharge and Quality Assurance Data Supporting an...

Types, open citations, closed citations, publishers, and participation...

Rainfall, Volumetric Soil-Water Content, Video, and Geophone Data from the...

USDA White Mountain National Forest Volume 1 (2014 - 2024) | gimi9.com

Massachusetts Wildlife Monitoring Project (2022 - 2024) | gimi9.com

Anion Data for the East River Watershed, Colorado (2014-2022)

List of File Extensions and Descriptions

Introduction

Content

Usage

Keywords

File Extensions CSV Preview

DIAMAS survey on Institutional Publishing - aggregated data

Data from: Maine Department of Inland Fisheries and Wildlife Moose Project -...

Data from: Indiana Dunes National Park Volume 1 (2019)

Data from: Maine Department of Inland Fisheries and Wildlife Volume 1 (2022...

New Hampshire Fish and Game Department Volume 1 (2014 - 2024)

NGEE Arctic Phase 4 Plant Functional Type Framework for Pan-Arctic...

Renewable Energy Generation Amount (kWh) by Renewable Energy Type |...

online review.csv

Exploration Gap Assessment (FY13 Update) geochemistry_data.csv

ARC-AGI-CSV-DATA

CSV formatted data for the ARC-AGI Challenge