100+ datasets found

H
SummaModel PreProcessing using csv file and PostProcessing using Plotting...
hydroshare.org
zip
Updated Nov 8, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
YOUNGDON CHOI; Jonathan Goodall; Jeff Sadler; Andrew Bennett (2018). SummaModel PreProcessing using csv file and PostProcessing using Plotting Modules using PySUMMA [Dataset]. https://www.hydroshare.org/resource/b606c591ba6c45448572080e8b316559
Explore at:
zip(455.0 MB)Available download formats
Dataset updated
Nov 8, 2018
Dataset provided by
HydroShare
Authors
YOUNGDON CHOI; Jonathan Goodall; Jeff Sadler; Andrew Bennett
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jul 1, 2001 - Sep 30, 2008
Area covered

Description
Following the procedure of Jupyter notebook, users can create SUMMA input using *.csv files. If users want to create new SUMMA input, they can prepare input by csv format. After that, users are able to simulate SUMMA with PySUMMA and Plotting with SUMMA output by the various way.

Following the step of this notebooks 1. Creating SUMMA input from *.csv files 2. Run SUMMA Model using PySUMMA 3. Plotting with SUMMA output - Time series Plotting - 2D Plotting (heatmap, hovmoller) - Calculating water balance variables and Plotting - Spatial Plotting with shapefile
e
ESS-DIVE Reporting Format for Comma-separated Values (CSV) File Structure
knb.ecoinformatics.org
dataone.org
+1more
Updated May 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Terri Velliquette; Jessica Welch; Michael Crow; Ranjeet Devarakonda; Susan Heinz; Robert Crystal-Ornelas (2023). ESS-DIVE Reporting Format for Comma-separated Values (CSV) File Structure [Dataset]. http://doi.org/10.15485/1734841
Explore at:
Unique identifier
https://doi.org/10.15485/1734841
Dataset updated
May 4, 2023
Dataset provided by
ESS-DIVE
Authors
Terri Velliquette; Jessica Welch; Michael Crow; Ranjeet Devarakonda; Susan Heinz; Robert Crystal-Ornelas
Time period covered
Jan 1, 2020 - Sep 30, 2021
Description
The ESS-DIVE reporting format for Comma-separated Values (CSV) file structure is based on a combination of existing guidelines and recommendations including some found within the Earth Science Community with valuable input from the Environmental Systems Science (ESS) Community. The CSV reporting format is designed to promote interoperability and machine-readability of CSV data files while also facilitating the collection of some file-level metadata content. Tabular data in the form of rows and columns should be archived in its simplest form, and we recommend submitting these tabular data following the ESS-DIVE reporting format for generic comma-separated values (CSV) text format files. In general, the CSV file format is more likely accessible by future systems when compared to a proprietary format and CSV files are preferred because this format is easier to exchange between different programs increasing the interoperability of a data file. By defining the reporting format and providing guidelines for how to structure CSV files and some field content within, this can increase the machine-readability of the data file for extracting, compiling, and comparing the data across files and systems. Data package files are in .csv, .png, and .md. Open the .csv with e.g. Microsoft Excel, LibreOffice, or Google Sheets. Open the .md files by downloading and using a text editor (e.g., notepad or TextEdit). Open the .png in e.g. a web browser, photo viewer/editor, or Google Drive.
f
input/pt_025_001.flnc.report.csv.zip
fairdomhub.org
bin
Updated May 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marko Petek (2022). input/pt_025_001.flnc.report.csv.zip [Dataset]. https://fairdomhub.org/data_files/5906
Explore at:
bin(15.5 MB)Available download formats
Dataset updated
May 16, 2022
Authors
Marko Petek
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
_p_SUSPHIRE/_I_T31_mealybug/_S_P4_Pcitri_IsoSeq/_A_02_cDNAcupcake-dry/
d
Child 1: Nutrient and streamflow model-input data
catalog.data.gov
data.usgs.gov
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Child 1: Nutrient and streamflow model-input data [Dataset]. https://catalog.data.gov/dataset/child-1-nutrient-and-streamflow-model-input-data
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
U.S. Geological Survey
Description
Trends in nutrient fluxes and streamflow for selected tributaries in the Lake Erie watershed were calculated using monitoring data at 10 locations. Trends in flow-normalized nutrient fluxes were determined by applying a weighted regression approach called WRTDS (Weighted Regression on Time, Discharge, and Season). Site information and streamflow and water-quality records are contained in 3 zipped files named as follows: INFO (site information), Daily (daily streamflow records), and Sample (water-quality records). The INFO, Daily (flow), and Sample files contain the input data, by water-quality parameter and by site as .csv files, used to run trend analyses. These files were generated by the R (version 3.1.2) software package called EGRET - Exploration and Graphics for River Trends (version 2.5.1) (Hirsch and DeCicco, 2015), and can be used directly as input to run graphical procedures and WRTDS trend analyses using EGRET R software. The .csv files are identified according to water-quality parameter (TP, SRP, TN, NO23, and TKN) and site reference number (e.g. TPfiles.1.INFO.csv, SRPfiles.1.INFO.csv, TPfiles.2.INFO.csv, etc.). Water-quality parameter abbreviations and site reference numbers are defined in the file "Site-summary_table.csv" on the landing page, where there is also a site-location map ("Site_map.pdf"). Parameter information details, including abbreviation definitions, appear in the abstract on the Landing Page. SRP data records were available at only 6 of the 10 trend sites, which are identified in the file "site-summary_table.csv" (see landing page) as monitored by the organization NCWQR (National Center for Water Quality Research). The SRP sites are: RAIS, MAUW, SAND, HONE, ROCK, and CUYA. The model-input dataset is presented in 3 parts: 1. INFO.zip (site information) 2. Daily.zip (daily streamflow records) 3. Sample.zip (water-quality records) Reference: Hirsch, R.M., and De Cicco, L.A., 2015 (revised). User Guide to Exploration and Graphics for RivEr Trends (EGRET) and dataRetrieval: R Packages for Hydrologic Data, Version 2.0, U.S. Geological Survey Techniques Methods, 4-A10. U.S. Geological Survey, Reston, VA., 93 p. (at: http://dx.doi.org/10.3133/tm4A10).
Z
Temperature CSV
data.niaid.nih.gov
Updated Mar 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TuTiempo.net (2020). Temperature CSV [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3719520
Explore at:
Dataset updated
Mar 20, 2020
Dataset provided by
TuTiempo.net
Description
Metereological information related to Naples Metropolitan area
d
Integrated Hourly Meteorological Database of 20 Meteorological Stations...
search.dataone.org
osti.gov
Updated Jan 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Boris Faybishenko; Dylan O'Ryan (2025). Integrated Hourly Meteorological Database of 20 Meteorological Stations (1981-2022) for Watershed Function SFA Hydrological Modeling [Dataset]. http://doi.org/10.15485/2502101
Explore at:
Unique identifier
https://doi.org/10.15485/2502101
Dataset updated
Jan 17, 2025
Dataset provided by
ESS-DIVE
Authors
Boris Faybishenko; Dylan O'Ryan
Time period covered
Jan 1, 1981 - Dec 31, 2022
Area covered

Description
This dataset contains (a) a script “R_met_integrated_for_modeling.R”, and (b) associated input CSV files: 3 CSV files per location to create a 5-variable integrated meteorological dataset file (air temperature, precipitation, wind speed, relative humidity, and solar radiation) for 19 meteorological stations and 1 location within Trail Creek from the modeling team within the East River Community Observatory as part of the Watershed Function Scientific Focus Area (SFA). As meteorological forcings varied across the watershed, a high-frequency database is needed to ensure consistency in the data analysis and modeling. We evaluated several data sources, including gridded meteorological products and field data from meteorological stations. We determined that our modeling efforts required multiple data sources to meet all their needs. As output, this dataset contains (c) a single CSV data file (*_1981-2022.csv) for each location (20 CSV output files total) containing hourly time series data for 1981 to 2022 and (d) five PNG files of time series and density plots for each variable per location (100 PNG files). Detailed location metadata is contained within the Integrated_Met_Database_Locations.csv file for each point location included within this dataset, obtained from Varadharajan et al., 2023 doi:10.15485/1660962. This dataset also includes (e) a file-level metadata (flmd.csv) file that lists each file contained in the dataset with associated metadata and (f) a data dictionary (dd.csv) file that contains column/row headers used throughout the files along with a definition, units, and data type. Review the (g) ReadMe_Integrated_Met_Database.pdf file for additional details on the script, methods, and structure of the dataset. The script integrates Northwest Alliance for Computational Science and Engineering’s PRISM gridded data product, National Oceanic and Atmospheric Administration’s NCEP-NCAR Reanalysis 1 gridded data product (through the RCNEP R package, Kemp et al., doi:10.32614/CRAN.package.RNCEP), and analytical-based calculations. Further, this script downscales the input data into hourly frequency, which is necessary for the modeling efforts.
f
input/pt_025_001.unpolished.cluster_report.csv.zip
fairdomhub.org
bin
Updated May 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marko Petek (2022). input/pt_025_001.unpolished.cluster_report.csv.zip [Dataset]. https://fairdomhub.org/data_files/5908
Explore at:
bin(10.5 MB)Available download formats
Dataset updated
May 16, 2022
Authors
Marko Petek
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
_p_SUSPHIRE/_I_T31_mealybug/_S_P4_Pcitri_IsoSeq/_A_02_cDNAcupcake-dry/
Datasets and scripts related to the paper: "*Can Generative AI Help us in...
zenodo.org
zip
Updated Jul 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous Anonymous; Anonymous Anonymous (2024). Datasets and scripts related to the paper: "*Can Generative AI Help us in Qualitative Software Engineering?*" [Dataset]. http://doi.org/10.5281/zenodo.13134104
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13134104
Dataset updated
Jul 30, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anonymous Anonymous; Anonymous Anonymous
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This replication package contains datasets and scripts related to the paper: "*Can Generative AI Help us in Qualitative Software Engineering?*"

The replication package is organized into two directories:

- `manual_analysis`: This directory contains all sheets used to perform the manual analysis for RQ1, RQ2, and RQ3.

- `stats`: This directory contains all datasets, scripts, and results metrics used for the quantitative analyses of RQ1 and RQ2.

In the following, we describe the content of each directory:

## manual_analysis

- `manual_analysis_rq1`: This directory contains all sheets used to perform manual analysis for RQ1 (independent and incremental coding).

- The sub-directory `incremental_coding` contains .csv files for all datasets (`DL_Faults_COMMIT_incremental.csv`, `DL_Faults_ISSUE_incremental.csv`, `DL_Fault_SO_incremental.csv`, `DRL_Challenges_incremental.csv` and `Functional_incremental.csv`). All these .csv files contain the following columns:

- *Link*: The link to the instances

- *Prompt*: Prompt used as input to GPT-4-Turbo

- *ID*: Instance ID

- *FinalTag*: Tag assigned by the human in the original paper

- *Chatgpt\_output\_memory*: Output of GPT-4-Turbo with incremental coding

- *Chatgpt\_output\_memory\_clean*: (only for the DL Faults datasets) output of GPT-4-Turbo considering only the label assigned, excluding the text

- *Author1*: Label assigned by the first author

- *Author2*: Label assigned by the second author

- *FinalOutput*: Label assigned after the resolution of the conflicts

- The sub-directory `independent_coding` contains .csv files for all datasets (`DL_Faults_COMMIT_independent.csv`, `DL_Faults_ISSUE_ independent.csv`, `DL_Fault_SO_ independent.csv`, `DRL_Challenges_ independent.csv` and `Functional_ independent.csv`), containing the following columns:

- *Link*: The link to the instances

- *Prompt*: Prompt used as input to GPT-4-Turbo

- *ID*: Specific ID for the instance

- *FinalTag*: Tag assigned by the human in the original paper

- *Chatgpt\_output*: Output of GPT-4-Turbo with independent coding

- *Chatgpt\_output\_clean*: (only for DL Faults datasets) output of GPT-4-Turbo considering only the label assigned, excluding the text

- *Author1*: Label assigned by the first author

- *Author2*: Label assigned by the second author

- *FinalOutput*: Label assigned after the resolution of the conflicts.

- Also, the sub-directory contains sheets with inconsistencies after resolving conflicts. The directory `inconsistency_incremental_coding` contains .csv files with the following columns:

- *Dataset*: The dataset considered

- *Human*: The label assigned by the human in the original paper

- *Machine*: The label assigned by GPT-4-Turbo

- *Classification*: The final label assigned by the authors after resolving the conflicts. Multiple classifications for a single instance are separated by a comma “,”

- *Final*: final label assigned after the resolution of the incompatibilities

- Similarly, the sub-directory `inconsistency_independent_coding` contains a .csv file with the same columns as before, but this is for the case of independent coding.

- `manual_analysis_rq2`: This directory contains .csv files for all datasets (`DL_Faults_redundant_tag.csv`, `DRL_Challenges_redundant_tag.csv`, `Functional_redundant_tag.csv`) to perform manual analysis for RQ2.

- The `DL_Faults_redundant_tag.csv` file contains the following columns:

- *Tags Redundant*: tags identified as redundant by GPT-4-Turbo

- *Matched*: inspection by the authors to see if the tags are redundant matching or not

- *FinalTag*: final tag assigned by the authors after the resolution of the conflict

- The `Functional_redundant_tag.csv` file contains the same columns as before

- The `DRL_Challenges_redundant_tag.csv` file is organized as follows:

- *Tags Suggested*: The final tag suggested by GPT-4-Turbo

- *Tags Redundant*: tags identified as redundant by GPT-4-Turbo

- *Matched*: inspection by the authors to see if the tags redundant matching or not with the tags suggested

- *FinalTag*: final tag assigned by the authors after the resolution of the conflict

- The sub-directory `code_consolidation_mapping_overview` contains .csv files (`DL_Faults_rq2_overview.csv`, `DRL_Challenges_rq2_overview.csv`, `Functional_rq2_overview.csv`) organized as follows:

- *Initial_Tags*: list of the unique initial tags assigned by GPT-4-Turbo for each dataset

- *Mapped_tags*: list of tags mapped by GPT-4-Turbo

- *Unmatched_tags*: list of unmatched tags by GPT-4-Turbo

- *Aggregating_tags*: list of consolidated tags

- *Final_tags*: list of final tags after the consolidation task

## stats

- `RQ1`: contains script and datasets used to perform metrics for RQ1. The analysis calculates all possible combinations between Matched, More Abstract, More Specific, and Unmatched.

- `RQ1_Stats.ipynb` is a Python Jupyter nooteook to compute the RQ1 metrics. To use it, as explained in the notebook, it is necessary to change the values of variables contained in the first code block.

- `independent-prompting`: Contains the datasets related to the independent prompting. Each line contains the following fields:

- *Link*: Link to the artifact being tagged

- *Prompt*: Prompt sent to GPT-4-Turbo

- *FinalTag*: Artifact coding from the replicated study

- *chatgpt\_output_text*: GPT-4-Turbo output

- *chatgpt\_output*: Codes parsed from the GPT-4-Turbo output

- *Author1*: Annotator 1 evaluation of the coding

- *Author2*: Annotator 2 evaluation of the coding

- *FinalOutput*: Consolidated evaluation

- `incremental-prompting`: Contains the datasets related to the incremental prompting (same format as independent prompting)

- `results`: contains files for the RQ1 quantitative results. The files are named `RQ1\_<

- `RQ2`: contains the script used to perform metrics for RQ2, the datasets it uses, and its output.

- `RQ2_SetStats.ipynb` is the Python Jupyter notebook to perform the analyses. The scripts takes as input the following types of files, contained in the directory contains the script used to perform the metrics for RQ2. The script takes in input:

- RQ1 Data Files (`RQ1_DLFaults_Issues.csv`, `RQ1_DLFaults_Commits.csv`, and `RQ1_DLFaults_SO.csv`, joined in a single .csv `RQ1_DLFaults.csv`). These are the same files used in RQ1.

- Mapping Files (`RQ2_Mappings_DRL.csv`, `RQ2_Mappings_Functional.csv`, `RQ2_Mappings_DLFaults.csv`). These contain the mappings between human tags (*HumanTags*), GPT-4-Turbo tags (*Final Tags*), with indicated the type of matching (*MatchType*).

- Additional codes creating during the consolidation (`RQ2_newCodes_DRL.csv`, `RQ2_newCodes_Functional.csv`, `RQ2_newCodes_DLFaults.csv`), annotated with the matching: *new code*,*old code*,*human code*,*match type*

- Set files (`RQ2_Sets_DRL.csv`, `RQ2_Sets_Functional.csv`, `RQ2_Sets_DLFaults.csv`). Each file contains the following columns:

- *HumanTags*: List of tags from the original dataset

- *InitialTags*: Set of tags from RQ1,

- *ConsolidatedTags*: Tags that have been consolidated,

- *FinalTags*: Final set of tags (results of RQ2, used in RQ3)

- *NewTags*: New tags created during consolidation

- `RQ2_Set_Metrics.csv`: Reports the RQ2 output metrics (Precision, Recall, F1-Score, Jaccard).
g
CSV Water Recovery Factors.csv
gimi9.com
Updated Mar 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). CSV Water Recovery Factors.csv [Dataset]. https://gimi9.com/dataset/data-gov_watertap3-model-input-data-for-nawis-eight-source-water-baseline-analyses
Explore at:
Dataset updated
Mar 16, 2022
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This folder contains the input data for the WaterTAP3 model that was used for the eight NAWI (National Alliance for Water Innovation) source water baselines studies published in the Environmental Science and Technology special issue: Technology Baselines and Innovation Priorities for Water Treatment and Supply. There are also eight other separate DAMS submissions, one per source water, that include the model results for the published studies. In this data submission, all model inputs across the eight baselines are included. The data structure and content are described in a README.txt file. For more details on how to use the data in WaterTAP3 please refer to the model documentation and GitHub site found at "WaterTAP3 Github" linked in the submission resources.
z
RAPID input and output files corresponding to "RAPID Applied to the...
zenodo.org
data.niaid.nih.gov
csv, nc
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cédric H. David; Florence Habets; David R. Maidment; Zong-Liang Yang; Cédric H. David; Florence Habets; David R. Maidment; Zong-Liang Yang (2020). RAPID input and output files corresponding to "RAPID Applied to the SIM-France Model" [Dataset]. http://doi.org/10.5281/zenodo.30228
Explore at:
csv, ncAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.30228
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodo
Authors
Cédric H. David; Florence Habets; David R. Maidment; Zong-Liang Yang; Cédric H. David; Florence Habets; David R. Maidment; Zong-Liang Yang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
France
Description
Corresponding peer-reviewed publication

This dataset corresponds to all the RAPID input and output files that were used in the study reported in:

David, Cédric H., Florence Habets, David R. Maidment and Zong-Liang Yang (2011), RAPID applied to the SIM-France model, Hydrological Processes, 25(22), 3412-3425. DOI: 10.1002/hyp.8070.

When making use of any of the files in this dataset, please cite both the aforementioned article and the dataset herein.

Time format

The times reported in this description all follow the ISO 8601 format. For example 2000-01-01T16:00-06:00 represents 4:00 PM (16:00) on Jan 1^st 2000 (2000-01-01), Central Standard Time (-06:00). Additionally, when time ranges with inner time steps are reported, the first time corresponds to the beginning of the first time step, and the second time corresponds to the end of the last time step. For example, the 3-hourly time range from 2000-01-01T03:00+00:00 to 2000-01-01T09:00+00:00 contains two 3-hourly time steps. The first one starts at 3:00 AM and finishes at 6:00AM on Jan 1^st 2000, Universal Time; the second one starts at 6:00 AM and finishes at 9:00AM on Jan 1^st 2000, Universal Time.

Data sources

The following sources were used to produce files in this dataset:

The hydrographic network of SIM-France, as published in Habets, F., A. Boone, J. L. Champeaux, P. Etchevers, L. Franchistéguy, E. Leblois, E. Ledoux, P. Le Moigne, E. Martin, S. Morel, J. Noilhan, P. Quintana Seguí, F. Rousset-Regimbeau, and P. Viennot (2008), The SAFRAN-ISBA-MODCOU hydrometeorological model applied over France, Journal of Geophysical Research: Atmospheres, 113(D6), DOI: 10.1029/2007JD008548.

The observed flows are from Banque HYDRO, Service Central d’Hydrométéorologie et d’Appui à la Prévision des Inondations. Available at http://www.hydro.eaufrance.fr/index.php.

Outputs from a simulation using SIM-France (Habets et al. 2008). The simulation was run by Florence Habets, and produced 3-hourly time steps from 1995-08-01T00:00+02:00 to 2005-07-31T21:02+00:00. Further details on the inputs and options used for this simulation are provided in David et al. (2011).

Software

The following software were used to produce files in this dataset:

The Routing Application for Parallel computation of Discharge (RAPID, David et al. 2011, http://rapid-hub.org), Version 1.1.0. Further details on the inputs and options used for this series of simulations are provided below and in David et al. (2011).

ESRI ArcGIS (http://www.arcgis.com).

Microsoft Excel (https://products.office.com/en-us/excel).

The GNU Compiler Collection (https://gcc.gnu.org) and the Intel compilers (https://software.intel.com/en-us/intel-compilers).

Study domain

The files in this dataset correspond to one study domain:

The river network of SIM-France is made of 24264 river reaches. The temporal range corresponding to this domain is from 1995-08-01T00:00+02:00 to 2005-07-31 T21:00+02:00.

Description of files

All files below were prepared by Cédric H. David, using the data sources and software mentioned above.

rapid_connect_France.csv. This CSV file contains the river network connectivity information and is based on the unique IDs of the SIM-France river reaches (the IDs). For each river reach, this file specifies: the ID of the reach, the ID of the unique downstream reach, the number of upstream reaches with a maximum of four reaches, and the IDs of all upstream reaches. A value of zero is used in place of NoData. The river reaches are sorted in increasing value of ID. The values were computed based on the SIM-France FICVID file. This file was prepared using a Fortran program.

m3_riv_France_1995_2005_ksat_201101_c_zvol_ext.nc. This netCDF file contains the 3-hourly accumulated inflows of water (in cubic meters) from surface and subsurface runoff into the upstream point of each river reach. The river reaches have the same IDs and are sorted similarly to rapid_connect_France.csv. The time range for this file is from 1995-08-01T00:00+02:00 to 2005/07/31T21:00+02:00. The values were computed using the outputs of SIM-France. This file was prepared using a Fortran program.

kfac_modcou_1km_hour.csv. This CSV file contains a first guess of Muskingum k values (in seconds) for all river reaches. The river reaches have the same IDs and are sorted similarly to rapid_connect_France.csv. The values were computed based on the following information: ID, size of the side of the grid cell, Equation (5) in David et al. (2011), and using a wave celerity of 1 km/h. This file was prepared using a Fortran program.

kfac_modcou_ttra_length.csv. This CSV file contains a second guess of Muskingum k values (in seconds) for all river reaches. The river reaches have the same IDs and are sorted similarly to rapid_connect_France.csv. The values were computed based on the following information: ID, size of the side of the grid cell, travel time, and Equation (9) in David et al. (2011).

k_modcou_0.csv. This CSV file contains Muskingum k values (in seconds) for all river reaches. The river reaches have the same COMIDs and are sorted similarly to rapid_connect_San_Guad.csv. The values were computed based on the following information: kfac_modcou_1km_hour.csv and using Table (2) in David et al. (2011). This file was prepared using a Fortran program.

k_modcou_1.csv. This CSV file contains Muskingum k values (in seconds) for all river reaches. The river reaches have the same COMIDs and are sorted similarly to rapid_connect_San_Guad.csv. The values were computed based on the following information: kfac_modcou_1km_hour.csv and using Table (2) in David et al. (2011). This file was prepared using a Fortran program.

k_modcou_2.csv. This CSV file contains Muskingum k values (in seconds) for all river reaches. The river reaches have the same COMIDs and are sorted similarly to rapid_connect_San_Guad.csv. The values were computed based on the following information: kfac_modcou_1km_hour.csv and using Table (2) in David et al. (2011). This file was prepared using a Fortran program.

k_modcou_3.csv. This CSV file contains Muskingum k values (in seconds) for all river reaches. The river reaches have the same COMIDs and are sorted similarly to rapid_connect_San_Guad.csv. The values were computed based on the following information: kfac_modcou_1km_hour.csv and using Table (2) in David et al. (2011). This file was prepared using a Fortran program.

k_modcou_4.csv. This CSV file contains Muskingum k values (in seconds) for all river reaches. The river reaches have the same COMIDs and are sorted similarly to rapid_connect_San_Guad.csv. The values were computed based on the following information: kfac_modcou_1km_hour.csv and using Table (2) in David et al. (2011). This file was prepared using a Fortran program.

k_modcou_a.csv. This CSV file contains Muskingum k values (in seconds) for all river reaches. The river reaches have the same COMIDs and are sorted similarly to rapid_connect_San_Guad.csv. The values were computed based on the following information: kfac_modcou_1km_hour.csv and using Table (2) in David et al. (2011). This file was prepared using a Fortran program.

k_modcou_b.csv. This CSV file contains Muskingum k values (in seconds) for all river reaches. The river reaches have the same COMIDs and are sorted similarly to rapid_connect_San_Guad.csv. The values were computed based on the following information: kfac_modcou_1km_hour.csv and using Table (2) in David et al. (2011). This file was prepared using a Fortran program.

k_modcou_c.csv. This CSV file contains Muskingum k values (in seconds) for all river reaches. The river reaches have the same COMIDs and are sorted similarly to rapid_connect_San_Guad.csv. The values were computed based on the following information: kfac_modcou_1km_hour.csv and using Table (2) in David et al. (2011). This file was prepared using a Fortran program.

x_modcou_0.csv. This CSV file contains Muskingum x values (dimensionless) for all river reaches. The river reaches have the same COMIDs and are sorted similarly to rapid_connect_San_Guad.csv. The values were computed based on Table (2) in David et al. (2011). This file was prepared using a Fortran program.

x_modcou_1.csv. This CSV file contains Muskingum x values (dimensionless) for all river reaches. The river reaches have the same COMIDs and are sorted similarly to rapid_connect_San_Guad.csv. The values were computed based on Table (2) in David et al. (2011). This file was prepared using a Fortran program.

x_modcou_2.csv. This CSV file contains Muskingum x values (dimensionless) for all river reaches. The river reaches have the same COMIDs and are sorted similarly to rapid_connect_San_Guad.csv. The values were computed based on Table (2) in David et al. (2011). This file was prepared using a Fortran program.

x_modcou_3.csv. This CSV file contains Muskingum x values (dimensionless) for all river reaches. The river reaches have the same COMIDs and are sorted
1000 Empirical Time series
figshare.com
researchdata.edu.au
png
Updated May 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ben Fulcher (2023). 1000 Empirical Time series [Dataset]. http://doi.org/10.6084/m9.figshare.5436136.v10
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5436136.v10
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Ben Fulcher
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A diverse selection of 1000 empirical time series, along with results of an hctsa feature extraction, using v1.06 of hctsa and Matlab 2019b, computed on a server at The University of Sydney.The results of the computation are in the hctsa file, HCTSA_Empirical1000.mat for use in Matlab using v1.06 of hctsa.The same data is also provided in .csv format for the hctsa_datamatrix.csv (results of feature computation), with information about rows (time series) in hctsa_timeseries-info.csv, information about columns (features) in hctsa_features.csv (and corresponding hctsa code used to compute each feature in hctsa_masterfeatures.csv), and the data of individual time series (each line a time series, for time series described in hctsa_timeseries-info.csv) is in hctsa_timeseries-data.csv. These .csv files were produced by running >>OutputToCSV(HCTSA_Empirical1000.mat,true,true); in hctsa.The input file, INP_Empirical1000.mat, is for use with hctsa, and contains the time-series data and metadata for the 1000 time series. For example, massive feature extraction from these data on the user's machine, using hctsa, can proceed as>> TS_Init('INP_Empirical1000.mat');Some visualizations of the dataset are in CarpetPlot.png (first 1000 samples of all time series as a carpet (color) plot) and 150TS-250samples.png (conventional time-series plots of the first 250 samples of a sample of 150 time series from the dataset). More visualizations can be performed by the user using TS_PlotTimeSeries from the hctsa package.See links in references for more comprehensive documentation for performing methodological comparison using this dataset, and on how to download and use v1.06 of hctsa.
c
Dog Food Data Extracted from Chewy (USA) - 4,500 Records in CSV Format
crawlfeeds.com
csv, zip
Updated Apr 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Dog Food Data Extracted from Chewy (USA) - 4,500 Records in CSV Format [Dataset]. https://crawlfeeds.com/datasets/dog-food-data-extracted-from-chewy-usa-4-500-records-in-csv-format
Explore at:
zip, csvAvailable download formats
Dataset updated
Apr 22, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
The Dog Food Data Extracted from Chewy (USA) dataset contains 4,500 detailed records of dog food products sourced from one of the leading pet supply platforms in the United States, Chewy. This dataset is ideal for businesses, researchers, and data analysts who want to explore and analyze the dog food market, including product offerings, pricing strategies, brand diversity, and customer preferences within the USA.

The dataset includes essential information such as product names, brands, prices, ingredient details, product descriptions, weight options, and availability. Organized in a CSV format for easy integration into analytics tools, this dataset provides valuable insights for those looking to study the pet food market, develop marketing strategies, or train machine learning models.

Key Features:

Record Count: 4,500 dog food product records.

Data Fields: Product names, brands, prices, descriptions, ingredients .. etc. Find more fields under data points section.

Format: CSV, easy to import into databases and data analysis tools.

Source: Extracted from Chewy’s official USA platform.

Geography: Focused on the USA dog food market.

Use Cases:

Market Research: Analyze trends and preferences in the USA dog food market, including popular brands, price ranges, and product availability.

E-commerce Analysis: Understand how Chewy presents and prices dog food products, helping businesses compare their own product offerings.

Competitor Analysis: Compare different brands and products to develop competitive strategies for dog food businesses.

Machine Learning Models: Use the dataset for machine learning tasks such as product recommendation systems, demand forecasting, and price optimization.
e
Csv Exports And Imports | See Full Import/Export Data | Eximpedia
eximpedia.app
Updated Feb 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2025). Csv Exports And Imports | See Full Import/Export Data | Eximpedia [Dataset]. https://www.eximpedia.app/
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Feb 7, 2025
Dataset provided by
Eximpedia Export Import Trade Data
Eximpedia PTE LTD
Authors
Seair Exim
Area covered
Guam, Monaco, Georgia, Seychelles, United Arab Emirates, Northern Mariana Islands, United Republic of, Albania, Aruba, Antigua and Barbuda
Description
Csv Exports And Imports Company Export Import Records. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
m
Chapter 12: Data Preparation for Fraud Analytics: Project: Human Recourses...
data.mendeley.com
Updated Nov 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ABDELRAHIM AQQAD (2023). Chapter 12: Data Preparation for Fraud Analytics: Project: Human Recourses Analysis - Human_Resources.csv [Dataset]. http://doi.org/10.17632/smypp8574h.1
Explore at:
Unique identifier
https://doi.org/10.17632/smypp8574h.1
Dataset updated
Nov 1, 2023
Authors
ABDELRAHIM AQQAD
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Project: Human Recourses Analysis - Human_Resources.csv

Description:

The dataset, named "Human_Resources.csv", is a comprehensive collection of employee records from a fictional company. Each row represents an individual employee, and the columns represent various features associated with that employee.

The dataset is rich, highlighting features like 'Age', 'MonthlyIncome', 'Attrition', 'BusinessTravel', 'DailyRate', 'Department', 'EducationField', 'JobSatisfaction', and many more. The main focus is the 'Attrition' variable, which indicates whether an employee left the company or not.

Employee data were sourced from various departments, encompassing a diverse array of job roles and levels. Each employee's record provides an in-depth look into their background, job specifics, and satisfaction levels.

The dataset further includes specific indicators and parameters that were considered during employee performance assessments, offering a granular look into the complexities of each employee's experience.

For privacy reasons, certain personal details and specific identifiers have been anonymized or fictionalized. Instead of names or direct identifiers, each entry is associated with a unique 'EmployeeNumber', ensuring data privacy while retaining data integrity.

The employee records were subjected to rigorous examination, encompassing both manual assessments and automated checks. The end result of this examination, specifically whether an employee left the company or not, is clearly indicated for each record.
Dataset 5: R script and input data
figshare.com
txt
Updated Jan 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ulrike Bayr (2021). Dataset 5: R script and input data [Dataset]. http://doi.org/10.6084/m9.figshare.12301307.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12301307.v1
Dataset updated
Jan 21, 2021
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Ulrike Bayr
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
CSV-files as input for R code containing results from the GIS-analysis for points and polygons2. CSV-file with 3D-errors based on DTM 1m and DTM 10m (output from WSL Monoplotting Tool)2. R script
f
combined.csv dataset including market data and all input features.
figshare.com
zip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lars Steinert; Christian Herff (2023). combined.csv dataset including market data and all input features. [Dataset]. http://doi.org/10.1371/journal.pone.0208119.s001
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0208119.s001
Dataset updated
May 30, 2023
Dataset provided by
PLOS ONE
Authors
Lars Steinert; Christian Herff
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset consists of 94,663 samples representing the training and test set. It includes the following columns: Timestamp of query (‘Time’), Cryptocurrency name (‘Cryptocurrency’), Rate (‘Rate’), Trading Volume (‘Volume’), Number of tweets (‘NumTweets’), Mean positive VADER Score (‘Positive’), Mean negative VADER Score (‘Negative’), Mean compound VADER Score (‘Compound’) and Mean neutral VADER Score (‘Neutral’). (ZIP)
Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...
zenodo.org
explore.openaire.eu
zip
Updated Oct 20, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari (2022). LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive snapshots of our lives in the wild [Dataset]. http://doi.org/10.5281/zenodo.6832242
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6832242
Dataset updated
Oct 20, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
LifeSnaps Dataset Documentation

Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction.

The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication.

Data Import: Reading CSV

For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command.

Data Import: Setting up a MongoDB (Recommended)

To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database.

To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here.

For the Fitbit data, run the following:

mongorestore --host localhost:27017 -d rais_anonymized -c fitbit

For the SEMA data, run the following:

mongorestore --host localhost:27017 -d rais_anonymized -c sema

For surveys data, run the following:

mongorestore --host localhost:27017 -d rais_anonymized -c surveys

If you have access control enabled, then you will need to add the --username and --password parameters to the above commands.

Data Availability

The MongoDB database contains three collections, fitbit, sema, and surveys, containing the Fitbit, SEMA3, and survey data, respectively. Similarly, the CSV files contain related information to these collections. Each document in any collection follows the format shown below:

{ _id:
Z
Datalog subsetting input files (Wikidata 2015 NTriple-to-CSV dump)
data.niaid.nih.gov
Updated May 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seyed Amir Hosseini Beghaeiraveri (2023). Datalog subsetting input files (Wikidata 2015 NTriple-to-CSV dump) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7937849
Explore at:
Dataset updated
May 17, 2023
Dataset authored and provided by
Seyed Amir Hosseini Beghaeiraveri
License
Attribution 1.0 (CC BY 1.0)https://creativecommons.org/licenses/by/1.0/
License information was derived automatically
Description
This is a Wikidata 2015 NTriple dump in which the delimiter is changed to ','. The file is used in subsetting experiment via Radlog.
f
Viscosity data input
fairdomhub.org
zip
Updated Jul 16, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gudrun Gygli; Xinmeng Xu; Juergen Pleiss (2020). Viscosity data input [Dataset]. https://fairdomhub.org/data_files/3769
Explore at:
zip(34.9 KB)Available download formats
Dataset updated
Jul 16, 2020
Authors
Gudrun Gygli; Xinmeng Xu; Juergen Pleiss
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Input data (viscosities, in .csv format) for the modelling workflow, collected from literature, see associated publication for details.
o
Replication package for the paper "What do Developers Discuss about Code...
explore.openaire.eu
data.niaid.nih.gov
Updated Jun 30, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Replication package for the paper "What do Developers Discuss about Code Comments" [Dataset]. http://doi.org/10.5281/zenodo.5044270
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.5044270
Dataset updated
Jun 30, 2021
Description
RP-commenting-practices-multiple-sources Replication package for the paper "What do Developers Discuss about Code Comments?" ## Structure Appendix.pdf Tags-topics.md Stack-exchange-query.md RQ1/ LDA_input/ combined-so-quora-mallet-metadata.csv topic-input.mallet LDA_output/ Mallet/ output_csv/ docs-in-topics.csv topic-words.csv topics-in-docs.csv topics-metadata.csv output_html/ all_topics.html Docs/ Topics/ RQ2/ datasource_rawdata/ quora.csv stackoverflow.csv manual_analysis_output/ stackoverflow_quora_taxonomy.xlsx ## Contents of the Replication Package --- - Appendix.pdf- Appendix of the paper containing supplement tables - Tags-topics.md tags selected from Stack overflow and topics selected from Quora for the study (RQ1 & RQ2) - Stack-exchange-query.md the query interface used to extract the posts from stack exchnage explorer. - RQ1/ - contains the data used to answer RQ1 - LDA_input/ - input data used for LDA analysis - combined-so-quora-mallet-metadata.csv - Stack overflow and Quora questions used to perform LDA analysis - topic-input.mallet - input file to the mallet tool - LDA_output/ - Mallet/ - contains the LDA output generated by MALLET tool - output_csv/ - docs-in-topics.csv - documents per topic - topic-words.csv - most relevant topic words - topics-in-docs.csv - topic probability per document - topics-metadata.csv - metadata per document and topic probability - output_html/ - Browsable results of mallet output - all_topics.html - Docs/ - Topics/ - RQ2/ - contains the data used to answer RQ2 - datasource_rawdata/ - contains the raw data for each source - quora.csv - contains the processed dataset (like removing html tags). To know more about the preprocessing steps, please refer to the reproducibility section in the paper. The data is preprocessed using Makar tool. - stackoverflow.csv - contains the processed stackoverflow dataset. To know more about the preprocessing steps, please refer to the reproducibility section in the paper. The data is preprocessed using Makar tool. - manual_analysis_output/ - stackoverflow_quora_taxonomy.xlsx - contains the classified dataset of stackoverflow and quora and description of taxonomy. - Taxonomy - contains the description of the first dimension and second dimension categories. Second dimension categories are further divided into levels, separated by | symbol. - stackoverflow-posts - the questions are labelled relevant or irrelevant and categorized into the first dimension and second dimension categories. - quota-posts - the questions are labelled relevant or irrelevant and categorized into the first dimension and second dimension categories. ---

Facebook

Twitter

Click to copy link

Link copied

Cite

YOUNGDON CHOI; Jonathan Goodall; Jeff Sadler; Andrew Bennett (2018). SummaModel PreProcessing using csv file and PostProcessing using Plotting Modules using PySUMMA [Dataset]. https://www.hydroshare.org/resource/b606c591ba6c45448572080e8b316559

SummaModel PreProcessing using csv file and PostProcessing using Plotting Modules using PySUMMA

Explore at:

zip(455.0 MB)Available download formats

Dataset updated

Nov 8, 2018

Dataset provided by

HydroShare

Authors

YOUNGDON CHOI; Jonathan Goodall; Jeff Sadler; Andrew Bennett

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Time period covered

Jul 1, 2001 - Sep 30, 2008

Area covered

Description

Following the procedure of Jupyter notebook, users can create SUMMA input using *.csv files. If users want to create new SUMMA input, they can prepare input by csv format. After that, users are able to simulate SUMMA with PySUMMA and Plotting with SUMMA output by the various way.

Following the step of this notebooks 1. Creating SUMMA input from *.csv files 2. Run SUMMA Model using PySUMMA 3. Plotting with SUMMA output - Time series Plotting - 2D Plotting (heatmap, hovmoller) - Calculating water balance variables and Plotting - Spatial Plotting with shapefile

Clear search

Close search

Google apps

Main menu

SummaModel PreProcessing using csv file and PostProcessing using Plotting...

ESS-DIVE Reporting Format for Comma-separated Values (CSV) File Structure

input/pt_025_001.flnc.report.csv.zip

Child 1: Nutrient and streamflow model-input data

Temperature CSV

Integrated Hourly Meteorological Database of 20 Meteorological Stations...

input/pt_025_001.unpolished.cluster_report.csv.zip

Datasets and scripts related to the paper: "*Can Generative AI Help us in...

CSV Water Recovery Factors.csv

RAPID input and output files corresponding to "RAPID Applied to the...

1000 Empirical Time series

Dog Food Data Extracted from Chewy (USA) - 4,500 Records in CSV Format

Use Cases:

Csv Exports And Imports | See Full Import/Export Data | Eximpedia

Chapter 12: Data Preparation for Fraud Analytics: Project: Human Recourses...

Dataset 5: R script and input data

combined.csv dataset including market data and all input features.

Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...

Datalog subsetting input files (Wikidata 2015 NTriple-to-CSV dump)

Viscosity data input

Replication package for the paper "What do Developers Discuss about Code...

SummaModel PreProcessing using csv file and PostProcessing using Plotting Modules using PySUMMA