42 datasets found

Caravan - A global community dataset for large-sample hydrology (csv...
zenodo.org
application/gzip, zip
Updated May 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Frederik Kratzert; Frederik Kratzert; Grey Nearing; Grey Nearing; Nans Addor; Nans Addor; Tyler Erickson; Martin Gauch; Martin Gauch; Oren Gilon; Lukas Gudmundsson; Lukas Gudmundsson; Avinatan Hassidim; Daniel Klotz; Daniel Klotz; Sella Nevo; Guy Shalev; Yossi Matias; Tyler Erickson; Oren Gilon; Avinatan Hassidim; Sella Nevo; Guy Shalev; Yossi Matias (2025). Caravan - A global community dataset for large-sample hydrology (csv version) [Dataset]. http://doi.org/10.5281/zenodo.15530022
Explore at:
zip, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15530022
Dataset updated
May 27, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Frederik Kratzert; Frederik Kratzert; Grey Nearing; Grey Nearing; Nans Addor; Nans Addor; Tyler Erickson; Martin Gauch; Martin Gauch; Oren Gilon; Lukas Gudmundsson; Lukas Gudmundsson; Avinatan Hassidim; Daniel Klotz; Daniel Klotz; Sella Nevo; Guy Shalev; Yossi Matias; Tyler Erickson; Oren Gilon; Avinatan Hassidim; Sella Nevo; Guy Shalev; Yossi Matias
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the accompanying dataset to the following paper https://www.nature.com/articles/s41597-023-01975-w

Caravan is an open community dataset of meteorological forcing data, catchment attributes, and discharge daat for catchments around the world. Additionally, Caravan provides code to derive meteorological forcing data and catchment attributes from the same data sources in the cloud, making it easy for anyone to extend Caravan to new catchments. The vision of Caravan is to provide the foundation for a truly global open source community resource that will grow over time.

If you use Caravan in your research, it would be appreciated to not only cite Caravan itself, but also the source datasets, to pay respect to the amount of work that was put into the creation of these datasets and that made Caravan possible in the first place.

All current development and additional community extensions can be found at https://github.com/kratzert/Caravan

IMPORTANT: Due to size limitations for individual repositories, the netCDF version and the CSV version of Caravan (since Version 1.6) are split into two different repositories. You can find the netCDF version at https://zenodo.org/records/14673536

Channel Log:

23 May 2022: Version 0.2 - Resolved a bug when renaming the LamaH gauge ids from the LamaH ids to the official gauge ids provided as "govnr" in the LamaH dataset attribute files.

24 May 2022: Version 0.3 - Fixed gaps in forcing data in some "camels" (US) basins.

15 June 2022: Version 0.4 - Fixed replacing negative CAMELS US values with NaN (-999 in CAMELS indicates missing observation).

1 December 2022: Version 0.4 - Added 4298 basins in the US, Canada and Mexico (part of HYSETS), now totalling to 6830 basins. Fixed a bug in the computation of catchment attributes that are defined as pour point properties, where sometimes the wrong HydroATLAS polygon was picked. Restructured the attribute files and added some more meta data (station name and country).

16 January 2023: Version 1.0 - Version of the official paper release. No changes in the data but added a static copy of the accompanying code of the paper. For the most up to date version, please check https://github.com/kratzert/Caravan

10 May 2023: Version 1.1 - No data change, just update data description.

17 May 2023: Version 1.2 - Updated a handful of attribute values that were affected by a bug in their derivation. See https://github.com/kratzert/Caravan/issues/22 for details.

16 April 2024: Version 1.4 - Added 9130 gauges from the original source dataset that were initially not included because of the area thresholds (i.e. basins smaller than 100sqkm or larger than 2000sqkm). Also extended the forcing period for all gauges (including the original ones) to 1950-2023. Added two different download options that include timeseries data only as either csv files (Caravan-csv.tar.xz) or netcdf files (Caravan-nc.tar.xz). Including the large basins also required an update in the earth engine code

16 Jan 2025: Version 1.5 - Added FAO Penman-Monteith PET (potential_evaporation_sum_FAO_PENMAN_MONTEITH) and renamed the ERA5-LAND potential_evaporation band to potential_evaporation_sum_ERA5_LAND. Also added all PET-related climated indices derived with the Penman-Monteith PET band (suffix "_FAO_PM") and renamed the old PET-related indices accordingly (suffix "_ERA5_LAND").

27 May 2025: Version 1.6

Updated the CAMELS-AUS data to source from CAMELS-AUS v2. This means more basins (561 compared to 222) and more recent streamflow data (2022 compared to 2014). Note that the gauge id for four basins changed between the original CAMELS-AUS version and v2. Those gauges are ['camelsaus_224213A', 'camelsaus_224214A', 'camelsaus_227225A', 'camelsaus_403213A'] that all lost their trailing "A". To stay synced with CAMELS-AUS (v2), we also adapted the new naming.

Added VERSION file to the root directory that contains the current version number.

Updated the code to the most recent GitHub snapshot (commit 6eab036).

Due to the 50GB repository limit, we had to split the netCDF version and the CSV version into two separate repositories. The CSV version can be found under https://zenodo.org/records/15530021
C
2003 Ward Dataset CSVs - All Except Crimes - 2001 to present
data.cityofchicago.org
datadiscoverystudio.org
+2more
application/rdfxml +5
Updated May 20, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Chicago (2015). 2003 Ward Dataset CSVs - All Except Crimes - 2001 to present [Dataset]. https://data.cityofchicago.org/Administration-Finance/2003-Ward-Dataset-CSVs-All-Except-Crimes-2001-to-p/q6z8-94kn
Explore at:
csv, application/rssxml, application/rdfxml, xml, json, tsvAvailable download formats
Dataset updated
May 20, 2015
Dataset authored and provided by
City of Chicago
Description
As discussed in http://bit.ly/wardpost, the City of Chicago changed to a new ward map on 5/18/2015, affecting some datasets. This ZIP file contains CSV exports from 5/15/2015 of all datasets except Crimes - 2001 to present. Due to size limitations, that CSV is at https://data.cityofchicago.org/d/5wdx-rdkp. These CSV files contain the final or close-to-final versions of the datasets with the previous ("2003") ward values.
m
csv datasets and summary statistics - Dataset - DCOR
dcor.mpl.mpg.de
Updated Jun 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). csv datasets and summary statistics - Dataset - DCOR [Dataset]. https://dcor.mpl.mpg.de/dataset/csv-datasets-and-summary-statistics
Explore at:
Dataset updated
Jun 6, 2025
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Contains csv data of cell features used for the analysis in the publication: "A novel MYH9 variant leads to atypical Epstein-Fechtner syndrome by altering non-muscle myosin IIA mediated contractile processes". These csv files contain call relevant cell features per patient and cell type. Files should be titled: For controls: + + .csv For patients: + + + + .csv Metadata containing sex and age is also available in files: “controls_metadata.csv” and “patients_metadata.csv” Summary statistic is also included in this public dataset. For controls: “controls_summary_statistics.csv” For patients: “patients_summary_statistics.csv” Summary statistic files are created using publicly available code: code: https://github.com/SaraKaliman/dc-data-novel-MYH9-variant/blob/main/Step1_summary_statistics.ipynb Group analysis included t-test, U-test and effect size for t-test and can be found in the file: “summary_statistical_group_analysis.csv” file. Main figure in the article and statistical analysis are done using publicly available code: https://github.com/SaraKaliman/dc-data-novel-MYH9-variant/blob/main/Step2_group_comparison.ipynb Single scalar rtdc files is included only due to limitation of DCOR datasets to rtdc files.
d
Plug Load Data
catalog.data.gov
data.nasa.gov
+2more
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Plug Load Data [Dataset]. https://catalog.data.gov/dataset/plug-load-data
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
We provide MATLAB binary files (.mat) and comma separated values files of data collected from a pilot study of a plug load management system that allows for the metering and control of individual electrical plug loads. The study included 15 power strips, each containing 4 channels (receptacles), which wirelessly transmitted power consumption data approximately once per second to 3 bridges. The bridges were connected to a building local area network which relayed data to a cloud-based service. Data were archived once per minute with the minimum, mean, and maximum power draw over each one minute interval recorded. The uncontrolled portion of the testing spanned approximately five weeks and established a baseline energy consumption. The controlled portion of the testing employed schedule-based rules for turning off selected loads during non-business hours; it also modified the energy saver policies for certain devices. Three folders are provided: “matFilesAllChOneDate” provides a MAT-file for each date, each file has all channels; “matFilesOneChAllDates” provides a MAT-file for each channel, each file has all dates; “csvFiles” provides comma separated values files for each date (note that because of data export size limitations, there are 10 csv files for each date). Each folder has the same data; there is no practical difference in content, only the way in which it is organized.

Synthetic Indoor Climate and Occupancy Data from Office and Meeting Room...

zenodo.org
data.niaid.nih.gov

csv, zip

Updated Apr 25, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Manuel Weber; Manuel Weber; Farzan Banihashemi; Farzan Banihashemi (2024). Synthetic Indoor Climate and Occupancy Data from Office and Meeting Room Simulations [Dataset]. http://doi.org/10.5281/zenodo.10507614

Explore at:

csv, zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.10507614

Dataset updated

Apr 25, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Manuel Weber; Manuel Weber; Farzan Banihashemi; Farzan Banihashemi

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This is the dataset used for the publication "Coddora: CO2-based Occupancy Detection model
trained via DOmain RAndomization". The goal is to provide training data for occupancy detection.

The dataset contains one million days of data including 10 occupied days for each of 100,000 randomized room models (50,000 rooms considering office activity and 50,000 meeting room activity). Data were generated in EnergyPlus simulations according to the methodology described in the paper.

When using the dataset, please cite:

Manuel Weber, Farzan Banihashemi, Davor Stjelja, Peter Mandl, Ruben Mayer, and Hans-Arno Jacobsen. 2024. Coddora: CO2-Based Occupancy Detection Model Trained via Domain Randomization. In International Joint Conference on Neural Networks (IJCNN). June 30 - July 5, 2024, Yokohama, Japan.

Dataset Structure

The following files are provided:

1. dataset_office_rooms.h5 (provided as zip file)
2. dataset_meeting_rooms.h5 (provided as zip file)
3. simulated_occupancy_office_rooms.csv
4. simulated_occupancy_meeting_rooms.csv

Please use an archiving tool such as 7zip to unzip the hdf5 files.
Both hdf5 files contain two datasets with the following keys:

1. "data": contains the simulated indoor climate and occupancy data
2. "metadata": contains the metadata that were used for each simulation

The csv files contain the time series of occupancy that were used for the simulations.

Data

Data includes the following fields:

Datetime: day of the year (may be relevant due to seasonal differences) and time of the day
Zone Air CO2 Concentration: CO2 level in ppm
Zone Mean Air Temperature: temperature in °C
Zone Air Relative Humidity: relative humidity in %
Occupancy: level of occupancy relative to the maximum capacity of the room (in the range [0-1])
Ventilation: fraction of window opening in the range [0.01, 1]
SimID: foreign key to reference the room properties the simulation was based on
BinaryOccupancy: 0 or 1 denoting absence or presence (for binary classification)

Example row:

Datetime	Zone Air CO2 Concentration	Zone Mean Air Temperature	Zone Air Relative Humidity	Occupancy	Ventilation	simID	BinaryOccupancy
10/09 11:21:00	1084.5624647371608	24.545635909907148	41.18393114737054	0.7	0.0	99	1

Metadata

Metadata includes the following fields.
Underscores denote that the field was not selected during randomization but calculated from the other values.

width: room width in m
length: room length in m
height: hoom height in m
infiltration: infiltration per exterior area in m³/m²s
outdoor_co2: co2 concentration in the outdoor air in ppm (set to a random value between [300, 500])
orientation: angle between the room's facade orientation and the north direction in degrees
maxOccupants: room occupation limit, i.e. the maximum number of occupants
_floorArea: floor area in m² (calculated from room dimensions)
_volume: room volume in m³ (calculated from room dimensions)
_exteriorSurfaceArea: surface area of the facade wall (calculated from room dimensions)
_winToFloorRatio: ratio between total window area and floor area (calculated from room model)
firstDayUsedOfOccupancySequence: selected starting day in the sequence of occupancy data for rooms with the respective maxOccupants value
simID: unique identifier of the simulation to relate between simulation metadata and resulting simulated data

Example row:

width	length	height	infiltration	outdoor_co2	orientation	maxOccupants	_floorArea	_volume	_exteriorSurfaceArea	_winToFloorRatio	firstDayOfUsedOccupancySequence	simID
5.481	5.190	3.264	0.000214	438.0	316.0	4.0	28.446	92.849	16.940	0.216	192	0

Occupancy Data

The occupancy data provided through the separate csv files contain the data from the upfront occupancy simulations that the climate simulation was based on. For each level of considered room occupancy limit (maxOccupants), the datasets provide minute values of occupancy throughout 1000 days.

Datetime, Date, Timestamp: fictive time of simulated occupancy record (sequences are in 1-minute resolution)
Occupants: number of present occupants
Occupancy: binary occupancy state (0=unoccupied, 1=occupied)
WindowState: binary state of ventilation (0=windows closed, 1=room is ventilated)
maxOccupants: maximum number of occupants considered for the simulated sequence
WindowOpeningFraction: fractional extent to which windows are opened, within the interval [0.01, 1]

Example row:

Datetime	Date	Timestamp	Occupants	Occupancy	WindowState	maxOccupants	WindowOpeningFraction
2023-01-01 00:00:00	2023-01-01	1.672531e+09	0	0	0	1	0.0

Z
Data from: Dataset from : Browsing is a strong filter for savanna tree...
data.niaid.nih.gov
Updated Oct 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archibald, Sally (2021). Dataset from : Browsing is a strong filter for savanna tree seedlings in their first growing season [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4972083
Explore at:
Dataset updated
Oct 1, 2021
Dataset provided by
Archibald, Sally
Wayne Twine
Craddock Mthabini
Nicola Stevens
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The data presented here were used to produce the following paper:

Archibald, Twine, Mthabini, Stevens (2021) Browsing is a strong filter for savanna tree seedlings in their first growing season. J. Ecology.

The project under which these data were collected is: Mechanisms Controlling Species Limits in a Changing World. NRF/SASSCAL Grant number 118588

For information on the data or analysis please contact Sally Archibald: sally.archibald@wits.ac.za

Description of file(s):

File 1: cleanedData_forAnalysis.csv (required to run the R code: "finalAnalysis_PostClipResponses_Feb2021_requires_cleanData_forAnalysis_.R"

The data represent monthly survival and growth data for ~740 seedlings from 10 species under various levels of clipping.

The data consist of one .csv file with the following column names:

treatment Clipping treatment (1 - 5 months clip plus control unclipped) plot_rep One of three randomised plots per treatment matrix_no Where in the plot the individual was placed species_code First three letters of the genus name, and first three letters of the species name uniquely identifies the species species Full species name sample_period Classification of sampling period into time since clip. status Alive or Dead standing.height Vertical height above ground (in mm) height.mm Length of the longest branch (in mm) total.branch.length Total length of all the branches (in mm) stemdiam.mm Basal stem diameter (in mm) maxSpineLength.mm Length of the longest spine postclipStemNo Number of resprouting stems (only recorded AFTER clipping) date.clipped date.clipped date.measured date.measured date.germinated date.germinated Age.of.plant Date measured - Date germinated newtreat Treatment as a numeric variable, with 8 being the control plot (for plotting purposes)

File 2: Herbivory_SurvivalEndofSeason_march2017.csv (required to run the R code: "FinalAnalysisResultsSurvival_requires_Herbivory_SurvivalEndofSeason_march2017.R"

The data consist of one .csv file with the following column names:

treatment Clipping treatment (1 - 5 months clip plus control unclipped) plot_rep One of three randomised plots per treatment matrix_no Where in the plot the individual was placed species_code First three letters of the genus name, and first three letters of the species name uniquely identifies the species species Full species name sample_period Classification of sampling period into time since clip. status Alive or Dead standing.height Vertical height above ground (in mm) height.mm Length of the longest branch (in mm) total.branch.length Total length of all the branches (in mm) stemdiam.mm Basal stem diameter (in mm) maxSpineLength.mm Length of the longest spine postclipStemNo Number of resprouting stems (only recorded AFTER clipping) date.clipped date.clipped date.measured date.measured date.germinated date.germinated Age.of.plant Date measured - Date germinated newtreat Treatment as a numeric variable, with 8 being the control plot (for plotting purposes) genus Genus MAR Mean Annual Rainfall for that Species distribution (mm) rainclass High/medium/low

File 3: allModelParameters_byAge.csv (required to run the R code: "FinalModelSeedlingSurvival_June2021_.R"

Consists of a .csv file with the following column headings

Age.of.plant Age in days species_code Species pred_SD_mm Predicted stem diameter in mm pred_SD_up top 75th quantile of stem diameter in mm pred_SD_low bottom 25th quantile of stem diameter in mm treatdate date when clipped pred_surv Predicted survival probability pred_surv_low Predicted 25th quantile survival probability pred_surv_high Predicted 75th quantile survival probability species_code species code Bite.probability Daily probability of being eaten max_bite_diam_duiker_mm Maximum bite diameter of a duiker for this species duiker_sd standard deviation of bite diameter for a duiker for this species max_bite_diameter_kudu_mm Maximum bite diameer of a kudu for this species kudu_sd standard deviation of bite diameter for a kudu for this species mean_bite_diam_duiker_mm mean etc duiker_mean_sd standard devaition etc mean_bite_diameter_kudu_mm mean etc kudu_mean_sd standard deviation etc genus genus rainclass low/med/high

File 4: EatProbParameters_June2020.csv (required to run the R code: "FinalModelSeedlingSurvival_June2021_.R"

Consists of a .csv file with the following column headings

shtspec species name species_code species code genus genus rainclass low/medium/high seed mass mass of seed (g per 1000seeds)
Surv_intercept coefficient of the model predicting survival from age of clip for this species Surv_slope coefficient of the model predicting survival from age of clip for this species GR_intercept coefficient of the model predicting stem diameter from seedling age for this species GR_slope coefficient of the model predicting stem diameter from seedling age for this species species_code species code max_bite_diam_duiker_mm Maximum bite diameter of a duiker for this species duiker_sd standard deviation of bite diameter for a duiker for this species max_bite_diameter_kudu_mm Maximum bite diameer of a kudu for this species kudu_sd standard deviation of bite diameter for a kudu for this species mean_bite_diam_duiker_mm mean etc duiker_mean_sd standard devaition etc mean_bite_diameter_kudu_mm mean etc kudu_mean_sd standard deviation etc AgeAtEscape_duiker[t] age of plant when its stem diameter is larger than a mean duiker bite AgeAtEscape_duiker_min[t] age of plant when its stem diameter is larger than a min duiker bite AgeAtEscape_duiker_max[t] age of plant when its stem diameter is larger than a max duiker bite AgeAtEscape_kudu[t] age of plant when its stem diameter is larger than a mean kudu bite AgeAtEscape_kudu_min[t] age of plant when its stem diameter is larger than a min kudu bite AgeAtEscape_kudu_max[t] age of plant when its stem diameter is larger than a max kudu bite
h
Aviation Sites Coordinates.csv
opendata.housing.gov.ie
hub.arcgis.com
Updated Sep 26, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Aviation Sites Coordinates.csv [Dataset]. https://opendata.housing.gov.ie/dataset/aviation-sites-coordinates-csv1
Explore at:
Dataset updated
Sep 26, 2022
Description
As part of a review of the Solar Planning Exemptions set out in the Planning and Development Regulations 2001, the Department in conjunction with relevant statutory stakeholders (namely the Irish aviation Authority (IAA), Department of Defence and the HSE) considered the impact of glint and/or glare from solar panels on aviation receptors. Having regard to the potential glint and/or glare impact on aviation receptors, the designation of Solar Safeguarding Zones (SSZs) around certain airports (5km zone), aerodromes/ military barracks (3km zone), emergency helipads (3km zone) was required in order to provide appropriate safeguards in close proximity to aviation sites.

43 SSZs were introduced within which a rooftop limit on solar panels continues to apply:

10 SSZs with 5km zones (airports)

33 SSZs with 3km zones (aerodromes/ military barracks/ hospital helipads).

The geographical area of the Solar Safeguarding Zones are delineated and defined by Statute in Schedule 1 (a map or maps of the areas) and Schedule 2 (a list of the townlands in question / a description of the areas) of the Planning and Development (Solar safeguarding Zone) Regulation 2022 (S.I. No. 492 of 2022).

The maps are also available to view in more detail on a non-statutory basis on myplan.ie
f
DataSheet1_Scaling Beyond Cities.CSV
frontiersin.figshare.com
txt
Updated Jun 4, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rafael Prieto Curiel; Carmen Cabrera-Arnau; Steven Richard Bishop (2023). DataSheet1_Scaling Beyond Cities.CSV [Dataset]. http://doi.org/10.3389/fphy.2022.858307.s001
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.3389/fphy.2022.858307.s001
Dataset updated
Jun 4, 2023
Dataset provided by
Frontiers
Authors
Rafael Prieto Curiel; Carmen Cabrera-Arnau; Steven Richard Bishop
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
City population size is a crucial measure when trying to understand urban life. Many socio-economic indicators scale superlinearly with city size, whilst some infrastructure indicators scale sublinearly with city size. However, the impact of size also extends beyond the city’s limits. Here, we analyse the scaling behaviour of cities beyond their boundaries by considering the emergence and growth of nearby cities. Based on an urban network from African continental cities, we construct an algorithm to create the region of influence of cities. The number of cities and the population within a region of influence are then analysed in the context of urban scaling. Our results are compared against a random permutation of the network, showing that the observed scaling power of cities to enhance the emergence and growth of cities is not the result of randomness. By altering the radius of influence of cities, we observe three regimes. Large cities tend to be surrounded by many small towns for small distances. For medium distances (above 114 km), large cities are surrounded by many other cities containing large populations. Large cities boost urban emergence and growth (even more than 190 km away), but their scaling power decays with distance.
CAMELSH: A Large-Sample Hourly Hydrometeorological Dataset and Attributes at...
zenodo.org
bin
Updated May 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vinh Ngoc Tran; Vinh Ngoc Tran (2025). CAMELSH: A Large-Sample Hourly Hydrometeorological Dataset and Attributes at Watershed-Scale for Contiguous United States [Dataset]. http://doi.org/10.5281/zenodo.15413207
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15413207
Dataset updated
May 16, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Vinh Ngoc Tran; Vinh Ngoc Tran
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Mar 23, 2025
Area covered
Contiguous United States, United States
Description
****** UPDATE 05/015/2025

We have increased the number of basins with observational data from 3,166 to 5,188.

In addition, we have added water level data alongside streamflow measurements.

The hourly streamflow and water level data for a total of 5,188 USGS gauges are stored in individual NetCDF files and packaged together in the Hourly.7z archive. The dataset covers the period from 1980-01-01 00:00:00 to 2024-12-31 23:00:00. Missing values are indicated by NaN.

DOI: https://doi.org/10.5281/zenodo.15413207

****** UPDATE 05/01/2025

ERA5-Land forcings can be downloaded here: https://doi.org/10.5281/zenodo.15264814

******

The current version of the CAMELSH dataset, containing data for 9,008 basins,. Due to the total data volume in the repository being approximately 57 GB, which exceeds Zenodo's size limit, we split it into two different links. The first link (https://doi.org/10.5281/zenodo.15066778) contains data on attributes, shapefiles, and time series data for the first set of basins. The second link (https://doi.org/10.5281/zenodo.14889025) contains forcing (time series) data for the the remaining basins. All data is compressed in 7zip format. After extraction, the dataset is organized into the following subfolders:

• The attributes folder contains 28 CSV (comma-separated values) files that store basin attributes with all files beginning with "attributes_" and one excel file. Of these, the 'attributes_nldas2_climate.csv' file contains nine climate attributes (Table 2) derived from NLDAS-2 data. The 'attributes_hydroATLAS.csv' file includes 195 basin attributes derived from the HydroATLAS dataset. 26 files with names starting with 'attributes_gageii_' contain a total of 439 basin attributes extracted from the GAGES-II dataset. The name of each file represents a distinct group of attributes, as described in Table S.1. The remaining file, named 'Var_description_gageii.xlsx', provides explanatory details regarding the variable names included in the 26 CSV files, with information similar to that presented in Table S.1. The first column in all CSV files, labeled 'STAID', contains the identification (ID) names of the stream gauges. These IDs are assigned by the USGS and are sourced from the original GAGES-II dataset.
• The shapefiles folder contains two sets of shapefiles for the catchment boundary. The first set, CAMELSH_shapefile.shp, is derived from the original GAGES-II dataset and is used to obtain the corresponding climate forcing data for each catchment. The second set, CAMELSH_shapefile_hydroATLAS.shp, includes catchment boundaries derived from the HydroATLAS dataset. Each polygon in both shapefiles contains a field named GAGE_ID, which represents the ID of the stream gauges.
• The timeseries (7zip) file contains a compressed archive (7zip) that includes time series data for 3,166 basins with observed streamflow data. Within this 7zip file, there are a total of 3,166 NetCDF files, each corresponding to a specific basin. The name of each NetCDF file matches the stream gauge ID. Each file contains an hourly time series from 1980-01-01 00:00:00 to 2024-12-31 23:00:00 for streamflow (denoted as "Streamflow" in the NetCDF file) and 11 climate variables (see Table 1). The streamflow data series includes missing values, which are represented as "NaN". All meteorological forcing data and streamflow records have been standardized to the +0 UTC time zone.
• The timeseries_nonobs (7zip) file contains time series data for the remaining 5,842 basins. The structure of each NetCDF file is similar to the one described above.
• The info.csv file, located in the main directory of the dataset, contains basic information for 9,008 stream stations. This includes the stream gauge ID, the total number of observed hourly data points over 45 years (from 1980 to 2024), and the number of observed hourly data points for each year from 1980 to 2024. Stations with and without observed data are distinguished by the value in the second column, where stations without observed streamflow data have a corresponding value of 0.

DOI: https://doi.org/10.5281/zenodo.15066778

https://doi.org/10.5281/zenodo.14889025
Retailrocket recommender system dataset
kaggle.com
Updated Nov 8, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roman Zykov (2022). Retailrocket recommender system dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/4471234
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/4471234
Dataset updated
Nov 8, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Roman Zykov
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Context

The dataset consists of three files: a file with behaviour data (events.csv), a file with item properties (item_properties.сsv) and a file, which describes category tree (category_tree.сsv). The data has been collected from a real-world ecommerce website. It is raw data, i.e. without any content transformations, however, all values are hashed due to confidential issues. The purpose of publishing is to motivate researches in the field of recommender systems with implicit feedback.

Content

The behaviour data, i.e. events like clicks, add to carts, transactions, represent interactions that were collected over a period of 4.5 months. A visitor can make three types of events, namely “view”, “addtocart” or “transaction”. In total there are 2 756 101 events including 2 664 312 views, 69 332 add to carts and 22 457 transactions produced by 1 407 580 unique visitors. For about 90% of events corresponding properties can be found in the “item_properties.csv” file.

For example:

“1439694000000,1,view,100,” means visitorId = 1, clicked the item with id = 100 at 1439694000000 (Unix timestamp)

“1439694000000,2,transaction,1000,234” means visitorId = 2 purchased the item with id = 1000 in transaction with id = 234 at 1439694000000 (Unix timestamp)

The file with item properties (item_properties.csv) includes 20 275 902 rows, i.e. different properties, describing 417 053 unique items. File is divided into 2 files due to file size limitations. Since the property of an item can vary in time (e.g., price changes over time), every row in the file has corresponding timestamp. In other words, the file consists of concatenated snapshots for every week in the file with the behaviour data. However, if a property of an item is constant over the observed period, only a single snapshot value will be present in the file. For example, we have three properties for single item and 4 weekly snapshots, like below:

timestamp,itemid,property,value 1439694000000,1,100,1000 1439695000000,1,100,1000 1439696000000,1,100,1000 1439697000000,1,100,1000 1439694000000,1,200,1000 1439695000000,1,200,1100 1439696000000,1,200,1200 1439697000000,1,200,1300 1439694000000,1,300,1000 1439695000000,1,300,1000 1439696000000,1,300,1100 1439697000000,1,300,1100

After snapshot merge it would looks like:

1439694000000,1,100,1000 1439694000000,1,200,1000 1439695000000,1,200,1100 1439696000000,1,200,1200 1439697000000,1,200,1300 1439694000000,1,300,1000 1439696000000,1,300,1100

Because property=100 is constant over time, property=200 has different values for all snapshots, property=300 has been changed once.

Item properties file contain timestamp column because all of them are time dependent, since properties may change over time, e.g. price, category, etc. Initially, this file consisted of snapshots for every week in the events file and contained over 200 millions rows. We have merged consecutive constant property values, so it's changed from snapshot form to change log form. Thus, constant values would appear only once in the file. This action has significantly reduced the number of rows in 10 times.

All values in the “item_properties.csv” file excluding "categoryid" and "available" properties were hashed. Value of the "categoryid" property contains item category identifier. Value of the "available" property contains availability of the item, i.e. 1 means the item was available, otherwise 0. All numerical values were marked with "n" char at the beginning, and have 3 digits precision after decimal point, e.g., "5" will become "n5.000", "-3.67584" will become "n-3.675". All words in text values were normalized (stemming procedure: https://en.wikipedia.org/wiki/Stemming) and hashed, numbers were processed as above, e.g. text "Hello world 2017!" will become "24214 44214 n2017.000"

The category tree file has 1669 rows. Every row in the file specifies a child categoryId and the corresponding parent. For example:

Line “100,200” means that categoryid=1 has parent with categoryid=200

Line “300,” means that categoryid hasn’t parent in the tree

Acknowledgements

Retail Rocket (retailrocket.io) helps web shoppers make better shopping decisions by providing personalized real-time recommendations through multiple channels with over 100MM unique monthly users and 1000+ retail partners over the world.

Inspiration

How to use item properties and category tree data to improve collaborative filtering model?

Recurrent Neural Networks with Top-k Gains for Session-based Recommendations https://github.com/hidasib/GRU4Rec and paper https://arxiv.org/abs/1706.03847

https://www.researchgate.net/publication/280538158_Application_of_Kullback-Leibler_divergence_for_short-term_user_interest_detection

https://pdfs.semanticscholar.org/66dc/1724c4ed1e74fe6...
Z
Data from: PatagoniaMet: A multi-source hydrometeorological dataset for...
data.niaid.nih.gov
zenodo.org
Updated Jan 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Baez-Villanueva, Oscar (2024). PatagoniaMet: A multi-source hydrometeorological dataset for Western Patagonia [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7992760
Explore at:
Dataset updated
Jan 3, 2024
Dataset provided by
Aguayo, Rodrigo
Fernández, Alfonso
Baez-Villanueva, Oscar
Aguayo, Mauricio
León-Muñoz, Jorge
Zambrano-Bigiarini, Mauricio
Jacques-Coper, Martín
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PatagoniaMet v1.0 (PMET from here on) is a new dataset for Western Patagonia that consists of two datasets: i) PMET-obs, a compilation of quality-controlled ground-based hydrometeorological data, and ii) PMET-sim, a daily gridded product of precipitation, and maximum and minimum temperature. PMET-obs was developed using a 4-step quality control process applied to 523 hydro-meteorological time series (precipitation, air temperature, potential evaporation, streamflow and lake level stations) obtained from eight institutions in Chile and Argentina. Based on this dataset and currently available uncorrected gridded products (in this case ERA5), PMET-sim was developed using statistical bias correction procedures (i.e. quantile mapping), spatial regression models (random forest) and hydrological methods (Budyko framework). Details are given below.

PMET-obs is a compilation of five hydrometeorological variables obtained from eight institutions in Chile and Argentina. The daily quality controlled data of each variable are stored in separate .csv files with the following naming convention: variable_PMETobs_timeperiod_version/timestep.csv. Each column represents a different gauge with its "gauge_id". Each variable has an additional .csv file containing the metadata for each station (variable_PMETobs_version_metadata.csv). In order to make transparent the possible erroneous data that were discarded from the quality-controlled version, a .zip file with the raw data of all variables is attached. The metadata file (final and raw versions) contains the station name (gauge_name), the institution, the station location (gauge_lat and gauge_lon), the NASADEM elevation (gauge_alt) and the total number of daily records (length). In addition, the precipitation and temperature metadata include the number of monthly outliers (step Nº3 in the methods) and the number of changepoints (step Nº4 in the methods).

The streamflow metadata file (Q_PMETobs_version_metadata.csv) contains more than just the location data. Following current guidelines for hydrological datasets, the upstream area corresponding to each stream gauge was delimited (.shp file in Basins_PMETobs_version.zip), and several climatic and geographic attributes were derived. The details of the attributes can be found in the README file. For the basins that were part of the hydrological modelling (and that achieved a Kling-Gupta efficiency greater than 0.5), the file Q_PMETobs_version_water_balance.csv is attached, which contains the water balance for each basin estimated for the period 1985-2019.

PMET-sim is a daily gridded product with a spatial resolution of 0.05° covering the period 1980-2020. The data for each variable (precipitation and maximum and minimum temperature) are stored in separate netcdf files with the following naming convention: variable_PMETsim_1980_2020_v10d.nc.

Citation: Aguayo, R., León-Muñoz, J., Aguayo, M., Baez-Villanueva, O., Fernandez, A. Zambrano-Bigiarini, M., and Jacques-Coper, M. (2023) PatagoniaMet: A multi-source hydrometeorological dataset for Western Patagonia. Sci Data 11, 6 (2024). https://doi.org/10.1038/s41597-023-02828-2

Code repository: https://github.com/rodaguayo/PatagoniaMet
H
Replication Data for: Communication networks do not predict success in...
dataverse.harvard.edu
Updated Jan 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeremy Foote; Aaron Shaw; Benjamin Mako Hill (2023). Replication Data for: Communication networks do not predict success in attempts at peer production [Dataset]. http://doi.org/10.7910/DVN/48QC7B
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/48QC7B
Dataset updated
Jan 17, 2023
Dataset provided by
Harvard Dataverse
Authors
Jeremy Foote; Aaron Shaw; Benjamin Mako Hill
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/48QC7Bhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/48QC7B
Description
This replication dataset includes code and data to replicate the paper "Communication networks do not predict success in attempts at peer production". The data included are of three types: 1. A zipped tar file of compressed XML files of edits made to wikis. This includes the full text of every revision made to the 1430 wikis that were part of our analysis as of early 2010 (different wikis were collected at different times). Note: Due to the Dataverse's file size limit, this file is in two parts - wiki_com_networks-wiki_dump.tar.xz.partaa and wiki_com_networks-wiki_dump.tar.xz.partab To combine them run: cat wiki_com_networks-wiki_dump.tar.xz.part* > wiki_com_networks-wiki_dump.tar.xz 2. A zipped tar file of the wikiq TSV files with metadata about each edit, created using the wikiq parser (https://code.communitydata.science/mediawiki_dump_tools.git). Those wishing to convert the XML files into TSV files can use the wikiq parser. 3. Summary CSV files with data about the communication network and activity levels for each wiki---in other words, the data used for the analyses in the paper. Code for converting the TSV files into these summary CSV files is included. A more detailed description of how to replicate the figures and analyses from the paper is given in the README file included with the code.
g
Real-time temperatures on the Métropole de Lyon | gimi9.com
gimi9.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Real-time temperatures on the Métropole de Lyon | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_648e2ddac3abf19d3521c0fd/
Explore at:
Area covered
Lyon Metropolis
Description
Temperature data in Celsius degrees collected approximately every 30 minutes since 2017 by 19 sensors distributed on Garibaldi Street in Lyon and elsewhere in the Metropolis (see sensor location data). These sensors were installed as part of the European biotope project (https://www.grandlyon.com/metropole/affaires-europeennes/biotope) by the Métropole de Lyon. They run on battery and some no longer transmit data (see inactive sensors in sensor location data). Attention: data transmission is expected to stop by the end of 2024, depending on the battery level of each sensor and the scheduled shutdown of the LoRa data transmission network. Downloading this large data (more than 1500000 unit data) in CSV format will require a specific query to limit the file size and be able to open it. For example: — sensor data N°70b3d580a0100648 can be downloaded via the link: https://download.data.grandlyon.com/ws/timeseries/biotope.temperature/all.csv?field=deveui&value=70b3d580a0100648&maxfeatures=-1 — the 1000000 unit data from the 700 000th element can be downloaded in CSV via the link: https://download.data.grandlyon.com/ws/timeseries/biotope.temperature/all.csv?maxfeatures=1000000&start=700000 Refer to the Documentation of the platform for more details (filter by sensor number, date range, etc.): https://rdata-grandlyon.readthedocs.io/fr/latest/
Surface Water - Chemistry Results
data.cnra.ca.gov
data.ca.gov
csv, pdf, zip
Updated May 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California State Water Resources Control Board (2025). Surface Water - Chemistry Results [Dataset]. https://data.cnra.ca.gov/dataset/surface-water-chemistry-results
Explore at:
csv, zip, pdfAvailable download formats
Dataset updated
May 7, 2025
Dataset authored and provided by
California State Water Resources Control Board
Description
This data provides results from the California Environmental Data Exchange Network (CEDEN) for field and lab chemistry analyses. The data set contains two provisionally assigned values (“DataQuality” and “DataQualityIndicator”) to help users interpret the data quality metadata provided with the associated result.

Due to file size limitations, the data has been split into individual resources by year. The entire dataset can also be downloaded in bulk using the zip files on this page (in csv format or parquet format), and developers can also use the API associated with each year's dataset to access the data.

Users who want to manually download more specific subsets of the data can also use the CEDEN Query Tool, which provides access to the same data presented here, but allows for interactive data filtering.

NOTE: Some of the field and lab chemistry data that has been submitted to CEDEN since 2020 has not been loaded into the CEDEN database. That data is not included in this data set (and is also not available via the CEDEN query tool described above), but is available as a supplemental data set available here: Surface Water - Chemistry Results - CEDEN Augmentation. For consistency, many of the conditions applied to the data in this dataset and in the CEDEN query tool are also applied to that supplemental dataset (e.g., no rejected data or replicates are included), but that supplemental data is provisional and may not reflect all of the QA/QC controls applied to the regular CEDEN data available here.
G
Landings, abundance series, and biological data for a potential range-wide...
open.canada.ca
gimi9.com
+1more
csv, esri rest, pdf +2
Updated Feb 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fisheries and Oceans Canada (2025). Landings, abundance series, and biological data for a potential range-wide American eel stock assessment [Dataset]. https://open.canada.ca/data/en/dataset/7ce80b52-b555-11ea-bdf6-1860247f53e3
Explore at:
pdf, csv, zip, xlsx, esri restAvailable download formats
Dataset updated
Feb 17, 2025
Dataset provided by
Fisheries and Oceans Canada
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Time period covered
Jan 1, 1870 - Dec 31, 2019
Description
PURPOSE: To provide a permanent repository of key data series necessary to build a range-wide American eel stock assessment. DESCRIPTION: This collection presents data associated with the following report: Cairns, D.K. 2020. Landings, abundance indicators, and biological data for a potential range-wide American eel stock assessment. Canadian Data Report of Fisheries and Aquatic Science. No. 1311: v + 180 pp. Much of the data collection is from the Atlantic Provinces of Canada, particularly the Southern Gulf of St. Lawrence. The collection also includes data from elsewhere in the American eel's range in Canada, and also the United States and the Caribbean Basin. Files in the collection are as follows. Cairns2020_AnnexA_ReportTables.xlsx: This Excel file (file size 756 kb) contains all 37 tables in Cairns (2020) exactly as they appear in the report. Cairns2020_AnnexB_EelLengthsAgesEfishingRecords.xlsx: This Excel file (file size 3.1 mb) contains 20,047 records of American eel lengths and other biological data from the Canadian Atlantic Provinces, 1983-2017. Records include weights of 8,915 eels and ages of 2,212 eels. Records of 3,224 electrofishing sessions in the Miramichi River, New Brunswick, 1952-2019, and records of 2,590 electrofishing sessions in the Restigouche River, New Brunswick, 1972-2019 are included. Cairns2020_AnnexC_EelLengthsAgesDataDefinitions.csv: This .csv file (file size 4 kb) gives data definitions in English and French for the table of eel lengths and other biological data that is contained in Cairns2020_AnnexB_EelLengthsAgesEfishingRecords.xlsx and in Cairns2020_AnnexD_EelLengthsAges.csv. Cairns2020_AnnexD_EelLengthsAges.csv: This file (file size 2.0 mb) presents in .csv format the table of eel lengths and other biological data that is also presented in Cairns2020_AnnexB_EelLengthsAgesEfishingRecords.xlsx. Cairns2020_AnnexE_EelEFishingDataDefinitions.csv: This .csv file (file size 2 kb) gives data definitions in English and French for the table of eel electrofishing data that is contained in Cairns2020_AnnexB_EelLengthsAgesEfishingRecords.xlsx and in Cairns2020_AnnexD_EelLengthsAges.csv. Cairns2020_AnnexF_EelEFishing.csv: This file (file size 314 kb) presents in .csv format the table of eel electrofishing data that is also presented in Cairns2020_AnnexB_EelLengthsAgesEfishingRecords.xlsx. Cairns2020_AnnexG_OtolithImageMetadata.csv: This .csv file (file size 2 kb) provides metadata for the collection of eel otolith images. Files with names starting with EelOtos . . . . : These .tif, .jpg, and .bmp image files are in zipped format with a summed size of 5.3 gb. The files give magnified photos of 1,838 eel otoliths that have been prepared for age reading. Samples are from the Atlantic Provinces of Canada. Individual otolith codes in Cairns2020_AnnexB_EelLengthsAgesEfishingRecords.xlsx and in Cairns2020_AnnexC_EelLengthsAgesDataDefinitions.csv match the codes embedded in otolith image filenames. PARAMETERS COLLECTED: American eel landings, number caught, and effort of commercial and research fishing gear. American eel lengths, ages, sex and other biological data and sampling locations. NOTES ON QUALITY CONTROL: All keypunched records of landings, densities, and other data were verified against original sources. Landings and abundance indices were reviewed in a Department of Fisheries and Oceans scientific workshop and corrected as necessary. Length and age data were examined by length-weight and length age plots and implausible records were discarded. PHYSICAL SAMPLE DETAILS: No physical samples SAMPLING METHODS: Landings are from government fisheries agencies. Abundance indices are from commercial fyke, spear, and trap catch per unit effort, and from research ladder counts and electrofishing records. Mean elver lengths are compiled from published literature Sex ratios are compiled from published literature Locations of biological and genetic sampling are compiled from published literature American eel lengths are total length of live specimens. Ages are from otolith annulus readings Electrofishing records are from backpack electrofishing surveys in wadeable waters USE LIMITATION: To ensure scientific integrity and appropriate use of the data, we would encourage you to contact the data custodian.
Dataset for Global Lunar Boulder Map from LRO NAC optical images using Deep...
zenodo.org
csv, tiff
Updated May 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ben Aussel; Ben Aussel (2025). Dataset for Global Lunar Boulder Map from LRO NAC optical images using Deep Learning: Implications for Regolith and Protolith [Dataset]. http://doi.org/10.5281/zenodo.15520275
Explore at:
csv, tiffAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15520275
Dataset updated
May 30, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ben Aussel; Ben Aussel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Boulder
Description
The global boulder diameter, area, and aspect ratio distribution are stored in "global_boulder_diameter_distr.csv", "global_boulder_diameter_distr_cumulative.csv" "global_boulder_area_distr.csv" and "global_boulder_aspect_ratio_distr.csv".

The dependence of the boulder density (per square kilometer) and the mean boulder diameter on the longitude and latitude are saved in "global_boulder_longitude_distr.csv" and "global_boulder_latitude_distr.csv".

The size of the largest ejected boulder depending on the crater diameter are stored in "craters_max_boulder_diams.csv"

The average maximum and mean boulder diameter distribution with respect to the source crater are stored in "craters_boulders_dists.csv".

The cumulative size-frequency distributions of the ejecta boulders around fresh craters and cold spots are stored for maria and highlands in:

"csfd_coldspots_300_400m.csv": Cold spot diameters between 300 and 400m

"csfd_coldspots_400_500m.csv": Cold spot diameters between 400 and 500m

"csfd_coldspots_500_700m.csv": Cold spot diameters between 500 and 700m

"csfd_craters_700_1000m.csv": Crater diameters between 700 and 1000m

"csfd_craters_1000_1500m.csv": Crater diameters between 1000 and 1500m

"csfd_craters_1500_2000m.csv": Crater diameters between 1500 and 2000m

The boulder density maps are:

"boulder_density_map.tif": For all boulder diameters

"boulder_density_4p5_10m.tif": For boulder diameters between 4.5 and 10m

"boulder_density_10_30m.tif": For boulder diameters between 10 and 30m

"boulder_density_30_1000m.tif": For boulder diameters larger than 30m

The mean boulder diameter map is stored in "boulder_mean_diameter_map.tif"

The NAC rock abundance map is saved in "nra_map.tif"

For further information, see article.
f
Data_Sheet_2_Using LBB Tools to Assess Miter Squid Stock in the Northeastern...
frontiersin.figshare.com
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xuehui Wang; Yinglin He; Feiyan Du; Mengna Liu; Weilie Bei; Yancong Cai; Yongsong Qiu (2023). Data_Sheet_2_Using LBB Tools to Assess Miter Squid Stock in the Northeastern South China Sea.CSV [Dataset]. http://doi.org/10.3389/fmars.2020.518627.s002
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.3389/fmars.2020.518627.s002
Dataset updated
Jun 1, 2023
Dataset provided by
Frontiers
Authors
Xuehui Wang; Yinglin He; Feiyan Du; Mengna Liu; Weilie Bei; Yancong Cai; Yongsong Qiu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
South China Sea, China
Description
Based on length frequency data of miter squid (Uroteuthis chinensis) collected in the northeastern South China Sea in 1975–1977, 1997–1999, and 2018–2019, asymptotic length, optimal length at first capture, relative mortality, and relative biomass of the stock were estimated using length-based Bayesian biomass estimation (LBB). The LBB-estimated asymptotic length for 2018–2019 was smaller. Optimal lengths at first capture for the later far exceeded average lengths in catches because of a major increase in fishing intensity. Between 1975 and 1977, relative total mortality (Z/K) was low, but it increased in the latter two periods, while relative natural mortality (M/K) showed a downward trend. Relative biomasses (B/B0 and B/Bmsy) indicated that the stock was close to unexploited between 1975 and 1977, but they declined to the levels of 6% and 4% in the later periods, which correspond to growth in fishing horsepower. Indeed, by 2018, fishing horsepower increased by nearly four times the optimal level. The analysis suggests that the stock of miter squid has been overfished since the mid-1980s and is now under heavy fishing pressure. To recover the stock, it is imperative to reduce fishing intensity and enforce size-at-first-capture regulations.
f
Data_Sheet_1_Combined eDNA and Acoustic Analysis Reflects Diel Vertical...
frontiersin.figshare.com
figshare.com
txt
Updated Jun 6, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cole G. Easson; Kevin M. Boswell; Nicholas Tucker; Joseph D. Warren; Jose V. Lopez (2023). Data_Sheet_1_Combined eDNA and Acoustic Analysis Reflects Diel Vertical Migration of Mixed Consortia in the Gulf of Mexico.CSV [Dataset]. http://doi.org/10.3389/fmars.2020.00552.s001
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.3389/fmars.2020.00552.s001
Dataset updated
Jun 6, 2023
Dataset provided by
Frontiers
Authors
Cole G. Easson; Kevin M. Boswell; Nicholas Tucker; Joseph D. Warren; Jose V. Lopez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Gulf of Mexico (Gulf of America)
Description
Oceanic diel vertical migration (DVM) constitutes the daily movement of various mesopelagic organisms migrating vertically from depth to feed in shallower waters and return to deeper water during the day. Accurate classification of taxa that participate in DVM remains non-trivial, and there can be discrepancies between methods. DEEPEND consortium (www.deependconsortium.org) scientists have been characterizing the diversity and trophic structure of pelagic communities in the northern Gulf of Mexico (nGoM). Profiling has included scientific echosounders to provide accurate and quantitative estimates of organismal density and timing as well as quantitative net sampling of micronekton. The use of environmental DNA (eDNA) can detect uncultured microbial taxa and the remnants that larger organisms leave behind in the environment. eDNA offers the potential to increase understanding of the DVM and the organisms that participate. Here we used real-time shipboard echosounder data to direct the sampling of eDNA in seawater at various time-points during the ascending and descending DVM. This approach allowed the observation of shifts in eDNA profiles concurrent with the movement of organisms in the DVM as measured by acoustic sensors. Seawater eDNA was sequenced using a high-throughput metabarcoding approach. Additionally, fine-scale acoustic data using an autonomous multifrequency echosounder was collected simultaneously with the eDNA samples and changes in organism density in the water column were compared with changes in eDNA profiles. Our results show distinct shifts in eukaryotic taxa such as copepods, cnidarians, and tunicates, over short timeframes during the DVM. These shifts in eDNA track changes in the depth of sound scattering layers (SSLs) of organisms and the density of organisms around the CTD during eDNA sampling. Dominant taxa in eDNA samples were mostly smaller organisms that may be below the size limit for acoustic detection, while taxa such as teleost fish were much less abundant in eDNA data compared to acoustic data. Overall, these data suggest that eDNA, may be a powerful new tool for understanding the dynamics and composition of the DVM, yet challenges remain to reconcile differences among sampling methodologies.
f
data_sheet_1_Island and Rensch’s rules do not apply to cave vs. surface...
figshare.com
txt
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gábor Herczeg; Gergely Balázs; Anna Biró; Žiga Fišer; Simona Kralj-Fišer; Cene Fišer (2023). data_sheet_1_Island and Rensch’s rules do not apply to cave vs. surface populations of Asellus aquaticus.csv [Dataset]. http://doi.org/10.3389/fevo.2023.1155261.s001
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.3389/fevo.2023.1155261.s001
Dataset updated
Jun 2, 2023
Dataset provided by
Frontiers
Authors
Gábor Herczeg; Gergely Balázs; Anna Biró; Žiga Fišer; Simona Kralj-Fišer; Cene Fišer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Body size is a trait of fundamental ecological and evolutionary importance that is often different between males and females (sexual size dimorphism; SSD). The island rule predicts that small-bodied species tend to evolve larger following a release from interspecific competition and predation in insular environments. According to Rensch’s rule, male body size relative to female body size increases with increasing mean body size. This allometric body size – SSD scaling is explained by male-driven body size evolution. These ecogeographical rules are rarely tested within species, and has not been addressed in a cave–surface context, even though caves represent insular environments (small and isolated with simple communities). By analyzing six cave and nine surface populations of the widespread, primarily surface-dwelling freshwater isopod Asellus aquaticus with male-biased SSD, we tested whether cave populations evolved larger and showed higher SSD than the surface populations. We found extensive between-population variation in body size (maximum divergence being 74%) and SSD (males being 15%–50% larger than females). However, habitat type did not explain the body size and SSD variation and we could not reject isometry in the male–female body size relationship. Hence, we found no support for the island or Rensch’s rules. We conclude that local selective forces stemming from environmental factors other than island vs. mainland or the general surface vs. cave characteristics are responsible for the reported population variation.
N
Data Tables
dtechtive.com
find.data.gov.scot
xlsx, zip
Updated Jun 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Records of Scotland (2022). Data Tables [Dataset]. https://dtechtive.com/datasets/3592
Explore at:
zip(null MB), xlsx(null MB)Available download formats
Dataset updated
Jun 15, 2022
Dataset provided by
National Records of Scotland
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Area covered
Scotland
Description
Correction - 8 October 2020 An error has been found involving Figure 2.9: Projected population change, council area, mid-2018 to mid-2028. The ‘percentage change’ value for Scotland was entered in error and has now been corrected (from 4.4% to 1.8%). The council figures are unaffected. Corrections have been made to the ‘All Sections’ and ‘Population’ data tables (Excel and CSV) files. Maximum file size: 3 MB

Facebook

Twitter

Click to copy link

Link copied

Cite

Frederik Kratzert; Frederik Kratzert; Grey Nearing; Grey Nearing; Nans Addor; Nans Addor; Tyler Erickson; Martin Gauch; Martin Gauch; Oren Gilon; Lukas Gudmundsson; Lukas Gudmundsson; Avinatan Hassidim; Daniel Klotz; Daniel Klotz; Sella Nevo; Guy Shalev; Yossi Matias; Tyler Erickson; Oren Gilon; Avinatan Hassidim; Sella Nevo; Guy Shalev; Yossi Matias (2025). Caravan - A global community dataset for large-sample hydrology (csv version) [Dataset]. http://doi.org/10.5281/zenodo.15530022

Caravan - A global community dataset for large-sample hydrology (csv version)

Explore at:

zip, application/gzipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.15530022

Dataset updated

May 27, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This is the accompanying dataset to the following paper https://www.nature.com/articles/s41597-023-01975-w

Caravan is an open community dataset of meteorological forcing data, catchment attributes, and discharge daat for catchments around the world. Additionally, Caravan provides code to derive meteorological forcing data and catchment attributes from the same data sources in the cloud, making it easy for anyone to extend Caravan to new catchments. The vision of Caravan is to provide the foundation for a truly global open source community resource that will grow over time.

If you use Caravan in your research, it would be appreciated to not only cite Caravan itself, but also the source datasets, to pay respect to the amount of work that was put into the creation of these datasets and that made Caravan possible in the first place.

All current development and additional community extensions can be found at https://github.com/kratzert/Caravan

IMPORTANT: Due to size limitations for individual repositories, the netCDF version and the CSV version of Caravan (since Version 1.6) are split into two different repositories. You can find the netCDF version at https://zenodo.org/records/14673536

Channel Log:

23 May 2022: Version 0.2 - Resolved a bug when renaming the LamaH gauge ids from the LamaH ids to the official gauge ids provided as "govnr" in the LamaH dataset attribute files.
24 May 2022: Version 0.3 - Fixed gaps in forcing data in some "camels" (US) basins.
15 June 2022: Version 0.4 - Fixed replacing negative CAMELS US values with NaN (-999 in CAMELS indicates missing observation).
1 December 2022: Version 0.4 - Added 4298 basins in the US, Canada and Mexico (part of HYSETS), now totalling to 6830 basins. Fixed a bug in the computation of catchment attributes that are defined as pour point properties, where sometimes the wrong HydroATLAS polygon was picked. Restructured the attribute files and added some more meta data (station name and country).
16 January 2023: Version 1.0 - Version of the official paper release. No changes in the data but added a static copy of the accompanying code of the paper. For the most up to date version, please check https://github.com/kratzert/Caravan
10 May 2023: Version 1.1 - No data change, just update data description.
17 May 2023: Version 1.2 - Updated a handful of attribute values that were affected by a bug in their derivation. See https://github.com/kratzert/Caravan/issues/22 for details.
16 April 2024: Version 1.4 - Added 9130 gauges from the original source dataset that were initially not included because of the area thresholds (i.e. basins smaller than 100sqkm or larger than 2000sqkm). Also extended the forcing period for all gauges (including the original ones) to 1950-2023. Added two different download options that include timeseries data only as either csv files (Caravan-csv.tar.xz) or netcdf files (Caravan-nc.tar.xz). Including the large basins also required an update in the earth engine code
16 Jan 2025: Version 1.5 - Added FAO Penman-Monteith PET (potential_evaporation_sum_FAO_PENMAN_MONTEITH) and renamed the ERA5-LAND potential_evaporation band to potential_evaporation_sum_ERA5_LAND. Also added all PET-related climated indices derived with the Penman-Monteith PET band (suffix "_FAO_PM") and renamed the old PET-related indices accordingly (suffix "_ERA5_LAND").
27 May 2025: Version 1.6
- Updated the CAMELS-AUS data to source from CAMELS-AUS v2. This means more basins (561 compared to 222) and more recent streamflow data (2022 compared to 2014). Note that the gauge id for four basins changed between the original CAMELS-AUS version and v2. Those gauges are ['camelsaus_224213A', 'camelsaus_224214A', 'camelsaus_227225A', 'camelsaus_403213A'] that all lost their trailing "A". To stay synced with CAMELS-AUS (v2), we also adapted the new naming.
- Added VERSION file to the root directory that contains the current version number.
- Updated the code to the most recent GitHub snapshot (commit 6eab036).
- Due to the 50GB repository limit, we had to split the netCDF version and the CSV version into two separate repositories. The CSV version can be found under https://zenodo.org/records/15530021

Clear search

Close search

Google apps

Main menu

Caravan - A global community dataset for large-sample hydrology (csv...

2003 Ward Dataset CSVs - All Except Crimes - 2001 to present

csv datasets and summary statistics - Dataset - DCOR

Plug Load Data

Synthetic Indoor Climate and Occupancy Data from Office and Meeting Room...

Dataset Structure

Data

Metadata

Occupancy Data

Data from: Dataset from : Browsing is a strong filter for savanna tree...

Aviation Sites Coordinates.csv

DataSheet1_Scaling Beyond Cities.CSV

CAMELSH: A Large-Sample Hourly Hydrometeorological Dataset and Attributes at...

Retailrocket recommender system dataset

Context

Content

Acknowledgements

Inspiration

Data from: PatagoniaMet: A multi-source hydrometeorological dataset for...

Replication Data for: Communication networks do not predict success in...

Real-time temperatures on the Métropole de Lyon | gimi9.com

Surface Water - Chemistry Results

Landings, abundance series, and biological data for a potential range-wide...

Dataset for Global Lunar Boulder Map from LRO NAC optical images using Deep...

Data_Sheet_2_Using LBB Tools to Assess Miter Squid Stock in the Northeastern...

Data_Sheet_1_Combined eDNA and Acoustic Analysis Reflects Diel Vertical...

data_sheet_1_Island and Rensch’s rules do not apply to cave vs. surface...

Data Tables

Caravan - A global community dataset for large-sample hydrology (csv version)See More Versions

Caravan - A global community dataset for large-sample hydrology (csv version)