13 datasets found

e
Data from: Superconductor-ferromagnet hybrids for non-reciprocal electronics...
ekoizpen-zientifikoa.ehu.eus
data.europa.eu
Updated 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhuoran Geng; Hijano, Alberto; Ilic, Stefan; Ilyn, Maxim; Maasilta, Ilari J.; Monfardini, Alessandro; Spies, Maria; Strambini, Elia; Virtanen, Pauli; Calvo, Martino; Gonzales-Orellana, Carmen; Helenius, Ari P.; Khorshidian, Sara; Clodoaldo I. Levartoski De Araujo; Levy-Bertrand, Florence; Rogero, Celia; Giazotto, Francesco; F. Sebastian Bergeret; Heikkilä, Tero T.; Zhuoran Geng; Hijano, Alberto; Ilic, Stefan; Ilyn, Maxim; Maasilta, Ilari J.; Monfardini, Alessandro; Spies, Maria; Strambini, Elia; Virtanen, Pauli; Calvo, Martino; Gonzales-Orellana, Carmen; Helenius, Ari P.; Khorshidian, Sara; Clodoaldo I. Levartoski De Araujo; Levy-Bertrand, Florence; Rogero, Celia; Giazotto, Francesco; F. Sebastian Bergeret; Heikkilä, Tero T. (2023). Superconductor-ferromagnet hybrids for non-reciprocal electronics and detectors [Dataset]. https://ekoizpen-zientifikoa.ehu.eus/documentos/668fc45cb9e7c03b01bdb087?lang=de
Explore at:
Dataset updated
2023
Authors
Zhuoran Geng; Hijano, Alberto; Ilic, Stefan; Ilyn, Maxim; Maasilta, Ilari J.; Monfardini, Alessandro; Spies, Maria; Strambini, Elia; Virtanen, Pauli; Calvo, Martino; Gonzales-Orellana, Carmen; Helenius, Ari P.; Khorshidian, Sara; Clodoaldo I. Levartoski De Araujo; Levy-Bertrand, Florence; Rogero, Celia; Giazotto, Francesco; F. Sebastian Bergeret; Heikkilä, Tero T.; Zhuoran Geng; Hijano, Alberto; Ilic, Stefan; Ilyn, Maxim; Maasilta, Ilari J.; Monfardini, Alessandro; Spies, Maria; Strambini, Elia; Virtanen, Pauli; Calvo, Martino; Gonzales-Orellana, Carmen; Helenius, Ari P.; Khorshidian, Sara; Clodoaldo I. Levartoski De Araujo; Levy-Bertrand, Florence; Rogero, Celia; Giazotto, Francesco; F. Sebastian Bergeret; Heikkilä, Tero T.
Description
Data for the manuscript "Superconductor-ferromagnet hybrids for non-reciprocal electronics and detectors", submitted to Superconductor Science and Technology, arXiv:2302.12732. This archive contains the data for all plots of numerical data in the manuscript. ## Fig. 4
Data of Fig. 4 in the WDX (Wolfram Data Exchange) format (unzip to extract the files). Contains critical exchange fields and critical thicknesses as functions of the temperature. Can be opened with Wolfram Mathematica with the command: Import[FileNameJoin[{NotebookDirectory[],"filename.wdx"}]] ## Fig. 5
Data of Fig. 5 in the WDX (Wolfram Data Exchange) format (unzip to extract the files). Contains theoretically calculated I(V) curves and the rectification coefficient R of N/FI/S junctions. Can be opened with Wolfram Mathematica with the command Import[FileNameJoin[{NotebookDirectory[],"filename.wdx"}]]. ## Fig. 7a
Data of Fig. 7a in the ascii format. Contains G in uS as a function of B in mT and V in mV. ## Fig. 7c
Data of Fig. 7c in the ascii format. Contains G in uS as a function of B in mT and V in mV. ## Fig. 7e
Data of Fig. 7e in the ascii format. Contains G in uS as a function of B in mT and V in mV. The plots 7b, d, and f are taken from the plots a, c and e as indicated in the caption of the figure. ## Fig. 8
Data of Fig. 8 in the ascii format. Contains G in uS as a function V in mV for several values of B in mT. ## Fig. 8 inset
Data of Fig. 8 inset in the ascii format. Contains G_0/G_N as a function of B in mT. ## Fig9a_b First raw Magnetic field values in T, first column voltage drop in V,
rest of the columns differential conductance in S ## Fig9b_FIT First raw Magnetic field values in T, first column voltage drop in V,
rest of the columns differential conductance in S ## Fig9c First raw Magnetic field values in T, first column voltage drop in V,
rest of the columns R (real number) ## Fig9c inset First raw Magnetic field values in T, odd columns voltage drop in V,
even columns injected current in A ## Fog9d Foist column magnetic field in T, second column conductance ration (real
number), sample name in the file name. ## Fig. 12
Data of Fig. 12 in the ascii format. Contains energy resolution as functions of temperature and tunnel resistance with current and voltage readout. ## Fig. 13
Data of Fig. 13 in the ascii format. Contains energy resolution as functions of (a) exchange field, (b) polarization, (c) dynes, and (d) absorber volume with different amplifier noises. ## Fig. 14
Data of Fig. 14 in the ascii format. Contains detector pulse current as functions of (a) temperature change (b) time with different detector parameters.
## Fig. 17
Data of Fig. 17 in the ascii format. Contains dIdV curves as function of the voltage for different THz illumination frequency and polarization. ## Fig. 18
Data of Fig. 18 in the ascii format. Contains the current flowing throughout the junction as function time (arbitrary units) for ON and OFF illumination at 150 GHz for InPol and CrossPol polarization. ## Fig. 21
Data of Fig. 21c in the ascii format. Contains the magnitude of readout line S43 as frequency.
Data of Fig. 21d in the ascii format. Contains the magnitude of iKID line S21 as frequency.
g
Water Temperature of Lakes in the Conterminous U.S. Using the Landsat 8...
gimi9.com
data.usgs.gov
+2more
Updated Feb 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Water Temperature of Lakes in the Conterminous U.S. Using the Landsat 8 Analysis Ready Dataset Raster Images from 2013-2023 [Dataset]. https://gimi9.com/dataset/data-gov_water-temperature-of-lakes-in-the-conterminous-u-s-using-the-landsat-8-analysis-ready-2013
Explore at:
Dataset updated
Feb 22, 2025
Area covered
Contiguous United States
Description
This data release contains lake and reservoir water surface temperature summary statistics calculated from Landsat 8 Analysis Ready Dataset (ARD) images available within the Conterminous United States (CONUS) from 2013-2023. All zip files within this data release contain nested directories using .parquet files to store the data. The file example_script_for_using_parquet.R contains example code for using the R arrow package (Richardson and others, 2024) to open and query the nested .parquet files. Limitations with this dataset include: - All biases inherent to the Landsat Surface Temperature product are retained in this dataset which can produce unrealistically high or low estimates of water temperature. This is observed to happen, for example, in cases with partial cloud coverage over a waterbody. - Some waterbodies are split between multiple Landsat Analysis Ready Data tiles or orbit footprints. In these cases, multiple waterbody-wide statistics may be reported - one for each data tile. The deepest point values will be extracted and reported for tile covering the deepest point. A total of 947 waterbodies are split between multiple tiles (see the multiple_tiles = “yes” column of site_id_tile_hv_crosswalk.csv). - Temperature data were not extracted from satellite images with more than 90% cloud cover. - Temperature data represents skin temperature at the water surface and may differ from temperature observations from below the water surface. Potential methods for addressing limitations with this dataset: - Identifying and removing unrealistic temperature estimates: - Calculate total percentage of cloud pixels over a given waterbody as: percent_cloud_pixels = wb_dswe9_pixels/(wb_dswe9_pixels + wb_dswe1_pixels), and filter percent_cloud_pixels by a desired percentage of cloud coverage. - Remove lakes with a limited number of water pixel values available (wb_dswe1_pixels < 10) - Filter waterbodies where the deepest point is identified as water (dp_dswe = 1) - Handling waterbodies split between multiple tiles: - These waterbodies can be identified using the "site_id_tile_hv_crosswalk.csv" file (column multiple_tiles = “yes”). A user could combine sections of the same waterbody by spatially weighting the values using the number of water pixels available within each section (wb_dswe1_pixels). This should be done with caution, as some sections of the waterbody may have data available on different dates. All zip files within this data release contain nested directories using .parquet files to store the data. The example_script_for_using_parquet.R contains example code for using the R arrow package to open and query the nested .parquet files. - "year_byscene=XXXX.zip" – includes temperature summary statistics for individual waterbodies and the deepest points (the furthest point from land within a waterbody) within each waterbody by the scene_date (when the satellite passed over). Individual waterbodies are identified by the National Hydrography Dataset (NHD) permanent_identifier included within the site_id column. Some of the .parquet files with the _byscene datasets may only include one dummy row of data (identified by tile_hv="000-000"). This happens when no tabular data is extracted from the raster images because of clouds obscuring the image, a tile that covers mostly ocean with a very small amount of land, or other possible. An example file path for this dataset follows: year_byscene=2023/tile_hv=002-001/part-0.parquet -"year=XXXX.zip" – includes the summary statistics for individual waterbodies and the deepest points within each waterbody by the year (dataset=annual), month (year=0, dataset=monthly), and year-month (dataset=yrmon). The year_byscene=XXXX is used as input for generating these summary tables that aggregates temperature data by year, month, and year-month. Aggregated data is not available for the following tiles: 001-004, 001-010, 002-012, 028-013, and 029-012, because these tiles primarily cover ocean with limited land, and no output data were generated. An example file path for this dataset follows: year=2023/dataset=lakes_annual/tile_hv=002-001/part-0.parquet - "example_script_for_using_parquet.R" – This script includes code to download zip files directly from ScienceBase, identify HUC04 basins within desired landsat ARD grid tile, download NHDplus High Resolution data for visualizing, using the R arrow package to compile .parquet files in nested directories, and create example static and interactive maps. - "nhd_HUC04s_ingrid.csv" – This cross-walk file identifies the HUC04 watersheds within each Landsat ARD Tile grid. -"site_id_tile_hv_crosswalk.csv" - This cross-walk file identifies the site_id (nhdhr_{permanent_identifier}) within each Landsat ARD Tile grid. This file also includes a column (multiple_tiles) to identify site_id's that fall within multiple Landsat ARD Tile grids. - "lst_grid.png" – a map of the Landsat grid tiles labelled by the horizontal – vertical ID.
d
Young and older adult vowel categorization responses
datadryad.org
zip
Updated Mar 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mishaela DiNino (2024). Young and older adult vowel categorization responses [Dataset]. http://doi.org/10.5061/dryad.brv15dvh0
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.brv15dvh0
Dataset updated
Mar 14, 2024
Dataset provided by
Dryad
Authors
Mishaela DiNino
Time period covered
Feb 20, 2024
Description
Young and older adult vowel categorization responses

https://doi.org/10.5061/dryad.brv15dvh0

On each trial, participants heard a stimulus and clicked a box on the computer screen to indicate whether they heard "SET" or "SAT." Responses of "SET" are coded as 0 and responses of "SAT" are coded as 1. The continuum steps, from 1-7, for duration and spectral quality cues of the stimulus on each trial are named "DurationStep" and "SpectralStep," respectively. Group (young or older adult) and listening condition (quiet or noise) information are provided for each row of the dataset.
o
Uniform Crime Reporting (UCR) Program Data: Arrests by Age, Sex, and Race,...
openicpsr.org
search.datacite.org
Updated Aug 16, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jacob Kaplan (2018). Uniform Crime Reporting (UCR) Program Data: Arrests by Age, Sex, and Race, 1980-2016 [Dataset]. http://doi.org/10.3886/E102263V5
Explore at:
Unique identifier
https://doi.org/10.3886/E102263V5
Dataset updated
Aug 16, 2018
Dataset provided by
University of Pennsylvania
Authors
Jacob Kaplan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
1980 - 2016
Area covered
United States
Description
Version 5 release notes:
Removes support for SPSS and Excel data.
Changes the crimes that are stored in each file. There are more files now with fewer crimes per file. The files and their included crimes have been updated below.
Adds in agencies that report 0 months of the year.
Adds a column that indicates the number of months reported. This is generated summing up the number of unique months an agency reports data for. Note that this indicates the number of months an agency reported arrests for ANY crime. They may not necessarily report every crime every month. Agencies that did not report a crime with have a value of NA for every arrest column for that crime.
Removes data on runaways.
Version 4 release notes:
Changes column names from "poss_coke" and "sale_coke" to "poss_heroin_coke" and "sale_heroin_coke" to clearly indicate that these column includes the sale of heroin as well as similar opiates such as morphine, codeine, and opium. Also changes column names for the narcotic columns to indicate that they are only for synthetic narcotics.
Version 3 release notes:
Add data for 2016.
Order rows by year (descending) and ORI.
Version 2 release notes:
Fix bug where Philadelphia Police Department had incorrect FIPS county code.

The Arrests by Age, Sex, and Race data is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains highly granular data on the number of people arrested for a variety of crimes (see below for a full list of included crimes). The data sets here combine data from the years 1980-2015 into a single file. These files are quite large and may take some time to load.

All the data was downloaded from NACJD as ASCII+SPSS Setup files and read into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. For the R code used to clean this data, see here. https://github.com/jacobkap/crime_data. If you have any questions, comments, or suggestions please contact me at jkkaplan6@gmail.com.

I did not make any changes to the data other than the following. When an arrest column has a value of "None/not reported", I change that value to zero. This makes the (possible incorrect) assumption that these values represent zero crimes reported. The original data does not have a value when the agency reports zero arrests other than "None/not reported." In other words, this data does not differentiate between real zeros and missing values. Some agencies also incorrectly report the following numbers of arrests which I change to NA: 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 99999, 99998.

To reduce file size and make the data more manageable, all of the data is aggregated yearly. All of the data is in agency-year units such that every row indicates an agency in a given year. Columns are crime-arrest category units. For example, If you choose the data set that includes murder, you would have rows for each agency-year and columns with the number of people arrests for murder. The ASR data breaks down arrests by age and gender (e.g. Male aged 15, Male aged 18). They also provide the number of adults or juveniles arrested by race. Because most agencies and years do not report the arrestee's ethnicity (Hispanic or not Hispanic) or juvenile outcomes (e.g. referred to adult court, referred to welfare agency), I do not include these columns.

To make it easier to merge with other data, I merged this data with the Law Enforcement Agency Identifiers Crosswalk (LEAIC) data. The data from the LEAIC add FIPS (state, county, and place) and agency type/subtype. Please note that some of the FIPS codes have leading zeros and if you open it in Excel it will automatically delete those leading zeros.

I created 9 arrest categories myself. The categories are:
Total Male Juvenile
Total Female Juvenile
Total Male Adult
Total Female Adult
Total Ma
CTD Data from the Rectangular Midwater Trawl collected during the BROKE-West...
data.aad.gov.au
researchdata.edu.au
+1more
Updated Aug 24, 2010
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GORTON, REBECCA (2010). CTD Data from the Rectangular Midwater Trawl collected during the BROKE-West voyage of the Aurora Australis, 2006 [Dataset]. http://doi.org/10.4225/15/5988073792cf2
Explore at:
Unique identifier
https://doi.org/10.4225/15/5988073792cf2
Dataset updated
Aug 24, 2010
Dataset provided by
Australian Antarctic Divisionhttps://www.antarctica.gov.au/
Australian Antarctic Data Centre
Authors
GORTON, REBECCA
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 17, 2006 - Feb 28, 2006
Area covered
Description
The CTD data were acquired when the RMT instrument was in the water.

Data Acquisition:

There is a FSI CTD sensor housed in a fibreglass box that is attached to the top bar of the RMT. The RMT software running in the aft control room establishes a Telnet connection to the aft control terminal server which connects to the CTD sensor using various hardware connections. Included are the calibration data for the CTD sensor that were used for the duration of the voyage.

The RMT software receives packet of CTD data and every second the most recent CTD data are written out to a data file. Additional information about the motor is also logged with the CTD data.

Data are only written to the data file when the net is in the water. The net in and out of water status is determined by the conductivity value. The net is deemed to be in the water when the conductivity averaged over a 10 second period is greater than 0. When the average value is less than 0 the net is deemed to be out of the water. New data files were automatically created for each trawl.

Data Processing:

Adjustment of the net open time.

If the net did not open when first attempted then the net was 'jerked' open. This meant the winch operator adjusted the winch control so that it was at maximum speed and then turned it on for a very short time. This had the effect of dropping the net a short distance very quickly. This dislodges the net hook from its cradle and the net opens. The scientist responsible for the trawl would have noted the time in the trawl log book that the winch operator turned on the winch to jerk the net.

The data files will have started the 'net open' counter 10 seconds after the user clicks the 'Net Open' button. If this time did not match the time written in the trawl log book by the scientist, then the net open time in the CSV file was adjusted. The value in the 'Net Open Time' column will increment from the time the net started to open to the time that the net started to close.

The pressure was also plotted to ensure that the time written down in the log book was correct. When the net opens there is a visible change in the CTD pressure value received. The net 'flies' up as the drag in the water increases as the net opens. If the time noted was incorrect then the scientist responsible for the log book, So Kawaguchi, was notifed of the problem and the data file was not adjusted.

Removing unused columns from the original log files.

The original log files that were produced by the RMT software were trimmed to remove any columns that did not pertain to the CTD data. These columns include the motor information and the ITI data. The ITI data gives information about the distance from the net to the ship but was not working for the duration of the BROKE-West voyage. This trimming was completed using a purpose built java application. This java class is part of the NOODLES source code.

Dataset Format:

The dataset is in a zip format. There is a .CSV file for each trawl, 125 in total. There were 51 Routine trawls and 74 Target Trawls. The file naming convention is as follows:

[Routine/Target]NNN-rmt-2006-MM-DD.csv

Where,

NNN is the trawl number from 001 to 124. MM is the month, 01 or 02 DD is the day of the month.

Also included in the zip file are the calibration files for each of the CTD sensors and the current documentation on the RMT software.

Each CSV file contains the following columns:

Date (UTC) Time (UTC) Ship Latitude (decimal degrees) Ship Longitude (decimal degrees) Conductivity (mS/cm) Temperature (Deg C) Pressure (DBar) Salinity (PSU) Sound Velocity (m/s) Fluorometer (ug/L chlA) Net Open Time (mm:ss) If the net is not open this value will be 0, else the number of minutes and seconds since the net opened will be displayed.

When the user clicks the 'Net Open' button there is a delay of 10 seconds before the net starts to open. The value displayed in the 'Net Open Time' column starts incrementing once this 10 seconds delay has passed. Similarly when the user clicks the 'Net Close' button there is a delay of 6 seconds before the net starts to close. Thus the counter stops once this 6 seconds has passed.

Acronyms Used:

CTD: Conductivity, Temperature, Pressure RMT: Rectangular Midwater Trawl CSV: Comma seperated value FSI: Falmouth Scientific Inc ITI: Intelligent Trawl Interface

This work was completed as part of ASAC projects 2655 and 2679 (ASAC_2655, ASAC_2679).
Population and GDP/GNI/CO2 emissions (2019, raw data)
figshare.com
txt
Updated Feb 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Liang Zhao (2023). Population and GDP/GNI/CO2 emissions (2019, raw data) [Dataset]. http://doi.org/10.6084/m9.figshare.22085060.v6
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22085060.v6
Dataset updated
Feb 23, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Liang Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Original dataset The original year-2019 dataset was downloaded from the World Bank Databank by the following approach on July 23, 2022.

Database: "World Development Indicators" Country: 266 (all available) Series: "CO2 emissions (kt)", "GDP (current US$)", "GNI, Atlas method (current US$)", and "Population, total" Time: 1960, 1970, 1980, 1990, 2000, 2010, 2017, 2018, 2019, 2020, 2021 Layout: Custom -> Time: Column, Country: Row, Series: Column Download options: Excel

Preprocessing

With libreoffice,

remove non-country entries (lines after Zimbabwe), shorten column names for easy processing: Country Name -> Country, Country Code -> Code, "XXXX ... GNI ..." -> GNI_1990, etc (notice '_', not '-', for R), remove unnesssary rows after line Zimbabwe.
Intermediate data for TE calculation
zenodo.org
bin, csv
Updated May 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yue Liu; Yue Liu (2025). Intermediate data for TE calculation [Dataset]. http://doi.org/10.5281/zenodo.10373032
Explore at:
csv, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10373032
Dataset updated
May 9, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Yue Liu; Yue Liu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset includes intermediate data from RiboBase that generates translation efficiency (TE). The code to generate the files can be found at https://github.com/CenikLab/TE_model.

We uploaded demo HeLa .ribo files, but due to the large storage requirements of the full dataset, I recommend contacting Dr. Can Cenik directly to request access to the complete version of RiboBase if you need the original data.

The detailed explanation for each file:

human_flatten_ribo_clr.rda: ribosome profiling clr normalized data with GEO GSM ids in columns and genes in rows in human.

human_flatten_rna_clr.rda: matched RNA-seq clr normalized data with GEO GSM ids in columns and genes in rows in human.

human_flatten_te_clr.rda: TE clr data with GEO GSM ids in columns and genes in rows in human.

human_TE_cellline_all_plain.csv: TE clr data with genes in rows and cell lines in rows in human.

human_RNA_rho_new.rda: matched RNA-seq proportional similarity data as genes by genes matrix in human.

human_TE_rho.rda: TE proportional similarity data as genes by genes matrix in human.

mouse_flatten_ribo_clr.rda: ribosome profiling clr normalized data with GEO GSM ids in columns and genes in rows in mouse.

mouse_flatten_rna_clr.rda: matched RNA-seq clr normalized data with GEO GSM ids in columns and genes in rows in mouse.

mouse_flatten_te_clr.rda: TE clr data with GEO GSM ids in columns and genes in rows in mouse.

mouse_TE_cellline_all_plain.csv: TE clr data with genes in rows and cell lines in rows in mouse.

mouse_RNA_rho_new.rda: matched RNA-seq proportional similarity data as genes by genes matrix in mouse.

mouse_TE_rho.rda: TE proportional similarity data as genes by genes matrix in mouse.

All the data was passed quality control. There are 1054 mouse samples and 835 mouse samples:
* coverage > 0.1 X
* CDS percentage > 70%
* R2 between RNA and RIBO >= 0.188 (remove outliers)

All ribosome profiling data here is non-dedup winsorizing data paired with RNA-seq dedup data without winsorizing (even though it names as flatten, it just the same format of the naming)

####code
If you need to read rda data please use load("rdaname.rda") with R

If you need to calculate proportional similarity from clr data:
library(propr)
human_TE_homo_rho <- propr:::lr2rho(as.matrix(clr_data))
rownames(human_TE_homo_rho) <- colnames(human_TE_homo_rho) <- rownames(clr_data)
E
[Water Column Data - CTD] - Water column data from CTD casts along the East...
erddap.bco-dmo.org
Updated Apr 22, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BCO-DMO (2019). [Water Column Data - CTD] - Water column data from CTD casts along the East Siberian Arctic Shelf on R/V Oden during 2011 (ESAS Water Column Methane project) (The East Siberian Arctic Shelf as a Source of Atmospheric Methane: First Approach to Quantitative Assessment) [Dataset]. https://erddap.bco-dmo.org/erddap/info/bcodmo_dataset_660543/index.html
Explore at:
Dataset updated
Apr 22, 2019
Dataset provided by
Biological and Chemical Oceanographic Data Management Office (BCO-DMO)
Authors
BCO-DMO
License
https://www.bco-dmo.org/dataset/660543/licensehttps://www.bco-dmo.org/dataset/660543/license
Area covered

Variables measured
Cl, Si, pH, CH4, DOC, DON, DOP, MOX, NH4, NO2, and 13 more
Description
Water column data from CTD casts along the East Siberian Arctic Shelf on R/V Oden during 2011 (ESAS Water Column Methane project) access_formats=.htmlTable,.csv,.json,.mat,.nc,.tsv,.esriCsv,.geoJson acquisition_description=Acquisition methods are described in the following publication: Orcut, B. et al. 2005

Core sectioning, porewater\u00a0collection\u00a0and analysis

At each sampling site, sediment sub-samples were collected for porewater analyses and at selected depths for microbial rate assays (AOM, anaerobic oxidation of methane oxidation; methanogenesis (MOG) from bicarbonate and acetate). Sediment was expelled from core liner using a hydraulic extruder under anoxic conditions. The depth intervals for extrusion varied. At each depth interval, a sub-sample was collected into a cut-off syringe for dissolved methane concentration quantification. Another 5 mL\u00a0sub- sample\u00a0was collected into pre-weighed and pre-combusted glass vial for determination of porosity (determined by the change in weight after drying at 80 degrees celsius to a constant weight). The remaining material was used for porewater extraction. Sample fixation and\u00a0analyses\u00a0for dissolved constituents followed the methods of Joye et al. (2010).\u00a0

Microbial Activity Measurements\u00a0

To determine AOM and MOG rates, 8 to 12 sub-samples (5 cm3) were collected from a core by manual insertion of a glass tube. For AOM, 100 uL of dissolved\u00a014CH4\u00a0tracer (about 2,000,000 DPM as gas) was injected into each core. Samples were incubated for 36 to 48 hours at\u00a0in situ\u00a0temperature.\u00a0 Following incubation, samples were transferred to 20 mL glass vials containing 2 mL of 2M NaOH (which served to arrest biological activity and fix\u00a014CO2\u00a0as\u00a014C-HCO3-).\u00a0 Each vial was sealed with a\u00a0teflon-lined screw cap, vortexed to mix the sample and base, and immediately frozen. Time zero samples were fixed immediately after radiotracer injection. The specific activity of the tracer substrate (14CH4) was determined by injecting 50 uL directly into scintillation cocktail (Scintiverse BD) followed by liquid scintillation counting. The accumulation of 14C product (14CO2) was determined by acid digestion following the method of Joye et al. (2010).\u00a0 The AOM rate was calculated using equation 1:

AOM Rate = [CH4] x alphaCH4 /t x (a-14CO2/a-14CH4)\u00a0\u00a0 \u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0 \u00a0(Eq. 1)

Here, the AOM Rate is expressed as nmol CH4 oxidized per cm3 sediment per day (nmol\u00a0cm-3 d-1), [CH4] is the methane concentration (uM), alphaCH4 is the isotope fractionation factor for AOM (1.06; (ALPERIN and REEBURGH, 1988)), t is the incubation time (d), a-14CO2 is the activity of the product pool, and a-14CH4 is the activity of the substrate pool. If methane concentration was not available, the turnover time of the 14CH4 tracer is presented.

Rates of bicarbonate-based-methanogenesis and acetoclastic methanogenesis were determined by incubating samples in gas-tight, closed-tube vessels without headspace, to prevent the loss of gaseous 14CH4 product during sample manipulation. These sample tubes were sealed using custom-designed plungers (black Hungate stoppers with the lip removed containing a plastic \u201ctail\u201d that was run through the stopper) were inserted at the base of the tube; the sediment was then pushed via the plunger to the top of the tube until a small amount protruded through the tube opening. A butyl rubber septa\u00a0was\u00a0then eased into the tube opening to displace sediment in contact with the atmosphere and close the tube, which was then sealed with\u00a0a open-top\u00a0screw cap.\u00a0 The rubber materials used in these assays were boiled in 1N NaOH for 1 hour, followed by several rinses in boiling milliQ, to leach potentially toxic substances. \u00a0 \u00a0

A volume of radiotracer solution (100 uL of 14C-HCO3- tracer (~1 x 107\u00a0dpm\u00a0in slightly alkaline milliQ\u00a0water) or 1,2-14C-CH3COO- tracer (~5 x 107\u00a0dpm\u00a0in slightly alkaline milliQ\u00a0water)) was injected into each sample. Samples were incubated as described above and then 2 ml of 2N NaOH was injected through the top stopper into each sample to terminate biological activity (time zero samples were fixed prior to tracer injection).\u00a0 Samples were mixed to evenly distribute NaOH through the sample.\u00a0 Production of 14CH4 was quantified by stripping methane from the tubes with an air carrier, converting the 14CH4 to 14CO2 in a combustion furnace, and subsequent trapping of the 14CO2 in NaOH as carbonate (CRAGG et al., 1990; CRILL and MARTENS, 1986).\u00a0\u00a0Activity\u00a0of 14CO2 was measured subsequently by liquid scintillation counting.\u00a0

The rates of Bi-MOG and Ac-MOG rates were calculated using equations 2 and 3, respectively:

Bi-MOG Rate = [HCO3-] x alphaHCO3/t x\u00a0 (a-14CH4/a-H14CO3-) \u00a0 \u00a0 (Eq. 2)

Ac-MOG Rate = [CH3COO-] x alphaCH3COO-/t\u00a0 x\u00a0 (a-14CH4/a-14CH314COO-) \u00a0 \u00a0 (Eq. 3)

Both rates are expressed as nmol HCO3- or CH3COO-, respectively, reduced cm-3 d-1, alphaHCO3\u00a0and alphaCH3COO- are the isotope fractionation factors for MOG (assumed to be 1.06). [HCO3-] and [CH3COO-] are the\u00a0pore\u00a0water bicarbonate (mM) and acetate (uM) concentrations, respectively, t is incubation time (d), a-14CH4 is the activity of the product pool, and a-H14CO3 and a-14CH314COO are the activities of the substrate pools. If samples for substrate concentration determination were not available, the substrate turnover constant instead of the rate is presented.

For water column methane oxidation rate assays, triplicate 20 mL of live water (in addition to one 20 mL sample which was killed with ethanol (750 uL of pure EtOH) before tracer addition) were transferred from the CTD into serum vials. Samples were amended with 2 x 10^6 DPM of 3H-labeled-methane tracer and incubated for 24 to 72 hours (linearity of activity was tested and confirmed). After incubation, samples were fixed with ethanol, as above, and a sub-sample to determine total sample activity (3H-methane + 3H-water) was collected. Next, the sample was purged with nitrogen to remove the 3H-methane tracer and a sub-sample was amended with scintillation fluid and counted on a shipboard scintillation counter to determine the activity of tracer in the product of 3H-methane oxidation, 3H-water. The methane oxidation rate was calculated as:

MOX Rate = [methane concentration in nM] x alphaCH4/t\u00a0 x\u00a0 (a-3H- H2O/a-3H-CH4-) \u00a0 \u00a0 (Eq. 3) awards_0_award_nid=651766 awards_0_award_number=PLR-1023444 awards_0_data_url=http://www.nsf.gov/awardsearch/showAward?AWD_ID=1023444 awards_0_funder_name=NSF Division of Polar Programs awards_0_funding_acronym=NSF PLR awards_0_funding_source_nid=490497 awards_0_program_manager=Henrietta N Edmonds awards_0_program_manager_nid=51517 cdm_data_type=Other comment=Water Column Data S. Joye and V. Samarkin, PIs Version 4 October 2016 Conventions=COARDS, CF-1.6, ACDD-1.3 data_source=extract_data_as_tsv version 2.3 19 Dec 2019 defaultDataQuery=&time<now doi=10.1575/1912/bco-dmo.660543.1 Easternmost_Easting=178.9479 geospatial_lat_max=77.3829 geospatial_lat_min=65.0835 geospatial_lat_units=degrees_north geospatial_lon_max=178.9479 geospatial_lon_min=125.0406 geospatial_lon_units=degrees_east geospatial_vertical_max=651.0 geospatial_vertical_min=10.0 geospatial_vertical_positive=down geospatial_vertical_units=m infoUrl=https://www.bco-dmo.org/dataset/660543 institution=BCO-DMO instruments_0_acronym=CTD instruments_0_dataset_instrument_description=Used to collect water column samples instruments_0_dataset_instrument_nid=660553 instruments_0_description=The Conductivity, Temperature, Depth (CTD) unit is an integrated instrument package designed to measure the conductivity, temperature, and pressure (depth) of the water column. The instrument is lowered via cable through the water column and permits scientists observe the physical properties in real time via a conducting cable connecting the CTD to a deck unit and computer on the ship. The CTD is often configured with additional optional sensors including fluorometers, transmissometers and/or radiometers. It is often combined with a Rosette of water sampling bottles (e.g. Niskin, GO-FLO) for collecting discrete water samples during the cast. This instrument designation is used when specific make and model are not known. instruments_0_instrument_external_identifier=https://vocab.nerc.ac.uk/collection/L05/current/130/ instruments_0_instrument_name=CTD profiler instruments_0_instrument_nid=417 instruments_0_supplied_name=CTD keywords_vocabulary=GCMD Science Keywords metadata_source=https://www.bco-dmo.org/api/dataset/660543 Northernmost_Northing=77.3829 param_mapping={'660543': {'lat': 'master - latitude', 'depth_max': 'flag - depth', 'lon': 'master - longitude'}} parameter_source=https://www.bco-dmo.org/mapserver/dataset/660543/parameters people_0_affiliation=University of Georgia people_0_affiliation_acronym=UGA people_0_person_name=Samantha B. Joye people_0_person_nid=51421 people_0_role=Principal Investigator people_0_role_type=originator people_1_affiliation=University of Georgia people_1_affiliation_acronym=UGA people_1_person_name=Vladimir Samarkin people_1_person_nid=641543 people_1_role=Co-Principal Investigator people_1_role_type=originator people_2_affiliation=University of Georgia people_2_affiliation_acronym=UGA people_2_person_name=Samantha B. Joye people_2_person_nid=51421 people_2_role=Contact people_2_role_type=related people_3_affiliation=Woods Hole Oceanographic Institution people_3_affiliation_acronym=WHOI BCO-DMO people_3_person_name=Hannah Ake people_3_person_nid=650173 people_3_role=BCO-DMO Data Manager people_3_role_type=related project=ESAS Water Column Methane projects_0_acronym=ESAS Water Column Methane projects_0_description=We propose to study methane (CH4)
o
Data from: HarDWR - Harmonized Water Rights Records
osti.gov
Updated Oct 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Caccese, Robert; Fisher-Vanden, Karen; Fowler, Lara; Grogan, Danielle; Lammers, Richard; Lisk, Matthew; Olmstead, Sheila; Peklak, Darrah; Zheng, Jiameng; Zuidema, Shan (2024). HarDWR - Harmonized Water Rights Records [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/2475306
Explore at:
Dataset updated
Oct 31, 2024
Dataset provided by
USDOE Office of Science (SC), Biological and Environmental Research (BER)
MultiSector Dynamics - Living, Intuitive, Value-adding, Environment
Authors
Caccese, Robert; Fisher-Vanden, Karen; Fowler, Lara; Grogan, Danielle; Lammers, Richard; Lisk, Matthew; Olmstead, Sheila; Peklak, Darrah; Zheng, Jiameng; Zuidema, Shan
Description
A dataset within the Harmonized Database of Western U.S. Water Rights (HarDWR). For a detailed description of the database, please see the meta-record v2.0. Changelog v2.0 - Recalculated based on data sourced from WestDAAT - Changed using a Site ID column to identify unique records to using aa combination of Site ID and Allocation ID - Removed the Water Management Area (WMA) column from the harmonized records. The replacement is a separate file which stores the relationship between allocations and WMAs. This allows for allocations to contribute to water right amounts to multiple WMAs during the subsequent cumulative process. - Added a column describing a water rights legal status - Added "Unspecified" was a water source category - Added an acre-foot (AF) column - Added a column for the classification of the right's owner v1.02 - Added a .RData file to the dataset as a convenience for anyone exploring our code. This is an internal file, and the one referenced in analysis scripts as the data objects are already in R data objects. v1.01 - Updated the names of each file with an ID number less than 3 digits to include leading 0s v1.0 - Initial public release Description Heremore » we present an updated database of Western U.S. water right records. This database provides consistent unique identifiers for each water right record, and a consistent categorization scheme that puts each water right record into one of seven broad use categories. These data were instrumental in conducting a study of the multi-sector dynamics of inter-sectoral water allocation changes though water markets (Grogan et al., in review). Specifically, the data were formatted for use as input to a process-based hydrologic model, Water Balance Model (WBM), with a water rights module (Grogan et al., in review). While this specific study motivated the development of the database presented here, water management in the U.S. West is a rich area of study (e.g., Anderson and Woosly, 2005; Tidwell, 2014; Null and Prudencio, 2016; Carney et al., 2021) so releasing this database publicly with documentation and usage notes will enable other researchers to do further work on water management in the U.S. West. We produced the water rights database presented here in four main steps: (1) data collection, (2) data quality control, (3) data harmonization, and (4) generation of cumulative water rights curves. Each of steps (1)-(3) had to be completed in order to produce (4), the final product that was used in the modeling exercise in Grogan et al. (in review). All data in each step is associated with a spatial unit called a Water Management Area (WMA), which is the unit of water right administration utilized by the state in which the right came from. Steps (2) and (3) required use to make assumptions and interpretation, and to remove records from the raw data collection. We describe each of these assumptions and interpretations below so that other researchers can choose to implement alternative assumptions an interpretation as fits their research aims. Motivation for Changing Data Sources The most significant change has been a switch from collecting the raw water rights directly from each state to using the water rights records presented in WestDAAT, a product of the Water Data Exchange (WaDE) Program under the Western States Water Council (WSWC). One of the main reasons for this is that each state of interest is a member of the WSWC, meaning that WaDE is partially funded by these states, as well as many universities. As WestDAAT is also a database with consistent categorization, it has allowed us to spend less time on data collection and quality control and more time on answering research questions. This has included records from water right sources we had previously not known about when creating v1.0 of this database. The only major downside to utilizing the WestDAAT records as our raw data is that further updates are tied to when WestDAAT is updated, as some states update their public water right records daily. However, as our focus is on cumulative water amounts at the regional scale, it is unlikely most records updates would have a significant effect on our results. The structure of WestDAAT led to several important changes to how HarWR is formatted. The most significant change is that WaDE has calculated a field known as SiteUUID, which is a unique identifier for the Point of Diversion (POD), or where the water is drawn from. This separate from AllocationNativeID, which is the identifier for the allocation of water, or the amount of water associated with the water right. It should be noted that it is possible for a single site to have multiple allocations associated with it and for an allocation to be able to be extracted from multiple sites. The site-allocation structure has allowed us to adapt a more consistent, and hopefully more realistic, approach in organizing the water right records than we had with HarDWR v1.0. This was incredibly helpful as the raw data from many states had multiple water uses within a single field within a single row of their raw data, and it was not always clear if the first water use was the most important, or simply first alphabetically. WestDAAT has already addressed this data quality issue. Furthermore, with v1.0, when there were multiple records with the same water right ID, we selected the largest volume or flow amount and disregarded the rest. As WestDAAT was already a common structure for disparate data formats, we were better able to identify sites with multiple allocations and, perhaps more importantly, allocations with multiple sites. This is particularly helpful when an allocation has sites which cross WMA boundaries, instead of just assigning the full water amount to a single WMA we are now able to divide the amount of water between the number of relevant WMAs. As it is now possible to identify allocations with water used in multiple WMAs, it is no longer practical to store this information within a single column. Instead the stAllocationToWMATab.csv file was created, which is an allocation by WMA matrix containing the percent Place of Use area overlap with each WMA. We then use this percentage to divide the allocation's flow amount between the given WMAs during the cumulation process to hopefully provide more realistic totals of water use in each area. However, not every state provides areas of water use, so like HarDWR v1.0, a hierarchical decision tree was used to assign each allocation to a WMA. First, if a WMA could be identified based on the allocation ID, then that WMA was used; typically, when available, this applied to the entire state and no further steps were needed. Second was the spatial analysis of Place of Use to WMAs. Third was a spatial analysis of the POD locations to WMAs, with the assumption that allocation's POD is within the WMA it should belong to; if an allocation still had multiple WMAs based on its POD locations, then the allocation's flow amount would be divided equally between all WMAs. The fourth, and final, process was to include water allocations which spatially fell outside of the state WMA boundaries. This could be due to several reasons, such as coordinate errors / imprecision in the POD location, imprecision in the WMA boundaries, or rights attached with features, such as a reservoir, which crosses state boundaries. To include these records, we decided for any POD which was within one kilometer of the state's edge would be assigned to the nearest WMA. Other Changes WestDAAT has Allowed In addition to a more nuanced and consistent method of assigning water right's data to WMAs, there are other benefits gained from using the WestDAAT dataset. Among those is a consistent categorization of a water right's legal status. In HarDWR v1.0, legal status was effectively ignored, which led to many valid concerns about the quality of the database related to the amounts of water the rights allowed to be claimed. The main issue was that rights with legal status' such as "application withdrawn", "non-active", or "cancelled" were included within HarDWR v1.0. These, and other water rights status' which were deemed to not be in use have been removed from this version of the database. Another major change has been the addition of the "unspecified water source category. This is water that can come from either surface water or groundwater, or the source of which is unknown. The addition of this source category brings the total number of categories to three. Due to reviewer feedback, we decided to add the acre-foot (AF) column so that the data may be more applicable to a wider audience. We added the ownerClassification column so that the data may be more applicable to a wider audience. File Descriptions The dataset is a series of various files organized by state sub-directories. In addition, each file begins with the state's name, in case the file is separate from its sub-directory for some reason. After the state name is the text which describes the contents of the file. Here is each file described in detail. Note that st is a placeholder for the state's name. stFullRecords_HarmonizedRights.csv: A file of the complete water records for each state. The column headers for each of this type of file are: state - The name of the state to which the allocations belong to. FIPS - The two digit numeric state ID code. siteID - The site location ID for POD locations. A site may have multiple allocations, which are the actual amount of water which can be drawn. In a simplified hypothetical, a farm stead may have an allocation for "irrigation" and an allocation for "domestic" water use, but the water is drawn from the same pumping equipment. It should be noted that many of the site ID appear to have been added by WaDE, and therefore may not be recognized by a given state's water rights database. allocationID - The allocation ID for the water right. For most states this is the water right ID, and what is
HyG: A hydraulic geometry dataset derived from historical stream gage...
zenodo.org
data.niaid.nih.gov
csv
Updated Feb 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thomas L. Enzminger; J. Toby Minear; Ben Livneh; Thomas L. Enzminger; J. Toby Minear; Ben Livneh (2024). HyG: A hydraulic geometry dataset derived from historical stream gage measurements across the conterminous United States [Dataset]. http://doi.org/10.5281/zenodo.10425392
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10425392
Dataset updated
Feb 26, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Thomas L. Enzminger; J. Toby Minear; Ben Livneh; Thomas L. Enzminger; J. Toby Minear; Ben Livneh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States, Contiguous United States
Description
Regional- and continental-scale models predicting variations in the magnitude and timing of streamflow are important tools for forecasting water availability as well as flood inundation extent and associated damages. Such models must define the geometry of stream channels through which flow is routed. These channel parameters, such as width, depth, and hydraulic resistance, exhibit substantial variability in natural systems. While hydraulic geometry relationships have been extensively studied in the United States, they remain unquantified for thousands of stream reaches across the country. Consequently, large-scale hydraulic models frequently take simplistic approaches to channel geometry parameterization. Over-simplification of channel geometries directly impacts the accuracy of streamflow estimates, with knock-on effects for water resource and hazard prediction.

Here, we present a hydraulic geometry dataset derived from long-term measurements at U.S. Geological Survey (USGS) stream gages across the conterminous United States (CONUS). This dataset includes (a) at-a-station hydraulic geometry parameters following the methods of Leopold and Maddock (1953), (b) at-a-station Manning's n calculated from the Manning equation, (c) daily discharge percentiles, and (d) downstream hydraulic geometry regionalization parameters based on HUC4 (Hydrologic Unit Code 4). This dataset is referenced in Heldmyer et al. (2022); further details and implications for CONUS-scale hydrologic modeling are available in that article (https://doi.org/10.5194/hess-26-6121-2022).

At-a-station Hydraulic Geometry

We calculated hydraulic geometry parameters using historical USGS field measurements at individual station locations. Leopold and Maddock (1953) derived the following power law relationships:

$w={aQ^b}$

$d=cQ^f$

$v=kQ^m$

where Q is discharge, w is width, d is depth, v is velocity, and a, b, c, f, k, and m are at-a-station hydraulic geometry (AHG) parameters. We downloaded the complete record of USGS field measurements from the USGS NWIS portal (https://waterdata.usgs.gov/nwis/measurements). This raw dataset includes 4,051,682 individual measurements from a total of 66,841 stream gages within CONUS. Quantities of interest in AHG derivations are Q, w, d, and v. USGS field measurements do not include d--we therefore calculated d using d=A/w, where A is measured channel area. We applied the following quality control (QC) procedures in order to ensure the robustness of AHG parameters derived from the field data:

We considered only measurements which reported Q, v, w and A.

For each gage, we excluded measurements older than the most recent five years, so as to minimize the effects of long-term channel evolution on observed hydraulic geometry relationships.

We excluded gages for which measured Q disagreed with the product of measured velocity and measured area by more than 5%. Gages for which $ Q eq vA$ are often tidally influenced and therefore may not conform to expected channel geometry relationships.

Q, v, w, and d from field measurements at each gage were log-transformed. We performed robust linear regressions on the relationships between log(Q) and log(w), log(v), and log(d). AHG parameters were derived from the regressed explanatory variables.

We applied an iterative outlier detection procedure to the linear regression residuals. Values of log-transformed w, v, and d residuals falling outside a three median absolute deviation (MAD) envelope were excluded. Regression coefficients were recalculated and the outlier detection procedure was reapplied until no new outliers were detected.

Gages for which one or more regression had p-values >0.05 were excluded, as the relationships between log-transformed Q and w, v, or d lacked statistical significance.

Gages were omitted if regressed AHG parameters did not fulfill two additional relationships derived by Leopold and Maddock: $b+f+m=1{\displaystyle \pm }0.1$ and $a{\displaystyle \times }c{\displaystyle \times }k=1{\displaystyle \pm }0.1$.

If the number of field measurements for a given gage was less than 10, either initially or after individual measurements were removed via steps 1-4, the gage was excluded from further analysis.

Application of the QC procedures described above removed 55,328 stream gages, many of which were short-term campaign gages at which very few field measurements had been recorded. We derived AHG parameters for the remaining 11,513 gages which passed our QC.

At-a-station Manning's n

We calculated hydraulic resistance at each gage location by solving Manning's equation for Manning's n, given by

$n = {{R^{2/3}S^{1/2}} \over v}$

where v is velocity, R is hydraulic radius and S is longitudinal slope. We used smoothed reach-scale longitudinal slopes from the NHDPlusv2 (National Hydrography Dataset Plus, version 2) ElevSlope data product. We note that NHDPlusv2 contains a minimum slope constraint of 10^-5 m/m--no reach may have a slope less than this value. Furthermore, NHDPlusv2 lacks slope values for certain reaches. As such, we could not calculate Manning's n for every gage, and some Manning's n values we report may be inaccurate due to the NHDPlusv2 minimum slope constraint. We report two Manning's n values, both of which take stream depth as an approximation for R. The first takes the median stream depth and velocity measurements from the USGS's database of manual flow measurements for each gage. The second uses stream depth and velocity calculated for a 50th percentile discharge (Q₅₀; see below). Approximating R as stream depth is an assumption which is generally considered valid if the width-to-depth ratio of the stream is greater than 10—which was the case for the vast majority of field measurements. Thus, we report two Manning's n values for each gage, which are each intended to approximately represent median flow conditions.

Daily discharge percentiles

We downloaded full daily discharge records from 16,947 USGS stream gages through the NWIS online portal. The data includes records from both operational and retired gages. Records for operational gages were truncated at the end of the 2018 water year (September 30, 2018) in order to avoid use of preliminary data. To ensure the robustness of daily discharge percentiles, we applied the following QC:

For a given gage, we removed blocks of missing discharge values longer than 6 months. These long blocks of missing data generally correspond to intervals in which a gage was temporarily decommissioned for maintenance.

A gage was omitted from further analysis if its discharge record was less than 10 years (3,652 days) long, and/or less than 90% complete (>10% missing values after removal of long blocks in step 1.

We calculated discharge percentiles for each of the 10,871 gages which passed QC. Discharge percentiles were calculated at increments of 1% between Q₁ and Q₅, increments of 5% (e.g. Q₁₀, Q₁₅, Q₂₀, etc.) between Q₅ and Q₉₅, increments of 1% between Q₉₅ and Q₉₉, and increments of 0.1% between Q₉₉ and Q₁₀₀ in order to provide higher resolution at the lowest and highest flows, which occur much less frequently.

HG Regionalization

We regionalized AHG parameters from gage locations to all stream reaches in the conterminous United States. This downstream hydraulic geometry regionalization was performed using all gages with AHG parameters in each HUC4, as opposed to traditional downstream hydraulic geometry--which involves interpolation of parameters of interest to ungaged reaches on individual streams. We performed linear regressions on log-transformed drainage area and Q at a number of flow percentiles as follows:

$log(Q_i) = \beta_1log(DA) + \beta_0$

where Q_i is streamflow at percentile i, DA is drainage area and $\beta_1$ and $\beta_0$ are regression parameters. We report $\beta_1$, $\beta_0$ , and the r² value of the regression relationship for Q percentiles Q₁₀, Q₂₅, Q₅₀, Q₇₅, Q₉₀, Q₉₅, Q₉₉, and Q_99.9. Further discussion and additional analysis of HG regionalization are presented in Heldmyer et al. (2022).

Dataset description

We present the HyG dataset in a comma-separated value (csv) format. Each row corresponds to a different USGS stream gage. Information in the dataset includes gage ID (column 1), gage location in latitude and longitude (columns 2-3), gage drainage area (from USGS; column 4), longitudinal slope of the gage's stream reach (from NHDPlusv2; column 5), AHG parameters derived from field measurements (columns 6-11), Manning's n calculated from median measured flow conditions (column 12), Manning's n calculated from Q50 (column 13), Q percentiles (columns 14-51), HG regionalization parameters and r² values (columns 52-75), and geospatial information for the HUC4 in which the gage is located (from USGS; columns 76-87). Users are advised to exercise caution when opening the dataset. Certain software, including Microsoft Excel and Python, may drop the leading zeros in USGS gage IDs and HUC4 IDs if these columns are not explicitly imported as strings.

Errata

In version 1, drainage area was mistakenly reported in cubic meters but labeled in cubic kilometers. This error has been corrected in version 2.
Data from: Bike Sharing Dataset
kaggle.com
Updated Sep 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ram Vishnu R (2024). Bike Sharing Dataset [Dataset]. https://www.kaggle.com/datasets/ramvishnur/bike-sharing-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 10, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ram Vishnu R
Description
Problem Statement:

A bike-sharing system is a service in which bikes are made available for shared use to individuals on a short term basis for a price or free. Many bike share systems allow people to borrow a bike from a "dock" which is usually computer-controlled wherein the user enters the payment information, and the system unlocks it. This bike can then be returned to another dock belonging to the same system.

A US bike-sharing provider BoomBikes has recently suffered considerable dip in their revenue due to the Corona pandemic. The company is finding it very difficult to sustain in the current market scenario. So, it has decided to come up with a mindful business plan to be able to accelerate its revenue.

In such an attempt, BoomBikes aspires to understand the demand for shared bikes among the people. They have planned this to prepare themselves to cater to the people's needs once the situation gets better all around and stand out from other service providers and make huge profits.

They have contracted a consulting company to understand the factors on which the demand for these shared bikes depends. Specifically, they want to understand the factors affecting the demand for these shared bikes in the American market. The company wants to know:

Which variables are significant in predicting the demand for shared bikes.

How well those variables describe the bike demands

Based on various meteorological surveys and people's styles, the service provider firm has gathered a large dataset on daily bike demands across the American market based on some factors.

Business Goal:

You are required to model the demand for shared bikes with the available independent variables. It will be used by the management to understand how exactly the demands vary with different features. They can accordingly manipulate the business strategy to meet the demand levels and meet the customer's expectations. Further, the model will be a good way for management to understand the demand dynamics of a new market.

Data Preparation:

You can observe in the dataset that some of the variables like 'weathersit' and 'season' have values as 1, 2, 3, 4 which have specific labels associated with them (as can be seen in the data dictionary). These numeric values associated with the labels may indicate that there is some order to them - which is actually not the case (Check the data dictionary and think why). So, it is advisable to convert such feature values into categorical string values before proceeding with model building. Please refer the data dictionary to get a better understanding of all the independent variables.

You might notice the column 'yr' with two values 0 and 1 indicating the years 2018 and 2019 respectively. At the first instinct, you might think it is a good idea to drop this column as it only has two values so it might not be a value-add to the model. But in reality, since these bike-sharing systems are slowly gaining popularity, the demand for these bikes is increasing every year proving that the column 'yr' might be a good variable for prediction. So think twice before dropping it.

Model Building:

In the dataset provided, you will notice that there are three columns named 'casual', 'registered', and 'cnt'. The variable 'casual' indicates the number casual users who have made a rental. The variable 'registered' on the other hand shows the total number of registered users who have made a booking on a given day. Finally, the 'cnt' variable indicates the total number of bike rentals, including both casual and registered. The model should be built taking this 'cnt' as the target variable.

Model Evaluation:

When you're done with model building and residual analysis and have made predictions on the test set, just make sure you use the following two lines of code to calculate the R-squared score on the test set. python from sklearn.metrics import r2_score r2_score(y_test, y_pred) - where y_test is the test data set for the target variable, and y_pred is the variable containing the predicted values of the target variable on the test set. - Please perform this step as the R-squared score on the test set holds as a benchmark for your model.
Maricopa County Assessor "Fast Food" Search Query
kaggle.com
Updated Sep 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FoxbatCS (2021). Maricopa County Assessor "Fast Food" Search Query [Dataset]. https://www.kaggle.com/foxbatcs/maricopa-county-assessor-fast-food-search-query/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 21, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
FoxbatCS
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Maricopa County
Description
SOURCE

This data was obtained from the Maricopa County Assessor under the search "Fast Food". The query has approximately 1342 results, with only 1000 returned due MCA Data Policies.

DATA CLEANING

Due to some Subdivision Name values posessing unescaped commas that interfered with Pandas' ability to properly align the columns, some manual cleaning in Libre Office was performed by me.

Aside from a handful of Null values, the data is fairly clean and requires little from Pandas.

NULL VALUES

Here are the sums and percentage of NULLS in the dataframe. Interestingly, there are 17 NULLS that do not have any physical addresses. This amounts to 1.7% of values for the Address, City, and Zip, and are all corresponding rows for those missing values.

I have looked into a couple of these on the Maricopa County Assessor's GIS Portal, and they do not appear to have any assigned physical addresses. This is a good avenue of exploration for EDA. Possibly an error that could be corrected, or some obscure legal reason, but interesting nonetheless.

Additionally, there are 391 NULLS in Subdivision Name accounting for 39.1%. This is a feature that I am interested in exploring to determine if there are any predominant groups. It could also generate a list of Entities that can be searched later to see if the dataset can be enriched beyond it's initial 1,000 record limit.

There are 348 NULLS in the MCR column. This is the definition according to the MCA Glossary

MCR (MARICOPA COUNTY RECORDER NUMBER) Often associated with recorded plat maps.

This seems to be an uninteresting nominal value, so I will drop this columns.

While Property Type and Rental have no NULLS, 100% of those values are Fast Food Restaurant and N (for No), and therefore offer no useful information, and will be dropped.

I will leave the S/T/R column, although it also seems to be uninteresting nominal values, I am curious if there are predominent groups, and since it also has no NULLS, might be useful for further data enrichment.
f
NTU Dataset
figshare.com
7z
Updated May 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Satyajit Neogi (2023). NTU Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.7890764.v2
Explore at:
7zAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7890764.v2
Dataset updated
May 31, 2023
Dataset provided by
figshare
Authors
Satyajit Neogi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
**************** NTU Dataset ReadMe file *******************Please consider the latest version.Attached files contain our data collected inside Nanyang Technological University Campus for pedestrian intention prediction. The dataset is particularly designed to capture spontaneous vehicle influences on pedestrian crossing/not-crossing intention. We utilize this dataset in our paper "Context Model for Pedestrian Intention Prediction using Factored Latent-Dynamic Conditional Random Fields" submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence.The dataset consists of 35 crossing and 35 stopping* (not-crossing) scenarios. The image sequences are in 'Image_sequences' folder. 'stopping_instants.csv' and 'crossing_instants.csv' files provide the stopping and crossing instants respectively, utilized for labeling the data and providing ground-truth for evaluation. Camera1 and Camera2 images are synchronized. Two cameras were used to capture the whole scene of interest.We provide pedestrian and vehicle bounding boxes obtained from [1]. The occlusions and mis-detections are linearly interpolated. All necessary detections are stored in 'Object_detector_pedestrians_vehicles' folder. Each column within the csv files ('car_bndbox_..') corresponds to a unique tracked car within each image sequence. Each of the pedestrian csv files ('ped_bndbox_..') contains only one column, as we consider each pedestrian in the scene separately. Additional details:* [xmin xmax ymin ymax] = left right top down* Dataset frequency: 15 fps.* Camera parameters (in pixels): f = 1135, principal point = (960, 540).Additionally, we provide semantic segmentation output [2] and our depth parameters. As the data were collected in two phases, there are two files in each folder, highlighting the sequences in each phase.Crossing sequences 1-28 and stopping sequences 1-24 were collected in Phase 1, while crossing sequences 29-35 and stopping sequences 25-35 were collected in Phase 2.We obtained the optical flow from [3]. Our model (FLDCRF and LSTM) codes are available in 'Models' folder.If you use our dataset in your research, please cite our paper:"S. Neogi, M. Hoy, W. Chaoqun, J. Dauwels, 'Context Based Pedestrian Intention Prediction Using Factored Latent Dynamic Conditional Random Fields', IEEE SSCI-2017."Please email us if you have any questions:1. Satyajit Neogi, PhD Student, Nanyang Technological University @ satyajit001@e.ntu.edu.sg 2. Justin Dauwels, Associate Professor, Nanyang Technological University @ jdauwels@ntu.edu.sgOur other group members include:3. Dr. Michael Hoy, @ mch.hoy@gmail.com4. Dr. Kang Dang, @ kangdang@gmail.com5. Ms. Lakshmi Prasanna Kachireddy, 6. Mr. Mok Bo Chuan Lance,7. Dr. Hang Yu, @ fhlyhv@gmail.comReferences:1. S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", NIPS 2015.2. A. Kendall, V. Badrinarayanan, R. Cipolla,Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding", BMVC 2017.3. C. Liu. ``Beyond Pixels: Exploring New Representations and Applications for Motion Analysis". Doctoral Thesis. Massachusetts Institute of Technology. May 2009.* Please note, we had to remove sequence Stopping-33 for privacy reasons.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Zhuoran Geng; Hijano, Alberto; Ilic, Stefan; Ilyn, Maxim; Maasilta, Ilari J.; Monfardini, Alessandro; Spies, Maria; Strambini, Elia; Virtanen, Pauli; Calvo, Martino; Gonzales-Orellana, Carmen; Helenius, Ari P.; Khorshidian, Sara; Clodoaldo I. Levartoski De Araujo; Levy-Bertrand, Florence; Rogero, Celia; Giazotto, Francesco; F. Sebastian Bergeret; Heikkilä, Tero T.; Zhuoran Geng; Hijano, Alberto; Ilic, Stefan; Ilyn, Maxim; Maasilta, Ilari J.; Monfardini, Alessandro; Spies, Maria; Strambini, Elia; Virtanen, Pauli; Calvo, Martino; Gonzales-Orellana, Carmen; Helenius, Ari P.; Khorshidian, Sara; Clodoaldo I. Levartoski De Araujo; Levy-Bertrand, Florence; Rogero, Celia; Giazotto, Francesco; F. Sebastian Bergeret; Heikkilä, Tero T. (2023). Superconductor-ferromagnet hybrids for non-reciprocal electronics and detectors [Dataset]. https://ekoizpen-zientifikoa.ehu.eus/documentos/668fc45cb9e7c03b01bdb087?lang=de

Data from: Superconductor-ferromagnet hybrids for non-reciprocal electronics and detectors

Explore at:

Dataset updated

2023

Authors

Description

Data for the manuscript "Superconductor-ferromagnet hybrids for non-reciprocal electronics and detectors", submitted to Superconductor Science and Technology, arXiv:2302.12732. This archive contains the data for all plots of numerical data in the manuscript. ## Fig. 4
Data of Fig. 4 in the WDX (Wolfram Data Exchange) format (unzip to extract the files). Contains critical exchange fields and critical thicknesses as functions of the temperature. Can be opened with Wolfram Mathematica with the command: Import[FileNameJoin[{NotebookDirectory[],"filename.wdx"}]] ## Fig. 5
Data of Fig. 5 in the WDX (Wolfram Data Exchange) format (unzip to extract the files). Contains theoretically calculated I(V) curves and the rectification coefficient R of N/FI/S junctions. Can be opened with Wolfram Mathematica with the command Import[FileNameJoin[{NotebookDirectory[],"filename.wdx"}]]. ## Fig. 7a
Data of Fig. 7a in the ascii format. Contains G in uS as a function of B in mT and V in mV. ## Fig. 7c
Data of Fig. 7c in the ascii format. Contains G in uS as a function of B in mT and V in mV. ## Fig. 7e
Data of Fig. 7e in the ascii format. Contains G in uS as a function of B in mT and V in mV. The plots 7b, d, and f are taken from the plots a, c and e as indicated in the caption of the figure. ## Fig. 8
Data of Fig. 8 in the ascii format. Contains G in uS as a function V in mV for several values of B in mT. ## Fig. 8 inset
Data of Fig. 8 inset in the ascii format. Contains G_0/G_N as a function of B in mT. ## Fig9a_b First raw Magnetic field values in T, first column voltage drop in V,
rest of the columns differential conductance in S ## Fig9b_FIT First raw Magnetic field values in T, first column voltage drop in V,
rest of the columns differential conductance in S ## Fig9c First raw Magnetic field values in T, first column voltage drop in V,
rest of the columns R (real number) ## Fig9c inset First raw Magnetic field values in T, odd columns voltage drop in V,
even columns injected current in A ## Fog9d Foist column magnetic field in T, second column conductance ration (real
number), sample name in the file name. ## Fig. 12
Data of Fig. 12 in the ascii format. Contains energy resolution as functions of temperature and tunnel resistance with current and voltage readout. ## Fig. 13
Data of Fig. 13 in the ascii format. Contains energy resolution as functions of (a) exchange field, (b) polarization, (c) dynes, and (d) absorber volume with different amplifier noises. ## Fig. 14
Data of Fig. 14 in the ascii format. Contains detector pulse current as functions of (a) temperature change (b) time with different detector parameters.
## Fig. 17
Data of Fig. 17 in the ascii format. Contains dIdV curves as function of the voltage for different THz illumination frequency and polarization. ## Fig. 18
Data of Fig. 18 in the ascii format. Contains the current flowing throughout the junction as function time (arbitrary units) for ON and OFF illumination at 150 GHz for InPol and CrossPol polarization. ## Fig. 21
Data of Fig. 21c in the ascii format. Contains the magnitude of readout line S43 as frequency.
Data of Fig. 21d in the ascii format. Contains the magnitude of iKID line S21 as frequency.

Clear search

Close search

Google apps

Main menu

Data from: Superconductor-ferromagnet hybrids for non-reciprocal electronics...

Water Temperature of Lakes in the Conterminous U.S. Using the Landsat 8...

Young and older adult vowel categorization responses

Young and older adult vowel categorization responses

Uniform Crime Reporting (UCR) Program Data: Arrests by Age, Sex, and Race,...

CTD Data from the Rectangular Midwater Trawl collected during the BROKE-West...

Population and GDP/GNI/CO2 emissions (2019, raw data)

Intermediate data for TE calculation

[Water Column Data - CTD] - Water column data from CTD casts along the East...

Data from: HarDWR - Harmonized Water Rights Records

HyG: A hydraulic geometry dataset derived from historical stream gage...

Data from: Bike Sharing Dataset

Problem Statement:

Business Goal:

Data Preparation:

Model Building:

Model Evaluation:

Maricopa County Assessor "Fast Food" Search Query

SOURCE

DATA CLEANING

NULL VALUES

NTU Dataset

Data from: Superconductor-ferromagnet hybrids for non-reciprocal electronics and detectors