This data release includes water-quality data collected at up to thirteen locations along the Merrimack River and Merrimack River Estuary in Massachusetts. In this study, conducted by the U.S. Geological Survey (USGS) in cooperation with the Massachusetts Department of Environmental Protection, discrete samples were collected, and continuous monitoring was completed from June to September 2020. The data include results of measured field properties (water temperature, specific conductivity, pH, dissolved oxygen) and laboratory concentrations of nitrogen and phosphorus species, total carbon, pheophytin-a, and chlorophyll-a. These data were collected to assess selected (mainly nutrients) water-quality conditions in the Merrimack River and Merrimack River Estuary at the thirteen locations and identify areas where more water-quality monitoring is needed. The discrete samples and continuous-monitoring data are also available in the USGS National Water Information System at https://waterdata.usgs.gov/nwis. This data release consists of (1) Table of the discrete water-quality data collected (Merrimack_DiscreteWQ_Data.csv); (2) Statistical summaries including the minimum, median, and maximum of the discrete water-quality data collected (Merrimack_DiscreteWQ_Statistical_Data.original.csv); (3) Statistical summaries including the minimum, median, and maximum of the continuous water-quality data collected (Merrimack_ContinuousWQ_Statistical_Data.csv); (4) Table of vertical profile data (Merrimack_VerticalWQ_Profiles_Data.csv); (5) Table of continuous monitor deployment location and dates (Merrimack_ContinuousWQ_Deployment_Dates.csv); (6) Time-series plots of continuous water-quality data (Continuous_QW_Plots_All.zip); (7) Vertical profile plots (Vertical Profiles_QW_Plots.zip).
This dataset includes discrete sample and profile data collected from DISCOVERY in the Indian Ocean, South Atlantic Ocean and Southern Oceans (> 60 degrees South) from 1993-02-06 to 1993-03-18. These data include CHLOROFLUOROCARBON-11 (CFC-11), CHLOROFLUOROCARBON-113 (CFC-113), CHLOROFLUOROCARBON-12 (CFC-12), Carbon tetrachloride (CCL4), DISSOLVED OXYGEN, Delta Oxygen-18, HYDROSTATIC PRESSURE, NITRATE, SALINITY, WATER TEMPERATURE, phosphate and silicate. The instruments used to collect these data include CTD and bottle. These data were collected by Robert R. Dickson of Fisheries Laboratory - Lowestoft as part of the WOCE_S04_74DI19930206 dataset. CDIAC associated the following cruise ID(s) with this dataset: WOCE_S04_1993 The World Ocean Circulation Experiment (WOCE) was a major component of the World Climate Research Program with the overall goal of better understanding the ocean's role in climate and climatic changes resulting from both natural and anthropogenic causes. The CO2 survey took advantage of the sampling opportunities provided by the WOCE Hydrographic Program (WHP) cruises during this period between 1990 and 1998. The final collection covers approximately 23,000 stations from 94 WOCE cruises.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mutual information (MI) is a powerful method for detecting relationships between data sets. There are accurate methods for estimating MI that avoid problems with “binning” when both data sets are discrete or when both data sets are continuous. We present an accurate, non-binning MI estimator for the case of one discrete data set and one continuous data set. This case applies when measuring, for example, the relationship between base sequence and gene expression level, or the effect of a cancer drug on patient survival time. We also show how our method can be adapted to calculate the Jensen–Shannon divergence of two or more data sets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a dataset is an example of a distribution of 20 correlated Bernoulli random variables.
Discrete sample data from manual field collection and laboratory analyses taken since 2020. It contains water quality, sediment, biological, air, and soil samples from monitoring locations across the Lower Brazos Subregion of Texas, Hydrologic Unit Code (HUC) 1207.
Discrete sample data from manual field collection and laboratory analyses taken since 2010. It contains water quality, sediment, biological, air, and soil samples from monitoring locations across the Lower Canadian Subregion of Texas, Hydrologic Unit Code (HUC) 1109.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Usage note: Time zone information was not received by BCO-DMO so it is unknown if the dates and times in this dataset are local or UTC. Please contact the dataset PI if you have questions about this.
This dataset includes chemical, discrete sample, physical and profile data collected from SONNE in the North Atlantic Ocean and South Atlantic Ocean from 2000-11-28 to 2000-12-27 and retrieved during cruise 06BE152. These data include CHLOROFLUOROCARBON-11 (CFC-11), CHLOROFLUOROCARBON-12 (CFC-12), DELTA HELIUM-3, DISSOLVED OXYGEN, HELIUM, HYDROSTATIC PRESSURE, NEON, NITRATE, NITRITE, PHOSPHATE, Potential temperature (theta), SALINITY, SILICATE, Tritium (Hydrogen isotope) and WATER TEMPERATURE. The instruments used to collect these data include CTD and bottle. These data were collected by Monika Rhein of Universität Bremen as part of the CARINA/06BE20001128 (06BE152, WOCE AR04) dataset. The CARINA (CARbon dioxide IN the Atlantic Ocean) data synthesis project is an international collaborative effort of the EU IP CARBOOCEAN, and U.S. partners. It has produced a merged internally consistent dataset of open ocean subsurface measurements for biogeochemical investigations, in particular, studies involving the carbon system. The original focus area was the North Atlantic Ocean, but over time the geographic extent expanded and CARINA now includes data from the entire Atlantic, the Arctic Ocean, and the Southern Ocean.
The goal of this study was to develop a suite of inter-related water quality monitoring approaches capable of modeling and estimating the spatial and temporal gradients of particulate and dissolved total mercury (THg) concentration, and particulate and dissolved methyl mercury (MeHg), concentration, in surface waters across the Sacramento / San Joaquin River Delta (SSJRD). This suite of monitoring approaches included: a) data collection at fixed continuous monitoring stations (CMS) outfitted with in-situ sensors, b) spatial mapping using boat-mounted flow-through sensors, and c) satellite-based remote sensing. The focus of this specific Child Page is to present all field and laboratory-based data associated with discrete surface water samples collected as part of the CMS and boat mapping components of the study. The data provided in the table herein constitute a collection of field-based and laboratory-based measurements that coincide with the timestamps of samples collected at 33 sites across the Delta. Laboratory-based measurements presented herein were conducted by the U.S. Geological Survey (USGS) Organic Matter Research Laboratory (OMRL) in Sacramento, CA, the USGS Earths System Processes Division (ESPD) microbial biogeochemistry laboratory in Menlo Park, CA, the USGS Reston Stable Isotope Laboratory (RSIL) in Reston, VA and the USGS National Water Quality Laboratory (NWQL) in Denver, CO. The machine-readable (comma separated value, *.csv) file presented herein includes laboratory-based measurements for discrete samples collected from 33 established field sites (sampled repeatedly). In addition, field-based sensor data from continuous measurement platforms (CMS locations or as part of the mapping boat flow-through system) are also included in this discrete sample dataset by ensuring that the field sensor measurements were both spatially and temporally coincident with the physically discrete water sample collected for laboratory analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset shows the amount of water used by a company in southern China from 2016 to 2017.
The United States Geological Survey (USGS) is studying the effects of climate change on ocean acidification within the Gulf of Mexico; dealing specifically with the effect of ocean acidification on marine organisms and habitats. To investigate this, the USGS participated in cruises on the West Florida Shelf and northern Gulf of Mexico regions aboard the research vessel (R/V) Weatherbird II or Bellows, ships of opportunity led by Dr. Kendra Daly, of the University of South Florida (USF) in July and August, 2013. Cruises left from and returned to Saint Petersburg, Florida, but followed different routes. The USGS collected geochemical data pertaining to pH, dissolved inorganic carbon (DIC), total carbon dioxide (TCO2), and total alkalinity (TA) in discrete samples at various depths from predetermined stations. Discrete surface samples were also taken, while in transit, during both cruises.
The United States Geological Survey (USGS) is studying the effects of climate change on ocean acidification within the Gulf of Mexico; dealing specifically with the effect of ocean acidification on marine organisms and habitats. To investigate this, the USGS participated in cruises on the West Florida Shelf and northern Gulf of Mexico regions aboard the research vessel (R/V) Weatherbird II or Bellows, ships of opportunity led by Dr. Kendra Daly, of the University of South Florida (USF) in July and August, 2013. Cruises left from and returned to Saint Petersburg, Florida, but followed different routes. The USGS collected geochemical data pertaining to pH, dissolved inorganic carbon (DIC), total carbon dioxide (TCO2), and total alkalinity (TA) in discrete samples at various depths from predetermined stations. Discrete surface samples were also taken, while in transit, during both cruises.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This data set consists of chemical and stable isotope data obtained through the analysis of discrete water samples collected from 14 fixed sampling locations in the northern Sacramento-San Joaquin Delta at roughly monthly intervals between April 2011 and November 2012.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We use individual-level health facility choice data from urban Senegal to estimate consumer preferences for facility characteristics related to maternal health services. We find that consumers consider a large number of quality related facility characteristics, as well as travel costs, when making their health facility choice. In contrast to the typical assumption in the literature, our findings indicate that individuals frequently bypass the facility nearest their home. In light of this, we show that the mismeasured data used commonly in the literature produces biased preference estimates; most notably, the literature likely overestimates consumer distaste for travel.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This paper demonstrates the flexibility of a general approach for the analysis of discrete time competing risks data that can accommodate complex data structures, different time scales for different causes, and nonstandard sampling schemes. The data may involve a single data source where all individuals contribute to analyses of both cause-specific hazard functions, overlapping datasets where some individuals contribute to the analysis of the cause-specific hazard function of only one cause while other individuals contribute to analyses of both cause-specific hazard functions, or separate data sources where each individual contributes to the analysis of the cause-specific hazard function of only a single cause. The approach is modularized into estimation and prediction. For the estimation step, the parameters and the variance-covariance matrix can be estimated using widely available software. The prediction step utilizes a generic program with plug-in estimates from the estimation step. The approach is illustrated with three prognostic models for stage IV male oral cancer using different data structures. The first model uses only men with stage IV oral cancer from population-based registry data. The second model strategically extends the cohort to improve the efficiency of the estimates. The third model improves the accuracy for those with a lower risk of other causes of death, by bringing in an independent data source collected under a complex sampling design with additional other-cause covariates. These analyses represent novel extensions of existing methodology, broadly applicable for the development of prognostic models capturing both the cancer and non-cancer aspects of a patient's health.
This dataset includes chemical, discrete sample, physical and profile data collected from METEOR in the North Atlantic Ocean and South Atlantic Ocean from 1990-10-04 to 1990-10-27 and retrieved during cruise WOCE_AE4E_06MT19901004. These data include CHLOROFLUOROCARBON-11 (CFC-11), CHLOROFLUOROCARBON-12 (CFC-12), DELTA CARBON-14, DELTA HELIUM-3, DISSOLVED OXYGEN, HELIUM, HYDROSTATIC PRESSURE, Potential temperature (theta), SALINITY, Tritium (Hydrogen isotope) and WATER TEMPERATURE. The instruments used to collect these data include CTD and bottle. These data were collected by Monika Rhein of Leibniz Institute of Marine Sciences (IFM-GEOMAR) and Friedrich A. Schott of IfMK as part of the WOCE_AE4E_06MT19901004 dataset. The International CLIVAR Global Ocean Carbon and Repeat Hydrography Program carries out a systematic and global re-occupation of select WOCE/JGOFS hydrographic sections to quantify changes in storage and transport of heat, fresh water, carbon dioxide (CO2), and related parameters.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Example of bias resulting from fitting discrete data with continuous maximum likelihood solutions.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
The primary article (cited below under "Related works") introduces social work researchers to discrete choice experiments (DCEs) for studying stakeholder preferences. The article includes an online supplement with a worked example demonstrating DCE design and analysis with realistic simulated data. The worked example focuses on caregivers' priorities in choosing treatment for children with attention deficit hyperactivity disorder. This dataset includes the scripts (and, in some cases, Excel files) that we used to identify appropriate experimental designs, simulate population and sample data, estimate sample size requirements for the multinomial logit (MNL, also known as conditional logit) and random parameter logit (RPL) models, estimate parameters using the MNL and RPL models, and analyze attribute importance, willingness to pay, and predicted uptake. It also includes the associated data files (experimental designs, data generation parameters, simulated population data and parameters, simulated choice data, MNL and RPL results, RPL sample size simulation results, and willingness-to-pay results) and images. The data could easily be analyzed using other software, and the code could easily be adapted to analyze other data. Because this dataset contains only simulated data, we are not aware of any legal or ethical considerations. Methods In the worked example, we used simulated data to examine caregiver preferences for 7 treatment attributes (medication administration, therapy location, school accommodation, caregiver behavior training, provider communication, provider specialty, and monthly out-of-pocket costs) identified by dosReis and colleagues in a previous DCE. We employed an orthogonal design with 1 continuous variable (cost) and 12 dummy-coded variables (representing the levels of the remaining attributes, which were categorical). Using the parameter estimates published by dosReis et al., with slight adaptations, we simulated utility values for a population of 100,000 people, then selected a sample of 500 for analysis. Relying on random utility theory, we used the mlogit package in R to estimate the MNL and RPL models, using 5,000 Halton draws for simulated maximum likelihood estimation of the RPL model. In addition to estimating the utility parameters, we measured the relative importance of each attribute, estimated caregivers’ willingness to pay (WTP) for differences in attributes (e.g., how much they would be willing to pay for their child to see one type of provider versus another) with bootstrapped 95% confidence intervals, and predicted the uptake of three treatment packages with different sets of attributes. This submission includes both the simulated source data and the processed results. The online supplement of the primary article describes the methods in greater detail.
Discrete sample data from manual field collection and laboratory analyses taken since 2000. It contains water quality, sediment, biological, air, and soil samples from monitoring locations across the Lower Pecos Subregion of Texas, Hydrologic Unit Code (HUC) 1307.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Ocean Observatories Initiative (OOI) is a long-term NSF-funded program that deploys autonomous sensors on both moored and mobile platforms at multiple locations, including the Global Irminger Sea Array (60.46°N, 38.44°W) (Trowbridge et al., 2019). The OOI program conducts yearly turn-around cruises to the Irminger Sea Array to recover and redeploy moorings and gliders deployed year-round at this site. During these cruises the OOI program routinely conducts Conductivity Temperature Depth (CTD) casts and collects water samples from Niskin bottles on the CTD rosette for discrete sample analysis. These turn-around cruise data are critical for validation and calibration of the data from sensors deployed year-round and also provide a valuable dataset in and of themselves (Palevsky et al., 2023).
For this project, our team participated in two of the yearly turn-around cruises to the OOI Irminger Sea Array (AR30-03, 4-24 June, 2018 and AR35-05, 2-25 August, 2019) and collected supplementary additional samples from CTD casts to further support efforts to improve the capacity to produce high-quality data products from OOI’s biogeochemical sensors to enable analysis of scientific questions about the ocean’s biological carbon pump and other carbon cycling processes (Palevsky and Nicholson, 2018; Palevsky et al., 2023).
These supplemental data provided here were collected in coordination with data collected by the OOI program. The complete collection of shipboard data and cruise documentation from these cruises is available from an OOI managed document storage system called Alfresco (see related publications), following the path: OOI > Global Irminger Sea Array > Cruise Data > {Cruise ID}. For more information on OOI data access options and recommendations for use of cruise data to calibrate OOI biogeochemical sensors, see the OOI Biogeochemical Sensor Data Best Practices and User Guide (Palevsky et al., 2023).
Full cruise data from the OOI Irminger Sea cruises for which discrete sample data are presented here can be accessed via the Rolling Deck to Repository (R2R, see deployments) and OOI’s Alfresco data management server (see related publications):
Irminger Sea 5 cruise, June 2018, AR30-03. Alfresco path: Global Irminger Sea Array > Cruise Data > Irminger_Sea-05_AR30-03_2018-06-05
Irminger Sea 6 cruise, August 2019, AR35-05. Alfresco path: Global Irminger Sea Array > Cruise Data > Irminger_Sea-06_AR35-05_2019-08-02
The McRaven 2022 datasets (see related datasets) provide salinity-calibrated Conductivity Temperature Depth (CTD) data from the OOI Irminger Sea cruises for which discrete sample data are presented here.
This data release includes water-quality data collected at up to thirteen locations along the Merrimack River and Merrimack River Estuary in Massachusetts. In this study, conducted by the U.S. Geological Survey (USGS) in cooperation with the Massachusetts Department of Environmental Protection, discrete samples were collected, and continuous monitoring was completed from June to September 2020. The data include results of measured field properties (water temperature, specific conductivity, pH, dissolved oxygen) and laboratory concentrations of nitrogen and phosphorus species, total carbon, pheophytin-a, and chlorophyll-a. These data were collected to assess selected (mainly nutrients) water-quality conditions in the Merrimack River and Merrimack River Estuary at the thirteen locations and identify areas where more water-quality monitoring is needed. The discrete samples and continuous-monitoring data are also available in the USGS National Water Information System at https://waterdata.usgs.gov/nwis. This data release consists of (1) Table of the discrete water-quality data collected (Merrimack_DiscreteWQ_Data.csv); (2) Statistical summaries including the minimum, median, and maximum of the discrete water-quality data collected (Merrimack_DiscreteWQ_Statistical_Data.original.csv); (3) Statistical summaries including the minimum, median, and maximum of the continuous water-quality data collected (Merrimack_ContinuousWQ_Statistical_Data.csv); (4) Table of vertical profile data (Merrimack_VerticalWQ_Profiles_Data.csv); (5) Table of continuous monitor deployment location and dates (Merrimack_ContinuousWQ_Deployment_Dates.csv); (6) Time-series plots of continuous water-quality data (Continuous_QW_Plots_All.zip); (7) Vertical profile plots (Vertical Profiles_QW_Plots.zip).