81 datasets found

Poisson Distribution - Discrete Data
kaggle.com
zip
Updated Jan 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alberto Marini (2025). Poisson Distribution - Discrete Data [Dataset]. https://www.kaggle.com/datasets/albertomarini88/poisson-process
Explore at:
zip(36993 bytes)Available download formats
Dataset updated
Jan 30, 2025
Authors
Alberto Marini
License
https://www.usa.gov/government-works/https://www.usa.gov/government-works/
Description
The Poisson Process file concerns the solution of an exercise from the fourth module of the Statistics and Applied Data Analysis Specialization course at the University of Colorado Boulder that I took. In these notes, I intend to explain the most important steps.
Secondary data analysis using Understanding Society Data
eprints.soton.ac.uk
harmonydata.ac.uk
+1more
Updated May 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Durrant, Gabriele (2024). Secondary data analysis using Understanding Society Data [Dataset]. http://doi.org/10.5255/UKDA-SN-852046
Explore at:
Unique identifier
https://doi.org/10.5255/UKDA-SN-852046
Dataset updated
May 17, 2024
Dataset provided by
UK Data Archivehttp://data-archive.ac.uk/
Authors
Durrant, Gabriele
Description
We analysed the Understanding Society Data from Waves 1 and 2 in our project to explore the uses of paradata in cross-sectional and longitudinal surveys with the aim of gaining knowledge that leads to improvement in field process management and responsive survey designs.
Additional file 2: of DRfit: a Java tool for the analysis of discrete data...
springernature.figshare.com
figshare.com
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andreas Hofmann; Sarah Preston; Megan Cross; H. Herath; Anne Simon; Robin Gasser (2023). Additional file 2: of DRfit: a Java tool for the analysis of discrete data from multi-well plate assays [Dataset]. http://doi.org/10.6084/m9.figshare.8164121.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.8164121.v1
Dataset updated
Jun 1, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Andreas Hofmann; Sarah Preston; Megan Cross; H. Herath; Anne Simon; Robin Gasser
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Example session file. (DRFIT 46 kb)
c
Data from: Investment and Interest Rate Policy: A Discrete Time Analysis
clevelandfed.org
Updated Dec 20, 2003
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Federal Reserve Bank of Cleveland (2003). Investment and Interest Rate Policy: A Discrete Time Analysis [Dataset]. https://www.clevelandfed.org/publications/working-paper/2003/wp-0320-investment-and-interest
Explore at:
Dataset updated
Dec 20, 2003
Dataset authored and provided by
Federal Reserve Bank of Cleveland
Description
This paper analyzes the restrictions necessary to ensure that the interest rate policy rule used by the central bank does not introduce local real indeterminacy into the economy. It conducts the analysis in a Calvo-style sticky price model. A key innovation is to add investment spending to the analysis. In this environment, local real indeterminacy is much more likely. In particular, all forward-looking interest rate rules are subject to real indeterminacy.
e
Data from: Discrete System Analysis
paper.erudition.co.in
html
Updated Aug 25, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Einetic (2021). Discrete System Analysis [Dataset]. https://paper.erudition.co.in/1/btech-in-electronics-and-instrumentation-engineering/7/digital-control-system
Explore at:
htmlAvailable download formats
Dataset updated
Aug 25, 2021
Dataset authored and provided by
Einetic
License
https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Description
Question Paper Solutions of chapter Discrete System Analysis of Digital Control System, 7th Semester , Applied Electronics and Instrumentation Engineering
d
Coastal Ocean Data Analysis Product in North America (CODAP-NA, Version...
catalog.data.gov
Updated Nov 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(Point of Contact) (2025). Coastal Ocean Data Analysis Product in North America (CODAP-NA, Version 2021) (NCEI Accession 0219960) [Dataset]. https://catalog.data.gov/dataset/coastal-ocean-data-analysis-product-in-north-america-codap-na-version-2021-ncei-accession-021991
Explore at:
Dataset updated
Nov 1, 2025
Dataset provided by
(Point of Contact)
Area covered
North America
Description
This data package contains an internally consistent data product for discrete inorganic carbon, oxygen, and nutrients on the U.S. North American ocean margins, i.e., Coastal Ocean Data Analysis Product (CODAP-NA). It is created by compiling, quality controlling (QC), and synthesizing two decades of discrete measurements of inorganic carbon, oxygen, and nutrient chemistry data from North Americaâ€™s U.S. coastal oceans. Due to the lack of deep-water sampling (>1500m), cross-over analyses were not conducted like the open ocean data products. Instead, only core data sets from laboratories with known quality assurance are included. Internal consistency checks and outlier detections are used to quality control the data. We worked closely with the investigators who collected and measured these data during the QC process. This version of the CODAP-NA is composed of 3,391 oceanographic profiles from 61 research cruises covering all continental shelves in North America (U.S. west coast, U.S. east coast, Gulf of Mexico, and Alaska coast). Data for 14 variables (temperature; salinity; dissolved oxygen concentration; dissolved inorganic carbon concentration; total alkalinity; pH on the Total Scale; carbonate ion concentration; fugacity of carbon dioxide; and concentrations of silicate, phosphate, nitrate, nitrite, nitrate plus nitrite, and ammonium) have been subjected to extensive quality control. Funding for this work comes from the National Oceanic and Atmospheric Administration (NOAA) Ocean Acidification Program (OAP, Project #: OAP 1903-1903).
t
Data from: Data sets for the analysis of decomposition error in...
service.tib.eu
Updated Nov 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Data sets for the analysis of decomposition error in discrete-time open tandem queues [Dataset]. https://service.tib.eu/ldmservice/dataset/rdr-doi-10-35097-1342
Explore at:
Dataset updated
Nov 28, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract: This data repository contains raw data for the analysis of decomposition error in discrete-time open tandem queues. The data is formatted for the computation and validation of point and interval estimates for decomposition error as well as for the analysis of decomposition error in bottleneck queues. TechnicalRemarks: This data repository contains two folders: 01 Equal Traffic Intensities – Raw data for the analysis of decomposition error in tandem queues with equal traffic intensities, 02 Bottleneck Analyses – Raw data for the analysis of decomposition error in tandem queues with bottlenecks. The first folder contains a training data and a test data file. The second folder contains three files: Data set with downstream bottleneck queues, Data set with upstream bottleneck queues, * Data set with similar traffic intensities.
H
Using a discrete mathematics approach, distinct BPS/IC phenotypes and...
dataverse.harvard.edu
Updated May 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nobuo Okui (2024). Using a discrete mathematics approach, distinct BPS/IC phenotypes and personalized treatment targets are revealed. [Dataset]. http://doi.org/10.7910/DVN/CEWVPA
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/CEWVPA
Dataset updated
May 2, 2024
Dataset provided by
Harvard Dataverse
Authors
Nobuo Okui
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This study identified subgroups of bladder pain syndrome/interstitial cystitis (BPS/IC) patients and potential treatment targets by combining validated questionnaires and patient diaries with discrete mathematical techniques. Hierarchical clustering of questionnaire data revealed three distinct patient groups. Analysis of patient diaries, employing natural language processing—a form of discrete data analysis—found keywords capturing emotional and psychological experiences, complementing the questionnaire results. Integration of questionnaire and diary data visualized the relationships between symptoms and treatment targets through a network graph. This personalized approach, akin to solving the traveling salesman problem in discrete mathematics, was validated through case studies, demonstrating its utility in guiding targeted interventions. The study emphasizes the significant potential of discrete mathematics-based data integration and visualization for personalized management of this complex condition.
d
Data from: Global Ocean Data Analysis Project, Version 2 (GLODAPv2) (NCEI...
catalog.data.gov
s.cnmilf.com
Updated Nov 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(Point of Contact) (2025). Global Ocean Data Analysis Project, Version 2 (GLODAPv2) (NCEI Accession 0162565) [Dataset]. https://catalog.data.gov/dataset/global-ocean-data-analysis-project-version-2-glodapv2-ncei-accession-0162565
Explore at:
Dataset updated
Nov 1, 2025
Dataset provided by
(Point of Contact)
Description
This data product is composed of data from 724 scientific cruises covering the global ocean. It includes data assembled during the previous interior ocean data synthesis efforts GLODAPv1.1 (Global Ocean Data Analysis Project version 1.1) in 2004, CARINA (CARbon IN the Atlantic) in 2009/2010, and PACIFICA (PACIFic ocean Interior CArbon) in 2013, as well as data from an additional 168 cruises. This dataset includes discrete bottle measurements of salinity, oxygen, nitrate, silicate, phosphate, dissolved inorganic carbon, total alkalinity, pH, CFC-11, CFC-12, CFC-113, and CCl4, carbon isotopes and chlorophyll. These data have been subjected to extensive primary and secondary quality control which included systematic evaluation of bias, and adjustments have been applied to remove significant biases, respecting occurrences of any known or likely time trends or variations.
Discrete Semiconductor Market - Size, Share & Growth Analysis
mordorintelligence.com
pdf,excel,csv,ppt
Updated Sep 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mordor Intelligence (2025). Discrete Semiconductor Market - Size, Share & Growth Analysis [Dataset]. https://www.mordorintelligence.com/industry-reports/discrete-semiconductor-market
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Sep 9, 2025
Dataset authored and provided by
Mordor Intelligence
License
https://www.mordorintelligence.com/privacy-policyhttps://www.mordorintelligence.com/privacy-policy
Time period covered
2019 - 2030
Area covered
Global
Description
The Discrete Semiconductor Market Report is Segmented by Device Type (Diode, Small-Signal Transistor, and More), End-User Vertical (Automotive, Consumer Electronics, and More), Material (Silicon, Silicon-Carbide, Gallium-Nitride), Power Rating (Low-Power, Mid-Power, High-Power), and Geography (North America, South America, Europe, Asia-Pacific, Middle East, and Africa). The Market Forecasts are Provided in Terms of Value (USD).
U
Discrete and high-frequency chloride (Cl) and specific conductance (SC) data...
data.usgs.gov
s.cnmilf.com
+1more
Updated Nov 19, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rosemary Fanelli; Andrew Sekellick; Joel Moore (2021). Discrete and high-frequency chloride (Cl) and specific conductance (SC) data sets and Cl-SC regression equations used for analysis of 93 USGS water quality monitoring stations in the eastern United States [Dataset]. http://doi.org/10.5066/P9YN2QST
Explore at:
Unique identifier
https://doi.org/10.5066/P9YN2QST
Dataset updated
Nov 19, 2021
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Rosemary Fanelli; Andrew Sekellick; Joel Moore
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Time period covered
Oct 1, 1953 - Sep 30, 2018
Area covered
United States
Description
High frequency estimated chloride (Cl) and observed specific conductance (SC) data sets, along with response variables derived from those data sets, were used in an analysis to quantify the extent to which deicer applications in winter affect water quality in 93 U.S. Geological Survey water quality monitoring stations across the eastern United States. The analysis was documented in the following publication: Moore, J., R. Fanelli, and A. Sekellick. In review. High-frequency data reveal deicing salts drive elevated conductivity and chloride along with pervasive and frequent exceedances of the EPA aquatic life criteria for chloride in urban streams. Submitted to Environmental Science and Technology. This data release contains five child items: 1) Input datasets of discrete specific conductance (SC) and chloride (Cl) observations used to develop regression models describing the relationship between chloride and SC 2) The predicted chloride concentrations generated by applying the s ...
e
Data for: Reducing Sample Size Requirements by Extending Discrete Choice...
opendata.eawag.ch
opendata-stage.eawag.ch
Updated Oct 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Data for: Reducing Sample Size Requirements by Extending Discrete Choice Experiments to Indifference Elicitation - Package - ERIC [Dataset]. https://opendata.eawag.ch/dataset/data-for-sriwastava-et-al-2023-reducing-sample-size-requirements
Explore at:
Dataset updated
Oct 10, 2023
Description
Discrete choice (DC) methods provide a convenient approach for preference elicitation and they lead to unbiased estimates of preference model parameters if the parameterization of the value function allows for a good description of the preferences. On the other hand, indifference elicitation (IE) has been suggested as a direct trade-off estimator for preference elicitation in decision analysis decades ago, but has not found widespread application in statistical analysis frameworks as for discrete choice methods. We develop a hierarchical, probabilistic model for IE that allows us to do Bayesian inference similar to DC methods. A case study with synthetically generated data allows us to investigate potential bias and to estimate parameter uncertainty over a wide range of numbers of replies and elicitation uncertainties for both DC and IE. Through an empirical case study with laboratory-scale choice and indifference experiments, we investigate the feasibility of the approach and the excess time needed for indifference replies. Our results demonstrate (i) the absence of bias of the suggested methodology, (ii) a reduction in the uncertainty of estimated parameters by about a factor of three or a reduction of the required number of replies to achieve a similar accuracy as with DC by about a factor of ten, (iii) the feasibility of the approach, and (iv) a median increase in time needed for indifference reply of about a factor of three. If the set of respondents is small, the higher elicitation effort may be worth to achieve a reasonable accuracy in estimated value function parameters.
Lending Club Loan Data Analysis - Deep Learning
kaggle.com
zip
Updated Aug 9, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Deependra Verma (2023). Lending Club Loan Data Analysis - Deep Learning [Dataset]. https://www.kaggle.com/datasets/deependraverma13/lending-club-loan-data-analysis-deep-learning/discussion
Explore at:
zip(219646 bytes)Available download formats
Dataset updated
Aug 9, 2023
Authors
Deependra Verma
Description
DESCRIPTION

Create a model that predicts whether or not a loan will be default using the historical data.

Problem Statement:

For companies like Lending Club correctly predicting whether or not a loan will be a default is very important. In this project, using the historical data from 2007 to 2015, you have to build a deep learning model to predict the chance of default for future loans. As you will see later this dataset is highly imbalanced and includes a lot of features that make this problem more challenging.

Domain: Finance

Analysis to be done: Perform data preprocessing and build a deep learning prediction model.

Content:

Dataset columns and definition:

credit.policy: 1 if the customer meets the credit underwriting criteria of LendingClub.com, and 0 otherwise.

purpose: The purpose of the loan (takes values "credit_card", "debt_consolidation", "educational", "major_purchase", "small_business", and "all_other").

int.rate: The interest rate of the loan, as a proportion (a rate of 11% would be stored as 0.11). Borrowers judged by LendingClub.com to be more risky are assigned higher interest rates.

installment: The monthly installments owed by the borrower if the loan is funded.

log.annual.inc: The natural log of the self-reported annual income of the borrower.

dti: The debt-to-income ratio of the borrower (amount of debt divided by annual income).

fico: The FICO credit score of the borrower.

days.with.cr.line: The number of days the borrower has had a credit line.

revol.bal: The borrower's revolving balance (amount unpaid at the end of the credit card billing cycle).

revol.util: The borrower's revolving line utilization rate (the amount of the credit line used relative to total credit available).

inq.last.6mths: The borrower's number of inquiries by creditors in the last 6 months.

delinq.2yrs: The number of times the borrower had been 30+ days past due on a payment in the past 2 years.

pub.rec: The borrower's number of derogatory public records (bankruptcy filings, tax liens, or judgments).

Steps to perform:

Perform exploratory data analysis and feature engineering and then apply feature engineering. Follow up with a deep learning model to predict whether or not the loan will be default using the historical data.

Tasks:

Feature Transformation

Transform categorical values into numerical values (discrete)

Exploratory data analysis of different factors of the dataset.

Additional Feature Engineering

You will check the correlation between features and will drop those features which have a strong correlation

This will help reduce the number of features and will leave you with the most relevant features

Modeling

After applying EDA and feature engineering, you are now ready to build the predictive models

In this part, you will create a deep learning model using Keras with Tensorflow backend
f
Table_2_The value of generalized linear mixed models for data analysis in...
figshare.com
frontiersin.figshare.com
docx
Updated Jun 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laurence V. Madden; Peter S. Ojiambo (2024). Table_2_The value of generalized linear mixed models for data analysis in the plant sciences.docx [Dataset]. http://doi.org/10.3389/fhort.2024.1423462.s002
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fhort.2024.1423462.s002
Dataset updated
Jun 25, 2024
Dataset provided by
Frontiers
Authors
Laurence V. Madden; Peter S. Ojiambo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Modern data analysis typically involves the fitting of a statistical model to data, which includes estimating the model parameters and their precision (standard errors) and testing hypotheses based on the parameter estimates. Linear mixed models (LMMs) fitted through likelihood methods have been the foundation for data analysis for well over a quarter of a century. These models allow the researcher to simultaneously consider fixed (e.g., treatment) and random (e.g., block and location) effects on the response variables and account for the correlation of observations, when it is assumed that the response variable has a normal distribution. Analysis of variance (ANOVA), which was developed about a century ago, can be considered a special case of the use of an LMM. A wide diversity of experimental and treatment designs, as well as correlations of the response variable, can be handled using these types of models. Many response variables are not normally distributed, of course, such as discrete variables that may or may not be expressed as a percentage (e.g., counts of insects or diseased plants) and continuous variables with asymmetrical distributions (e.g., survival time). As expansions of LMMs, generalized linear mixed models (GLMMs) can be used to analyze the data arising from several non-normal statistical distributions, including the discrete binomial, Poisson, and negative binomial, as well as the continuous gamma and beta. A GLMM allows the data analyst to better match the model to the data rather than to force the data to match a specific model. The increase in computer memory and processing speed, together with the development of user-friendly software and the progress in statistical theory and methodology, has made it practical for non-statisticians to use GLMMs since the late 2000s. The switch from LMMs to GLMMs is deceptive, however, as there are several major issues that must be thought about or judged when using a GLMM, which are mostly resolved for routine analyses with LMMs. These include the consideration of conditional versus marginal distributions and means, overdispersion (for discrete data), the model-fitting method [e.g., maximum likelihood (integral approximation), restricted pseudo-likelihood, and quasi-likelihood], and the choice of link function to relate the mean to the fixed and random effects. The issues are explained conceptually with different model formulations and subsequently with an example involving the percentage of diseased plants in a field study with wheat, as well as with simulated data, starting with a LMM and transitioning to a GLMM. A brief synopsis of the published GLMM-based analyses in the plant agricultural literature is presented to give readers a sense of the range of applications of this approach to data analysis.
G
Genomic Results Discrete Data Integration Market Research Report 2033
growthmarketreports.com
csv, pdf, pptx
Updated Oct 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Growth Market Reports (2025). Genomic Results Discrete Data Integration Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/genomic-results-discrete-data-integration-market
Explore at:
csv, pptx, pdfAvailable download formats
Dataset updated
Oct 7, 2025
Dataset authored and provided by
Growth Market Reports
Time period covered
2024 - 2032
Area covered
Global
Description
Genomic Results Discrete Data Integration Market Outlook

As per our latest research, the global Genomic Results Discrete Data Integration market size reached USD 1.45 billion in 2024, demonstrating robust momentum driven by the increasing adoption of precision medicine and advanced data analytics in genomics. The market is projected to expand at a CAGR of 13.2% during the forecast period, reaching an estimated USD 4.14 billion by 2033. This impressive growth trajectory is fueled by the convergence of high-throughput sequencing technologies, the rising demand for integrated healthcare data, and the need for actionable insights from complex genomic datasets.

A primary growth factor in the Genomic Results Discrete Data Integration market is the exponential rise in genomic data generation, propelled by advancements in next-generation sequencing (NGS) and other high-throughput technologies. As the cost of sequencing continues to decline, the volume of raw genomic data produced by research laboratories, clinical settings, and biopharmaceutical companies has surged. However, the true value of this data is only realized when disparate datasets—spanning genomics, transcriptomics, proteomics, and metabolomics—are seamlessly integrated and analyzed. The integration of discrete genomic results enables researchers and clinicians to uncover complex biological relationships, identify novel biomarkers, and support the development of targeted therapies, thus driving widespread adoption of data integration platforms and solutions.

Another significant driver is the increasing focus on personalized medicine, which relies heavily on the integration of multi-omics data to tailor medical treatments to individual patients. Healthcare providers and pharmaceutical companies are leveraging integrated genomic data to stratify patient populations, predict disease susceptibility, and optimize therapeutic interventions. This shift toward data-driven healthcare is further supported by regulatory agencies encouraging the use of real-world evidence and integrated datasets for drug approval and post-market surveillance. Consequently, the demand for robust, scalable, and interoperable data integration solutions is surging, as stakeholders seek to harness the full potential of genomic and related datasets for clinical and research applications.

Furthermore, the Genomic Results Discrete Data Integration market benefits from technological innovations in artificial intelligence (AI), machine learning (ML), and cloud computing. These technologies facilitate the efficient aggregation, harmonization, and analysis of massive and heterogeneous datasets, overcoming traditional barriers to data integration such as data silos, format inconsistencies, and security concerns. The adoption of AI-driven analytics and cloud-based integration platforms is accelerating, enabling real-time data sharing, collaborative research, and scalable storage solutions. These advancements are not only enhancing the accuracy and speed of data interpretation but also democratizing access to integrated genomic insights across diverse healthcare and research environments.

From a regional perspective, North America continues to dominate the Genomic Results Discrete Data Integration market, accounting for the largest share in 2024, followed by Europe and Asia Pacific. The region’s leadership is attributed to its advanced healthcare infrastructure, significant investments in genomics research, and the presence of leading biopharmaceutical and technology companies. Meanwhile, Asia Pacific is emerging as the fastest-growing region, propelled by expanding genomic research initiatives, increasing healthcare expenditure, and government support for precision medicine. Europe also demonstrates steady growth, driven by collaborative research projects and strong regulatory frameworks supporting data integration. Latin America and Middle East & Africa represent nascent but promising markets, with growing awareness and gradual adoption of integrated genomic solutions.

Component Analysis

The Com
d
Data from: Select elements of concern in surface water of three hydrologic...
catalog.data.gov
Updated Sep 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Select elements of concern in surface water of three hydrologic basins (Delaware River, Illinois River and Upper Colorado River) - Data screening for the development of spatial and temporal models [Dataset]. https://catalog.data.gov/dataset/select-elements-of-concern-in-surface-water-of-three-hydrologic-basins-delaware-river-illi
Explore at:
Dataset updated
Sep 12, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Delaware River, Colorado River
Description
This data release is focused on the analysis of surface water concentration data associated with 12 elements of concern from three hydrologic basins. Data is analyzed with respect to: a) reporting limits, b) the extent of censored data, c) co-location with USGS real-time sensor data, and d) median concentrations at the catchment spatial scale. The Proxies Project (under the Water Quality Program of the USGS Water Mission Area) is a multi-year effort designed to develop rapid and cost-effective approaches for monitoring and risk assessment of a range of aquatic contaminants in riverine surface waters at multiple spatial scales. One component of this project is focused on 12 Elements of Concern (EoC; Al, As, Cd, Cr, Cu, Fe, Hg, Mn, Pb, Se, U and Zn) in three primary hydrologic basins: Delaware River Basin (DRB), the Illinois River Basin (ILRB) and the Upper Colorado (UCOL) River Basin (USGS, 2023). Two modeling approaches being explored as part of the Proxies Project rely on the analysis of previously published EoC concentration data retrieved from the multi-agency supported Water Quality Portal (www.waterqualitydata.us/). This basin-specific retrieved data, covering the 1900-2022 timeframe, was subsequently screened, harmonized and published as part of an earlier USGS Data Release (Marvin-DiPasquale and others, 2022). The two distinct modeling approaches that leverage this previously published data are: a) machine learning statistical analysis of EoC concentration distributions as a function of geospatial attributes; and b) time series analysis in support of estimating EoC concentrations in (near)real-time at a sub-set of USGS real-time stations using discharge in combination with a range of deployed in-situ sensors. Prior to the final stages of model development, there were several data analysis steps required to further define which elements and aquatic fractions (i.e. filtered, unfiltered, and particulate) best lend themselves to further model exploration and development. These intermediate data analyses include: a) an analysis of the change in detection quantitation limits, by element and methods over time (DR_Table _1); b) an analysis of data censoring, by study basin, element, and fraction (DR_Table_2); c) a calculation of median EoC concentrations at the National Hydrography Dataset Plus (NHDPlus) catchment spatial scale (DR_Table_3); d) an analysis of the percentage of censored median EoC concentration values by study basin, element, and fraction (DR_Table_4); e) decision tree analysis associated with the geospatial machine learning modeling approach, by study basin, element and fraction (DR_Table_5); f) discrete EoC concentration data merged with continuous discharge and in-situ sensor data at USGS real-time stations, by station ID, element and fraction (DR_Table_6); and g) an analysis of the total number of observations and the percentage of censored EoC data associated with the merged discrete EoC and continuous discharge and sensor data retrieved from USGS real-time stations, by station ID, element, and fraction (DR_Table_7). The current data release documents the results of these data analyses. The associated seven data tables presented herein are provided in machine-readable comma separated value (*.csv) format and are more fully described in the associated meta-data. REFERENCES Marvin-DiPasquale, M.C., Sullivan, S.L., Platt, L. R., Gorsky, A., Agee, J.L., McCleskey, B.R., Kakouros, E., Walton-Day, K., Runkel, R. L., Morriss, M. C., Wakefield, B. F., and Bergamaschi, B., 2022, Concentration Data for 12 Elements of Concern Used in the Development of Surrogate Models for Estimating Elemental Concentrations in Surface Water of Three Hydrologic Basins (Delaware River, Illinois River and Upper Colorado River): U.S. Geological Survey data release, https://doi.org/10.5066/P9L06M3G. USGS, 2023, Proxies Project, U.S. Geological Survey webpage, accessed 3/11/2025, https://www.usgs.gov/mission-areas/water-resources/science/proxies-project
D
Genomic Results Discrete Data Integration Market Research Report 2033
dataintelo.com
csv, pdf, pptx
Updated Oct 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Genomic Results Discrete Data Integration Market Research Report 2033 [Dataset]. https://dataintelo.com/report/genomic-results-discrete-data-integration-market
Explore at:
csv, pptx, pdfAvailable download formats
Dataset updated
Oct 1, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Genomic Results Discrete Data Integration Market Outlook

According to our latest research, the global Genomic Results Discrete Data Integration market size reached USD 2.18 billion in 2024, reflecting a robust expansion driven by the rapid adoption of precision medicine and the increasing integration of multi-omics data in healthcare and research. The market is projected to grow at a CAGR of 13.6% from 2025 to 2033, reaching an estimated USD 6.47 billion by 2033. This remarkable growth is primarily fueled by technological advancements in bioinformatics, an upsurge in clinical applications of genomics, and a growing demand for actionable insights from complex biological datasets.

One of the primary growth factors propelling the Genomic Results Discrete Data Integration market is the exponential increase in genomic data generated by next-generation sequencing (NGS) technologies. As the cost of sequencing continues to decrease, the volume of genomic, transcriptomic, proteomic, and metabolomic data being produced is rising dramatically. This surge necessitates advanced data integration solutions capable of transforming raw, heterogeneous datasets into structured, clinically relevant information. The ability to harmonize and standardize disparate data sources is crucial for supporting clinical diagnostics, personalized medicine, and drug discovery, all of which rely on robust data integration platforms to drive informed decisions and improve patient outcomes.

Another significant driver is the growing emphasis on personalized medicine and targeted therapeutics. Healthcare providers and pharmaceutical companies are increasingly leveraging discrete data integration platforms to correlate genomic variants with phenotypic outcomes, enabling more precise disease stratification and individualized treatment strategies. The integration of multi-omics data not only enhances the understanding of disease mechanisms but also accelerates the identification of novel therapeutic targets. This trend is further reinforced by regulatory agencies and reimbursement bodies that are placing greater value on the clinical utility of integrated genomic data, thereby incentivizing investments in advanced integration technologies.

Furthermore, the adoption of cloud-based solutions and artificial intelligence (AI) in genomic data integration is revolutionizing the market landscape. Cloud platforms offer scalable storage, computational power, and collaborative environments, making it feasible for institutions of all sizes to process and analyze vast datasets efficiently. AI-driven analytics are enhancing the extraction of actionable insights from integrated data, supporting applications across clinical diagnostics, research, and drug development. The convergence of these technologies is not only improving the speed and accuracy of data interpretation but also expanding the accessibility of genomic insights to a broader range of end-users, including hospitals, research institutes, and biotechnology companies.

Regionally, North America dominated the Genomic Results Discrete Data Integration market in 2024, accounting for the largest revenue share due to its advanced healthcare infrastructure, high adoption of precision medicine, and significant investments in genomics research. Europe followed closely, driven by strong government support and collaborative research initiatives. The Asia Pacific region is emerging as a high-growth market, propelled by increasing healthcare expenditure, expanding genomics research capabilities, and rising awareness of personalized medicine. Latin America and the Middle East & Africa are also witnessing gradual adoption, supported by international collaborations and capacity-building efforts. The regional outlook remains optimistic, with all major regions expected to contribute significantly to the market’s overall expansion through 2033.

Component Analysis

The Genomic Results Discrete Data Integration market by component is segmented into software, hardware, and services, each playing a pivotal role in enabling seamless integration and interpretation of complex biological data. Software solutions represent the largest share, driven by the need for sophisticated algorithms that can harmonize, standardize, and analyze multi-omics datasets. These platforms facilitate data interoperability, support regulatory compliance, and enable advanced analytics, making them indispensable for both clinical and research applications. Key sof
C
Data from: Global Ocean Data Analysis Project, Version 2 (GLODAPv2) (NCEI...
data.cnra.ca.gov
access.earthdata.nasa.gov
+2more
xml
Updated May 9, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ocean Data Partners (2019). Global Ocean Data Analysis Project, Version 2 (GLODAPv2) (NCEI Accession 0162565) [Dataset]. https://data.cnra.ca.gov/dataset/global-ocean-data-analysis-project-version-2-glodapv2-ncei-accession-0162565
Explore at:
xmlAvailable download formats
Dataset updated
May 9, 2019
Dataset authored and provided by
Ocean Data Partners
Description
This data product is composed of data from 724 scientific cruises covering the global ocean. It includes data assembled during the previous interior ocean data synthesis efforts GLODAPv1.1 (Global Ocean Data Analysis Project version 1.1) in 2004, CARINA (CARbon IN the Atlantic) in 2009/2010, and PACIFICA (PACIFic ocean Interior CArbon) in 2013, as well as data from an additional 168 cruises. NCEI Accession 0162565 includes discrete bottle measurements of salinity, oxygen, nitrate, silicate, phosphate, dissolved inorganic carbon, total alkalinity, pH, CFC-11, CFC-12, CFC-113, and CCl4, carbon isotopes and chlorophyll. These data have been subjected to extensive primary and secondary quality control which included systematic evaluation of bias, and adjustments have been applied to remove significant biases, respecting occurrences of any known or likely time trends or variations.
Discrete Tone Image Dataset
kaggle.com
zip
Updated Aug 22, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akash Patel (2021). Discrete Tone Image Dataset [Dataset]. https://www.kaggle.com/imakash3011/discrete-tone-image-dataset
Explore at:
zip(26300605 bytes)Available download formats
Dataset updated
Aug 22, 2021
Authors
Akash Patel
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Data Set Information:

This dataset contains a total of 71 images including 11 types of images with its distorted versions. Each and every image has its own uniqueness of discrete tone image properties.

Content

Attribute Information:

Types of Images 1.System Generated DTI by setting distinct pixel values 2.Discrete Pixel Logo 3.Business Charts 4.Bi-Level 5.Part of Discrete Information from an Continuous Image

Colorspace models 1.RGB 2.Grayscale 3.Binary

Distortion Types 1.JPEG 2.Gaussian White Noise (GWN) 3.Salt and Pepper noise (SP) 4.Multiplicative Speckle Noise (MSN) 5.Poisson Noise (PN)

** Target**

Use this dataset for analysis purpose

Acknowledgements

Source:

Creator:

J.Uthayakumar Research Scholar,Department of Computer Science,Pondicherry University,India. Contact: +91 9677583754 Email Id: uthayresearchscholar '@' gmail.com

Guided By,

Dr.T.Vengattaraman Assistant Professor,Department of Computer Science,Pondicherry University,India. Email Id: vengattaramant '@' gmail.com

Dr.P.Dhavachelvan Professor,Department of Computer Science,Pondicherry University,India. Email Id: dhavachelvan '@' gmail.com

Inspiration

keep sharing knowledge
m
Extended Dataset Generated by the OEIS Integer Sequence A377045: Number of...
data.mendeley.com
figshare.com
Updated Nov 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paul F Marrero Romero (2024). Extended Dataset Generated by the OEIS Integer Sequence A377045: Number of Partitions of Cuban Primes. [Dataset]. http://doi.org/10.17632/st8j3c3fp9.1
Explore at:
Unique identifier
https://doi.org/10.17632/st8j3c3fp9.1
Dataset updated
Nov 13, 2024
Authors
Paul F Marrero Romero
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This integer sequence was registered and published in the On-Line Encyclopedia of Integer Sequences (OEIS.org) Database on October 14 - 2024, under the OEIS code: A377045.

This sequence can be expressed with the help of two general formulas that uses the sequences:

1) A000041: a(n) is the number of partitions of n (the partition numbers).

2) A002407: Cuban primes: primes which are the difference of two consecutive cubes.

3) A121259: Numbers k such that (3*k^2 + 1)/4 is prime.

The two aforementioned general formulas are as follows:

a(n) = A000041(A002407(n)). (1)

a(n) = A000041((3*A121259 (n)^2+1) / 4). (2)

Some interesting properties of this sequence are:

◼ Number of partitions of prime numbers that are the difference of two consecutive cubes.

◼ Number of partitions of primes p such that p=(3*n^2 + 1) / 4 for some integer n (A121259).

◼ a(13) = ~1.49910(x10^43).

◼ The last known integer n in A121259 is 341 and corresponds to a(60) = ~1.59114(x10^323).

The numerical data showed on this dataset was generated by the following Mathematica program:

PartitionsP[Select[Table[(3 k^2 + 1)/4, {k, 500}], PrimeQ]]

The previous program was builded on Mathematica v13.3.0.

Note: More mathematical details, graphics and technical information can be found in the notebook (.nb) & pdf files provided in this data pack.

Facebook

Twitter

Click to copy link

Link copied

Cite

Alberto Marini (2025). Poisson Distribution - Discrete Data [Dataset]. https://www.kaggle.com/datasets/albertomarini88/poisson-process

Poisson Distribution - Discrete Data

Poisson Distribution models the count of discrete events over time or space

Explore at:

zip(36993 bytes)Available download formats

Dataset updated

Jan 30, 2025

Authors

Alberto Marini

License

https://www.usa.gov/government-works/https://www.usa.gov/government-works/

Description

The Poisson Process file concerns the solution of an exercise from the fourth module of the Statistics and Applied Data Analysis Specialization course at the University of Colorado Boulder that I took. In these notes, I intend to explain the most important steps.

Clear search

Close search

Google apps

Main menu

Poisson Distribution - Discrete Data

Secondary data analysis using Understanding Society Data

Additional file 2: of DRfit: a Java tool for the analysis of discrete data...

Data from: Investment and Interest Rate Policy: A Discrete Time Analysis

Data from: Discrete System Analysis

Coastal Ocean Data Analysis Product in North America (CODAP-NA, Version...

Data from: Data sets for the analysis of decomposition error in...

Using a discrete mathematics approach, distinct BPS/IC phenotypes and...

Data from: Global Ocean Data Analysis Project, Version 2 (GLODAPv2) (NCEI...

Discrete Semiconductor Market - Size, Share & Growth Analysis

Discrete and high-frequency chloride (Cl) and specific conductance (SC) data...

Data for: Reducing Sample Size Requirements by Extending Discrete Choice...

Lending Club Loan Data Analysis - Deep Learning

Table_2_The value of generalized linear mixed models for data analysis in...

Genomic Results Discrete Data Integration Market Research Report 2033

Genomic Results Discrete Data Integration Market Outlook

Component Analysis

Data from: Select elements of concern in surface water of three hydrologic...

Genomic Results Discrete Data Integration Market Research Report 2033

Genomic Results Discrete Data Integration Market Outlook

Component Analysis

Data from: Global Ocean Data Analysis Project, Version 2 (GLODAPv2) (NCEI...

Discrete Tone Image Dataset

Context

Content

Acknowledgements

Inspiration

Extended Dataset Generated by the OEIS Integer Sequence A377045: Number of...

Poisson Distribution - Discrete Data

Poisson Distribution models the count of discrete events over time or space