100+ datasets found
  1. Standardize Data

    • figshare.com
    csv
    Updated Jul 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zekun Lu (2025). Standardize Data [Dataset]. http://doi.org/10.6084/m9.figshare.29590574.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jul 17, 2025
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Zekun Lu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Standardize Data

  2. D

    Port Call Data Standardization Services Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Port Call Data Standardization Services Market Research Report 2033 [Dataset]. https://dataintelo.com/report/port-call-data-standardization-services-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Port Call Data Standardization Services Market Outlook



    According to our latest research, the global Port Call Data Standardization Services market size reached USD 1.92 billion in 2024, propelled by the increasing need for operational efficiency and digital transformation across the maritime sector. The market is anticipated to expand at a robust CAGR of 14.1% during the forecast period, reaching approximately USD 5.13 billion by 2033. This growth is primarily driven by the rising adoption of advanced data management solutions, regulatory mandates for data accuracy, and the growing complexity of port operations worldwide.



    One of the primary growth factors for the Port Call Data Standardization Services market is the escalating demand for real-time, accurate, and standardized data across global maritime operations. As ports and shipping lines increasingly digitize their workflows, the importance of harmonizing data formats and ensuring interoperability between disparate systems has become critical. Efficient data standardization enables seamless communication among stakeholders, reduces operational bottlenecks, and enhances decision-making capabilities. Additionally, the emergence of smart ports and the integration of IoT devices have further amplified the volume and complexity of data, necessitating robust standardization services to maintain data integrity and streamline port call processes.



    Another significant driver is the stringent regulatory environment governing maritime operations. International bodies such as the International Maritime Organization (IMO) and regional authorities are mandating higher standards for data transparency, security, and reporting. These regulations compel port authorities, shipping companies, and logistics providers to invest in comprehensive data standardization services to ensure compliance and avoid costly penalties. Moreover, the growing focus on sustainability and environmental compliance demands accurate tracking and reporting of vessel movements, emissions, and cargo handling, further fueling the need for reliable data standardization solutions.



    Technological advancements and the proliferation of cloud-based solutions are also catalyzing the expansion of the Port Call Data Standardization Services market. Cloud-based platforms offer scalability, flexibility, and cost-effectiveness, enabling maritime stakeholders to manage and standardize vast datasets efficiently. The integration of artificial intelligence (AI) and machine learning (ML) into data standardization processes is enhancing data cleansing, validation, and mapping capabilities, resulting in improved data quality and actionable insights. As digital transformation accelerates across the maritime sector, the adoption of advanced data standardization services is set to surge, driving sustained market growth through 2033.



    From a regional perspective, Asia Pacific continues to dominate the Port Call Data Standardization Services market, accounting for the largest share in 2024, followed by Europe and North America. The presence of major transshipment hubs, rapid port infrastructure development, and government initiatives to modernize maritime operations are key factors supporting market expansion in this region. Meanwhile, North America and Europe are witnessing significant investments in digitalization and compliance-focused solutions, driven by stringent regulatory frameworks and the need to enhance supply chain resilience. Emerging economies in Latin America and the Middle East & Africa are also progressively adopting data standardization services, albeit at a slower pace, as they modernize their port infrastructure and integrate into global trade networks.



    Service Type Analysis



    The Service Type segment in the Port Call Data Standardization Services market encompasses various specialized offerings, including data cleansing, data integration, data validation, data mapping, and other related services. Among these, data cleansing remains a foundational component, ensuring that port call data is accurate, free from duplicates, and devoid of inconsistencies. As maritime operations generate vast volumes of data from multiple sources, the risk of errors and redundancies increases significantly. Data cleansing services play a crucial role in maintaining data quality, which is essential for operational efficiency, compliance, and informed decision-making. The increasing complexity of global shipping routes and the proliferation of digital documentation have fur

  3. q

    Data from: The ODD protocol: A review and first update

    • qubeshub.org
    Updated Oct 25, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Volker Grimm; Uta Berger; Donald DeAngelis; J. Polhill; Jarl Giske; Steven Railsback (2018). The ODD protocol: A review and first update [Dataset]. http://doi.org/10.25334/Q49428
    Explore at:
    Dataset updated
    Oct 25, 2018
    Dataset provided by
    QUBES
    Authors
    Volker Grimm; Uta Berger; Donald DeAngelis; J. Polhill; Jarl Giske; Steven Railsback
    Description

    The ‘ODD’ (Overview, Design concepts, and Details) protocol was published in 2006 to standardize the published descriptions of individual-based and agent-based models (ABMs).

  4. d

    Sea Surface Temperature (SST) Standard Deviation of Long-term Mean,...

    • catalog.data.gov
    • data.ioos.us
    • +2more
    Updated Jan 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Center for Ecological Analysis and Synthesis (NCEAS) (Point of Contact) (2025). Sea Surface Temperature (SST) Standard Deviation of Long-term Mean, 2000-2013 - Hawaii [Dataset]. https://catalog.data.gov/dataset/sea-surface-temperature-sst-standard-deviation-of-long-term-mean-2000-2013-hawaii
    Explore at:
    Dataset updated
    Jan 27, 2025
    Dataset provided by
    National Center for Ecological Analysis and Synthesis (NCEAS) (Point of Contact)
    Area covered
    Hawaii
    Description

    Sea surface temperature (SST) plays an important role in a number of ecological processes and can vary over a wide range of time scales, from daily to decadal changes. SST influences primary production, species migration patterns, and coral health. If temperatures are anomalously warm for extended periods of time, drastic changes in the surrounding ecosystem can result, including harmful effects such as coral bleaching. This layer represents the standard deviation of SST (degrees Celsius) of the weekly time series from 2000-2013. Three SST datasets were combined to provide continuous coverage from 1985-2013. The concatenation applies bias adjustment derived from linear regression to the overlap periods of datasets, with the final representation matching the 0.05-degree (~5-km) near real-time SST product. First, a weekly composite, gap-filled SST dataset from the NOAA Pathfinder v5.2 SST 1/24-degree (~4-km), daily dataset (a NOAA Climate Data Record) for each location was produced following Heron et al. (2010) for January 1985 to December 2012. Next, weekly composite SST data from the NOAA/NESDIS/STAR Blended SST 0.1-degree (~11-km), daily dataset was produced for February 2009 to October 2013. Finally, a weekly composite SST dataset from the NOAA/NESDIS/STAR Blended SST 0.05-degree (~5-km), daily dataset was produced for March 2012 to December 2013. The standard deviation of the long-term mean SST was calculated by taking the standard deviation over all weekly data from 2000-2013 for each pixel.

  5. D

    Cloud EHR Data Normalization Platforms Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Cloud EHR Data Normalization Platforms Market Research Report 2033 [Dataset]. https://dataintelo.com/report/cloud-ehr-data-normalization-platforms-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Cloud EHR Data Normalization Platforms Market Outlook



    According to our latest research, the global Cloud EHR Data Normalization Platforms market size in 2024 reached USD 1.2 billion, reflecting robust adoption across healthcare sectors worldwide. The market is experiencing a strong growth trajectory, with a compound annual growth rate (CAGR) of 16.5% projected from 2025 to 2033. By the end of 2033, the market is expected to attain a value of approximately USD 4.3 billion. This expansion is primarily fueled by the rising demand for integrated healthcare data systems, the proliferation of electronic health records (EHRs), and the critical need for seamless interoperability between disparate healthcare IT systems.




    One of the principal growth factors driving the Cloud EHR Data Normalization Platforms market is the global healthcare sector's increasing focus on digitization and interoperability. As healthcare organizations strive to improve patient outcomes and operational efficiencies, the adoption of cloud-based EHR data normalization solutions has become essential. These platforms enable the harmonization of heterogeneous data sources, ensuring that clinical, administrative, and financial data are standardized across multiple systems. This standardization is critical for supporting advanced analytics, clinical decision support, and population health management initiatives. Moreover, the growing adoption of value-based care models is compelling healthcare providers to invest in technologies that facilitate accurate data aggregation and reporting, further propelling market growth.




    Another significant growth catalyst is the rapid advancement in cloud computing technologies and the increasing availability of scalable, secure cloud infrastructure. Cloud EHR data normalization platforms leverage these technological advancements to offer healthcare organizations flexible deployment options, robust data security, and real-time access to normalized datasets. The scalability of cloud platforms allows healthcare providers to efficiently manage large volumes of data generated from diverse sources, including EHRs, laboratory systems, imaging centers, and wearable devices. Additionally, the integration of artificial intelligence and machine learning algorithms into these platforms enhances their ability to map, clean, and standardize data with greater accuracy and speed, resulting in improved clinical and operational insights.




    Regulatory and compliance requirements are also playing a pivotal role in shaping the growth trajectory of the Cloud EHR Data Normalization Platforms market. Governments and regulatory bodies across major regions are mandating the adoption of interoperable health IT systems to improve patient safety, data privacy, and care coordination. Initiatives such as the 21st Century Cures Act in the United States and similar regulations in Europe and Asia Pacific are driving healthcare organizations to implement advanced data normalization solutions. These platforms help ensure compliance with data standards such as HL7, FHIR, and SNOMED CT, thereby reducing the risk of data silos and enhancing the continuity of care. As a result, the market is witnessing increased investments from both public and private stakeholders aiming to modernize healthcare IT infrastructure.




    From a regional perspective, North America holds the largest share of the Cloud EHR Data Normalization Platforms market, driven by the presence of advanced healthcare infrastructure, high EHR adoption rates, and supportive regulatory frameworks. Europe follows closely, with significant investments in health IT modernization and interoperability initiatives. The Asia Pacific region is emerging as a high-growth market due to rising healthcare expenditures, expanding digital health initiatives, and increasing awareness about the benefits of data normalization. Latin America and the Middle East & Africa are also witnessing gradual adoption, supported by ongoing healthcare reforms and investments in digital health technologies. Collectively, these regional dynamics underscore the global momentum toward interoperable, cloud-based healthcare data ecosystems.



    Component Analysis



    The Cloud EHR Data Normalization Platforms market is segmented by component into software and services, each playing a distinct and critical role in driving the market's growth. Software solutions form the technological backbone of the market, enabling healthcare organizations to autom

  6. n

    Methods for normalizing microbiome data: an ecological perspective

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Oct 30, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Donald T. McKnight; Roger Huerlimann; Deborah S. Bower; Lin Schwarzkopf; Ross A. Alford; Kyall R. Zenger (2018). Methods for normalizing microbiome data: an ecological perspective [Dataset]. http://doi.org/10.5061/dryad.tn8qs35
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 30, 2018
    Dataset provided by
    University of New England
    James Cook University
    Authors
    Donald T. McKnight; Roger Huerlimann; Deborah S. Bower; Lin Schwarzkopf; Ross A. Alford; Kyall R. Zenger
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description
    1. Microbiome sequencing data often need to be normalized due to differences in read depths, and recommendations for microbiome analyses generally warn against using proportions or rarefying to normalize data and instead advocate alternatives, such as upper quartile, CSS, edgeR-TMM, or DESeq-VS. Those recommendations are, however, based on studies that focused on differential abundance testing and variance standardization, rather than community-level comparisons (i.e., beta diversity), Also, standardizing the within-sample variance across samples may suppress differences in species evenness, potentially distorting community-level patterns. Furthermore, the recommended methods use log transformations, which we expect to exaggerate the importance of differences among rare OTUs, while suppressing the importance of differences among common OTUs. 2. We tested these theoretical predictions via simulations and a real-world data set. 3. Proportions and rarefying produced more accurate comparisons among communities and were the only methods that fully normalized read depths across samples. Additionally, upper quartile, CSS, edgeR-TMM, and DESeq-VS often masked differences among communities when common OTUs differed, and they produced false positives when rare OTUs differed. 4. Based on our simulations, normalizing via proportions may be superior to other commonly used methods for comparing ecological communities.
  7. HHS Metadata Standard

    • healthdata.gov
    • data.virginia.gov
    • +1more
    csv, xlsx, xml
    Updated Jul 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). HHS Metadata Standard [Dataset]. https://healthdata.gov/HHS/HHS-Metadata-Standard/9g3v-hy22
    Explore at:
    xml, csv, xlsxAvailable download formats
    Dataset updated
    Jul 18, 2025
    License

    https://www.usa.gov/government-workshttps://www.usa.gov/government-works

    Description

    HHS Metadata Standard: Version 1.0, published in July 2025, serves as the authoritative framework for defining HHS metadata—data about data—fields and attributes. Aligned with the Evidence Act and HealthData.gov, this standard establishes clear guidelines for metadata collection and public sharing across all data assets created, collected, managed, or maintained by HHS. It outlines required metadata fields for HHS datasets, ensuring consistency, interoperability, and discoverability in HHS data governance.

  8. d

    UNI-CEN Standardized Census Data Table - Census Tract (CT) - 1976 - Long...

    • search.dataone.org
    Updated Dec 28, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UNI-CEN Project (2023). UNI-CEN Standardized Census Data Table - Census Tract (CT) - 1976 - Long Format (DTA) (Version 2023-03) [Dataset]. http://doi.org/10.5683/SP3/RWMP14
    Explore at:
    Dataset updated
    Dec 28, 2023
    Dataset provided by
    Borealis
    Authors
    UNI-CEN Project
    Time period covered
    Jan 1, 1976
    Description

    UNI-CEN Standardized Census Data Tables contain Census data that have been reformatted into a common table format with standardized variable names and codes. The data are provided in two tabular formats for different use cases. "Long" tables are suitable for use in statistical environments, while "wide" tables are commonly used in GIS environments. The long tables are provided in Stata Binary (dta) format, which is readable by all statistics software. The wide tables are provided in comma-separated values (csv) and dBase 3 (dbf) formats with codebooks. The wide tables are easily joined to the UNI-CEN Digital Boundary Files. For the csv files, a .csvt file is provided to ensure that column data formats are correctly formatted when importing into QGIS. A schema.ini file does the same when importing into ArcGIS environments. As the DBF file format supports a maximum of 250 columns, tables with a larger number of variables are divided into multiple DBF files. For more information about file sources, the methods used to create them, and how to use them, consult the documentation at https://borealisdata.ca/dataverse/unicen_docs. For more information about the project, visit https://observatory.uwo.ca/unicen.

  9. d

    Development Standard Variance

    • catalog.data.gov
    • data.montgomerycountymd.gov
    • +2more
    Updated Nov 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.montgomerycountymd.gov (2025). Development Standard Variance [Dataset]. https://catalog.data.gov/dataset/development-standard-variance
    Explore at:
    Dataset updated
    Nov 22, 2025
    Dataset provided by
    data.montgomerycountymd.gov
    Description

    A variance is required when an application has submitted a proposed project to the Department of Permitting Services and it is determined that the construction, alteration or extension does not conform to the development standards (in the zoning ordinance) for the zone in which the subject property is located. A variance may be required in any zone and includes accessory structures as well as primary buildings or dwellings. Update Frequency : Daily

  10. n

    Data from: Development of Data Dictionary for neonatal intensive care unit:...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    zip
    Updated Dec 27, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harpreet Singh; Ravneet Kaur; Satish Saluja; Su Cho; Avneet Kaur; Ashish Pandey; Shubham Gupta; Ritu Das; Praveen Kumar; Jonathan Palma; Gautam Yadav; Yao Sun (2020). Development of Data Dictionary for neonatal intensive care unit: advancement towards a better critical care unit [Dataset]. http://doi.org/10.5061/dryad.zkh18936f
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 27, 2020
    Dataset provided by
    KLKH
    CHIL
    UCSF Benioff Children's Hospital
    Apollo Cradle For Women & Children
    Sir Ganga Ram Hospital
    Post Graduate Institute of Medical Education and Research
    Lucile Packard Children's Hospital
    Indraprastha Institute of Information Technology Delhi
    Ewha Womans University
    Authors
    Harpreet Singh; Ravneet Kaur; Satish Saluja; Su Cho; Avneet Kaur; Ashish Pandey; Shubham Gupta; Ritu Das; Praveen Kumar; Jonathan Palma; Gautam Yadav; Yao Sun
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Background: Critical care units (CCUs) with wide use of various monitoring devices generate massive data. To utilize the valuable information of these devices; data are collected and stored using systems like Clinical Information System (CIS), Laboratory Information Management System (LIMS), etc. These systems are proprietary in nature, allow limited access to their database and have vendor specific clinical implementation. In this study we focus on developing an open source web-based meta-data repository for CCU representing stay of patient with relevant details.

    Methods: After developing the web-based open source repository we analyzed prospective data from two sites for four months for data quality dimensions (completeness, timeliness, validity, accuracy and consistency), morbidity and clinical outcomes. We used a regression model to highlight the significance of practice variations linked with various quality indicators. Results: Data dictionary (DD) with 1447 fields (90.39% categorical and 9.6% text fields) is presented to cover clinical workflow of NICU. The overall quality of 1795 patient days data with respect to standard quality dimensions is 87%. The data exhibit 82% completeness, 97% accuracy, 91% timeliness and 94% validity in terms of representing CCU processes. The data scores only 67% in terms of consistency. Furthermore, quality indicator and practice variations are strongly correlated (p-value < 0.05).

    Results: Data dictionary (DD) with 1555 fields (89.6% categorical and 11.4% text fields) is presented to cover clinical workflow of a CCU. The overall quality of 1795 patient days data with respect to standard quality dimensions is 87%. The data exhibit 82% completeness, 97% accuracy, 91% timeliness and 94% validity in terms of representing CCU processes. The data scores only 67% in terms of consistency. Furthermore, quality indicators and practice variations are strongly correlated (p-value < 0.05).

    Conclusion: This study documents DD for standardized data collection in CCU. This provides robust data and insights for audit purposes and pathways for CCU to target practice improvements leading to specific quality improvements.

  11. Simulation Data Set

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

  12. f

    Data from: FLiPPR: A Processor for Limited Proteolysis (LiP) Mass...

    • acs.figshare.com
    xlsx
    Updated May 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Edgar Manriquez-Sandoval; Joy Brewer; Gabriela Lule; Samanta Lopez; Stephen D. Fried (2024). FLiPPR: A Processor for Limited Proteolysis (LiP) Mass Spectrometry Data Sets Built on FragPipe [Dataset]. http://doi.org/10.1021/acs.jproteome.3c00887.s003
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 24, 2024
    Dataset provided by
    ACS Publications
    Authors
    Edgar Manriquez-Sandoval; Joy Brewer; Gabriela Lule; Samanta Lopez; Stephen D. Fried
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Here, we present FLiPPR, or FragPipe LiP (limited proteolysis) Processor, a tool that facilitates the analysis of data from limited proteolysis mass spectrometry (LiP-MS) experiments following primary search and quantification in FragPipe. LiP-MS has emerged as a method that can provide proteome-wide information on protein structure and has been applied to a range of biological and biophysical questions. Although LiP-MS can be carried out with standard laboratory reagents and mass spectrometers, analyzing the data can be slow and poses unique challenges compared to typical quantitative proteomics workflows. To address this, we leverage FragPipe and then process its output in FLiPPR. FLiPPR formalizes a specific data imputation heuristic that carefully uses missing data in LiP-MS experiments to report on the most significant structural changes. Moreover, FLiPPR introduces a data merging scheme and a protein-centric multiple hypothesis correction scheme, enabling processed LiP-MS data sets to be more robust and less redundant. These improvements strengthen statistical trends when previously published data are reanalyzed with the FragPipe/FLiPPR workflow. We hope that FLiPPR will lower the barrier for more users to adopt LiP-MS, standardize statistical procedures for LiP-MS data analysis, and systematize output to facilitate eventual larger-scale integration of LiP-MS data.

  13. h

    pooled standard data

    • funginet.hki-jena.de
    zip
    Updated May 3, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniela Albrecht-Eckardt (2021). pooled standard data [Dataset]. https://funginet.hki-jena.de/data_files/161
    Explore at:
    zip(134 MB)Available download formats
    Dataset updated
    May 3, 2021
    Authors
    Daniela Albrecht-Eckardt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description not specified.........................

  14. f

    Data from: A concentration-based approach to data classification for...

    • figshare.com
    • tandf.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert G. Cromley; Shuowei Zhang; Natalia Vorotyntseva (2023). A concentration-based approach to data classification for choropleth mapping [Dataset]. http://doi.org/10.6084/m9.figshare.1456086.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Robert G. Cromley; Shuowei Zhang; Natalia Vorotyntseva
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The choropleth map is a device used for the display of socioeconomic data associated with an areal partition of geographic space. Cartographers emphasize the need to standardize any raw count data by an area-based total before displaying the data in a choropleth map. The standardization process converts the raw data from an absolute measure into a relative measure. However, there is recognition that the standardizing process does not enable the map reader to distinguish between low–low and high–high numerator/denominator differences. This research uses concentration-based classification schemes using Lorenz curves to address some of these issues. A test data set of nonwhite birth rate by county in North Carolina is used to demonstrate how this approach differs from traditional mean–variance-based systems such as the Jenks’ optimal classification scheme.

  15. r

    QoG Standard Dataset - The QoG Time-Series Dataset

    • researchdata.se
    • demo.researchdata.se
    Updated Aug 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jan Teorell; Aksel Sundström; Sören Holmberg; Bo Rothstein; Natalia Alvarado Pachon; Cem Mert Dalli (2024). QoG Standard Dataset - The QoG Time-Series Dataset [Dataset]. http://doi.org/10.18157/QoGStdJan22
    Explore at:
    Dataset updated
    Aug 6, 2024
    Dataset provided by
    University of Gothenburg
    Authors
    Jan Teorell; Aksel Sundström; Sören Holmberg; Bo Rothstein; Natalia Alvarado Pachon; Cem Mert Dalli
    Time period covered
    1946 - 2021
    Description

    The QoG Institute is an independent research institute within the Department of Political Science at the University of Gothenburg. Overall 30 researchers conduct and promote research on the causes, consequences and nature of Good Governance and the Quality of Government - that is, trustworthy, reliable, impartial, uncorrupted and competent government institutions.

    The main objective of our research is to address the theoretical and empirical problem of how political institutions of high quality can be created and maintained. A second objective is to study the effects of Quality of Government on a number of policy areas, such as health, the environment, social policy, and poverty.

    QoG Standard Dataset is our largest data set consisting of more than 2,000 variables from sources related to the Quality of Government.

    In the QoG Standard CS dataset, data from and around 2018 is included. Data from 2018 is prioritized, however, if no data is available for a country for 2018, data for 2019 is included. If no data exists for 2019, data for 2017 is included, and so on up to a maximum of +/- 3 years.

    In the QoG Standard TS dataset, data from 1946 to 2021 is included and the unit of analysis is country-year (e.g., Sweden-1946, Sweden-1947, etc.).

    In the QoG Standard TS dataset, data from 1946 to 2021 is included and the unit of analysis is country-year (e.g., Sweden-1946, Sweden-1947, etc.).

    Historical countries are in most cases denoted with a do-date (e.g. Ethiopia (-1992) and a from-date (Ethiopia (1993-)).

  16. d

    Data from: 2024 Standard Scenarios: A U.S. Electricity Sector Outlook

    • catalog.data.gov
    • data.openei.org
    • +1more
    Updated Jan 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Renewable Energy Laboratory (NREL) (2025). 2024 Standard Scenarios: A U.S. Electricity Sector Outlook [Dataset]. https://catalog.data.gov/dataset/2024-standard-scenarios-a-u-s-electricity-sector-outlook
    Explore at:
    Dataset updated
    Jan 23, 2025
    Dataset provided by
    National Renewable Energy Laboratory (NREL)
    Area covered
    United States
    Description

    This data corresponds to the 2024 Standard Scenarios report, which contains a suite of forward-looking scenarios of the possible evolution of the U.S. electricity sector through 2050. These files contain modeled projections of the future. Although we strive to capture relevant phenomena as comprehensively as possible, the models used to create this data are unavoidably imperfect, and the future is highly uncertain. Consequentially, this data should not be the sole basis for making decisions. In addition to drawing from multiple scenarios within this set, we encourage analysts to also draw on projections from other sources, to benefit from diverse analytical frameworks and perspectives when forming their conclusions about the future of the power sector. For further discussions about the limitations of the models underlying this data, see section 1.4 of the "ReEDS Documentation" linked below. For scenario descriptions, input assumptions, and metric definitions for the data in these files, see the "2024 Standard Scenarios Report" linked below.

  17. d

    Government data standard education training course materials

    • data.gov.tw
    pdf
    Updated Oct 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ministry of Digital Affairs (2025). Government data standard education training course materials [Dataset]. https://data.gov.tw/en/datasets/175308
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Oct 31, 2025
    Dataset authored and provided by
    Ministry of Digital Affairs
    License

    https://data.gov.tw/licensehttps://data.gov.tw/license

    Description

    This dataset contains the teaching materials used in the data standard-related education and training courses organized by the Digital Development Department, covering topics such as data field standardization and the formulation of domain data standards. The teaching materials include presentation files, teaching cases, and reference manuals for the study and application of data quality improvement by government personnel.

  18. Data from: Monthly precipitation data from a network of standard gauges at...

    • catalog.data.gov
    • datasetcatalog.nlm.nih.gov
    • +5more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Monthly precipitation data from a network of standard gauges at the Jornada Experimental Range (Jornada Basin LTER) in southern New Mexico, January 1916 - ongoing [Dataset]. https://catalog.data.gov/dataset/monthly-precipitation-data-from-a-network-of-standard-gauges-at-the-jornada-experimental-r-f331c
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    This ongoing dataset contains monthly precipitation measurements from a network of standard can rain gauges at the Jornada Experimental Range in Dona Ana County, New Mexico, USA. Precipitation physically collects within gauges during the month and is manually measured with a graduated cylinder at the end of each month. This network is maintained by USDA Agricultural Research Service personnel. This dataset includes 39 different locations but only 29 of them are current. Other precipitation data exist for this area, including event-based tipping bucket data with timestamps, but do not go as far back in time as this dataset. Resources in this dataset:Resource Title: Website Pointer to html file. File Name: Web Page, url: https://portal.edirepository.org/nis/mapbrowse?scope=knb-lter-jrn&identifier=210380001 Webpage with information and links to data files for download

  19. U

    United States CSI: Expected Inflation: Next 5 Yrs: Standard Deviation

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, United States CSI: Expected Inflation: Next 5 Yrs: Standard Deviation [Dataset]. https://www.ceicdata.com/en/united-states/consumer-sentiment-index-unemployment-interest-rates-prices-and-government-expectations/csi-expected-inflation-next-5-yrs-standard-deviation
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 1, 2017 - Mar 1, 2018
    Area covered
    United States
    Description

    United States CSI: Expected Inflation: Next 5 Yrs: Standard Deviation data was reported at 2.500 % in May 2018. This stayed constant from the previous number of 2.500 % for Apr 2018. United States CSI: Expected Inflation: Next 5 Yrs: Standard Deviation data is updated monthly, averaging 3.200 % from Feb 1979 (Median) to May 2018, with 380 observations. The data reached an all-time high of 10.900 % in Feb 1980 and a record low of 2.200 % in Apr 1999. United States CSI: Expected Inflation: Next 5 Yrs: Standard Deviation data remains active status in CEIC and is reported by University of Michigan. The data is categorized under Global Database’s USA – Table US.H030: Consumer Sentiment Index: Unemployment, Interest Rates, Prices and Government Expectations. The questions were: 'What about the outlook for prices over the next 5 to 10 years? Do you think prices will be higher, to go up, on the average, during the next 12 months?' and 'By about what percent per year do you expect prices to go up or down, on the average, during the next 5 to 10 years?'

  20. US Age-Standardized Stroke Mortality Rates

    • kaggle.com
    zip
    Updated Jan 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). US Age-Standardized Stroke Mortality Rates [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-age-standardized-stroke-mortality-rates-2013
    Explore at:
    zip(894260 bytes)Available download formats
    Dataset updated
    Jan 12, 2023
    Authors
    The Devastator
    Area covered
    United States
    Description

    US Age-Standardized Stroke Mortality Rates (2013-15) by State/County/Gender/Race

    Investigating Variations in Rates

    By US Open Data Portal, data.gov [source]

    About this dataset

    This dataset contains the age-standardized stroke mortality rate in the United States from 2013 to 2015, by state/territory, county, gender and race/ethnicity. The data source is the highly respected National Vital Statistics System. The rates are reported as a 3-year average and have been age-standardized. Moreover, county rates are spatially smoothed for further accuracy. The interactive map of heart disease and stroke produced by this dataset provides invaluable information about the geographic disparities in stroke mortality across America at different scales - county, state/territory and national. By using the adjustable filter settings provided in this interactive map, you can quickly explore demographic details such as gender (Male/Female) or race/ethnicity (e.g Non-Hispanic White). Conquer your fear of unknown with evidence! Investigate these locations now to inform meaningful action plans for greater public health resilience in America and find out if strokes remain a threat to our millions of citizens every day! Updated regularly since 2020-02-26, so check it out now!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    The US Age-Standardized Stroke Mortality Rates (2013-2015) by State/County/Gender/Race dataset provides valuable insights into stroke mortality rates among adults ages 35 and over in the USA between 2013 and 2015. This dataset contains age-standardized data from the National Vital Statistics System at the state, county, gender, and race level. Use this guide to learn how best use this dataset for your purposes!

    Understand the Data

    This dataset provides information about stroke mortality rates among adult Americans aged 35+. The data is collected from 2013 to 2015 in three year averages. Even though it is possible to view county level data, spatial smoothing techniques have been applied here. The following columns of data are provided: - Year – The year of the data collection - LocationAbbr – The abbreviation of location where the data was collected
    - LocationDesc – A description of this location
    - GeographicLevel – Geographic level of granularity where these numbers are recorded * DataSource - source of these statistics * Class - class or group into which these stats fall * Topic - overall topic on which we have stats * Data_Value - age standardized value associated with each row * Data_Value_Unit - units associated with each value * Stratification1– First stratification defined for a given row * Stratification2– Second stratification defined for a given row

    Additionally, several other footnotes fields such as ‘Data_value_Type’; ‘Data_Value_Footnote _Symbol’; ‘StratificationCategory1’ & ‘StratificatoinCategory2’ etc may be present accordingly .  
    

    ## Exploring Correlations

    Now that you understand what individual columns mean it should take no time to analyze correlations within different categories using standard statistical methods like linear regressions or boxplots etc. If you want to compare different regions , then you can use LocationAbbr column with locations reduced geographical levels such as State or Region. Alternatively if one wants comparisons across genders then they can refer column labelled Stratifacation1 alongwith their desired values within this

    Research Ideas

    • Creating a visualization to show the relationship between stroke mortality and specific variations in race/ethnicity, gender, and geography.
    • Comparing two or more states based on their average stroke mortality rate over time.
    • Building a predictive model that disregards temporal biases to anticipate further changes in stroke mortality for certain communities or entire states across the US

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    Unknown License - Please check the dataset description for more information.

    Columns

    File: csv-1.csv | Column name | Description | |:--...

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Zekun Lu (2025). Standardize Data [Dataset]. http://doi.org/10.6084/m9.figshare.29590574.v1
Organization logoOrganization logo

Standardize Data

Explore at:
csvAvailable download formats
Dataset updated
Jul 17, 2025
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Zekun Lu
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Standardize Data

Search
Clear search
Close search
Google apps
Main menu