22 datasets found
  1. N

    Normalizing Service Report

    • marketreportanalytics.com
    doc, pdf, ppt
    Updated Apr 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Report Analytics (2025). Normalizing Service Report [Dataset]. https://www.marketreportanalytics.com/reports/normalizing-service-53022
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Apr 2, 2025
    Dataset authored and provided by
    Market Report Analytics
    License

    https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global Normalizing Service market is experiencing robust growth, driven by increasing demand for [Insert specific drivers based on your knowledge of the Normalizing Service market, e.g., improved data quality, enhanced data analysis capabilities, rising adoption of cloud-based solutions, stringent data governance regulations]. The market is segmented by application [Insert specific applications, e.g., healthcare, finance, manufacturing] and type [Insert specific types of Normalizing Services, e.g., data cleansing, data transformation, data integration]. While precise market sizing data is unavailable, based on industry trends and comparable markets with similar growth trajectories, a reasonable estimate for the 2025 market size could be placed in the range of $500-750 million USD, with a Compound Annual Growth Rate (CAGR) of approximately 15-20% projected from 2025 to 2033. This growth is expected to be fueled by the continued expansion of big data analytics and the rising need for data standardization across diverse industries. However, challenges such as data security concerns, integration complexities, and high initial investment costs can act as potential restraints on market expansion. Regional analysis suggests a strong presence across North America and Europe, driven by early adoption and robust technological infrastructure. Asia-Pacific is poised for significant growth in the coming years due to increasing digitalization and expanding data centers. The market is highly competitive, with a mix of established players and emerging technology companies vying for market share. Successful players will need to differentiate their offerings through specialized solutions, strategic partnerships, and a focus on addressing specific industry needs. Future growth will depend on advancements in AI and machine learning technologies, further integration with cloud platforms, and the development of user-friendly, scalable solutions.

  2. c

    Data from: LVMED: Dataset of Latvian text normalisation samples for the...

    • repository.clarin.lv
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Viesturs Jūlijs Lasmanis; Normunds Grūzītis (2023). LVMED: Dataset of Latvian text normalisation samples for the medical domain [Dataset]. https://repository.clarin.lv/repository/xmlui/handle/20.500.12574/85
    Explore at:
    Dataset updated
    May 30, 2023
    Authors
    Viesturs Jūlijs Lasmanis; Normunds Grūzītis
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    The CSV dataset contains sentence pairs for a text-to-text transformation task: given a sentence that contains 0..n abbreviations, rewrite (normalize) the sentence in full words (word forms).

    Training dataset: 64,665 sentence pairs Validation dataset: 7,185 sentence pairs. Testing dataset: 7,984 sentence pairs.

    All sentences are extracted from a public web corpus (https://korpuss.lv/id/Tīmeklis2020) and contain at least one medical term.

  3. f

    THINGS-data: MRI transformation between spaces

    • plus.figshare.com
    zip
    Updated May 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Hebart; Oliver Contier; Lina Teichmann; Adam Rockter; Charles Zheng; Alexis Kidder; Anna Corriveau; Maryam Vaziri-Pashkam; Chris Baker (2024). THINGS-data: MRI transformation between spaces [Dataset]. http://doi.org/10.25452/figshare.plus.25868785.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 21, 2024
    Dataset provided by
    Figshare+
    Authors
    Martin Hebart; Oliver Contier; Lina Teichmann; Adam Rockter; Charles Zheng; Alexis Kidder; Anna Corriveau; Maryam Vaziri-Pashkam; Chris Baker
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains files produced by fMRIPrep that allow to transform the fMRI data between different spaces. For instance, any results obtained in the subjects' individual anatomical space could be transformed into the MNI standard space, allowing to compare results between subjects or even with other datasets.Part of THINGS-data: A multimodal collection of large-scale datasets for investigating object representations in brain and behavior.See related materials in Collection at: https://doi.org/10.25452/figshare.plus.c.6161151

  4. H

    Change Management (Normalized)

    • dataverse.harvard.edu
    Updated May 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Diomar Anez; Dimar Anez (2025). Change Management (Normalized) [Dataset]. http://doi.org/10.7910/DVN/J5KRBS
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 6, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Diomar Anez; Dimar Anez
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset provides processed and normalized/standardized indices for the management tool 'Change Management' (often encompassing Change Management Programs). Derived from five distinct raw data sources, these indices are specifically designed for comparative longitudinal analysis, enabling the examination of trends and relationships across different empirical domains (web search, literature, academic publishing, and executive adoption). The data presented here represent transformed versions of the original source data, aimed at achieving metric comparability. Users requiring the unprocessed source data should consult the corresponding Change Management dataset in the Management Tool Source Data (Raw Extracts) Dataverse. Data Files and Processing Methodologies: Google Trends File (Prefix: GT_): Normalized Relative Search Interest (RSI) Input Data: Native monthly RSI values from Google Trends (Jan 2004 - Jan 2025) for the query "change management programs" + "change management" + "change management business". Processing: None. Utilizes the original base-100 normalized Google Trends index. Output Metric: Monthly Normalized RSI (Base 100). Frequency: Monthly. Google Books Ngram Viewer File (Prefix: GB_): Normalized Relative Frequency Input Data: Annual relative frequency values from Google Books Ngram Viewer (1950-2022, English corpus, no smoothing) for the query Change Management Programs + Change Management. Processing: Annual relative frequency series normalized (peak year = 100). Output Metric: Annual Normalized Relative Frequency Index (Base 100). Frequency: Annual. Crossref.org File (Prefix: CR_): Normalized Relative Publication Share Index Input Data: Absolute monthly publication counts matching Change Management-related keywords [("change management programs" OR ...) AND (...) - see raw data for full query] in titles/abstracts (1950-2025), alongside total monthly Crossref publications. Deduplicated via DOIs. Processing: Monthly relative share calculated (Change Mgmt Count / Total Count). Monthly relative share series normalized (peak month's share = 100). Output Metric: Monthly Normalized Relative Publication Share Index (Base 100). Frequency: Monthly. Bain & Co. Survey - Usability File (Prefix: BU_): Normalized Usability Index Input Data: Original usability percentages (%) from Bain surveys for specific years: Change Management Programs (2002, 2004, 2010, 2012, 2014, 2017, 2022). Processing: Normalization: Original usability percentages normalized relative to its historical peak (Max % = 100). Output Metric: Biennial Estimated Normalized Usability Index (Base 100 relative to historical peak). Frequency: Biennial (Approx.). Bain & Co. Survey - Satisfaction File (Prefix: BS_): Standardized Satisfaction Index Input Data: Original average satisfaction scores (1-5 scale) from Bain surveys for specific years: Change Management Programs (2002-2022). Processing: Standardization (Z-scores): Using Z = (X - 3.0) / 0.891609. Index Scale Transformation: Index = 50 + (Z * 22). Output Metric: Biennial Standardized Satisfaction Index (Center=50, Range?[1,100]). Frequency: Biennial (Approx.). File Naming Convention: Files generally follow the pattern: PREFIX_Tool_Processed.csv or similar, where the PREFIX indicates the data source (GT_, GB_, CR_, BU_, BS_). Consult the parent Dataverse description (Management Tool Comparative Indices) for general context and the methodological disclaimer. For original extraction details (specific keywords, URLs, etc.), refer to the corresponding Change Management dataset in the Raw Extracts Dataverse. Comprehensive project documentation provides full details on all processing steps.

  5. f

    Table_1_Overview of data preprocessing for machine learning applications in...

    • frontiersin.figshare.com
    • figshare.com
    xlsx
    Updated Oct 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eliana Ibrahimi; Marta B. Lopes; Xhilda Dhamo; Andrea Simeon; Rajesh Shigdel; Karel Hron; Blaž Stres; Domenica D’Elia; Magali Berland; Laura Judith Marcos-Zambrano (2023). Table_1_Overview of data preprocessing for machine learning applications in human microbiome research.XLSX [Dataset]. http://doi.org/10.3389/fmicb.2023.1250909.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Oct 6, 2023
    Dataset provided by
    Frontiers
    Authors
    Eliana Ibrahimi; Marta B. Lopes; Xhilda Dhamo; Andrea Simeon; Rajesh Shigdel; Karel Hron; Blaž Stres; Domenica D’Elia; Magali Berland; Laura Judith Marcos-Zambrano
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Although metagenomic sequencing is now the preferred technique to study microbiome-host interactions, analyzing and interpreting microbiome sequencing data presents challenges primarily attributed to the statistical specificities of the data (e.g., sparse, over-dispersed, compositional, inter-variable dependency). This mini review explores preprocessing and transformation methods applied in recent human microbiome studies to address microbiome data analysis challenges. Our results indicate a limited adoption of transformation methods targeting the statistical characteristics of microbiome sequencing data. Instead, there is a prevalent usage of relative and normalization-based transformations that do not specifically account for the specific attributes of microbiome data. The information on preprocessing and transformations applied to the data before analysis was incomplete or missing in many publications, leading to reproducibility concerns, comparability issues, and questionable results. We hope this mini review will provide researchers and newcomers to the field of human microbiome research with an up-to-date point of reference for various data transformation tools and assist them in choosing the most suitable transformation method based on their research questions, objectives, and data characteristics.

  6. h

    text-stats

    • huggingface.co
    Updated Dec 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alan Tseng (2024). text-stats [Dataset]. https://huggingface.co/datasets/agentlans/text-stats
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 14, 2024
    Authors
    Alan Tseng
    Description

    Text statistics

    This dataset is a combination of the following datasets:

    agentlans/text-quality-v2 agentlans/readability agentlans/twitter-sentiment-meta-analysis

    The main purpose is to collect the large data into one place for easy training and evaluation.

      Data Preparation and Transformation
    
    
    
    
    
    
    
      Quality Score Normalization
    

    The dataset was enhanced with additional columns, and quality scores (n = 909 533) were normalized using Ordered Quantile… See the full description on the dataset page: https://huggingface.co/datasets/agentlans/text-stats.

  7. u

    Supplementary file including normalized data sets to reproduce the analyses...

    • data.ub.uni-muenchen.de
    Updated Nov 29, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). Supplementary file including normalized data sets to reproduce the analyses presented in the paper "Use of pre-transformation to cope with extreme values in important candidate features" by Boulesteix, Guillemot & Sauerbrei (Biometrical Journal, 2011) [Dataset]. http://doi.org/10.5282/ubm/data.39
    Explore at:
    Dataset updated
    Nov 29, 2019
    Description

    The zip-file contains supplementary files (normalized data sets and R-codes) to reproduce the analyses presented in the paper "Use of pre-transformation to cope with extreme values in important candidate features" by Boulesteix, Guillemot & Sauerbrei (Biometrical Journal, 2011). The raw data (CEL-files) are publicly available and described in the following papers: - Ancona et al, 2006. On the statistical assessment of classifiers using DNA microarray data. BMC Bioinformatics 7, 387. - Miller et al, 2005. An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proceedings of the National Academy of Science 102, 13550–13555. - Minn et al, 2005. Genes that mediate breast cancer metastasis to lung. Nature 436, 518–524. - Pawitan et al, 2005. Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts. Breast Cancer Research 7, R953–964. - Scherzer et al, 2007. Molecular markers of early parkinsons disease based on gene expression in blood. Proceedings of the National Academy of Science 104, 955-960. - Singh et al, 2002. Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1, 203–209. - Sotiriou et al, 2006. Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. Journal of the National Cancer Institute 98, 262–272. - Tang et al, 2009. Gene-expression profiling of peripheral blood mononuclear cells in sepsis. Critical Care Medicine 37, 882–888. - Wang et al, 2005. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365, 671–679. - Irizarry, 2003. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 31 (4), e15. - Irizarry et al, 2006. Comparison of Affymetrix GeneChip expression measures. Bioinformatics 22 (7), 789–794.

  8. m

    Brightfield images of rank-based transformation (RBT)

    • data.mendeley.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Torbjörn Nordling (2025). Brightfield images of rank-based transformation (RBT) [Dataset]. http://doi.org/10.17632/r8n2kp2m8g.1
    Explore at:
    Dataset updated
    Apr 17, 2025
    Authors
    Torbjörn Nordling
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The Rank-Based Transformation (RBT) method combines histogram expansion and normalization to enhance image contrast. It works by ranking each pixel's intensity in ascending order, and then redistributing these ranks evenly across a specified intensity range. Pixels with identical intensities share the same rank, ensuring consistent mapping. This process produces an output image with uniformly spaced intensity levels, improving contrast while preserving relative intensity relationships.

    For more details, please check the included README.md or visit the GitHub for RBT algorithm (https://github.com/nordlinglab/RBT).

  9. f

    Binary classification using a confusion matrix.

    • plos.figshare.com
    xls
    Updated Dec 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chantha Wongoutong (2024). Binary classification using a confusion matrix. [Dataset]. http://doi.org/10.1371/journal.pone.0310839.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 6, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Chantha Wongoutong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Despite the popularity of k-means clustering, feature scaling before applying it can be an essential yet often neglected step. In this study, feature scaling via five methods: Z-score, Min-Max normalization, Percentile transformation, Maximum absolute scaling, or RobustScaler beforehand was compared with using the raw (i.e., non-scaled) data to analyze datasets having features with different or the same units via k-means clustering. The results of an experimental study show that, for features with different units, scaling them before k-means clustering provided better accuracy, precision, recall, and F-score values than when using the raw data. Meanwhile, when features in the dataset had the same unit, scaling them beforehand provided similar results to using the raw data. Thus, scaling the features beforehand is a very important step for datasets with different units, which improves the clustering results and accuracy. Of the five feature-scaling methods used in the dataset with different units, Z-score standardization and Percentile transformation provided similar performances that were superior to the other or using the raw data. While Maximum absolute scaling, slightly more performances than the other scaling methods and raw data when the dataset contains features with the same unit, the improvement was not significant.

  10. Using a Machine Learning Regression Approach to Predict the Aroma...

    • zenodo.org
    zip
    Updated Jan 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marvin Anker; Christine Borsum; Christian Krupitzer; Youfeng Zhang; Yanyan Zhang; Marvin Anker; Christine Borsum; Christian Krupitzer; Youfeng Zhang; Yanyan Zhang (2024). Using a Machine Learning Regression Approach to Predict the Aroma Partitioning in Diary Matrices - Accompanying material [Dataset]. http://doi.org/10.5281/zenodo.10566439
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 25, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Marvin Anker; Christine Borsum; Christian Krupitzer; Youfeng Zhang; Yanyan Zhang; Marvin Anker; Christine Borsum; Christian Krupitzer; Youfeng Zhang; Yanyan Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These files are accompanying material for our submission "Using a Machine Learning Regression Approach to Predict the Aroma Partitioning in Diary Matrices" to MDPI Processes:

    Aroma partitioning in food is a challenging area of research due to the contribution of several physical and chemical factors that affect the binding and release of aroma in food matrices. The partition coefficient measured by the Kmg value refers to the partition coefficient that describes how aroma compounds distribute themselves between matrices and a gas phase, such as between different components of a food matrix and air. This study introduces a regression approach to predict the Kmg value of aroma compounds of a wide range of physicochemical properties in dairy matrices representing products of different compositions and/or processing. The approach consists of data cleaning, grouping based on the temperature of Kmg analysis, pre-processing (log transformation and normalization), and, finally, the development and evaluation of prediction models with regression methods. We compared regression analysis with linear regression (LR) to five machine-learning-based regression algorithms: Random Forest Regressor (RFR), Gradient Boosting Regression (GBR), Extreme Gradient Boosting (XGBoost, XGB), Support Vector Regression (SVR), and Artificial Neural Network Regression (NNR). Explainable AI (XAI) was used to calculate feature importance and therefore identify the features that mainly contribute to the prediction. The top three features that were identified are log P, specific gravity, and molecular weight. For the prediction of the Kmg in dairy matrices, R2 scores of up to 0.99 were reached. For 37.0 °C, which resembles the temperature of the mouth, RFR delivered the best results, and, at lower temperatures of 7.0 ◦C, typical for a household fridge, XGB performed best. The results from the models work as a proof of concept and show the applicability of a data-driven approach with machine learning to predict the Kmg value of aroma compounds in different dairy matrices.

    We provided two folders with the results and scripts, described by documentation:

    • Results_Aroma_Regression (Feature_Importance, Histogram_Plots)
    • Scripts_Aroma_Regression (Pipeline.py, regr_simple.py, regr_preprocessed.py)

  11. f

    The performance results for k-means clustering and testing the hypothesis...

    • plos.figshare.com
    xls
    Updated Dec 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chantha Wongoutong (2024). The performance results for k-means clustering and testing the hypothesis for homogeneity between the true grouped data and feature scaling on datasets containing features with different units. [Dataset]. http://doi.org/10.1371/journal.pone.0310839.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 6, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Chantha Wongoutong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The performance results for k-means clustering and testing the hypothesis for homogeneity between the true grouped data and feature scaling on datasets containing features with different units.

  12. f

    Additional file 6 of Benchmarking differential expression analysis tools for...

    • springernature.figshare.com
    txt
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas P. Quinn; Tamsyn M. Crowley; Mark F. Richardson (2023). Additional file 6 of Benchmarking differential expression analysis tools for RNA-Seq: normalization-based vs. log-ratio transformation-based methods [Dataset]. http://doi.org/10.6084/m9.figshare.6838784.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    figshare
    Authors
    Thomas P. Quinn; Tamsyn M. Crowley; Mark F. Richardson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Simulated data benchmark performance. This table contains the precision and recall estimates for several methods as applied to the low and high variance simulated data. This table is used to make figures. (CSV 2945 kb)

  13. Additional file 4 of Benchmarking differential expression analysis tools for...

    • springernature.figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas P. Quinn; Tamsyn M. Crowley; Mark F. Richardson (2023). Additional file 4 of Benchmarking differential expression analysis tools for RNA-Seq: normalization-based vs. log-ratio transformation-based methods [Dataset]. http://doi.org/10.6084/m9.figshare.6838757.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Thomas P. Quinn; Tamsyn M. Crowley; Mark F. Richardson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Group labels for Rollins et al. data. This table contains the group labels for all samples from the Rollins et al. cane toad data set. (CSV 1 kb)

  14. Additional file 3 of Benchmarking differential expression analysis tools for...

    • springernature.figshare.com
    txt
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas P. Quinn; Tamsyn M. Crowley; Mark F. Richardson (2023). Additional file 3 of Benchmarking differential expression analysis tools for RNA-Seq: normalization-based vs. log-ratio transformation-based methods [Dataset]. http://doi.org/10.6084/m9.figshare.6838751.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Thomas P. Quinn; Tamsyn M. Crowley; Mark F. Richardson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Estimated relative abundance for Rollins et al. data (by stsl method). This table contains the relative transcript abundances as estimated using the stsl procedure. (CSV 4579 kb)

  15. f

    The AUC and time performance results of the model using raw data in...

    • plos.figshare.com
    xls
    Updated Jan 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mwenge Mulenga; Arutchelvan Rajamanikam; Suresh Kumar; Saharuddin bin Muhammad; Subha Bhassu; Chandramathi Samudid; Aznul Qalid Md Sabri; Manjeevan Seera; Christopher Ifeanyi Eke (2025). The AUC and time performance results of the model using raw data in comparison with combinations of chained normalization, rank transformation, and feature selection methods. [Dataset]. http://doi.org/10.1371/journal.pone.0316493.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 29, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Mwenge Mulenga; Arutchelvan Rajamanikam; Suresh Kumar; Saharuddin bin Muhammad; Subha Bhassu; Chandramathi Samudid; Aznul Qalid Md Sabri; Manjeevan Seera; Christopher Ifeanyi Eke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The AUC and time performance results of the model using raw data in comparison with combinations of chained normalization, rank transformation, and feature selection methods.

  16. f

    Data from: Cross-Normalization of MALDI Mass Spectrometry Imaging Data...

    • acs.figshare.com
    • figshare.com
    xlsx
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tobias Boskamp; Rita Casadonte; Lena Hauberg-Lotte; Sören Deininger; Jörg Kriegsmann; Peter Maass (2023). Cross-Normalization of MALDI Mass Spectrometry Imaging Data Improves Site-to-Site Reproducibility [Dataset]. http://doi.org/10.1021/acs.analchem.1c01792.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    ACS Publications
    Authors
    Tobias Boskamp; Rita Casadonte; Lena Hauberg-Lotte; Sören Deininger; Jörg Kriegsmann; Peter Maass
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Matrix-assisted laser desorption/ionization mass spectrometry imaging (MALDI MSI) is an established tool for the investigation of formalin-fixed paraffin-embedded (FFPE) tissue samples and shows a high potential for applications in clinical research and histopathological tissue classification. However, the applicability of this method to serial clinical and pharmacological studies is often hampered by inevitable technical variation and limited reproducibility. We present a novel spectral cross-normalization algorithm that differs from the existing normalization methods in two aspects: (a) it is based on estimating the full statistical distribution of spectral intensities and (b) it involves applying a non-linear, mass-dependent intensity transformation to align this distribution with a reference distribution. This method is combined with a model-driven resampling step that is specifically designed for data from MALDI imaging of tryptic peptides. This method was performed on two sets of tissue samples: a single human teratoma sample and a collection of five tissue microarrays (TMAs) of breast and ovarian tumor tissue samples (N = 241 patients). The MALDI MSI data was acquired in two labs using multiple protocols, allowing us to investigate different inter-lab and cross-protocol scenarios, thus covering a wide range of technical variations. Our results suggest that the proposed cross-normalization significantly reduces such batch effects not only in inter-sample and inter-lab comparisons but also in cross-protocol scenarios. This demonstrates the feasibility of cross-normalization and joint data analysis even under conditions where preparation and acquisition protocols themselves are subject to variation.

  17. The results of proteomic LC-MS analysis of M. tuberculosis H37Rv in dormancy...

    • plos.figshare.com
    xlsx
    Updated Jun 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vadim Nikitushkin; Margarita Shleeva; Dmitry Loginov; Filip Dyčka F.; Jan Sterba; Arseny Kaprelyants (2023). The results of proteomic LC-MS analysis of M. tuberculosis H37Rv in dormancy and activity. [Dataset]. http://doi.org/10.1371/journal.pone.0269847.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 16, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Vadim Nikitushkin; Margarita Shleeva; Dmitry Loginov; Filip Dyčka F.; Jan Sterba; Arseny Kaprelyants
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The supplementary table contains: the raw intensities of the detected and annotated proteins–“Raw LC-MS data”; the results of data log2 transformation and quantile normalization–“Normalization” section; the results of differential analysis–“Results of differential analysis”; a reference list of antigenic proteins from published works on studies of the sera of human TB patients–“Antigenic proteins” sheet; a list of enzymes of dormancy as potential intracellular prodrugs activators “Enzymatic proteins” sheet. (XLSX)

  18. f

    Data_Sheet_1_Spatial normalization for voxel-based lesion symptom mapping:...

    • frontiersin.figshare.com
    pdf
    Updated Jan 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Jühling; Deepthi Rajashekar; Bastian Cheng; Claus Christian Hilgetag; Nils Daniel Forkert; Rene Werner (2024). Data_Sheet_1_Spatial normalization for voxel-based lesion symptom mapping: impact of registration approaches.PDF [Dataset]. http://doi.org/10.3389/fnins.2024.1296357.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jan 17, 2024
    Dataset provided by
    Frontiers
    Authors
    Daniel Jühling; Deepthi Rajashekar; Bastian Cheng; Claus Christian Hilgetag; Nils Daniel Forkert; Rene Werner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundVoxel-based lesion symptom mapping (VLSM) assesses the relation of lesion location at a voxel level with a specific clinical or functional outcome measure at a population level. Spatial normalization, that is, mapping the patient images into an atlas coordinate system, is an essential pre-processing step of VLSM. However, no consensus exists on the optimal registration approach to compute the transformation nor are downstream effects on VLSM statistics explored. In this work, we evaluate four registration approaches commonly used in VLSM pipelines: affine (AR), nonlinear (NLR), nonlinear with cost function masking (CFM), and enantiomorphic registration (ENR). The evaluation is based on a standard VLSM scenario: the analysis of statistical relations of brain voxels and regions in imaging data acquired early after stroke onset with follow-up modified Rankin Scale (mRS) values.Materials and methodsFluid-attenuated inversion recovery (FLAIR) MRI data from 122 acute ischemic stroke patients acquired between 2 and 3 days after stroke onset and corresponding lesion segmentations, and 30 days mRS values from a European multicenter stroke imaging study (I-KNOW) were available and used in this study. The relation of the voxel location with follow-up mRS was assessed by uni- as well as multi-variate statistical testing based on the lesion segmentations registered using the four different methods (AR, NLR, CFM, ENR; implementation based on the ANTs toolkit).ResultsThe brain areas evaluated as important for follow-up mRS were largely consistent across the registration approaches. However, NLR, CFM, and ENR led to distortions in the patient images after the corresponding nonlinear transformations were applied. In addition, local structures (for instance the lateral ventricles) and adjacent brain areas remained insufficiently aligned with corresponding atlas structures even after nonlinear registration.ConclusionsFor VLSM study designs and imaging data similar to the present work, an additional benefit of nonlinear registration variants for spatial normalization seems questionable. Related distortions in the normalized images lead to uncertainties in the VLSM analyses and may offset the theoretical benefits of nonlinear registration.

  19. f

    Comparison of between normalized LISA and the equivalent transformation...

    • plos.figshare.com
    xls
    Updated May 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yanguang Chen (2024). Comparison of between normalized LISA and the equivalent transformation results of Anselin’s second set of LISA definitions. [Dataset]. http://doi.org/10.1371/journal.pone.0303456.t009
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 22, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Yanguang Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of between normalized LISA and the equivalent transformation results of Anselin’s second set of LISA definitions.

  20. f

    Additional file 1 of Sparse feature selection for classification and...

    • springernature.figshare.com
    xlsx
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mehmet Eren Ahsen; Todd Boren; Nitin Singh; Burook Misganaw; David Mutch; Kathleen Moore; Floor Backes; Carolyn McCourt; Jayanthi Lea; David Miller; Michael White; Mathukumalli Vidyasagar (2023). Additional file 1 of Sparse feature selection for classification and prediction of metastasis in endometrial cancer [Dataset]. http://doi.org/10.6084/m9.figshare.c.3727570_D1.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    figshare
    Authors
    Mehmet Eren Ahsen; Todd Boren; Nitin Singh; Burook Misganaw; David Mutch; Kathleen Moore; Floor Backes; Carolyn McCourt; Jayanthi Lea; David Miller; Michael White; Mathukumalli Vidyasagar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List and description of supplemental tables. Table S1. This table contains the measurements of 1428 micro-RNAs for 94 Samples. The rows correspond to the features (miRNA) and the columns correspond to the samples. The samples consist of 47 lymph node-positive and 47 lymph node-negative samples. 43.75% of the entries in this sheet are NaN. It contains measurements for 213 miRNAs of 86 samples. Out of those 86 samples, 43 are lymph node-positive, and the remaining 43 are lymph node-negative. A sample whose label has the term IB or IC belongs to a lymph node-negative patient, whereas a sample with a label containing IIIC belong to a lymph node-positive patient. A lymph node-positive or neagtive status was defined empiracally during pimary staging. Table S2. This table contains a subset of the raw data, used for training the classifier. This data was obtained by removing four patients from each class, and 1,215 features. It contains measurements for 213 miRNAs of 86 samples. Out of those 86 samples, 43 are lymph node-positive, and the remaining 43 are lymph node-negative. Table S3. This table contains the normalized version of the training data. The following procedure is used for normalization: 1) From each entry of the i-th row vector (i-th feature vector), we subtract the mean value m i of the i-th row vector computed over all the 86 samples. 2) Multiply each entry of the i-th row vector by a scale factor s i so that the resulting vector has euclidean norm equal to the square root of 86. Table S4. The lone star algorithm selected 18 final features. This sheet contains the 20 best classifiers based on these eightteen features, sorted with respect to accuracy. The sensitivity, specificity and accuracy figures (columns T, U and V) are based on the classification of the 86 samples in the training data by the corresponding classifier.Table S5. This table shows the classifier obtained by taking the average of the classifiers in Sheet 4. In particular, we average the numbers in each column of the 20 classifiers given in Sheet 4 (20 best classifiers) (Columns A-S). Table S6. This sheet contains clinical information about the independent cohort of 28 patients who were used to validate the classifier. Out of these, 9 are lymph-node positive and 19 are lymph node-negative. Table S7. This sheet contains the raw microRNA measurements on the 28 test data samples. Table S8. This is the transformed version of the test data. We apply the same transformation as w did for the training data, as described on Sheet 3. For each of the 18 features (miRNAs), we subtract the original mean value m i from each entry and multiply each entry by the constant s i . The calculation of m i and s i is as in Additional file 1, Table S3. Table S9. This sheet contains the discriminant values of the classifier on the Test Data. In column D an entry of 1 means that the sample is correctly classified. Table 10. This sheet contains the number of overlaps between our 23 gene signature with the pathways in the KEGG database. The q-value is obtained from the Fisher exact test after the Benjamini-Hochberg multiple testing correction and quantifies the statistical significance of the overlap between the gene list and a set of genes in a particular pathway. (1170 KB XLSX)

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Market Report Analytics (2025). Normalizing Service Report [Dataset]. https://www.marketreportanalytics.com/reports/normalizing-service-53022

Normalizing Service Report

Explore at:
pdf, ppt, docAvailable download formats
Dataset updated
Apr 2, 2025
Dataset authored and provided by
Market Report Analytics
License

https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description

The global Normalizing Service market is experiencing robust growth, driven by increasing demand for [Insert specific drivers based on your knowledge of the Normalizing Service market, e.g., improved data quality, enhanced data analysis capabilities, rising adoption of cloud-based solutions, stringent data governance regulations]. The market is segmented by application [Insert specific applications, e.g., healthcare, finance, manufacturing] and type [Insert specific types of Normalizing Services, e.g., data cleansing, data transformation, data integration]. While precise market sizing data is unavailable, based on industry trends and comparable markets with similar growth trajectories, a reasonable estimate for the 2025 market size could be placed in the range of $500-750 million USD, with a Compound Annual Growth Rate (CAGR) of approximately 15-20% projected from 2025 to 2033. This growth is expected to be fueled by the continued expansion of big data analytics and the rising need for data standardization across diverse industries. However, challenges such as data security concerns, integration complexities, and high initial investment costs can act as potential restraints on market expansion. Regional analysis suggests a strong presence across North America and Europe, driven by early adoption and robust technological infrastructure. Asia-Pacific is poised for significant growth in the coming years due to increasing digitalization and expanding data centers. The market is highly competitive, with a mix of established players and emerging technology companies vying for market share. Successful players will need to differentiate their offerings through specialized solutions, strategic partnerships, and a focus on addressing specific industry needs. Future growth will depend on advancements in AI and machine learning technologies, further integration with cloud platforms, and the development of user-friendly, scalable solutions.

Search
Clear search
Close search
Google apps
Main menu