Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comparisons of estimates of normalizing parameter.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Two-tailed Mann-Whitney U test was used for calculating p-values. The statistical power was calculated for a Student's t-test using statistical parameters of log2 transformed expression data. Sample size (n) is number per group needed to obtain a power of at least 0.8. Nppb: group 2 vs. group 3; Vcam1: group 1 vs. group 3.
https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-licensehttps://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license
Abstract It is shown that whenever the multiplicative normalization of a fitting function is not known, least square fitting by ^(χ2)minimization can be performed with one parameter less than usual by converting the normalization parameter into a function of the remaining parameters and the data. Title of program: FITM1 Catalogue Id: AEYG_v1_0 Nature of problem Least square minimization when one of the free parameters is the multiplicative normalization of the fitting function. Versions of this program held in the CPC repository in Mendeley Data AEYG_v1_0; FITM1; 10.1016/j.cpc.2015.09.021 This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2018)
Background
The Infinium EPIC array measures the methylation status of > 850,000 CpG sites. The EPIC BeadChip uses a two-array design: Infinium Type I and Type II probes. These probe types exhibit different technical characteristics which may confound analyses. Numerous normalization and pre-processing methods have been developed to reduce probe type bias as well as other issues such as background and dye bias.
Methods
This study evaluates the performance of various normalization methods using 16 replicated samples and three metrics: absolute beta-value difference, overlap of non-replicated CpGs between replicate pairs, and effect on beta-value distributions. Additionally, we carried out Pearson’s correlation and intraclass correlation coefficient (ICC) analyses using both raw and SeSAMe 2 normalized data.Â
Results
The method we define as SeSAMe 2, which consists of the application of the regular SeSAMe pipeline with an additional round of QC, pOOBAH masking, was found to be the b...,
Study Participants and SamplesÂ
The whole blood samples were obtained from the Health, Well-being and Aging (Saúde, Ben-estar e Envelhecimento, SABE) study cohort. SABE is a cohort of census-withdrawn elderly from the city of São Paulo, Brazil, followed up every five years since the year 2000, with DNA first collected in 2010. Samples from 24 elderly adults were collected at two time points for a total of 48 samples. The first time point is the 2010 collection wave, performed from 2010 to 2012, and the second time point was set in 2020 in a COVID-19 monitoring project (9±0.71 years apart). The 24 individuals were 67.41±5.52 years of age (mean ± standard deviation) at time point one; and 76.41±6.17 at time point two and comprised 13 men and 11 women.
All individuals enrolled in the SABE cohort provided written consent, and the ethic protocols were approved by local and national institutional review boards COEP/FSP/USP OF.COEP/23/10, CONEP 2044/2014, CEP HIAE 1263-10, University o..., We provide data on an Excel file, with absolute differences in beta values between replicate samples for each probe provided in different tabs for raw data and different normalization methods.
Agreement among different normalization methods.
This data set presents polarimetric observations of comet 1P/Halley reported as normalized Stokes parameters. All data were contributed by a single observer, A. Dolphus, working at the European Southern Observatory. This revised version of the data set includes updated documentation and data formats.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset is about: Normalized geophysical parameters and factor analyses of sediment core AND-2A. Please consult parent dataset @ https://doi.org/10.1594/PANGAEA.834632 for more information.
Real-time functional magnetic resonance imaging (rtfMRI) is a recently emerged technique that demands fast data processing within a single repetition time (TR), such as a TR of 2 seconds. Data preprocessing in rtfMRI has rarely involved spatial normalization, which can not be accomplished in a short time period. However, spatial normalization may be critical for accurate functional localization in a stereotactic space and is an essential procedure for some emerging applications of rtfMRI. In this study, we introduced an online spatial normalization method that adopts a novel affine registration (AFR) procedure based on principal axes registration (PA) and Gauss-Newton optimization (GN) using the self-adaptive β parameter, termed PA-GN(β) AFR and nonlinear registration (NLR) based on discrete cosine transform (DCT). In AFR, PA provides an appropriate initial estimate of GN to induce the rapid convergence of GN. In addition, the β parameter, which relies on the change rate of cost functio...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All data and figures in the article were simulated by Matlab. The numerical method employed combines the classical spectral method in the temporal domain and the adaptive step size Runge Kutta method in the spatial domain. Herein, we provide all Matlab data and related graph source files for figures 1-6 in the article. This dataset is divided into the following two files. The 'data' file is the Matlab data for Figures 1-6, and the other file is the Matlab source file for Figures 1-6. In the ‘data' file, six files named fig1 (2, 3, 4, 5, 6) were created in the order of the figures, corresponding to the data in figures 1-6. The six files fig1 (2,3,4,5,6) are still named in the other file, according to the order of the figures, corresponding to the source files of figures 1-6. The normalization parameters of Figure 1 are plane wave amplitude A0=1, perturbation frequency Omega=1.5, phase difference(between perturbation signal and pump) phi0=0.5pi, perturbation amplitudes (from top to bottom) delta=0.01, 0.1, and 0.25, respectively. The normalization parameters in Figure 2 are plane wave amplitude A0=1, perturbation frequency Omega=2, phase difference(between perturbation signal and pump) phi0=0.5pi, perturbation amplitudes (from top to bottom) delta=0.01, 0.25, and 0.5, respectively. The normalization parameter in Figure 3 is the plane wave amplitude A0=1, and phase difference(between perturbation signal and pump) of phi0=0.5pi and 0.3pi, respectively. The normalization parameters in Figure 4 are plane wave amplitude A0=1, disturbance frequency Omega=1.5, disturbance amplitude delta=0.1, and the phase difference between the disturbance signal and the pump light (from top to bottom) of phi0=0.1pi, 0.3pi, and 0.5pi, respectively. The normalization parameter in Figure 5 is the plane wave amplitude A0=1, and the perturbation amplitudes are delta=0.1 and 0.01, respectively. The normalization parameters in Figure 6 are plane wave amplitude A0=1, disturbance frequency Omega=2.2, disturbance amplitude delta=0.25, and the phase difference between the disturbance signal and the pump light (from top to bottom) of phi0=0.1pi, 0.3pi, and 0.5pi, respectively.
--This data has been withdrawn by the author.-- The corrected dataset can be found here: https://doi.org/10.20387/bonares-2657-1NP3
The data has been withdrawn for following reasons: "The published R-script was revised in the course of an external review process.” The data is no longer available for free reuse and will only be released by the data centre if there is a reasonable interest.
Summary Scaling with ranked subsampling (SRS) is an algorithm for the normalization of species count data in ecology. So far, SRS has successfully been applied to microbial community data. The normalization by SRS reduces the number of reads in each sample in such a way that (i) the total count equals Cmin, (ii) each removed OTU is less or equally abundant that any preserved OTU, and (iii) the relative frequencies of OTUs remaining in the sample after normalization are as close as possible to the frequencies in the original sample. The algorithm consists of two steps. In the first step, the counts for all OTUs are divided by a scaling factor chosen in such a way that the sum of the scaled counts (Cscaled with integer or non-integer values) equals Cmin. The relative frequencies of all OTUs remain unchanged. In the second step, the non-integer count values are converted into integers by an algorithm that we dub ranked subsampling. The scaled count Cscaled for each OTU is split into the integer-part Cint by truncating the digits after the decimal separator (Cint = floor(Cscaled)) and the fractional part Cfrac (Cfrac = Cscaled - Cint). Since ΣCint ≤ Cmin, additional ∆C = Cmin - ΣCint reads have to be added to the library to reach the total count of Cmin. This is achieved as follows. OTUs are ranked in the descending order of their Cfrac values, which lie in the open interval (0, 1). Beginning with the OTU of the highest rank, single count per OTU is added to the normalized library until the total number of added counts reaches ∆C and the sum of all counts in the normalized library equals Cmin. For example, if ∆C = 5 and the seven top Cfrac values are 0.96, 0.96, 0.88, 0.55, 0.55, 0.55, and 0.55, the following counts are added: a single count for each OTU with Cfrac of 0.96; a single count for the OTU with Cfrac of 0.88; and a single count each for two OTUs among those with Cfrac of 0.55. When the lowest Cfrag involved in picking ∆C counts is shared by several OTUs, the OTUs used for adding a single count to the library are selected in the order of their Cint values. This selection minimizes the effect of normalization on the relative frequencies of OTUs. OTUs with identical Cfrag as well as Cint are sampled randomly without replacement.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides processed and normalized/standardized indices for the management tool group 'Supply Chain Management' (SCM), including related concepts like Supply Chain Integration. Derived from five distinct raw data sources, these indices are specifically designed for comparative longitudinal analysis, enabling the examination of trends and relationships across different empirical domains (web search, literature, academic publishing, and executive adoption). The data presented here represent transformed versions of the original source data, aimed at achieving metric comparability. Users requiring the unprocessed source data should consult the corresponding SCM dataset in the Management Tool Source Data (Raw Extracts) Dataverse. Data Files and Processing Methodologies: Google Trends File (Prefix: GT_): Normalized Relative Search Interest (RSI) Input Data: Native monthly RSI values from Google Trends (Jan 2004 - Jan 2025) for the query "supply chain management" + "supply chain logistics" + "supply chain". Processing: None. The dataset utilizes the original Google Trends index, which is base-100 normalized against the peak search interest for the specified terms and period. Output Metric: Monthly Normalized RSI (Base 100). Frequency: Monthly. Google Books Ngram Viewer File (Prefix: GB_): Normalized Relative Frequency Input Data: Annual relative frequency values from Google Books Ngram Viewer (1950-2022, English corpus, no smoothing) for the query Supply Chain Management + Supply Chain Integration + Supply Chain. Processing: The annual relative frequency series was normalized by setting the year with the maximum value to 100 and scaling all other values (years) proportionally. Output Metric: Annual Normalized Relative Frequency Index (Base 100). Frequency: Annual. Crossref.org File (Prefix: CR_): Normalized Relative Publication Share Index Input Data: Absolute monthly publication counts matching SCM-related keywords [("supply chain management" OR ...) AND ("management" OR ...) - see raw data for full query] in titles/abstracts (1950-2025), alongside total monthly publication counts in Crossref. Data deduplicated via DOIs. Processing: For each month, the relative share of SCM-related publications (SCM Count / Total Crossref Count for that month) was calculated. This monthly relative share series was then normalized by setting the month with the maximum relative share to 100 and scaling all other months proportionally. Output Metric: Monthly Normalized Relative Publication Share Index (Base 100). Frequency: Monthly. Bain & Co. Survey - Usability File (Prefix: BU_): Normalized Usability Index Input Data: Original usability percentages (%) from Bain surveys for specific years: Supply Chain Integration (1999, 2000, 2002); Supply Chain Management (2004, 2006, 2008, 2010, 2012, 2014, 2017, 2022). Processing: Semantic Grouping: Data points for "Supply Chain Integration" and "Supply Chain Management" were treated as a single conceptual series for SCM. Normalization: The combined series of original usability percentages was normalized relative to its own highest observed historical value across all included years (Max % = 100). Output Metric: Biennial Estimated Normalized Usability Index (Base 100 relative to historical peak). Frequency: Biennial (Approx.). Bain & Co. Survey - Satisfaction File (Prefix: BS_): Standardized Satisfaction Index Input Data: Original average satisfaction scores (1-5 scale) from Bain surveys for specific years: Supply Chain Integration (1999, 2000, 2002); Supply Chain Management (2004, 2006, 2008, 2010, 2012, 2014, 2017, 2022). Processing: Semantic Grouping: Data points for "Supply Chain Integration" and "Supply Chain Management" were treated as a single conceptual series for SCM. Standardization (Z-scores): Original scores (X) were standardized using Z = (X - ?) / ?, with ?=3.0 and ??0.891609. Index Scale Transformation: Z-scores were transformed via Index = 50 + (Z * 22). Output Metric: Biennial Standardized Satisfaction Index (Center=50, Range?[1,100]). Frequency: Biennial (Approx.). File Naming Convention: Files generally follow the pattern: PREFIX_Tool_Processed.csv or similar, where the PREFIX indicates the data source (GT_, GB_, CR_, BU_, BS_). Consult the parent Dataverse description (Management Tool Comparative Indices) for general context and the methodological disclaimer. For original extraction details (specific keywords, URLs, etc.), refer to the corresponding SCM dataset in the Raw Extracts Dataverse. Comprehensive project documentation provides full details on all processing steps.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Cloud and Sky Image Tensors for Classification
This dataset is designed for those interested in cloud classification projects. Due to licensing restrictions, the raw images cannot be shared publicly. However, the transformed tensors provided here are optimized for image classification tasks and are typically all you need for such projects.
Tensor Specifications
These tensors are preprocessed and normalized for use with ResNet models. The normalization parameters… See the full description on the dataset page: https://huggingface.co/datasets/jcamier/cloud_sky_vis.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Dr. Kevin Bronson provides a second year of nitrogen and water management in wheat agricultural research dataset for compute. Ten irrigation treatments from a linear sprinkler were combined with nitrogen treatments. This dataset includes notation of field events and operations, an intermediate analysis mega-table of correlated and calculated parameters, including laboratory analysis results generated during the experimentation, plus high resolution plot level intermediate data tables of SAS process output, as well as the complete raw data sensor records and logger outputs.
This proximal terrestrial high-throughput plant phenotyping data examples our early tri-metric field method, where a geo-referenced 5Hz crop canopy height, temperature and spectral signature are recorded coincident to indicate a plant health status. In this development period, our Proximal Sensing Cart Mark1 (PSCM1) platform suspends a single cluster of sensors on a dual sliding vertical placement armature.
Experimental design and operational details of research conducted are contained in related published articles, however further description of the measured data signals as well as germane commentary is herein offered.
The primary component of this dataset is the Holland Scientific (HS) CropCircle ACS-470 reflectance numbers. Which as derived here, consist of raw active optical band-pass values, digitized onboard the sensor product. Data is delivered as sequential serialized text output including the associated GPS information. Typically this is a production agriculture support technology, enabling an efficient precision application of nitrogen fertilizer. We used this optical reflectance sensor technology to investigate plant agronomic biology, as the ACS-470 is a unique performance product being not only rugged and reliable but illumination active and filter customizable.
Individualized ACS-470 sensor detector behavior and subsequent index calculation influence can be understood through analysis of white-panel and other known target measurements. When a sensor is held 120cm from a titanium dioxide white painted panel, a normalized unity value of 1.0 is set for each detector. To generate this dataset we used a Holland Scientific SC-1 device and set the 1.0 unity value (field normalize) on each sensor individually, before each data collection, and without using any channel gain boost. The SC-1 field normalization device allows a communications connection to a Windows machine, where company provided sensor control software enables the necessary sensor normalization routine, and a real-time view of streaming sensor data.
This type of active proximal multi-spectral reflectance data may be perceived as inherently “noisy”; however basic analytical description consistently resolves a biological patterning, and more advanced statistical analysis is suggested to achieve discovery. Sources of polychromatic reflectance are inherent in the environment; and can be influenced by surface features like wax or water, or presence of crystal mineralization; varying bi-directional reflectance in the proximal space is a model reality, and directed energy emission reflection sampling is expected to support physical understanding of the underling passive environmental system.
Soil in view of the sensor does decrease the raw detection amplitude of the target color returned and can add a soil reflection signal component. Yet that return accurately represents a largely two-dimensional cover and intensity signal of the target material present within each view. It does however, not represent a reflection of the plant material solely because it can contain additional features in view. Expect NDVI values greater than 0.1 when sensing plants and saturating more around 0.8, rather than the typical 0.9 of passive NDVI.
The active signal does not transmit energy to penetrate, perhaps past LAI 2.1 or less, compared to what a solar induced passive reflectance sensor would encounter. However the focus of our active sensor scan is on the uppermost expanded canopy leaves, and they are positioned to intercept the major solar energy. Active energy sensors are more easy to direct, and in our capture method we target a consistent sensor height that is 1m above the average canopy height, and maintaining a rig travel speed target around 1.5 mph, with sensors parallel to earth ground in a nadir view.
We consider these CropCircle raw detector returns to be more “instant” in generation, and “less-filtered” electronically, while onboard the “black-box” device, than are other reflectance products which produce vegetation indices as averages of multiple detector samples in time.
It is known through internal sensor performance tracking across our entire location inventory, that sensor body temperature change affects sensor raw detector returns in minor and undescribed yet apparently consistent ways.
Holland Scientific 5Hz CropCircle active optical reflectance ACS-470 sensors, that were measured on the GeoScout digital propriety serial data logger, have a stable output format as defined by firmware version. Fifteen collection events are presented.
Different numbers of csv data files were generated based on field operations, and there were a few short duration instances where GPS signal was lost. Multiple raw data files when present, including white panel measurements before or after field collections, were combined into one file, with the inclusion of the null value placeholder -9999. Two CropCircle sensors, numbered 2 and 3, were used, supplying data in a lined format, where variables are repeated for each sensor. This created a discrete data row for each individual sensor measurement instance.
We offer six high-throughput single pixel spectral colors, recorded at 530, 590, 670, 730, 780, and 800nm. The filtered band-pass was 10nm, except for the NIR, which was set to 20 and supplied an increased signal (including an increased noise).
Dual, or tandem approach, CropCircle paired sensor usage empowers additional vegetation index calculations, such as:
DATT = (r800-r730)/(r800-r670)
DATTA = (r800-r730)/(r800-r590)
MTCI = (r800-r730)/(r730-r670)
CIRE = (r800/r730)-1
CI = (r800/r590)-1
CCCI = NDRE/NDVIR800
PRI = (r590-r530)/(r590+r530)
CI800 = ((r800/r590)-1)
CI780 = ((r780/r590)-1)
The Campbell Scientific (CS) environmental data recording of small range (0 to 5 v) voltage sensor signals are accurate and largely shielded from electronic thermal induced influence, or other such factors by design. They were used as was descriptively recommended by the company. A high precision clock timing, and a recorded confluence of custom metrics, allow the Campbell Scientific raw data signal acquisitions a high research value generally, and have delivered baseline metrics in our plant phenotyping program. Raw electrical sensor signal captures were recorded at the maximum digital resolution, and could be re-processed in whole, while the subsequent onboard calculated metrics were often data typed at a lower memory precision and served our research analysis.
Improved Campbell Scientific data at 5Hz is presented for nine collection events, where thermal, ultrasonic displacement, and additional GPS metrics were recorded. Ultrasonic height metrics generated by the Honeywell sensor and present in this dataset, represent successful phenotypic recordings. The Honeywell ultrasonic displacement sensor has worked well in this application because of its 180Khz signal frequency that ranges 2m space. Air temperature is still a developing metric, a thermocouple wire junction (TC) placed in free air with a solar shade produced a low-confidence passive ambient air temperature.
Campbell Scientific logger derived data output is structured in a column format, with multiple sensor data values present in each data row. One data row represents one program output cycle recording across the sensing array, as there was no onboard logger data averaging or down sampling. Campbell Scientific data is first recorded in binary format onboard the data logger, and then upon data retrieval, converted to ASCII text via the PC based LoggerNet CardConvert application. Here, our full CS raw data output, that includes a four-line header structure, was truncated to a typical single row header of variable names. The -9999 placeholder value was inserted for null instances.
There is canopy thermal data from three view vantages. A nadir sensor view, and looking forward and backward down the plant row at a 30 degree angle off nadir. The high confidence Apogee Instruments SI-111 type infrared radiometer, non-contact thermometer, serial number 1022 was in a front position looking forward away from the platform, number 1023 with a nadir view was in middle position, and sensor number 1052 was in a rear position and looking back toward the platform frame. We have a long and successful history testing and benchmarking performance, and deploying Apogee Instruments infrared radiometers in field experimentation. They are biologically spectral window relevant sensors and return a fast update 0.2C accurate average surface temperature, derived from what is (geometrically weighted) in their field of view.
Data gaps do exist beyond null value -9999 designations, there are some instances when GPS signal was lost, or rarely on HS GeoScout logger error. GPS information may be missing at the start of data recording. However once the receiver supplies a signal the values will populate. Likewise there may be missing information at the end of a data collection, where the GPS signal was lost but sensors continue to record along with the data logger timestamping.
In the raw CS data, collections 1 through 7 are represented by only one table file, where the UTC from the GPS
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides processed and normalized/standardized indices for the management tool 'Knowledge Management' (KM), including related concepts like Intellectual Capital Management and Knowledge Transfer. Derived from five distinct raw data sources, these indices are specifically designed for comparative longitudinal analysis, enabling the examination of trends and relationships across different empirical domains (web search, literature, academic publishing, and executive adoption). The data presented here represent transformed versions of the original source data, aimed at achieving metric comparability. Users requiring the unprocessed source data should consult the corresponding KM dataset in the Management Tool Source Data (Raw Extracts) Dataverse. Data Files and Processing Methodologies: Google Trends File (Prefix: GT_): Normalized Relative Search Interest (RSI) Input Data: Native monthly RSI values from Google Trends (Jan 2004 - Jan 2025) for the query "knowledge management" + "knowledge management organizational". Processing: None. Utilizes the original base-100 normalized Google Trends index. Output Metric: Monthly Normalized RSI (Base 100). Frequency: Monthly. Google Books Ngram Viewer File (Prefix: GB_): Normalized Relative Frequency Input Data: Annual relative frequency values from Google Books Ngram Viewer (1950-2022, English corpus, no smoothing) for the query Knowledge Management + Intellectual Capital Management + Knowledge Transfer. Processing: Annual relative frequency series normalized (peak year = 100). Output Metric: Annual Normalized Relative Frequency Index (Base 100). Frequency: Annual. Crossref.org File (Prefix: CR_): Normalized Relative Publication Share Index Input Data: Absolute monthly publication counts matching KM-related keywords [("knowledge management" OR ...) AND (...) - see raw data for full query] in titles/abstracts (1950-2025), alongside total monthly Crossref publications. Deduplicated via DOIs. Processing: Monthly relative share calculated (KM Count / Total Count). Monthly relative share series normalized (peak month's share = 100). Output Metric: Monthly Normalized Relative Publication Share Index (Base 100). Frequency: Monthly. Bain & Co. Survey - Usability File (Prefix: BU_): Normalized Usability Index Input Data: Original usability percentages (%) from Bain surveys for specific years: Knowledge Management (1999, 2000, 2002, 2004, 2006, 2008, 2010). Note: Not reported after 2010. Processing: Normalization: Original usability percentages normalized relative to its historical peak (Max % = 100). Output Metric: Biennial Estimated Normalized Usability Index (Base 100 relative to historical peak). Frequency: Biennial (Approx.). Bain & Co. Survey - Satisfaction File (Prefix: BS_): Standardized Satisfaction Index Input Data: Original average satisfaction scores (1-5 scale) from Bain surveys for specific years: Knowledge Management (1999-2010). Note: Not reported after 2010. Processing: Standardization (Z-scores): Using Z = (X - 3.0) / 0.891609. Index Scale Transformation: Index = 50 + (Z * 22). Output Metric: Biennial Standardized Satisfaction Index (Center=50, Range?[1,100]). Frequency: Biennial (Approx.). File Naming Convention: Files generally follow the pattern: PREFIX_Tool_Processed.csv or similar, where the PREFIX indicates the data source (GT_, GB_, CR_, BU_, BS_). Consult the parent Dataverse description (Management Tool Comparative Indices) for general context and the methodological disclaimer. For original extraction details (specific keywords, URLs, etc.), refer to the corresponding KM dataset in the Raw Extracts Dataverse. Comprehensive project documentation provides full details on all processing steps.
A machine readable version of Table 3. For each model parameter, we report the 16th, 50th, and 84th percentile of the samples from our dynesty chain, which should be regarded as the statistical uncertainties. An additional systematic uncertainty of 5% should be added to the distances. The column headings are as follows: 'name' is the cloud coincident with the sightline 'l' is the Galactic longitude of the sightline (in degrees) 'b' is the Galactic latitude of the sightline (in degrees) 'n' is the normalization parameter 'f' is the foreground extinction parameter (in mag) 'm' is the cloud distance modulus parameter (in mag) 'd' is the cloud distance (derived from m) in pc 'p' is the outlier fraction parameter 'sfore' is the foreground smoothing parameter 'sback' is the background smoothing parameter See Section 3.2 for a complete description of the model parameters.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides processed and normalized/standardized indices for the management tool 'Core Competencies' (also Core Competence). Derived from five distinct raw data sources, these indices are specifically designed for comparative longitudinal analysis, enabling the examination of trends and relationships across different empirical domains (web search, literature, academic publishing, and executive adoption). The data presented here represent transformed versions of the original source data, aimed at achieving metric comparability. Users requiring the unprocessed source data should consult the corresponding Core Competencies dataset in the Management Tool Source Data (Raw Extracts) Dataverse. Data Files and Processing Methodologies: Google Trends File (Prefix: GT_): Normalized Relative Search Interest (RSI) Input Data: Native monthly RSI values from Google Trends (Jan 2004 - Jan 2025) for the query "core competencies" + "core competence strategy". Processing: None. Utilizes the original base-100 normalized Google Trends index. Output Metric: Monthly Normalized RSI (Base 100). Frequency: Monthly. Google Books Ngram Viewer File (Prefix: GB_): Normalized Relative Frequency Input Data: Annual relative frequency values from Google Books Ngram Viewer (1950-2022, English corpus, no smoothing) for the query Core Competencies + Core Competence. Processing: Annual relative frequency series normalized (peak year = 100). Output Metric: Annual Normalized Relative Frequency Index (Base 100). Frequency: Annual. Crossref.org File (Prefix: CR_): Normalized Relative Publication Share Index Input Data: Absolute monthly publication counts matching Core Competencies-related keywords [("core competencies" OR ...) AND (...) - see raw data for full query] in titles/abstracts (1950-2025), alongside total monthly Crossref publications. Deduplicated via DOIs. Processing: Monthly relative share calculated (Core Competencies Count / Total Count). Monthly relative share series normalized (peak month's share = 100). Output Metric: Monthly Normalized Relative Publication Share Index (Base 100). Frequency: Monthly. Bain & Co. Survey - Usability File (Prefix: BU_): Normalized Usability Index Input Data: Original usability percentages (%) from Bain surveys for specific years: Core Competencies (1993, 1996, 1999, 2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2017). Note: Not reported in 2022 survey data. Processing: Normalization: Original usability percentages normalized relative to its historical peak (Max % = 100). Output Metric: Biennial Estimated Normalized Usability Index (Base 100 relative to historical peak). Frequency: Biennial (Approx.). Bain & Co. Survey - Satisfaction File (Prefix: BS_): Standardized Satisfaction Index Input Data: Original average satisfaction scores (1-5 scale) from Bain surveys for specific years: Core Competencies (1993-2017). Note: Not reported in 2022 survey data. Processing: Standardization (Z-scores): Using Z = (X - 3.0) / 0.891609. Index Scale Transformation: Index = 50 + (Z * 22). Output Metric: Biennial Standardized Satisfaction Index (Center=50, Range?[1,100]). Frequency: Biennial (Approx.). File Naming Convention: Files generally follow the pattern: PREFIX_Tool_Processed.csv or similar, where the PREFIX indicates the data source (GT_, GB_, CR_, BU_, BS_). Consult the parent Dataverse description (Management Tool Comparative Indices) for general context and the methodological disclaimer. For original extraction details (specific keywords, URLs, etc.), refer to the corresponding Core Competencies dataset in the Raw Extracts Dataverse. Comprehensive project documentation provides full details on all processing steps.
http://earth.jaxa.jp/policy/en.htmlhttp://earth.jaxa.jp/policy/en.html
GCOM-C/SGLI L2 Normalized water leaving radiance and aerosol parameters and PAR (1km) dataset is obtained from the SGLI sensor onboard GCOM-C and produced by the Japan Aerospace Exploration Agency (JAXA). GCOM-C is Sun-synchronous sub-recurrent Orbit satellite launched on December 23, 2017, which mounts SGLI and conducts long-term global observations of geophysical variables related to the global climate system across 28 items including aerosol and vegetation over 4 areas of atmosphere, land, ocean, and cryosphere. The data will be used to contribute to higher accuracy of global warming prediction. The SGLI has swath of 1150 km in the visible band and 1400 km in the infrared band. Level 2 products are defined to be products composed of physical quantity data that is calculated based on Level 1B products and meteorological data, as well as various additional data related to the physical quantity data. This dataset includes Normalized water leaving radiance (NWLR), TAUA, and Photosynthetically available radiation (PAR). NWLR data is the upwelling radiance just above the sea surface at 380, 412, 443, 490, 530, 565, and 670 nm, with the sun at the zenith, at the average distance from the earth to the sun (1 AU). It is corrected for the viewing angle dependence and for the effects of the non-isotropic distribution of the in-water light field. The NWLR data can convert into Rrs (Remote Sensing Reflectance) by the parameters of itself. The physical quantity unit is W/m^2/str/um. TAUA (τA) data is the aerosol optical thickness at 670 and 865 nm when the atmospheric correction algorithm of ocean color estimated the NWLR data. Note that 'TAUA' and 'AROT_ocean in GCOM-C/SGLI L2 Aerosol over the ocean-land aerosol by near ultra violet (dio:xxxx)' differ in algorithm. PAR data is the photon flux density which is potentially available to plant for photosynthesis within the visible wavelength range of 400 to 700 nm over ocean. The physical quantity unit is Ein/m^2/day. The provided format is HDF5. The Spatial resolution is 1 km. The projection method is L1B reference coordinates. The generation unit is Scene. The current version of the product is Version 3. The Version 2 is also available, but please note that the "QA_Flag" data has been changed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The file geocenter_models.tar contains all models comprising the GeoCenter ensemble: 4 convolutional neural networks (CNN) and 4 isotonic-regression models. Every model is found in a subdirectory whose names indicate which infrared (IR) wavelengths are used as input to the CNN. For example:
wavelengths-microns=3.900_6.185_6.950/model.h5: An HDF5 file containing the trained CNN that uses data from bands 7, 8, 9 (corresponding to 3.9, 6.185, and 6.95 microns on the GOES ABI imager). The trained CNN can always be read by neural_net_utils.read_model() in the ml4tccf library (https://doi.org/10.5281/zenodo.13787645).
wavelengths-microns=3.900_6.185_6.950/model_metadata.p: A Pickle file containing metadata for the trained CNN. This file is needed to read the CNN itself with neural_net_utils.read_model(). Otherwise, you will probably never need to access this metafile directly.
wavelengths-microns=3.900_6.185_6.950/isotonic_regression/isotonic_regression.dill: A Dill file containing the isotonic-regression models used to bias-correct the above CNN. The trained isotonic-regression models can always be read by scalar_isotonic_regression.read_file() in the ml4tccf library. Note that there are technically two isotonic-regression models for every CNN: one that bias-corrects the x-coordinate of the TC-center, another that bias-corrects the y-coordinate.
As mentioned above, every trained CNN can be read by neural_net_utils.read_model(). Also, every trained CNN can be applied to new data (inference mode) by neural_net_utils.apply_model(). The input argument model_object should be the object returned by neural_net_utils.read_model(), and I suggest setting num_examples_per_batch = 10 to avoid out-of-memory errors. The only other input argument is predictor_matrices, which is a list of two numpy arrays. The first numpy array contains IR imagery centered at the first-guess TC center, and the second numpy array contains ATCF scalars. The first numpy array should have dimensions S (number of TC samples) x 500 (grid rows) x 500 (grid columns) x 7 (lag times) x 3 (wavelengths). Lag times should be in the following order: 180, 150, 120, 90, 60, 30, 0 min ago. Wavelengths should be in the order indicated by the subdirectory name. The numpy array itself should contain normalized brightness temperatures at the given lag times and wavelengths, following the grid specifications laid out in the journal paper (a plate carrée grid with 2-km spacing). The original IR data (brightness temperatures) must be normalized to z-scores using the same normalization parameters as in the journal paper, i.e., those based on the training data. See details below. The second numpy array in predictor_matrices should have dimensions S (number of TC samples) x 9 (variables). The variables must in the order: absolute latitude, cosine of longitude, sine of longitude, TC intensity, minimum central pressure, tropical flag, subtropical flag, extratropical flag, disturbance flag. The journal paper contains details on all these variables in one table. These variables must come from A-deck files at the most recent synoptic time. Like the IR data, these ATCF scalars must be normalized to z-scores using the same normalization parameters as in the journal paper. See details below.
Once you have predictions (estimated TC-center locations) from a CNN, you can bias-correct these predictions with isotonic regression. To read the isotonic-regression model corresponding to the given CNN, use scalar_isotonic_regression.read_file() in the ml4tccf library. To apply the isotonic-regression model, use scalar_isotonic_regression.apply_models().
To normalize the IR data, you will need the file ir_satellite_normalization_params.tar included with this dataset. Within the tar file is a single zarr file. You can read the zarr file with normalization.read_file() in the ml4tccf library; then you can normalize new data with normalization.normalize_data().
To normalize the ATCF data, you will need the file a_deck_normalization_params.nc included with this dataset. This is a NetCDF file, containing the full set of training values for all 5 ATCF variables that are normalized (the binary storm-type flags are not normalized). You can read this file using any of the standard Python methods for reading NetCDF files, such as xarray.open_dataset(). To normalize new ATCF data, you can use the method normalization._normalize_one_variable(), where the argument actual_values_training is the list of training values from a_deck_normalization_params.nc for the given variable, while actual_values_new is the list of values to be normalized (currently in physical units, to be converted to z-score units).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Proccessed and Input Data for the Pipelines of the Project "Multiomics and quantitative modelling disentangle diet, host, and microbiota contributions to the host metabolome"
-----------------------------------------------------------------------------------------------------
Contents:
-----------------------------------------------------------------------------------------------------
Folder /ProcessedData/metabolomics/ contains processed metabolomics data from the project:
/metabolomics/metabolites_allions_combined_norm_intensity.csv - file containing normalized intensities of ions detected across tissues with six measurement methods.
/metabolomics/metabolites_allions_combined_formulas_with_metabolite_filters_spatial100clusters_with_mean.csv - file containing metabolite attribution to spatial clusters and mean intensity values across tissues and conditions.
Other files are described in README_ProcessedData.md.
-----------------------------------------------------------------------------------------------------
Folder /ProcessedData/sequencing/ contains raw and normalized counts of metagenomics and metatransriptomics data mapped to bacterial genomes.
Folder /ProccessedData/util/ contains files used for data preprocessing and attribution to chemical classes and pathways.
Folder /ProcessedData/example_output/ contains example output of the pipelines:
/output/model_results_SMOOTH_raw_2LIcoefHost1LIcoefbact_allions.csv - file containing estimated model parameters (intestinal flux and metabolic flux values) for the forward problem for metabolomics measurements in the GIT.
/output/model_results_SMOOTH_normbyabsmax_reciprocal_problem_allions.csv - file containing estimated model parameters for the reverse problem (metabolite intensities) for the parameters estimated with the forward problem.
/output/model_results_SMOOTH_normbyabsmax_2LIcoefHost1LIcoefbact_allions.csv - file containing estimated model parameters (intestinal flux and metabolic flux values) for the forward problem for metabolomics measurements in the GIT, normalized by absolute maximum value.
/output/model_results_SMOOTH_normbyabsmax_ONLYMETCOEF_2LIcoefHost1LIcoefbact_allions.csv - file containing estimated model parameters (only metabolic flux values) for the forward problem for metabolomics measurements in the GIT, normalized by absolute maximum value.
/output/table_hierarchical_clustering_groups.csv - file containing attribution of the annotated metabolites to groups according to hierarchical clustering of the normalized model parameters.
/output/cgo_clustergrams_of_model_coefficients.mat - matlab object containing clustergram of the normalized model parameters and manually derived sub-clustergrams corresponding to different largest parameter values.
Description of other files is provided in the file README_ProcessedData.md.
-----------------------------------------------------------------------------------------------------
Folder /InputData/ contains HMDB and KEGG tables used for metabolite annotations and chemical group analysis.
Folder InputData_KEGGreaction_path contains matlab files with metabolite-metabolite paths calculated from KEGG reaction-pair information (Each matrix contains a subset of paths). These files are used by the script workflow_extract_keggECpathes_for_SPpairs_final.m.
Folder InputData_metabolomics_data contains raw metabolomics data from six methods (three LC columns: C08, C18 and HILIC, and positive and negative acquisition modes) and file tissue_weights.txt with tissue weight information used for normalization.
Folder InputData_sequencing_data contains folders ballgown_DNA and ballgown_RNA with results of metagenomic and metatranscriptomic data analysis (raw counts, GetMM normalized counts, EdgeR and DeSeq2 analysis).
Description of folders is provided in the file readme_InputData.md.
-----------------------------------------------------------------------------------------------------
Deriving metallicities for solar-like stars follows well-established methods, but for cooler stars such as M dwarfs, the determination is much more complicated due to forests of molecular lines that are present. Several methods have been developed in recent years to determine accurate stellar parameters for these cool stars (Teff<4000K). However, significant differences can be found at times when comparing metallicities for the same star derived using different methods. In this work, we determine the effective temperatures, surface gravities, and metallicities of 18 well-studied M dwarfs observed with the CARMENES high-resolution spectrograph following different approaches, including synthetic spectral fitting, analysis of pseudo-equivalent widths, and machine learning. We analyzed the discrepancies in the derived stellar parameters, including metallicity, in several analysis runs. Our goal is to minimize these discrepancies and find stellar parameters that are more consistent with the literature values. We attempted to achieve this consistency by standardizing the most commonly used components, such as wavelength ranges, synthetic model spectra, continuum normalization methods, and stellar parameters. We conclude that although such modifications work quite well for hotter main-sequence stars, they do not improve the consistency in stellar parameters for M dwarfs, leading to mean deviations of around 50-200K in temperature and 0.1-0.3dex in metallicity. In particular, M dwarfs are much more complex and a standardization of the aforementioned components cannot be considered as a straightforward recipe for bringing consistency to the derived parameters. Further in-depth investigations of the employed methods would be necessary in order to identify and correct for the discrepancies that remain. Cone search capability for table J/A+A/658/A194/stars (List of studied stars)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comparisons of estimates of normalizing parameter.