37 datasets found

Normalized Dataset
kaggle.com
zip
Updated Jun 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hemanth S (2022). Normalized Dataset [Dataset]. https://www.kaggle.com/datasets/hemanth012/normalized-dataset
Explore at:
zip(1009250933 bytes)Available download formats
Dataset updated
Jun 15, 2022
Authors
Hemanth S
Description
Dataset

This dataset was created by Hemanth S

Contents
c
Data from: LVMED: Dataset of Latvian text normalisation samples for the...
repository.clarin.lv
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Viesturs Jūlijs Lasmanis; Normunds Grūzītis (2023). LVMED: Dataset of Latvian text normalisation samples for the medical domain [Dataset]. https://repository.clarin.lv/repository/xmlui/handle/20.500.12574/85
Explore at:
Dataset updated
May 30, 2023
Authors
Viesturs Jūlijs Lasmanis; Normunds Grūzītis
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
The CSV dataset contains sentence pairs for a text-to-text transformation task: given a sentence that contains 0..n abbreviations, rewrite (normalize) the sentence in full words (word forms).

Training dataset: 64,665 sentence pairs Validation dataset: 7,185 sentence pairs. Testing dataset: 7,984 sentence pairs.

All sentences are extracted from a public web corpus (https://korpuss.lv/id/Tīmeklis2020) and contain at least one medical term.
d
Residential Existing Homes (One to Four Units) Energy Efficiency Meter...
catalog.data.gov
datasets.ai
+2more
Updated Jul 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.ny.gov (2025). Residential Existing Homes (One to Four Units) Energy Efficiency Meter Evaluated Project Data: 2007 – 2012 [Dataset]. https://catalog.data.gov/dataset/residential-existing-homes-one-to-four-units-energy-efficiency-meter-evaluated-projec-2007
Explore at:
Dataset updated
Jul 26, 2025
Dataset provided by
data.ny.gov
Description
IMPORTANT! PLEASE READ DISCLAIMER BEFORE USING DATA. This dataset backcasts estimated modeled savings for a subset of 2007-2012 completed projects in the Home Performance with ENERGY STAR® Program against normalized savings calculated by an open source energy efficiency meter available at https://www.openee.io/. Open source code uses utility-grade metered consumption to weather-normalize the pre- and post-consumption data using standard methods with no discretionary independent variables. The open source energy efficiency meter allows private companies, utilities, and regulators to calculate energy savings from energy efficiency retrofits with increased confidence and replicability of results. This dataset is intended to lay a foundation for future innovation and deployment of the open source energy efficiency meter across the residential energy sector, and to help inform stakeholders interested in pay for performance programs, where providers are paid for realizing measurable weather-normalized results. To download the open source code, please visit the website at https://github.com/openeemeter/eemeter/releases D I S C L A I M E R: Normalized Savings using open source OEE meter. Several data elements, including, Evaluated Annual Elecric Savings (kWh), Evaluated Annual Gas Savings (MMBtu), Pre-retrofit Baseline Electric (kWh), Pre-retrofit Baseline Gas (MMBtu), Post-retrofit Usage Electric (kWh), and Post-retrofit Usage Gas (MMBtu) are direct outputs from the open source OEE meter. Home Performance with ENERGY STAR® Estimated Savings. Several data elements, including, Estimated Annual kWh Savings, Estimated Annual MMBtu Savings, and Estimated First Year Energy Savings represent contractor-reported savings derived from energy modeling software calculations and not actual realized energy savings. The accuracy of the Estimated Annual kWh Savings and Estimated Annual MMBtu Savings for projects has been evaluated by an independent third party. The results of the Home Performance with ENERGY STAR impact analysis indicate that, on average, actual savings amount to 35 percent of the Estimated Annual kWh Savings and 65 percent of the Estimated Annual MMBtu Savings. For more information, please refer to the Evaluation Report published on NYSERDA’s website at: http://www.nyserda.ny.gov/-/media/Files/Publications/PPSER/Program-Evaluation/2012ContractorReports/2012-HPwES-Impact-Report-with-Appendices.pdf. This dataset includes the following data points for a subset of projects completed in 2007-2012: Contractor ID, Project County, Project City, Project ZIP, Climate Zone, Weather Station, Weather Station-Normalization, Project Completion Date, Customer Type, Size of Home, Volume of Home, Number of Units, Year Home Built, Total Project Cost, Contractor Incentive, Total Incentives, Amount Financed through Program, Estimated Annual kWh Savings, Estimated Annual MMBtu Savings, Estimated First Year Energy Savings, Evaluated Annual Electric Savings (kWh), Evaluated Annual Gas Savings (MMBtu), Pre-retrofit Baseline Electric (kWh), Pre-retrofit Baseline Gas (MMBtu), Post-retrofit Usage Electric (kWh), Post-retrofit Usage Gas (MMBtu), Central Hudson, Consolidated Edison, LIPA, National Grid, National Fuel Gas, New York State Electric and Gas, Orange and Rockland, Rochester Gas and Electric. How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov.
f
Data from: proteiNorm – A User-Friendly Tool for Normalization and Analysis...
datasetcatalog.nlm.nih.gov
Updated Sep 30, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Byrd, Alicia K; Zafar, Maroof K; Graw, Stefan; Tang, Jillian; Byrum, Stephanie D; Peterson, Eric C.; Bolden, Chris (2020). proteiNorm – A User-Friendly Tool for Normalization and Analysis of TMT and Label-Free Protein Quantification [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000568582
Explore at:
Dataset updated
Sep 30, 2020
Authors
Byrd, Alicia K; Zafar, Maroof K; Graw, Stefan; Tang, Jillian; Byrum, Stephanie D; Peterson, Eric C.; Bolden, Chris
Description
The technological advances in mass spectrometry allow us to collect more comprehensive data with higher quality and increasing speed. With the rapidly increasing amount of data generated, the need for streamlining analyses becomes more apparent. Proteomics data is known to be often affected by systemic bias from unknown sources, and failing to adequately normalize the data can lead to erroneous conclusions. To allow researchers to easily evaluate and compare different normalization methods via a user-friendly interface, we have developed “proteiNorm”. The current implementation of proteiNorm accommodates preliminary filters on peptide and sample levels followed by an evaluation of several popular normalization methods and visualization of the missing value. The user then selects an adequate normalization method and one of the several imputation methods used for the subsequent comparison of different differential expression methods and estimation of statistical power. The application of proteiNorm and interpretation of its results are demonstrated on two tandem mass tag multiplex (TMT6plex and TMT10plex) and one label-free spike-in mass spectrometry example data set. The three data sets reveal how the normalization methods perform differently on different experimental designs and the need for evaluation of normalization methods for each mass spectrometry experiment. With proteiNorm, we provide a user-friendly tool to identify an adequate normalization method and to select an appropriate method for differential expression analysis.
h
dhivehi-sentences-extended
huggingface.co
Updated May 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
rusputin (2025). dhivehi-sentences-extended [Dataset]. https://huggingface.co/datasets/alakxender/dhivehi-sentences-extended
Explore at:
Dataset updated
May 11, 2025
Authors
rusputin
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dhivehi Sentences Dataset

This repository contains a dataset of Dhivehi (Thaana, Maldivian) sentences processed and normalized for natural language processing tasks. This is an extended version of the Dhivehi Sentences Dataset

Dataset Description

The dataset combines text from multiple sources:

Random News Articles Glot500 Dhivehi-Thaana FineWeb-2 Dhivehi-Thaana

The sentences have been processed to:

Split into individual sentences Normalize numbers into Dhivehi text… See the full description on the dataset page: https://huggingface.co/datasets/alakxender/dhivehi-sentences-extended.
Naturalistic Neuroimaging Database
openneuro.org
Updated Apr 20, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah Aliko; Jiawen Huang; Florin Gheorghiu; Stefanie Meliss; Jeremy I Skipper (2021). Naturalistic Neuroimaging Database [Dataset]. http://doi.org/10.18112/openneuro.ds002837.v1.1.3
Explore at:
Unique identifier
https://doi.org/10.18112/openneuro.ds002837.v1.1.3
Dataset updated
Apr 20, 2021
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Sarah Aliko; Jiawen Huang; Florin Gheorghiu; Stefanie Meliss; Jeremy I Skipper
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Overview

The Naturalistic Neuroimaging Database (NNDb v2.0) contains datasets from 86 human participants doing the NIH Toolbox and then watching one of 10 full-length movies during functional magnetic resonance imaging (fMRI).The participants were all right-handed, native English speakers, with no history of neurological/psychiatric illnesses, with no hearing impairments, unimpaired or corrected vision and taking no medication. Each movie was stopped in 40-50 minute intervals or when participants asked for a break, resulting in 2-6 runs of BOLD-fMRI. A 10 minute high-resolution defaced T1-weighted anatomical MRI scan (MPRAGE) is also provided.

The NNDb V2.0 is now on Neuroscout, a platform for fast and flexible re-analysis of (naturalistic) fMRI studies. See: https://neuroscout.org/

v2.0 Changes

Overview

We have replaced our own preprocessing pipeline with that implemented in AFNI’s afni_proc.py, thus changing only the derivative files. This introduces a fix for an issue with our normalization (i.e., scaling) step and modernizes and standardizes the preprocessing applied to the NNDb derivative files. We have done a bit of testing and have found that results in both pipelines are quite similar in terms of the resulting spatial patterns of activity but with the benefit that the afni_proc.py results are 'cleaner' and statistically more robust.

Normalization

Emily Finn and Clare Grall at Dartmouth and Rick Reynolds and Paul Taylor at AFNI, discovered and showed us that the normalization procedure we used for the derivative files was less than ideal for timeseries runs of varying lengths. Specifically, the 3dDetrend flag -normalize makes 'the sum-of-squares equal to 1'. We had not thought through that an implication of this is that the resulting normalized timeseries amplitudes will be affected by run length, increasing as run length decreases (and maybe this should go in 3dDetrend’s help text). To demonstrate this, I wrote a version of 3dDetrend’s -normalize for R so you can see for yourselves by running the following code:

# Generate a resting state (rs) timeseries (ts) # Install / load package to make fake fMRI ts # install.packages("neuRosim") library(neuRosim) # Generate a ts ts.rs <- simTSrestingstate(nscan=2000, TR=1, SNR=1) # 3dDetrend -normalize # R command version for 3dDetrend -normalize -polort 0 which normalizes by making "the sum-of-squares equal to 1" # Do for the full timeseries ts.normalised.long <- (ts.rs-mean(ts.rs))/sqrt(sum((ts.rs-mean(ts.rs))^2)); # Do this again for a shorter version of the same timeseries ts.shorter.length <- length(ts.normalised.long)/4 ts.normalised.short <- (ts.rs[1:ts.shorter.length]- mean(ts.rs[1:ts.shorter.length]))/sqrt(sum((ts.rs[1:ts.shorter.length]- mean(ts.rs[1:ts.shorter.length]))^2)); # By looking at the summaries, it can be seen that the median values become larger summary(ts.normalised.long) summary(ts.normalised.short) # Plot results for the long and short ts # Truncate the longer ts for plotting only ts.normalised.long.made.shorter <- ts.normalised.long[1:ts.shorter.length] # Give the plot a title title <- "3dDetrend -normalize for long (blue) and short (red) timeseries"; plot(x=0, y=0, main=title, xlab="", ylab="", xaxs='i', xlim=c(1,length(ts.normalised.short)), ylim=c(min(ts.normalised.short),max(ts.normalised.short))); # Add zero line lines(x=c(-1,ts.shorter.length), y=rep(0,2), col='grey'); # 3dDetrend -normalize -polort 0 for long timeseries lines(ts.normalised.long.made.shorter, col='blue'); # 3dDetrend -normalize -polort 0 for short timeseries lines(ts.normalised.short, col='red');

Standardization/modernization

The above individuals also encouraged us to implement the afni_proc.py script over our own pipeline. It introduces at least three additional improvements: First, we now use Bob’s @SSwarper to align our anatomical files with an MNI template (now MNI152_2009_template_SSW.nii.gz) and this, in turn, integrates nicely into the afni_proc.py pipeline. This seems to result in a generally better or more consistent alignment, though this is only a qualitative observation. Second, all the transformations / interpolations and detrending are now done in fewers steps compared to our pipeline. This is preferable because, e.g., there is less chance of inadvertently reintroducing noise back into the timeseries (see Lindquist, Geuter, Wager, & Caffo 2019). Finally, many groups are advocating using tools like fMRIPrep or afni_proc.py to increase standardization of analyses practices in our neuroimaging community. This presumably results in less error, less heterogeneity and more interpretability of results across studies. Along these lines, the quality control (‘QC’) html pages generated by afni_proc.py are a real help in assessing data quality and almost a joy to use.

New afni_proc.py command line

The following is the afni_proc.py command line that we used to generate blurred and censored timeseries files. The afni_proc.py tool comes with extensive help and examples. As such, you can quickly understand our preprocessing decisions by scrutinising the below. Specifically, the following command is most similar to Example 11 for ‘Resting state analysis’ in the help file (see https://afni.nimh.nih.gov/pub/dist/doc/program_help/afni_proc.py.html): afni_proc.py \ -subj_id "$sub_id_name_1" \ -blocks despike tshift align tlrc volreg mask blur scale regress \ -radial_correlate_blocks tcat volreg \ -copy_anat anatomical_warped/anatSS.1.nii.gz \ -anat_has_skull no \ -anat_follower anat_w_skull anat anatomical_warped/anatU.1.nii.gz \ -anat_follower_ROI aaseg anat freesurfer/SUMA/aparc.a2009s+aseg.nii.gz \ -anat_follower_ROI aeseg epi freesurfer/SUMA/aparc.a2009s+aseg.nii.gz \ -anat_follower_ROI fsvent epi freesurfer/SUMA/fs_ap_latvent.nii.gz \ -anat_follower_ROI fswm epi freesurfer/SUMA/fs_ap_wm.nii.gz \ -anat_follower_ROI fsgm epi freesurfer/SUMA/fs_ap_gm.nii.gz \ -anat_follower_erode fsvent fswm \ -dsets media_?.nii.gz \ -tcat_remove_first_trs 8 \ -tshift_opts_ts -tpattern alt+z2 \ -align_opts_aea -cost lpc+ZZ -giant_move -check_flip \ -tlrc_base "$basedset" \ -tlrc_NL_warp \ -tlrc_NL_warped_dsets \ anatomical_warped/anatQQ.1.nii.gz \ anatomical_warped/anatQQ.1.aff12.1D \ anatomical_warped/anatQQ.1_WARP.nii.gz \ -volreg_align_to MIN_OUTLIER \ -volreg_post_vr_allin yes \ -volreg_pvra_base_index MIN_OUTLIER \ -volreg_align_e2a \ -volreg_tlrc_warp \ -mask_opts_automask -clfrac 0.10 \ -mask_epi_anat yes \ -blur_to_fwhm -blur_size $blur \ -regress_motion_per_run \ -regress_ROI_PC fsvent 3 \ -regress_ROI_PC_per_run fsvent \ -regress_make_corr_vols aeseg fsvent \ -regress_anaticor_fast \ -regress_anaticor_label fswm \ -regress_censor_motion 0.3 \ -regress_censor_outliers 0.1 \ -regress_apply_mot_types demean deriv \ -regress_est_blur_epits \ -regress_est_blur_errts \ -regress_run_clustsim no \ -regress_polort 2 \ -regress_bandpass 0.01 1 \ -html_review_style pythonic We used similar command lines to generate ‘blurred and not censored’ and the ‘not blurred and not censored’ timeseries files (described more fully below). We will provide the code used to make all derivative files available on our github site (https://github.com/lab-lab/nndb).

We made one choice above that is different enough from our original pipeline that it is worth mentioning here. Specifically, we have quite long runs, with the average being ~40 minutes but this number can be variable (thus leading to the above issue with 3dDetrend’s -normalise). A discussion on the AFNI message board with one of our team (starting here, https://afni.nimh.nih.gov/afni/community/board/read.php?1,165243,165256#msg-165256), led to the suggestion that '-regress_polort 2' with '-regress_bandpass 0.01 1' be used for long runs. We had previously used only a variable polort with the suggested 1 + int(D/150) approach. Our new polort 2 + bandpass approach has the added benefit of working well with afni_proc.py.

Which timeseries file you use is up to you but I have been encouraged by Rick and Paul to include a sort of PSA about this. In Paul’s own words: * Blurred data should not be used for ROI-based analyses (and potentially not for ICA? I am not certain about standard practice). * Unblurred data for ISC might be pretty noisy for voxelwise analyses, since blurring should effectively boost the SNR of active regions (and even good alignment won't be perfect everywhere). * For uncensored data, one should be concerned about motion effects being left in the data (e.g., spikes in the data). * For censored data: * Performing ISC requires the users to unionize the censoring patterns during the correlation calculation. * If wanting to calculate power spectra or spectral parameters like ALFF/fALFF/RSFA etc. (which some people might do for naturalistic tasks still), then standard FT-based methods can't be used because sampling is no longer uniform. Instead, people could use something like 3dLombScargle+3dAmpToRSFC, which calculates power spectra (and RSFC params) based on a generalization of the FT that can handle non-uniform sampling, as long as the censoring pattern is mostly random and, say, only up to about 10-15% of the data. In sum, think very carefully about which files you use. If you find you need a file we have not provided, we can happily generate different versions of the timeseries upon request and can generally do so in a week or less.

Effect on results

From numerous tests on our own analyses, we have qualitatively found that results using our old vs the new afni_proc.py preprocessing pipeline do not change all that much in terms of general spatial patterns. There is, however, an
CYGNSS Level 1 Science Data Record Version 2.1 - Dataset - NASA Open Data...
data.nasa.gov
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). CYGNSS Level 1 Science Data Record Version 2.1 - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/cygnss-level-1-science-data-record-version-2-1-c4d25
Explore at:
Dataset updated
Apr 1, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
This Level 1 (L1) dataset contains the Version 2.1 geo-located Delay Doppler Maps (DDMs) calibrated into Power Received (Watts) and Bistatic Radar Cross Section (BRCS) expressed in units of meters squared from the Delay Doppler Mapping Instrument aboard the CYGNSS satellite constellation. This version supersedes Version 2.0. Other useful scientific and engineering measurement parameters include the DDM of Normalized Bistatic Radar Cross Section (NBRCS), the Delay Doppler Map Average (DDMA) of the NBRCS near the specular reflection point, and the Leading Edge Slope (LES) of the integrated delay waveform. The L1 dataset contains a number of other engineering and science measurement parameters, including sets of quality flags/indicators, error estimates, and bias estimates as well as a variety of orbital, spacecraft/sensor health, timekeeping, and geolocation parameters. At most, 8 netCDF data files (each file corresponding to a unique spacecraft in the CYGNSS constellation) are provided each day; under nominal conditions, there are typically 6-8 spacecraft retrieving data each day, but this can be maximized to 8 spacecraft under special circumstances in which higher than normal retrieval frequency is needed (i.e., during tropical storms and or hurricanes). Latency is approximately 6 days (or better) from the last recorded measurement time. The Version 2.1 release represents the second science-quality release. Here is a summary of improvements that reflect the quality of the Version 2.1 data release: 1) data is now available when the CYGNSS satellites are rolled away from nadir during orbital high beta-angle periods, resulting in a significant amount of additional data; 2) correction to coordinate frames result in more accurate estimates of receiver antenna gain at the specular point; 3) improved calibration for analog-to-digital conversion results in better consistency between CYGNSS satellites measurements at nearly the same location and time; 4) improved GPS EIRP and transmit antenna pattern calibration results in significantly reduced PRN-dependence in the observables; 5) improved estimation of the location of the specular point within the DDM; 6) an altitude-dependent scattering area is used to normalize the scattering cross section (v2.0 used a simpler scattering area model that varied with incidence and azimuth angles but not altitude); 7) corrections added for noise floor-dependent biases in scattering cross section and leading edge slope of delay waveform observed in the v2.0 data. Users should also note that the receiver antenna pattern calibration is not applied per-DDM-bin in this v2.1 release.
US State populations - 2018
kaggle.com
zip
Updated May 29, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vikas (2018). US State populations - 2018 [Dataset]. https://www.kaggle.com/lucasvictor/us-state-populations-2018
Explore at:
zip(805 bytes)Available download formats
Dataset updated
May 29, 2018
Authors
Vikas
Area covered
United States
Description
Context

While working on the gun violence data set, i wanted to normalize the number of incidents because some states are more populous than others so normalizing the gun incidents per million people gave me a different outlook towards the data. The source of this data is unofficial as the last numbers from US census bureau were available only from 2010. I just wanted to get a quick unofficial source of this data and stumbled upon this site

http://worldpopulationreview.com/states/

Content

Simple two columns - state and population as of 2018

Acknowledgements

http://worldpopulationreview.com/states/

Inspiration

Your data will be in front of the world's largest data science community. What questions do you want to see answered?
f
Per-fold number of features on Álvez dataset, weighted and unweighted...
plos.figshare.com
xlsx
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin (2025). Per-fold number of features on Álvez dataset, weighted and unweighted models. [Dataset]. http://doi.org/10.1371/journal.pdig.0000780.s010
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000780.s010
Dataset updated
Mar 26, 2025
Dataset provided by
PLOS Digital Health
Authors
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Per-fold number of features on Álvez dataset, weighted and unweighted models.
f
Per-fold number of features on RAPIDS dataset, weighted and unweighted...
plos.figshare.com
xlsx
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin (2025). Per-fold number of features on RAPIDS dataset, weighted and unweighted models. [Dataset]. http://doi.org/10.1371/journal.pdig.0000780.s008
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000780.s008
Dataset updated
Mar 26, 2025
Dataset provided by
PLOS Digital Health
Authors
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Per-fold number of features on RAPIDS dataset, weighted and unweighted models.
Identification of Novel Reference Genes Suitable for qRT-PCR Normalization...
plos.figshare.com
tiff
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yu Hu; Shuying Xie; Jihua Yao (2023). Identification of Novel Reference Genes Suitable for qRT-PCR Normalization with Respect to the Zebrafish Developmental Stage [Dataset]. http://doi.org/10.1371/journal.pone.0149277
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0149277
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Yu Hu; Shuying Xie; Jihua Yao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Reference genes used in normalizing qRT-PCR data are critical for the accuracy of gene expression analysis. However, many traditional reference genes used in zebrafish early development are not appropriate because of their variable expression levels during embryogenesis. In the present study, we used our previous RNA-Seq dataset to identify novel reference genes suitable for gene expression analysis during zebrafish early developmental stages. We first selected 197 most stably expressed genes from an RNA-Seq dataset (29,291 genes in total), according to the ratio of their maximum to minimum RPKM values. Among the 197 genes, 4 genes with moderate expression levels and the least variation throughout 9 developmental stages were identified as candidate reference genes. Using four independent statistical algorithms (delta-CT, geNorm, BestKeeper and NormFinder), the stability of qRT-PCR expression of these candidates was then evaluated and compared to that of actb1 and actb2, two commonly used zebrafish reference genes. Stability rankings showed that two genes, namely mobk13 (mob4) and lsm12b, were more stable than actb1 and actb2 in most cases. To further test the suitability of mobk13 and lsm12b as novel reference genes, they were used to normalize three well-studied target genes. The results showed that mobk13 and lsm12b were more suitable than actb1 and actb2 with respect to zebrafish early development. We recommend mobk13 and lsm12b as new optimal reference genes for zebrafish qRT-PCR analysis during embryogenesis and early larval stages.
Predictive Validity Data Set
figshare.com
txt
Updated Dec 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antonio Abeyta (2022). Predictive Validity Data Set [Dataset]. http://doi.org/10.6084/m9.figshare.17030021.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.17030021.v1
Dataset updated
Dec 18, 2022
Dataset provided by
Figsharehttp://figshare.com/
Authors
Antonio Abeyta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Verbal and Quantitative Reasoning GRE scores and percentiles were collected by querying the student database for the appropriate information. Any student records that were missing data such as GRE scores or grade point average were removed from the study before the data were analyzed. The GRE Scores of entering doctoral students from 2007-2012 were collected and analyzed. A total of 528 student records were reviewed. Ninety-six records were removed from the data because of a lack of GRE scores. Thirty-nine of these records belonged to MD/PhD applicants who were not required to take the GRE to be reviewed for admission. Fifty-seven more records were removed because they did not have an admissions committee score in the database. After 2011, the GRE’s scoring system was changed from a scale of 200-800 points per section to 130-170 points per section. As a result, 12 more records were removed because their scores were representative of the new scoring system and therefore were not able to be compared to the older scores based on raw score. After removal of these 96 records from our analyses, a total of 420 student records remained which included students that were currently enrolled, left the doctoral program without a degree, or left the doctoral program with an MS degree. To maintain consistency in the participants, we removed 100 additional records so that our analyses only considered students that had graduated with a doctoral degree. In addition, thirty-nine admissions scores were identified as outliers by statistical analysis software and removed for a final data set of 286 (see Outliers below). Outliers We used the automated ROUT method included in the PRISM software to test the data for the presence of outliers which could skew our data. The false discovery rate for outlier detection (Q) was set to 1%. After removing the 96 students without a GRE score, 432 students were reviewed for the presence of outliers. ROUT detected 39 outliers that were removed before statistical analysis was performed. Sample See detailed description in the Participants section. Linear regression analysis was used to examine potential trends between GRE scores, GRE percentiles, normalized admissions scores or GPA and outcomes between selected student groups. The D’Agostino & Pearson omnibus and Shapiro-Wilk normality tests were used to test for normality regarding outcomes in the sample. The Pearson correlation coefficient was calculated to determine the relationship between GRE scores, GRE percentiles, admissions scores or GPA (undergraduate and graduate) and time to degree. Candidacy exam results were divided into students who either passed or failed the exam. A Mann-Whitney test was then used to test for statistically significant differences between mean GRE scores, percentiles, and undergraduate GPA and candidacy exam results. Other variables were also observed such as gender, race, ethnicity, and citizenship status within the samples. Predictive Metrics. The input variables used in this study were GPA and scores and percentiles of applicants on both the Quantitative and Verbal Reasoning GRE sections. GRE scores and percentiles were examined to normalize variances that could occur between tests. Performance Metrics. The output variables used in the statistical analyses of each data set were either the amount of time it took for each student to earn their doctoral degree, or the student’s candidacy examination result.
car_number_segment
kaggle.com
zip
Updated Jul 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Egor Andreasyan (2024). car_number_segment [Dataset]. https://www.kaggle.com/datasets/egorandreasyan/car-number-segment/versions/2
Explore at:
zip(9895621 bytes)Available download formats
Dataset updated
Jul 6, 2024
Authors
Egor Andreasyan
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset was marked by the CLIO team as part of the “Digital Breakthrough Season: Artificial Intelligence” hackathon.

By case Alignment of license plate images The customer of the case is Beeline.

Link to our team solution: https://github.com/NSO-Clio/Normalize-cars-numbers

Этот датасет был подготовлен командой CLIO в рамках хакатона «Цифровой прорыв. Сезон: Искусственный интеллект»

Кейс: Выравнивание изображений номерных знаков автомобилей Заказчик кейса: Beeline.

Ссылка на решение нашей команды: https://github.com/NSO-Clio/Normalize-cars-numbers
Data for A Systemic Framework for Assessing the Risk of Decarbonization to...
zenodo.org
txt
Updated Sep 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Soheil Shayegh; Soheil Shayegh; Giorgia Coppola; Giorgia Coppola (2025). Data for A Systemic Framework for Assessing the Risk of Decarbonization to Regional Manufacturing Activities in the European Union [Dataset]. http://doi.org/10.5281/zenodo.17152310
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.17152310
Dataset updated
Sep 18, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Soheil Shayegh; Soheil Shayegh; Giorgia Coppola; Giorgia Coppola
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Sep 18, 2025
Area covered
European Union
Description
README — Code and data
Project: LOCALISED

Work Package 7, Task 7.1

Paper: A Systemic Framework for Assessing the Risk of Decarbonization to Regional Manufacturing Activities in the European Union

What this repo does
-------------------
Builds the Transition‑Risk Index (TRI) for EU manufacturing at NUTS‑2 × NACE Rev.2, and reproduces the article’s Figures 3–6:
• Exposure (emissions by region/sector)
• Vulnerability (composite index)
• Risk = Exposure ⊗ Vulnerability
Outputs include intermediate tables, the final analysis dataset, and publication figures.

Folder of interest
------------------
Code and data/
├─ Code/ # R scripts (run in order 1A → 5)
│ └─ Create Initial Data/ # scripts to (re)build Initial data/ from Eurostat API with imputation
├─ Initial data/ # Eurostat inputs imputed for missing values
├─ Derived data/ # intermediates
├─ Final data/ # final analysis-ready tables
└─ Figures/ # exported figures

Quick start
-----------
1) Open R (or RStudio) and set the working directory to “Code and data/Code”.
Example: setwd(".../Code and data/Code")
2) Initial data/ contains the required Eurostat inputs referenced by the scripts.
To reproduce the inputs in Initial data/, run the scripts in Code/Create Initial Data/.
These scripts download the required datasets from the respective API and impute missing values; outputs are written to ../Initial data/.
3) Run scripts sequentially (they use relative paths to ../Raw data, ../Derived data, etc.):
1A-non-sector-data.R → 1B-sector-data.R → 1C-all-data.R → 2-reshape-data.R → 3-normalize-data-by-n-enterpr.R → 4-risk-aggregation.R → 5A-results-maps.R, 5B-results-radar.R

What each script does
---------------------
Create Initial Data — Recreate inputs
• Download source tables from the Eurostat API or the Localised DSP, apply light cleaning, and impute missing values.
• Write the resulting inputs to Initial data/ for the analysis pipeline.

1A / 1B / 1C — Build the unified base
• Read individual Eurostat datasets (some sectoral, some only regional).
• Harmonize, aggregate, and align them into a single analysis-ready schema.
• Write aggregated outputs to Derived data/ (and/or Final data/ as needed).

2 — Reshape and enrich
• Reshapes the combined data and adds metadata.
• Output: Derived data/2_All_data_long_READY.xlsx (all raw indicators in tidy long format, with indicator names and values).

3 — Normalize (enterprises & min–max)
• Divide selected indicators by number of enterprises.
• Apply min–max normalization to [0.01, 0.99].
• Exposure keeps real zeros (zeros remain zero).
• Write normalized tables to Derived data/ or Final data/.

4 — Aggregate indices
• Vulnerability: build dimension scores (Energy, Labour, Finance, Supply Chain, Technology).
– Within each dimension: equal‑weight mean of directionally aligned, [0.01,0.99]‑scaled indicators.
– Dimension scores are re‑scaled to [0.01,0.99].
• Aggregate Vulnerability: equal‑weight mean of the five dimensions.
• TRI (Risk): combine Exposure (E) and Vulnerability (V) via a weighted geometric rule with α = 0.5 in the baseline.
– Policy‑intuitive properties: high E & high V → high risk; imbalances penalized (non‑compensatory).
• Output: Final data/ (main analysis tables).

5A / 5B — Visualize results
• 5A: maps and distribution plots for Exposure, Vulnerability, and Risk → Figures 3 & 4.
• 5B: comparative/radar profiles for selected countries/regions/subsectors → Figures 5 & 6.
• Outputs saved to Figures/.

Data flow (at a glance)
-----------------------
Initial data → (1A–1C) Aggregated base → (2) Tidy long file → (3) Normalized indicators → (4) Composite indices → (5) Figures
| | |
v v v
Derived data/ 2_All_data_long_READY.xlsx Final data/ & Figures/

Assumptions & conventions
-------------------------
• Geography: EU NUTS‑2 regions; Sector: NACE Rev.2 manufacturing subsectors.
• Equal weights by default where no evidence supports alternatives.
• All indicators directionally aligned so that higher = greater transition difficulty.
• Relative paths assume working directory = Code/.

Reproducing the article
-----------------------
• Optionally run the codes from the Code/Create Initial Data subfolder
• Run 1A → 5B without interruption to regenerate:
– Figure 3: Exposure, Vulnerability, Risk maps (total manufacturing).
– Figure 4: Vulnerability dimensions (Energy, Labour, Finance, Supply Chain, Technology).
– Figure 5: Drivers of risk—highest vs. lowest risk regions (example: Germany & Greece).
– Figure 6: Subsector case (e.g., basic metals) by selected regions.
• Final tables for the paper live in Final data/. Figures export to Figures/.

Requirements
------------
• R (version per your environment).
• Install any missing packages listed at the top of each script (e.g., install.packages("...")).

Troubleshooting
---------------
• “File not found”: check that the previous script finished and wrote its outputs to the expected folder.
• Paths: confirm getwd() ends with /Code so relative paths resolve to ../Raw data, ../Derived data, etc.
• Reruns: optionally clear Derived data/, Final data/, and Figures/ before a clean rebuild.

Provenance & citation
---------------------
• Inputs: Eurostat and related sources cited in the paper and headers of the scripts.
• Methods: OECD composite‑indicator guidance; IPCC AR6 risk framing (see paper references).
• If you use this code, please cite the article:
A Systemic Framework for Assessing the Risk of Decarbonization to Regional Manufacturing Activities in the European Union.
f
Best regions for predicting False Negatives (FN, i.e. no prediction of...
figshare.com
csv
Updated Oct 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mélanie Garcia; Clare Kelly (2024). Best regions for predicting False Negatives (FN, i.e. no prediction of Autism whereas diagnosed Autism). [Dataset]. http://doi.org/10.1371/journal.pone.0276832.s009
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0276832.s009
Dataset updated
Oct 21, 2024
Dataset provided by
PLOS ONE
Authors
Mélanie Garcia; Clare Kelly
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Each row is for one region, each column is for one model and one combination of datasets considered (training+validation+testing 1 sets (no comorbidity), or all these sets + testing set 2 (containing subjects with comorbidities)), each case returns the number of datasets where the region was important for predicting TN for the model considered. (CSV)
f
Per-fold number of features selected by all models on binary datasets.
plos.figshare.com
xlsx
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin (2025). Per-fold number of features selected by all models on binary datasets. [Dataset]. http://doi.org/10.1371/journal.pdig.0000780.s007
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000780.s007
Dataset updated
Mar 26, 2025
Dataset provided by
PLOS Digital Health
Authors
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Per-fold number of features selected by all models on binary datasets.
ModelSC 1: Capacity of the different models to normalize for the varying...
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sylvie Troncale; Aurélie Barbet; Lamine Coulibaly; Emilie Henry; Beilei He; Emmanuel Barillot; Thierry Dubois; Philippe Hupé; Leanne de Koning (2023). ModelSC 1: Capacity of the different models to normalize for the varying amounts of total protein spotted. [Dataset]. http://doi.org/10.1371/journal.pone.0038686.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0038686.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Sylvie Troncale; Aurélie Barbet; Lamine Coulibaly; Emilie Henry; Beilei He; Emmanuel Barillot; Thierry Dubois; Philippe Hupé; Leanne de Koning
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Represented are the p-values of the amount effects without neither nor (), with (), with (), with and ().
f
Table_2_Comparison of Normalization Methods for Analysis of TempO-Seq...
frontiersin.figshare.com
datasetcatalog.nlm.nih.gov
+1more
xlsx
Updated Jun 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pierre R. Bushel; Stephen S. Ferguson; Sreenivasa C. Ramaiahgari; Richard S. Paules; Scott S. Auerbach (2023). Table_2_Comparison of Normalization Methods for Analysis of TempO-Seq Targeted RNA Sequencing Data.xlsx [Dataset]. http://doi.org/10.3389/fgene.2020.00594.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2020.00594.s002
Dataset updated
Jun 2, 2023
Dataset provided by
Frontiers
Authors
Pierre R. Bushel; Stephen S. Ferguson; Sreenivasa C. Ramaiahgari; Richard S. Paules; Scott S. Auerbach
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of bulk RNA sequencing (RNA-Seq) data is a valuable tool to understand transcription at the genome scale. Targeted sequencing of RNA has emerged as a practical means of assessing the majority of the transcriptomic space with less reliance on large resources for consumables and bioinformatics. TempO-Seq is a templated, multiplexed RNA-Seq platform that interrogates a panel of sentinel genes representative of genome-wide transcription. Nuances of the technology require proper preprocessing of the data. Various methods have been proposed and compared for normalizing bulk RNA-Seq data, but there has been little to no investigation of how the methods perform on TempO-Seq data. We simulated count data into two groups (treated vs. untreated) at seven-fold change (FC) levels (including no change) using control samples from human HepaRG cells run on TempO-Seq and normalized the data using seven normalization methods. Upper Quartile (UQ) performed the best with regard to maintaining FC levels as detected by a limma contrast between treated vs. untreated groups. For all FC levels, specificity of the UQ normalization was greater than 0.84 and sensitivity greater than 0.90 except for the no change and +1.5 levels. Furthermore, K-means clustering of the simulated genes normalized by UQ agreed the most with the FC assignments [adjusted Rand index (ARI) = 0.67]. Despite having an assumption of the majority of genes being unchanged, the DESeq2 scaling factors normalization method performed reasonably well as did simple normalization procedures counts per million (CPM) and total counts (TCs). These results suggest that for two class comparisons of TempO-Seq data, UQ, CPM, TC, or DESeq2 normalization should provide reasonably reliable results at absolute FC levels ≥2.0. These findings will help guide researchers to normalize TempO-Seq gene expression data for more reliable results.

toll register (probably Gemona, 1426-1427)

zenodo.org

Updated Sep 27, 2023

Facebook

Twitter

Click to copy link

Link copied

Cite

Tommaso Vidal; Tommaso Vidal (2023). toll register (probably Gemona, 1426-1427) [Dataset]. http://doi.org/10.5281/zenodo.8060450

Explore at:

Unique identifier

https://doi.org/10.5281/zenodo.8060450

Dataset updated

Sep 27, 2023

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Tommaso Vidal; Tommaso Vidal

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered

Gemona, Gemona del Friuli

Description

The following dataset is based on the register of a toll levied on 'German' merchants possibly in the friulian town of Gemona from October 1426 to September 1427 (the register is preserved in State Archive of Udine, Colloredo Mels, part. I, b. 5, register "MIIIIXXVI todeschi"). The research has been produced within the framework of the 2017 PRIN "LOC-GLOB. The local connectivity in an age of global intensification: infrastructural networks, production and trading areas in late-medieval Italy (1280-1500)".

The register is divided into accounts belonging to individual ‘German’ merchants (further information on the notion of ‘German’ in this context can be found in the paper T. Vidal, Fiscality and Infrastructures). Under each account the scribe recorded in chronological order all the commodities of each merchant as well as the carters and carriers.

The original language of the register is Friulian-venetian vernacular. The entrances in the original are structured as in the following example (f. 3r):

	Lenart di Tarvise
October
XII	Andrege di Pultebe	fero I^MIIII^C
XIIII	Yori di Dogna	fero I^M
XV	Stefin di Chuviza	fero I^MII^C
XVIIII^or	Petri di Chuviza	fero I^MV^C

Structure of the dataset

1) id.: progressive id of the items of the dataset.

2) account id.: progressive id given to each account as it appears in the register. Individual merchants can hold multiple accounts. I have chosen to keep the progressive id order rather than attributing the id according to the merchant to preserve the original structure of the document.

3-5) year, month, day.

6) merchant: name of the merchant holding each account. Since most of the names are a rather rough transliteration of German names in Friulian-venetian vernacular I have chosen not to normalize them to modern forms.

7) merchant origin: declared origin of the merchant. As for (6) many names appear to be tentative transliterations of the scribe. For all the cases were identification is evident the modern toponym is given; for all other cases I have kept the spelling of the manuscript.

8) merchant origin (normalized): proposed normalization and identification of non-normalized toponyms recorded under (7).

9) merchant jurisdiction: jurisdictional area of provenance of each merchant.

10) carrier: name of the carrier to whom the merchant had entrusted his commodities. See (6) for the choices regarding normalization.

11) carrier origin: declared origin of the carrier. See (7) for the choices regarding normalization.

12) carrier origin (normalized): proposed normalization and identification of non-normalized toponyms recorded under (11).

13) carrier jurisdiction: jurisdictional area of provenance of each carrier.

14) commodity: type of commodity declared. In all cases in which identification of the commodity as been impossible I have chosen to keep the origin spelling of the manuscript.

15) commodity unit: type of unit used to measure the commodity as it appears in the register. Given the wide variability of the equivalence between haulage (cart, soma, balla) and weight (libbre) I have chosen not to uniform the data. The libbra is the Friulian libbra grossa = 0.477 kg.

16) unit number: numerical value referred to (15).

17) folio: reference to the folio of the manuscript.

BiLSTM parameter setting.
plos.figshare.com
xls
Updated Mar 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zainab Mansur; Nazlia Omar; Sabrina Tiun; Eissa M. Alshari (2024). BiLSTM parameter setting. [Dataset]. http://doi.org/10.1371/journal.pone.0299652.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0299652.t006
Dataset updated
Mar 21, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Zainab Mansur; Nazlia Omar; Sabrina Tiun; Eissa M. Alshari
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
As social media booms, abusive online practices such as hate speech have unfortunately increased as well. As letters are often repeated in words used to construct social media messages, these types of words should be eliminated or reduced in number to enhance the efficacy of hate speech detection. Although multiple models have attempted to normalize out-of-vocabulary (OOV) words with repeated letters, they often fail to determine whether the in-vocabulary (IV) replacement words are correct or incorrect. Therefore, this study developed an improved model for normalizing OOV words with repeated letters by replacing them with correct in-vocabulary (IV) replacement words. The improved normalization model is an unsupervised method that does not require the use of a special dictionary or annotated data. It combines rule-based patterns of words with repeated letters and the SymSpell spelling correction algorithm to remove repeated letters within the words by multiple rules regarding the position of repeated letters in a word, be it at the beginning, middle, or end of the word and the repetition pattern. Two hate speech datasets were then used to assess performance. The proposed normalization model was able to decrease the percentage of OOV words to 8%. Its F1 score was also 9% and 13% higher than the models proposed by two extant studies. Therefore, the proposed normalization model performed better than the benchmark studies in replacing OOV words with the correct IV replacement and improved the performance of the detection model. As such, suitable rule-based patterns can be combined with spelling correction to develop a text normalization model to correctly replace words with repeated letters, which would, in turn, improve hate speech detection in texts.

Facebook

Twitter

Click to copy link

Link copied

Cite

Hemanth S (2022). Normalized Dataset [Dataset]. https://www.kaggle.com/datasets/hemanth012/normalized-dataset

Normalized Dataset

Explore at:

zip(1009250933 bytes)Available download formats

Dataset updated

Jun 15, 2022

Authors

Hemanth S

Description

Dataset

This dataset was created by Hemanth S

Clear search

Close search

Google apps

Main menu

Normalized Dataset

Dataset

Contents

Data from: LVMED: Dataset of Latvian text normalisation samples for the...

Residential Existing Homes (One to Four Units) Energy Efficiency Meter...

Data from: proteiNorm – A User-Friendly Tool for Normalization and Analysis...

dhivehi-sentences-extended

Naturalistic Neuroimaging Database

Overview

v2.0 Changes

CYGNSS Level 1 Science Data Record Version 2.1 - Dataset - NASA Open Data...

US State populations - 2018

Context

Content

Acknowledgements

Inspiration

Per-fold number of features on Álvez dataset, weighted and unweighted...

Per-fold number of features on RAPIDS dataset, weighted and unweighted...

Identification of Novel Reference Genes Suitable for qRT-PCR Normalization...

Predictive Validity Data Set

car_number_segment

Data for A Systemic Framework for Assessing the Risk of Decarbonization to...

Best regions for predicting False Negatives (FN, i.e. no prediction of...

Per-fold number of features selected by all models on binary datasets.

ModelSC 1: Capacity of the different models to normalize for the varying...

Table_2_Comparison of Normalization Methods for Analysis of TempO-Seq...

toll register (probably Gemona, 1426-1427)

BiLSTM parameter setting.

Normalized Dataset

Dataset

Contents