100+ datasets found
  1. Luecken Cite-seq human bone marrow 2021 preprocessing

    • figshare.com
    hdf
    Updated Oct 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Single-cell best practices (2023). Luecken Cite-seq human bone marrow 2021 preprocessing [Dataset]. http://doi.org/10.6084/m9.figshare.23623950.v2
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Oct 5, 2023
    Dataset provided by
    figshare
    Authors
    Single-cell best practices
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset published by Luecken et al. 2021 which contains data from human bone marrow measured through joint profiling of single-nucleus RNA and Antibody-Derived Tags (ADTs) using the 10X 3' Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0.File Descriptioncite_quality_control.h5mu: Filtered cell by feature MuData object after quality control.cite_normalization.h5mu: MuData object of normalized data using DSB (denoised and scaled by background) normalization.cite_doublet_removal_xdbt.h5mu: MuData of data after doublet removal based on known cell type markers. Cells were removed if they were double positive for mutually exclusive markers with a DSB value >2.5.cite_dimensionality_reduction.h5mu: MuData of data after dimensionality reduction.cite_batch_correction.h5mu: MuData of data after batch correction.CitationLuecken, M. D. et al. A sandbox for prediction and integration of DNA, RNA, and proteins in single cells. Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2021).Original data linkhttps://openproblems.bio/neurips_docs/data/dataset/

  2. Data from: Adapting Phrase-based Machine Translation to Normalise Medical...

    • zenodo.org
    • data.niaid.nih.gov
    txt, zip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nut Limsopatham; Nigel Collier; Nut Limsopatham; Nigel Collier (2020). Adapting Phrase-based Machine Translation to Normalise Medical Terms in Social Media Messages [Dataset]. http://doi.org/10.5281/zenodo.27354
    Explore at:
    zip, txtAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nut Limsopatham; Nigel Collier; Nut Limsopatham; Nigel Collier
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Data and supplementary information for the paper entitled "Adapting Phrase-based Machine Translation to Normalise Medical Terms in Social Media Messages" to be published at EMNLP 2015: Conference on Empirical Methods in Natural Language Processing — September 17–21, 2015 — Lisboa, Portugal.

    ABSTRACT: Previous studies have shown that health reports in social media, such as DailyStrength and Twitter, have potential for monitoring health conditions (e.g. adverse drug reactions, infectious diseases) in particular communities. However, in order for a machine to understand and make inferences on these health conditions, the ability to recognise when laymen's terms refer to a particular medical concept (i.e. text normalisation) is required. To achieve this, we propose to adapt an existing phrase-based machine translation (MT) technique and a vector representation of words to map between a social media phrase and a medical concept. We evaluate our proposed approach using a collection of phrases from tweets related to adverse drug reactions. Our experimental results show that the combination of a phrase-based MT technique and the similarity between word vector representations outperforms the baselines that apply only either of them by up to 55%.

  3. f

    Data_Sheet_1_NormExpression: An R Package to Normalize Gene Expression Data...

    • frontiersin.figshare.com
    application/cdfv2
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhenfeng Wu; Weixiang Liu; Xiufeng Jin; Haishuo Ji; Hua Wang; Gustavo Glusman; Max Robinson; Lin Liu; Jishou Ruan; Shan Gao (2023). Data_Sheet_1_NormExpression: An R Package to Normalize Gene Expression Data Using Evaluated Methods.doc [Dataset]. http://doi.org/10.3389/fgene.2019.00400.s001
    Explore at:
    application/cdfv2Available download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Frontiers
    Authors
    Zhenfeng Wu; Weixiang Liu; Xiufeng Jin; Haishuo Ji; Hua Wang; Gustavo Glusman; Max Robinson; Lin Liu; Jishou Ruan; Shan Gao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data normalization is a crucial step in the gene expression analysis as it ensures the validity of its downstream analyses. Although many metrics have been designed to evaluate the existing normalization methods, different metrics or different datasets by the same metric yield inconsistent results, particularly for the single-cell RNA sequencing (scRNA-seq) data. The worst situations could be that one method evaluated as the best by one metric is evaluated as the poorest by another metric, or one method evaluated as the best using one dataset is evaluated as the poorest using another dataset. Here raises an open question: principles need to be established to guide the evaluation of normalization methods. In this study, we propose a principle that one normalization method evaluated as the best by one metric should also be evaluated as the best by another metric (the consistency of metrics) and one method evaluated as the best using scRNA-seq data should also be evaluated as the best using bulk RNA-seq data or microarray data (the consistency of datasets). Then, we designed a new metric named Area Under normalized CV threshold Curve (AUCVC) and applied it with another metric mSCC to evaluate 14 commonly used normalization methods using both scRNA-seq data and bulk RNA-seq data, satisfying the consistency of metrics and the consistency of datasets. Our findings paved the way to guide future studies in the normalization of gene expression data with its evaluation. The raw gene expression data, normalization methods, and evaluation metrics used in this study have been included in an R package named NormExpression. NormExpression provides a framework and a fast and simple way for researchers to select the best method for the normalization of their gene expression data based on the evaluation of different methods (particularly some data-driven methods or their own methods) in the principle of the consistency of metrics and the consistency of datasets.

  4. Z

    Data from: Methods for normalizing microbiome data: an ecological...

    • data.niaid.nih.gov
    • datadryad.org
    Updated May 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huerlimann, Roger (2022). Data from: Methods for normalizing microbiome data: an ecological perspective [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_4950179
    Explore at:
    Dataset updated
    May 30, 2022
    Dataset provided by
    Zenger, Kyall R.
    Schwarzkopf, Lin
    McKnight, Donald T.
    Alford, Ross A.
    Bower, Deborah S.
    Huerlimann, Roger
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description
    1. Microbiome sequencing data often need to be normalized due to differences in read depths, and recommendations for microbiome analyses generally warn against using proportions or rarefying to normalize data and instead advocate alternatives, such as upper quartile, CSS, edgeR-TMM, or DESeq-VS. Those recommendations are, however, based on studies that focused on differential abundance testing and variance standardization, rather than community-level comparisons (i.e., beta diversity), Also, standardizing the within-sample variance across samples may suppress differences in species evenness, potentially distorting community-level patterns. Furthermore, the recommended methods use log transformations, which we expect to exaggerate the importance of differences among rare OTUs, while suppressing the importance of differences among common OTUs. 2. We tested these theoretical predictions via simulations and a real-world data set. 3. Proportions and rarefying produced more accurate comparisons among communities and were the only methods that fully normalized read depths across samples. Additionally, upper quartile, CSS, edgeR-TMM, and DESeq-VS often masked differences among communities when common OTUs differed, and they produced false positives when rare OTUs differed. 4. Based on our simulations, normalizing via proportions may be superior to other commonly used methods for comparing ecological communities.
  5. Additional file 2 of Pooling across cells to normalize single-cell RNA...

    • springernature.figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron L. Lun; Karsten Bach; John Marioni (2023). Additional file 2 of Pooling across cells to normalize single-cell RNA sequencing data with many zero counts [Dataset]. http://doi.org/10.6084/m9.figshare.c.3629252_D1.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Aaron L. Lun; Karsten Bach; John Marioni
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Enriched GO terms for deconvolution. This file is in a tab-separated format and contains the top 200 GO terms that were enriched in the set of DE genes unique to deconvolution. The identifier and name of each term is shown along with the total number of genes associated with the term, the number of associated genes that are also DE, the expected number under the null hypothesis, and the Fisher p value. (13 KB PDF)

  6. E

    Dataset of normalised Slovene text KonvNormSl 1.0

    • live.european-language-grid.eu
    • clarin.si
    binary format
    Updated Sep 18, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). Dataset of normalised Slovene text KonvNormSl 1.0 [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/8217
    Explore at:
    binary formatAvailable download formats
    Dataset updated
    Sep 18, 2016
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Data used in the experiments described in:

    Nikola Ljubešić, Katja Zupan, Darja Fišer and Tomaž Erjavec: Normalising Slovene data: historical texts vs. user-generated content. Proceedings of KONVENS 2016, September 19–21, 2016, Bochum, Germany.

    https://www.linguistics.rub.de/konvens16/pub/19_konvensproc.pdf

    (https://www.linguistics.rub.de/konvens16/)

    Data are split into the "token" folder (experiment on normalising individual tokens) and "segment" folder (experiment on normalising whole segments of text, i.e. sentences or tweets). Each experiment folder contains the "train", "dev" and "test" subfolders. Each subfolder contains two files for each sample, the original data (.orig.txt) and the data with hand-normalised words (.norm.txt). The files are aligned by lines.

    There are four datasets:

    - goo300k-bohoric: historical Slovene, hard case (<1850)

    - goo300k-gaj: historical Slovene, easy case (1850 - 1900)

    - tweet-L3: Slovene tweets, hard case (non-standard language)

    - tweet-L1: Slovene tweets, easy case (mostly standard language)

    The goo300k data come from http://hdl.handle.net/11356/1025, while the tweet data originate from the JANES project (http://nl.ijs.si/janes/english/).

    The text in the files has been split by inserting spaces between characters, with underscore (_) substituting the space character. Tokens not relevant for normalisation (e.g. URLs, hashtags) have been substituted by the inverted question mark '¿' character.

  7. d

    2018 LiDAR - Normalized Digital Surface Model - Tiles

    • catalog.data.gov
    • opendata.dc.gov
    • +2more
    Updated Feb 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    D.C. Office of the Chief Technology Officer (2025). 2018 LiDAR - Normalized Digital Surface Model - Tiles [Dataset]. https://catalog.data.gov/dataset/2018-lidar-normalized-digital-surface-model-tiles
    Explore at:
    Dataset updated
    Feb 4, 2025
    Dataset provided by
    D.C. Office of the Chief Technology Officer
    Description

    Normalised Digital Surface Model - 1m resolution. The dataset contains the Normalised Digital Surface Model for the Washington Area.Voids exist in the data due to data redaction conducted under the guidance of the United States Secret Service. All lidar data returns and collected data were removed from the dataset based on the redaction footprint shapefile generated in 2017.

  8. S

    A radiometric normalization dataset of Shandong Province based on Gaofen-1...

    • scidb.cn
    Updated Feb 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    黄莉婷; 焦伟利; 龙腾飞 (2020). A radiometric normalization dataset of Shandong Province based on Gaofen-1 WFV image (2018) [Dataset]. http://doi.org/10.11922/sciencedb.947
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 20, 2020
    Dataset provided by
    Science Data Bank
    Authors
    黄莉婷; 焦伟利; 龙腾飞
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Shandong
    Description

    Surface reflectance is a critical physical variable that affects the energy budget in land-atmosphere interactions, feature recognition and classification, and climate change research. This dataset uses the relative radiometric normalization method, and takes the Landsat-8 Operational Land Imager (OLI) surface reflectance products as the reference image to normalize the GF-1 satellite WFV sensor cloud-free images of Shandong Province in 2018. Relative radiometric normalization processing mainly includes atmospheric correction, image resampling, image registration, mask, extract the no-change pixels and calculate normalization coefficients. After relative radiometric normalization, the no-change pixels of each GF-1 WFV image and its reference image, R2 is 0.7295 above, RMSE is below 0.0172. The surface reflectance accuracy of GF-1 WFV image is improved, which can be used in cooperation with Landsat data to provide data support for remote sensing quantitative inversion. This dataset is in GeoTIFF format, and the spatial resolution of the image is 16 m.

  9. ARCS White Beam Vanadium Normalization for SNS Cycle 2021A

    • osti.gov
    Updated Oct 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abernathy, Douglas; Goyette, Rick; Granroth, Garrett (2023). ARCS White Beam Vanadium Normalization for SNS Cycle 2021A [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/2008337-arcs-white-beam-vanadium-normalization-sns-cycle
    Explore at:
    Dataset updated
    Oct 11, 2023
    Dataset provided by
    Office of Sciencehttp://www.er.doe.gov/
    United States Department of Energyhttp://energy.gov/
    High Flux Isotope Reactor (HFIR) & Spallation Neutron Source (SNS), Oak Ridge National Laboratory (ORNL); Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Spallation Neutron Source (SNS)
    Authors
    Abernathy, Douglas; Goyette, Rick; Granroth, Garrett
    Description

    This is a white beam data set from V to normalize the relative detector perfromance. See the ARCS_175852.md file for more information

  10. d

    Pi-plus-minus p elastic scattering in the 2-gev region

    • doi.org
    • hepdata.net
    Updated Sep 2, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). Pi-plus-minus p elastic scattering in the 2-gev region [Dataset]. http://doi.org/10.17182/hepdata.6227.v1
    Explore at:
    Dataset updated
    Sep 2, 2015
    Description

    ALL DATA IN THIS RECORD ARE REDUNDANT. I.E., THEY WERE OBTAINED DIRECTLY FROM OTHER DATA IN THIS FILE, USUALLY BY EXTRAPOLATION OR INTEGRATION.. THE FOLLOWING COMMENTS ARE TAKEN FROM THE PI N COMPILATION OF R.L. KELLY. THEY ARE THAT COMPILATION& apos;S COMPLETE SET OF COMMENTS FOR PAPERS RELATED TO THE SAME EXPERIMENT (DESIGNATED BUSZA69) AS THE CURRENT PAPER. (THE IDENTIFIER PRECEDING THE REFERENCE AND COMMENT FOR EACH PAPER IS FOR CROSS-REFERENCING WITHIN THESE COMMENTS ONLY AND DOES NOT NECESSARILY AGREE WITH THE SHORT CODE USED ELSEWHERE IN THE PRESENT COMPILATION.) /// BELLAMY65 [E. H. BELLAMY,PROC. ROY. SOC. (LONDON) 289,509(1965)] -- /// BUSZA67 [W. BUSZA,NC 52A,331(1967)] -- PI- P DCS FROM 2K ELASTIC EVENTS AT EACH OF 5 MOMENTA BETWEEN 1.72 AND 2.46 GEV/C. DONE AT NIMROD WITH OPTICAL SPARK CHAMBERS. THE APPARATUS IS DESCRIBED IN BELLAMY65, THE RESULTS IN BUSZA67. /// BUSZA69 [W. BUSZA,PR 180,1339(1969)] -- PI+ P DCS AT 10 MOMENTA BETWEEN 1.72 AND 2.80 GEV/C,AND PI- P DCS AT 5 MOMENTA BETWEEN 2.17 AND 2.80 GEV/C. THE DATA REPORTED IN BUSZA67 ARE ALSO REPEATED HERE. THE NEW MEASUREMENTS WERE DONE WITH AN IMPROVED VERSION OF THE APPARATUS USED BY BUSZA67. THE PI- DATA (INCLUDING BUSZA67)ARE NORMALIZED TO FORWARD DISPERSION RELATIONS,THE PI+ DATAHAS ITS OWN EXPERIMENTAL NORMALIZATION BUT NO NE IS GIVEN. WE HAVE INCREASED THE ERROR OF THE MOST FORWARD PI+ POINT AT 1.72 GEV/C BECAUSE OF AN AMBIGUOUS FOOTNOTE CONCERNING THIS POINT. /// COMMENTS FROM LOVELACE71 COMPILATION OF THESE DATA -- LOVELACE71 CLAIMS SOME USE WAS MADE OF FORWARD DISPERSION RELATIONS TO NORMALIZE THE PI+ DATA AS WELL AS THE PI-. THE FOLLOWING NORMALIZATION ERRORS AND RENORMALIZATION FACTORS ARE RECOMMENDED FOR THE PI+ P AND PI- P DIFFERENTIAL CROSS SECTIONS -- PLAB=1720 MEV/C -- NE(PI+ P)=INFIN, NE(PI- P)=INFIN. PLAB=1890 MEV/C -- RF(PI+ P)=1.245, RF(PI- P)=0.941. PLAB=2070 MEV/C -- NE(PI+ P)=INFIN, RF(PI- P)=1.224. PLAB=2170 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=0.1 . PLAB=2270 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=INFIN. PLAB=2360 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=0.1 . PLAB=2460 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=INFIN. PLAB=2560 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=0.1 . PLAB=2650 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=0.1 . PLAB=2800 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=0.1 . /// COMMENTS ON MODIFICATIONS TO LOVELACE71 COMPILATION BY KELLY -- WE HAVE TAKEN ALL PI- NES TO BE INFINITE,AND ALL PI+ NES TO BE UNKNOWN. ALSO ONE MINOR MISTAKE IN THE PI- (PI+) DATA AT 2.36 (2.65) GEV/C HAS BEEN CORRECTED.

  11. f

    Table_2_Comparison of Normalization Methods for Analysis of TempO-Seq...

    • frontiersin.figshare.com
    • figshare.com
    xlsx
    Updated Jun 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pierre R. Bushel; Stephen S. Ferguson; Sreenivasa C. Ramaiahgari; Richard S. Paules; Scott S. Auerbach (2023). Table_2_Comparison of Normalization Methods for Analysis of TempO-Seq Targeted RNA Sequencing Data.xlsx [Dataset]. http://doi.org/10.3389/fgene.2020.00594.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Frontiers
    Authors
    Pierre R. Bushel; Stephen S. Ferguson; Sreenivasa C. Ramaiahgari; Richard S. Paules; Scott S. Auerbach
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of bulk RNA sequencing (RNA-Seq) data is a valuable tool to understand transcription at the genome scale. Targeted sequencing of RNA has emerged as a practical means of assessing the majority of the transcriptomic space with less reliance on large resources for consumables and bioinformatics. TempO-Seq is a templated, multiplexed RNA-Seq platform that interrogates a panel of sentinel genes representative of genome-wide transcription. Nuances of the technology require proper preprocessing of the data. Various methods have been proposed and compared for normalizing bulk RNA-Seq data, but there has been little to no investigation of how the methods perform on TempO-Seq data. We simulated count data into two groups (treated vs. untreated) at seven-fold change (FC) levels (including no change) using control samples from human HepaRG cells run on TempO-Seq and normalized the data using seven normalization methods. Upper Quartile (UQ) performed the best with regard to maintaining FC levels as detected by a limma contrast between treated vs. untreated groups. For all FC levels, specificity of the UQ normalization was greater than 0.84 and sensitivity greater than 0.90 except for the no change and +1.5 levels. Furthermore, K-means clustering of the simulated genes normalized by UQ agreed the most with the FC assignments [adjusted Rand index (ARI) = 0.67]. Despite having an assumption of the majority of genes being unchanged, the DESeq2 scaling factors normalization method performed reasonably well as did simple normalization procedures counts per million (CPM) and total counts (TCs). These results suggest that for two class comparisons of TempO-Seq data, UQ, CPM, TC, or DESeq2 normalization should provide reasonably reliable results at absolute FC levels ≥2.0. These findings will help guide researchers to normalize TempO-Seq gene expression data for more reliable results.

  12. d

    Residential Existing Homes (One to Four Units) Energy Efficiency Meter...

    • catalog.data.gov
    • datasets.ai
    • +2more
    Updated Sep 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ny.gov (2023). Residential Existing Homes (One to Four Units) Energy Efficiency Meter Evaluated Project Data: 2007 – 2012 [Dataset]. https://catalog.data.gov/dataset/residential-existing-homes-one-to-four-units-energy-efficiency-meter-evaluated-projec-2007
    Explore at:
    Dataset updated
    Sep 15, 2023
    Dataset provided by
    data.ny.gov
    Description

    IMPORTANT! PLEASE READ DISCLAIMER BEFORE USING DATA. This dataset backcasts estimated modeled savings for a subset of 2007-2012 completed projects in the Home Performance with ENERGY STAR® Program against normalized savings calculated by an open source energy efficiency meter available at https://www.openee.io/. Open source code uses utility-grade metered consumption to weather-normalize the pre- and post-consumption data using standard methods with no discretionary independent variables. The open source energy efficiency meter allows private companies, utilities, and regulators to calculate energy savings from energy efficiency retrofits with increased confidence and replicability of results. This dataset is intended to lay a foundation for future innovation and deployment of the open source energy efficiency meter across the residential energy sector, and to help inform stakeholders interested in pay for performance programs, where providers are paid for realizing measurable weather-normalized results. To download the open source code, please visit the website at https://github.com/openeemeter/eemeter/releases D I S C L A I M E R: Normalized Savings using open source OEE meter. Several data elements, including, Evaluated Annual Elecric Savings (kWh), Evaluated Annual Gas Savings (MMBtu), Pre-retrofit Baseline Electric (kWh), Pre-retrofit Baseline Gas (MMBtu), Post-retrofit Usage Electric (kWh), and Post-retrofit Usage Gas (MMBtu) are direct outputs from the open source OEE meter. Home Performance with ENERGY STAR® Estimated Savings. Several data elements, including, Estimated Annual kWh Savings, Estimated Annual MMBtu Savings, and Estimated First Year Energy Savings represent contractor-reported savings derived from energy modeling software calculations and not actual realized energy savings. The accuracy of the Estimated Annual kWh Savings and Estimated Annual MMBtu Savings for projects has been evaluated by an independent third party. The results of the Home Performance with ENERGY STAR impact analysis indicate that, on average, actual savings amount to 35 percent of the Estimated Annual kWh Savings and 65 percent of the Estimated Annual MMBtu Savings. For more information, please refer to the Evaluation Report published on NYSERDA’s website at: http://www.nyserda.ny.gov/-/media/Files/Publications/PPSER/Program-Evaluation/2012ContractorReports/2012-HPwES-Impact-Report-with-Appendices.pdf. This dataset includes the following data points for a subset of projects completed in 2007-2012: Contractor ID, Project County, Project City, Project ZIP, Climate Zone, Weather Station, Weather Station-Normalization, Project Completion Date, Customer Type, Size of Home, Volume of Home, Number of Units, Year Home Built, Total Project Cost, Contractor Incentive, Total Incentives, Amount Financed through Program, Estimated Annual kWh Savings, Estimated Annual MMBtu Savings, Estimated First Year Energy Savings, Evaluated Annual Electric Savings (kWh), Evaluated Annual Gas Savings (MMBtu), Pre-retrofit Baseline Electric (kWh), Pre-retrofit Baseline Gas (MMBtu), Post-retrofit Usage Electric (kWh), Post-retrofit Usage Gas (MMBtu), Central Hudson, Consolidated Edison, LIPA, National Grid, National Fuel Gas, New York State Electric and Gas, Orange and Rockland, Rochester Gas and Electric. How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov.

  13. d

    Attributes for NHDPlus Catchments (Version 1.1) for the Conterminous United...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Sep 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Attributes for NHDPlus Catchments (Version 1.1) for the Conterminous United States: Normalized Atmospheric Deposition for 2002, Total Inorganic Nitrogen [Dataset]. https://catalog.data.gov/dataset/attributes-for-nhdplus-catchments-version-1-1-for-the-conterminous-united-states-normalize
    Explore at:
    Dataset updated
    Sep 18, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Contiguous United States, United States
    Description

    This data set represents the average normalized atmospheric (wet) deposition, in kilograms, of Total Inorganic Nitrogen for the year 2002 compiled for every catchment of NHDPlus for the conterminous United States. Estimates of Total Inorganic Nitrogen deposition are based on National Atmospheric Deposition Program (NADP) measurements (B. Larsen, U.S. Geological Survey, written commun., 2007). De-trending methods applied to the year 2002 are described in Alexander and others, 2001. NADP site selection met the following criteria: stations must have records from 1995 to 2002 and have a minimum of 30 observations. The NHDPlus Version 1.1 is an integrated suite of application-ready geospatial datasets that incorporates many of the best features of the National Hydrography Dataset (NHD) and the National Elevation Dataset (NED). The NHDPlus includes a stream network (based on the 1:100,00-scale NHD), improved networking, naming, and value-added attributes (VAAs). NHDPlus also includes elevation-derived catchments (drainage areas) produced using a drainage enforcement technique first widely used in New England, and thus referred to as "the New England Method." This technique involves "burning in" the 1:100,000-scale NHD and when available building "walls" using the National Watershed Boundary Dataset (WBD). The resulting modified digital elevation model (HydroDEM) is used to produce hydrologic derivatives that agree with the NHD and WBD. Over the past two years, an interdisciplinary team from the U.S. Geological Survey (USGS), and the U.S. Environmental Protection Agency (USEPA), and contractors, found that this method produces the best quality NHD catchments using an automated process (USEPA, 2007). The NHDPlus dataset is organized by 18 Production Units that cover the conterminous United States. The NHDPlus version 1.1 data are grouped by the U.S. Geologic Survey's Major River Basins (MRBs, Crawford and others, 2006). MRB1, covering the New England and Mid-Atlantic River basins, contains NHDPlus Production Units 1 and 2. MRB2, covering the South Atlantic-Gulf and Tennessee River basins, contains NHDPlus Production Units 3 and 6. MRB3, covering the Great Lakes, Ohio, Upper Mississippi, and Souris-Red-Rainy River basins, contains NHDPlus Production Units 4, 5, 7 and 9. MRB4, covering the Missouri River basins, contains NHDPlus Production Units 10-lower and 10-upper. MRB5, covering the Lower Mississippi, Arkansas-White-Red, and Texas-Gulf River basins, contains NHDPlus Production Units 8, 11 and 12. MRB6, covering the Rio Grande, Colorado and Great Basin River basins, contains NHDPlus Production Units 13, 14, 15 and 16. MRB7, covering the Pacific Northwest River basins, contains NHDPlus Production Unit 17. MRB8, covering California River basins, contains NHDPlus Production Unit 18.

  14. GINI Index Data

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). GINI Index Data [Dataset]. https://www.johnsnowlabs.com/marketplace/gini-index-data/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Time period covered
    1981 - 2021
    Area covered
    World
    Description

    GINI Index Data consists of information based on primary household survey data obtained from government statistical agencies and World Bank country departments. In economics, the GINI index (sometimes expressed as a GINI ratio, GINI coefficient or a normalized GINI index) is a measure of statistical dispersion intended to represent the income or wealth distribution of a nation's residents, and is the most commonly used measure of inequality.

  15. NOAA Climate Data Record (CDR) of AVHRR Normalized Difference Vegetation...

    • ncei.noaa.gov
    • datasets.ai
    • +2more
    html
    Updated Jul 11, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vermote, Eric (2019). NOAA Climate Data Record (CDR) of AVHRR Normalized Difference Vegetation Index (NDVI), Version 5 [Dataset]. http://doi.org/10.7289/v5zg6qh9
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jul 11, 2019
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    National Centers for Environmental Informationhttps://www.ncei.noaa.gov/
    Authors
    Vermote, Eric
    Time period covered
    Jun 24, 1981 - Present
    Area covered
    Description

    This dataset contains gridded daily Normalized Difference Vegetation Index (NDVI) derived from the NOAA Climate Data Record (CDR) of Advanced Very High Resolution Radiometer (AVHRR) Surface Reflectance. The data record spans from 1981 to 10 days before the present using data from eight NOAA polar orbiting satellites: NOAA-7, -9, -11, -14, -16, -17, -18 and -19. The data are projected on a 0.05 degree x 0.05 degree global grid. This dataset is one of the Land Surface CDR Version 5 products produced by the NASA Goddard Space Flight Center (GSFC) and the University of Maryland (UMD). Improvements for Version 5 include using the improved surface reflectance data, correcting the data for known errors in time, latitude, and longitude variables, as well as improvements in the global and variable attribute definitions. The dataset is in the netCDF-4 file format following ACDD and CF Conventions. The dataset is accompanied by algorithm documentation, data flow diagram and source code for the NOAA CDR Program.

  16. F

    Composite Leading Indicators: Composite Leading Indicator (CLI) Normalized...

    • fred.stlouisfed.org
    json
    Updated Apr 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Composite Leading Indicators: Composite Leading Indicator (CLI) Normalized for United States [Dataset]. https://fred.stlouisfed.org/series/USALOLITONOSTSAM
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Apr 10, 2024
    License

    https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required

    Area covered
    United States
    Description

    Graph and download economic data for Composite Leading Indicators: Composite Leading Indicator (CLI) Normalized for United States (USALOLITONOSTSAM) from Jan 1955 to Jan 2024 about leading indicator and USA.

  17. d

    Data from: Attributes for NHDPlus Catchments (Version 1.1) for the...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Nov 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Attributes for NHDPlus Catchments (Version 1.1) for the Conterminous United States: Normalized Atmospheric Deposition for 2002, Ammonium (NH4) [Dataset]. https://catalog.data.gov/dataset/attributes-for-nhdplus-catchments-version-1-1-for-the-conterminous-united-states-normalize-57d70
    Explore at:
    Dataset updated
    Nov 1, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Contiguous United States, United States
    Description

    This data set represents the average normalized atmospheric (wet) deposition, in kilograms, of Ammonium (NH4) for the year 2002 compiled for every catchment of NHDPlus for the conterminous United States. Estimates of NH4 deposition are based on National Atmospheric Deposition Program (NADP) measurements (B. Larsen, U.S. Geological Survey, written commun., 2007). De-trending methods applied to the year 2002 are described in Alexander and others, 2001. NADP site selection met the following criteria: stations must have records from 1995 to 2002 and have a minimum of 30 observations. The NHDPlus Version 1.1 is an integrated suite of application-ready geospatial datasets that incorporates many of the best features of the National Hydrography Dataset (NHD) and the National Elevation Dataset (NED). The NHDPlus includes a stream network (based on the 1:100,00-scale NHD), improved networking, naming, and value-added attributes (VAAs). NHDPlus also includes elevation-derived catchments (drainage areas) produced using a drainage enforcement technique first widely used in New England, and thus referred to as "the New England Method." This technique involves "burning in" the 1:100,000-scale NHD and when available building "walls" using the National Watershed Boundary Dataset (WBD). The resulting modified digital elevation model (HydroDEM) is used to produce hydrologic derivatives that agree with the NHD and WBD. Over the past two years, an interdisciplinary team from the U.S. Geological Survey (USGS), and the U.S. Environmental Protection Agency (USEPA), and contractors, found that this method produces the best quality NHD catchments using an automated process (USEPA, 2007). The NHDPlus dataset is organized by 18 Production Units that cover the conterminous United States. The NHDPlus version 1.1 data are grouped by the U.S. Geologic Survey's Major River Basins (MRBs, Crawford and others, 2006). MRB1, covering the New England and Mid-Atlantic River basins, contains NHDPlus Production Units 1 and 2. MRB2, covering the South Atlantic-Gulf and Tennessee River basins, contains NHDPlus Production Units 3 and 6. MRB3, covering the Great Lakes, Ohio, Upper Mississippi, and Souris-Red-Rainy River basins, contains NHDPlus Production Units 4, 5, 7 and 9. MRB4, covering the Missouri River basins, contains NHDPlus Production Units 10-lower and 10-upper. MRB5, covering the Lower Mississippi, Arkansas-White-Red, and Texas-Gulf River basins, contains NHDPlus Production Units 8, 11 and 12. MRB6, covering the Rio Grande, Colorado and Great Basin River basins, contains NHDPlus Production Units 13, 14, 15 and 16. MRB7, covering the Pacific Northwest River basins, contains NHDPlus Production Unit 17. MRB8, covering California River basins, contains NHDPlus Production Unit 18.

  18. F

    Composite Leading Indicators: Reference Series (GDP) Normalized for United...

    • fred.stlouisfed.org
    json
    Updated Apr 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Composite Leading Indicators: Reference Series (GDP) Normalized for United Kingdom [Dataset]. https://fred.stlouisfed.org/series/GBRLORSGPNOSTSAM
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Apr 10, 2024
    License

    https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required

    Area covered
    United Kingdom
    Description

    Graph and download economic data for Composite Leading Indicators: Reference Series (GDP) Normalized for United Kingdom (GBRLORSGPNOSTSAM) from Feb 1955 to Aug 2023 about leading indicator, United Kingdom, and GDP.

  19. F

    Composite Leading Indicators: Reference Series (GDP) Normalized for China

    • fred.stlouisfed.org
    json
    Updated Apr 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Composite Leading Indicators: Reference Series (GDP) Normalized for China [Dataset]. https://fred.stlouisfed.org/series/CHNLORSGPNOSTSAM
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Apr 10, 2024
    License

    https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required

    Area covered
    China
    Description

    Graph and download economic data for Composite Leading Indicators: Reference Series (GDP) Normalized for China (CHNLORSGPNOSTSAM) from Jan 1978 to Dec 2023 about leading indicator, China, and GDP.

  20. F

    Composite Leading Indicators: Reference Series (GDP) Normalized for Korea

    • fred.stlouisfed.org
    json
    Updated Apr 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Composite Leading Indicators: Reference Series (GDP) Normalized for Korea [Dataset]. https://fred.stlouisfed.org/series/KORLORSGPNOSTSAM
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Apr 10, 2024
    License

    https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required

    Area covered
    South Korea
    Description

    Graph and download economic data for Composite Leading Indicators: Reference Series (GDP) Normalized for Korea (KORLORSGPNOSTSAM) from Feb 1960 to Nov 2023 about leading indicator, Korea, and GDP.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Single-cell best practices (2023). Luecken Cite-seq human bone marrow 2021 preprocessing [Dataset]. http://doi.org/10.6084/m9.figshare.23623950.v2
Organization logo

Luecken Cite-seq human bone marrow 2021 preprocessing

Explore at:
hdfAvailable download formats
Dataset updated
Oct 5, 2023
Dataset provided by
figshare
Authors
Single-cell best practices
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Dataset published by Luecken et al. 2021 which contains data from human bone marrow measured through joint profiling of single-nucleus RNA and Antibody-Derived Tags (ADTs) using the 10X 3' Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0.File Descriptioncite_quality_control.h5mu: Filtered cell by feature MuData object after quality control.cite_normalization.h5mu: MuData object of normalized data using DSB (denoised and scaled by background) normalization.cite_doublet_removal_xdbt.h5mu: MuData of data after doublet removal based on known cell type markers. Cells were removed if they were double positive for mutually exclusive markers with a DSB value >2.5.cite_dimensionality_reduction.h5mu: MuData of data after dimensionality reduction.cite_batch_correction.h5mu: MuData of data after batch correction.CitationLuecken, M. D. et al. A sandbox for prediction and integration of DNA, RNA, and proteins in single cells. Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2021).Original data linkhttps://openproblems.bio/neurips_docs/data/dataset/

Search
Clear search
Close search
Google apps
Main menu