81 datasets found
  1. Luecken Cite-seq human bone marrow 2021 preprocessing

    • figshare.com
    hdf
    Updated Oct 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Single-cell best practices (2023). Luecken Cite-seq human bone marrow 2021 preprocessing [Dataset]. http://doi.org/10.6084/m9.figshare.23623950.v2
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Oct 5, 2023
    Dataset provided by
    figshare
    Authors
    Single-cell best practices
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset published by Luecken et al. 2021 which contains data from human bone marrow measured through joint profiling of single-nucleus RNA and Antibody-Derived Tags (ADTs) using the 10X 3' Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0.File Descriptioncite_quality_control.h5mu: Filtered cell by feature MuData object after quality control.cite_normalization.h5mu: MuData object of normalized data using DSB (denoised and scaled by background) normalization.cite_doublet_removal_xdbt.h5mu: MuData of data after doublet removal based on known cell type markers. Cells were removed if they were double positive for mutually exclusive markers with a DSB value >2.5.cite_dimensionality_reduction.h5mu: MuData of data after dimensionality reduction.cite_batch_correction.h5mu: MuData of data after batch correction.CitationLuecken, M. D. et al. A sandbox for prediction and integration of DNA, RNA, and proteins in single cells. Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2021).Original data linkhttps://openproblems.bio/neurips_docs/data/dataset/

  2. Data from: Adapting Phrase-based Machine Translation to Normalise Medical...

    • zenodo.org
    • data.niaid.nih.gov
    txt, zip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nut Limsopatham; Nigel Collier; Nut Limsopatham; Nigel Collier (2020). Adapting Phrase-based Machine Translation to Normalise Medical Terms in Social Media Messages [Dataset]. http://doi.org/10.5281/zenodo.27354
    Explore at:
    zip, txtAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nut Limsopatham; Nigel Collier; Nut Limsopatham; Nigel Collier
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Data and supplementary information for the paper entitled "Adapting Phrase-based Machine Translation to Normalise Medical Terms in Social Media Messages" to be published at EMNLP 2015: Conference on Empirical Methods in Natural Language Processing — September 17–21, 2015 — Lisboa, Portugal.

    ABSTRACT: Previous studies have shown that health reports in social media, such as DailyStrength and Twitter, have potential for monitoring health conditions (e.g. adverse drug reactions, infectious diseases) in particular communities. However, in order for a machine to understand and make inferences on these health conditions, the ability to recognise when laymen's terms refer to a particular medical concept (i.e. text normalisation) is required. To achieve this, we propose to adapt an existing phrase-based machine translation (MT) technique and a vector representation of words to map between a social media phrase and a medical concept. We evaluate our proposed approach using a collection of phrases from tweets related to adverse drug reactions. Our experimental results show that the combination of a phrase-based MT technique and the similarity between word vector representations outperforms the baselines that apply only either of them by up to 55%.

  3. f

    Table_2_Comparison of Normalization Methods for Analysis of TempO-Seq...

    • frontiersin.figshare.com
    • figshare.com
    xlsx
    Updated Jun 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pierre R. Bushel; Stephen S. Ferguson; Sreenivasa C. Ramaiahgari; Richard S. Paules; Scott S. Auerbach (2023). Table_2_Comparison of Normalization Methods for Analysis of TempO-Seq Targeted RNA Sequencing Data.xlsx [Dataset]. http://doi.org/10.3389/fgene.2020.00594.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Frontiers
    Authors
    Pierre R. Bushel; Stephen S. Ferguson; Sreenivasa C. Ramaiahgari; Richard S. Paules; Scott S. Auerbach
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of bulk RNA sequencing (RNA-Seq) data is a valuable tool to understand transcription at the genome scale. Targeted sequencing of RNA has emerged as a practical means of assessing the majority of the transcriptomic space with less reliance on large resources for consumables and bioinformatics. TempO-Seq is a templated, multiplexed RNA-Seq platform that interrogates a panel of sentinel genes representative of genome-wide transcription. Nuances of the technology require proper preprocessing of the data. Various methods have been proposed and compared for normalizing bulk RNA-Seq data, but there has been little to no investigation of how the methods perform on TempO-Seq data. We simulated count data into two groups (treated vs. untreated) at seven-fold change (FC) levels (including no change) using control samples from human HepaRG cells run on TempO-Seq and normalized the data using seven normalization methods. Upper Quartile (UQ) performed the best with regard to maintaining FC levels as detected by a limma contrast between treated vs. untreated groups. For all FC levels, specificity of the UQ normalization was greater than 0.84 and sensitivity greater than 0.90 except for the no change and +1.5 levels. Furthermore, K-means clustering of the simulated genes normalized by UQ agreed the most with the FC assignments [adjusted Rand index (ARI) = 0.67]. Despite having an assumption of the majority of genes being unchanged, the DESeq2 scaling factors normalization method performed reasonably well as did simple normalization procedures counts per million (CPM) and total counts (TCs). These results suggest that for two class comparisons of TempO-Seq data, UQ, CPM, TC, or DESeq2 normalization should provide reasonably reliable results at absolute FC levels ≥2.0. These findings will help guide researchers to normalize TempO-Seq gene expression data for more reliable results.

  4. Z

    Data from: Methods for normalizing microbiome data: an ecological...

    • data.niaid.nih.gov
    • datadryad.org
    Updated May 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huerlimann, Roger (2022). Data from: Methods for normalizing microbiome data: an ecological perspective [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_4950179
    Explore at:
    Dataset updated
    May 30, 2022
    Dataset provided by
    Huerlimann, Roger
    Schwarzkopf, Lin
    McKnight, Donald T.
    Alford, Ross A.
    Zenger, Kyall R.
    Bower, Deborah S.
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description
    1. Microbiome sequencing data often need to be normalized due to differences in read depths, and recommendations for microbiome analyses generally warn against using proportions or rarefying to normalize data and instead advocate alternatives, such as upper quartile, CSS, edgeR-TMM, or DESeq-VS. Those recommendations are, however, based on studies that focused on differential abundance testing and variance standardization, rather than community-level comparisons (i.e., beta diversity), Also, standardizing the within-sample variance across samples may suppress differences in species evenness, potentially distorting community-level patterns. Furthermore, the recommended methods use log transformations, which we expect to exaggerate the importance of differences among rare OTUs, while suppressing the importance of differences among common OTUs. 2. We tested these theoretical predictions via simulations and a real-world data set. 3. Proportions and rarefying produced more accurate comparisons among communities and were the only methods that fully normalized read depths across samples. Additionally, upper quartile, CSS, edgeR-TMM, and DESeq-VS often masked differences among communities when common OTUs differed, and they produced false positives when rare OTUs differed. 4. Based on our simulations, normalizing via proportions may be superior to other commonly used methods for comparing ecological communities.
  5. f

    Data_Sheet_1_NormExpression: An R Package to Normalize Gene Expression Data...

    • frontiersin.figshare.com
    application/cdfv2
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhenfeng Wu; Weixiang Liu; Xiufeng Jin; Haishuo Ji; Hua Wang; Gustavo Glusman; Max Robinson; Lin Liu; Jishou Ruan; Shan Gao (2023). Data_Sheet_1_NormExpression: An R Package to Normalize Gene Expression Data Using Evaluated Methods.doc [Dataset]. http://doi.org/10.3389/fgene.2019.00400.s001
    Explore at:
    application/cdfv2Available download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Frontiers
    Authors
    Zhenfeng Wu; Weixiang Liu; Xiufeng Jin; Haishuo Ji; Hua Wang; Gustavo Glusman; Max Robinson; Lin Liu; Jishou Ruan; Shan Gao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data normalization is a crucial step in the gene expression analysis as it ensures the validity of its downstream analyses. Although many metrics have been designed to evaluate the existing normalization methods, different metrics or different datasets by the same metric yield inconsistent results, particularly for the single-cell RNA sequencing (scRNA-seq) data. The worst situations could be that one method evaluated as the best by one metric is evaluated as the poorest by another metric, or one method evaluated as the best using one dataset is evaluated as the poorest using another dataset. Here raises an open question: principles need to be established to guide the evaluation of normalization methods. In this study, we propose a principle that one normalization method evaluated as the best by one metric should also be evaluated as the best by another metric (the consistency of metrics) and one method evaluated as the best using scRNA-seq data should also be evaluated as the best using bulk RNA-seq data or microarray data (the consistency of datasets). Then, we designed a new metric named Area Under normalized CV threshold Curve (AUCVC) and applied it with another metric mSCC to evaluate 14 commonly used normalization methods using both scRNA-seq data and bulk RNA-seq data, satisfying the consistency of metrics and the consistency of datasets. Our findings paved the way to guide future studies in the normalization of gene expression data with its evaluation. The raw gene expression data, normalization methods, and evaluation metrics used in this study have been included in an R package named NormExpression. NormExpression provides a framework and a fast and simple way for researchers to select the best method for the normalization of their gene expression data based on the evaluation of different methods (particularly some data-driven methods or their own methods) in the principle of the consistency of metrics and the consistency of datasets.

  6. S

    A radiometric normalization dataset of Shandong Province based on Gaofen-1...

    • scidb.cn
    Updated Feb 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    黄莉婷; 焦伟利; 龙腾飞 (2020). A radiometric normalization dataset of Shandong Province based on Gaofen-1 WFV image (2018) [Dataset]. http://doi.org/10.11922/sciencedb.947
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 20, 2020
    Dataset provided by
    Science Data Bank
    Authors
    黄莉婷; 焦伟利; 龙腾飞
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Shandong
    Description

    Surface reflectance is a critical physical variable that affects the energy budget in land-atmosphere interactions, feature recognition and classification, and climate change research. This dataset uses the relative radiometric normalization method, and takes the Landsat-8 Operational Land Imager (OLI) surface reflectance products as the reference image to normalize the GF-1 satellite WFV sensor cloud-free images of Shandong Province in 2018. Relative radiometric normalization processing mainly includes atmospheric correction, image resampling, image registration, mask, extract the no-change pixels and calculate normalization coefficients. After relative radiometric normalization, the no-change pixels of each GF-1 WFV image and its reference image, R2 is 0.7295 above, RMSE is below 0.0172. The surface reflectance accuracy of GF-1 WFV image is improved, which can be used in cooperation with Landsat data to provide data support for remote sensing quantitative inversion. This dataset is in GeoTIFF format, and the spatial resolution of the image is 16 m.

  7. ARCS White Beam Vanadium Normalization for SNS Cycle 2021A

    • osti.gov
    Updated Oct 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abernathy, Douglas; Goyette, Rick; Granroth, Garrett (2023). ARCS White Beam Vanadium Normalization for SNS Cycle 2021A [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/2008337-arcs-white-beam-vanadium-normalization-sns-cycle
    Explore at:
    Dataset updated
    Oct 11, 2023
    Dataset provided by
    Office of Sciencehttp://www.er.doe.gov/
    United States Department of Energyhttp://energy.gov/
    High Flux Isotope Reactor (HFIR) & Spallation Neutron Source (SNS), Oak Ridge National Laboratory (ORNL); Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Spallation Neutron Source (SNS)
    Authors
    Abernathy, Douglas; Goyette, Rick; Granroth, Garrett
    Description

    This is a white beam data set from V to normalize the relative detector perfromance. See the ARCS_175852.md file for more information

  8. Additional file 2 of Pooling across cells to normalize single-cell RNA...

    • springernature.figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron L. Lun; Karsten Bach; John Marioni (2023). Additional file 2 of Pooling across cells to normalize single-cell RNA sequencing data with many zero counts [Dataset]. http://doi.org/10.6084/m9.figshare.c.3629252_D1.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Aaron L. Lun; Karsten Bach; John Marioni
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Enriched GO terms for deconvolution. This file is in a tab-separated format and contains the top 200 GO terms that were enriched in the set of DE genes unique to deconvolution. The identifier and name of each term is shown along with the total number of genes associated with the term, the number of associated genes that are also DE, the expected number under the null hypothesis, and the Fisher p value. (13 KB PDF)

  9. d

    Pi-plus-minus p elastic scattering in the 2-gev region

    • doi.org
    • hepdata.net
    Updated Sep 2, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). Pi-plus-minus p elastic scattering in the 2-gev region [Dataset]. http://doi.org/10.17182/hepdata.6227.v1
    Explore at:
    Dataset updated
    Sep 2, 2015
    Description

    ALL DATA IN THIS RECORD ARE REDUNDANT. I.E., THEY WERE OBTAINED DIRECTLY FROM OTHER DATA IN THIS FILE, USUALLY BY EXTRAPOLATION OR INTEGRATION.. THE FOLLOWING COMMENTS ARE TAKEN FROM THE PI N COMPILATION OF R.L. KELLY. THEY ARE THAT COMPILATION& apos;S COMPLETE SET OF COMMENTS FOR PAPERS RELATED TO THE SAME EXPERIMENT (DESIGNATED BUSZA69) AS THE CURRENT PAPER. (THE IDENTIFIER PRECEDING THE REFERENCE AND COMMENT FOR EACH PAPER IS FOR CROSS-REFERENCING WITHIN THESE COMMENTS ONLY AND DOES NOT NECESSARILY AGREE WITH THE SHORT CODE USED ELSEWHERE IN THE PRESENT COMPILATION.) /// BELLAMY65 [E. H. BELLAMY,PROC. ROY. SOC. (LONDON) 289,509(1965)] -- /// BUSZA67 [W. BUSZA,NC 52A,331(1967)] -- PI- P DCS FROM 2K ELASTIC EVENTS AT EACH OF 5 MOMENTA BETWEEN 1.72 AND 2.46 GEV/C. DONE AT NIMROD WITH OPTICAL SPARK CHAMBERS. THE APPARATUS IS DESCRIBED IN BELLAMY65, THE RESULTS IN BUSZA67. /// BUSZA69 [W. BUSZA,PR 180,1339(1969)] -- PI+ P DCS AT 10 MOMENTA BETWEEN 1.72 AND 2.80 GEV/C,AND PI- P DCS AT 5 MOMENTA BETWEEN 2.17 AND 2.80 GEV/C. THE DATA REPORTED IN BUSZA67 ARE ALSO REPEATED HERE. THE NEW MEASUREMENTS WERE DONE WITH AN IMPROVED VERSION OF THE APPARATUS USED BY BUSZA67. THE PI- DATA (INCLUDING BUSZA67)ARE NORMALIZED TO FORWARD DISPERSION RELATIONS,THE PI+ DATAHAS ITS OWN EXPERIMENTAL NORMALIZATION BUT NO NE IS GIVEN. WE HAVE INCREASED THE ERROR OF THE MOST FORWARD PI+ POINT AT 1.72 GEV/C BECAUSE OF AN AMBIGUOUS FOOTNOTE CONCERNING THIS POINT. /// COMMENTS FROM LOVELACE71 COMPILATION OF THESE DATA -- LOVELACE71 CLAIMS SOME USE WAS MADE OF FORWARD DISPERSION RELATIONS TO NORMALIZE THE PI+ DATA AS WELL AS THE PI-. THE FOLLOWING NORMALIZATION ERRORS AND RENORMALIZATION FACTORS ARE RECOMMENDED FOR THE PI+ P AND PI- P DIFFERENTIAL CROSS SECTIONS -- PLAB=1720 MEV/C -- NE(PI+ P)=INFIN, NE(PI- P)=INFIN. PLAB=1890 MEV/C -- RF(PI+ P)=1.245, RF(PI- P)=0.941. PLAB=2070 MEV/C -- NE(PI+ P)=INFIN, RF(PI- P)=1.224. PLAB=2170 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=0.1 . PLAB=2270 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=INFIN. PLAB=2360 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=0.1 . PLAB=2460 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=INFIN. PLAB=2560 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=0.1 . PLAB=2650 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=0.1 . PLAB=2800 MEV/C -- NE(PI+ P)=0.1 , NE(PI- P)=0.1 . /// COMMENTS ON MODIFICATIONS TO LOVELACE71 COMPILATION BY KELLY -- WE HAVE TAKEN ALL PI- NES TO BE INFINITE,AND ALL PI+ NES TO BE UNKNOWN. ALSO ONE MINOR MISTAKE IN THE PI- (PI+) DATA AT 2.36 (2.65) GEV/C HAS BEEN CORRECTED.

  10. d

    Residential Existing Homes (One to Four Units) Energy Efficiency Meter...

    • catalog.data.gov
    • datasets.ai
    • +2more
    Updated Sep 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ny.gov (2023). Residential Existing Homes (One to Four Units) Energy Efficiency Meter Evaluated Project Data: 2007 – 2012 [Dataset]. https://catalog.data.gov/dataset/residential-existing-homes-one-to-four-units-energy-efficiency-meter-evaluated-projec-2007
    Explore at:
    Dataset updated
    Sep 15, 2023
    Dataset provided by
    data.ny.gov
    Description

    IMPORTANT! PLEASE READ DISCLAIMER BEFORE USING DATA. This dataset backcasts estimated modeled savings for a subset of 2007-2012 completed projects in the Home Performance with ENERGY STAR® Program against normalized savings calculated by an open source energy efficiency meter available at https://www.openee.io/. Open source code uses utility-grade metered consumption to weather-normalize the pre- and post-consumption data using standard methods with no discretionary independent variables. The open source energy efficiency meter allows private companies, utilities, and regulators to calculate energy savings from energy efficiency retrofits with increased confidence and replicability of results. This dataset is intended to lay a foundation for future innovation and deployment of the open source energy efficiency meter across the residential energy sector, and to help inform stakeholders interested in pay for performance programs, where providers are paid for realizing measurable weather-normalized results. To download the open source code, please visit the website at https://github.com/openeemeter/eemeter/releases D I S C L A I M E R: Normalized Savings using open source OEE meter. Several data elements, including, Evaluated Annual Elecric Savings (kWh), Evaluated Annual Gas Savings (MMBtu), Pre-retrofit Baseline Electric (kWh), Pre-retrofit Baseline Gas (MMBtu), Post-retrofit Usage Electric (kWh), and Post-retrofit Usage Gas (MMBtu) are direct outputs from the open source OEE meter. Home Performance with ENERGY STAR® Estimated Savings. Several data elements, including, Estimated Annual kWh Savings, Estimated Annual MMBtu Savings, and Estimated First Year Energy Savings represent contractor-reported savings derived from energy modeling software calculations and not actual realized energy savings. The accuracy of the Estimated Annual kWh Savings and Estimated Annual MMBtu Savings for projects has been evaluated by an independent third party. The results of the Home Performance with ENERGY STAR impact analysis indicate that, on average, actual savings amount to 35 percent of the Estimated Annual kWh Savings and 65 percent of the Estimated Annual MMBtu Savings. For more information, please refer to the Evaluation Report published on NYSERDA’s website at: http://www.nyserda.ny.gov/-/media/Files/Publications/PPSER/Program-Evaluation/2012ContractorReports/2012-HPwES-Impact-Report-with-Appendices.pdf. This dataset includes the following data points for a subset of projects completed in 2007-2012: Contractor ID, Project County, Project City, Project ZIP, Climate Zone, Weather Station, Weather Station-Normalization, Project Completion Date, Customer Type, Size of Home, Volume of Home, Number of Units, Year Home Built, Total Project Cost, Contractor Incentive, Total Incentives, Amount Financed through Program, Estimated Annual kWh Savings, Estimated Annual MMBtu Savings, Estimated First Year Energy Savings, Evaluated Annual Electric Savings (kWh), Evaluated Annual Gas Savings (MMBtu), Pre-retrofit Baseline Electric (kWh), Pre-retrofit Baseline Gas (MMBtu), Post-retrofit Usage Electric (kWh), Post-retrofit Usage Gas (MMBtu), Central Hudson, Consolidated Edison, LIPA, National Grid, National Fuel Gas, New York State Electric and Gas, Orange and Rockland, Rochester Gas and Electric. How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov.

  11. d

    Attributes for NHDPlus Catchments (Version 1.1) for the Conterminous United...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Sep 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Attributes for NHDPlus Catchments (Version 1.1) for the Conterminous United States: Normalized Atmospheric Deposition for 2002, Total Inorganic Nitrogen [Dataset]. https://catalog.data.gov/dataset/attributes-for-nhdplus-catchments-version-1-1-for-the-conterminous-united-states-normalize
    Explore at:
    Dataset updated
    Sep 18, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Contiguous United States, United States
    Description

    This data set represents the average normalized atmospheric (wet) deposition, in kilograms, of Total Inorganic Nitrogen for the year 2002 compiled for every catchment of NHDPlus for the conterminous United States. Estimates of Total Inorganic Nitrogen deposition are based on National Atmospheric Deposition Program (NADP) measurements (B. Larsen, U.S. Geological Survey, written commun., 2007). De-trending methods applied to the year 2002 are described in Alexander and others, 2001. NADP site selection met the following criteria: stations must have records from 1995 to 2002 and have a minimum of 30 observations. The NHDPlus Version 1.1 is an integrated suite of application-ready geospatial datasets that incorporates many of the best features of the National Hydrography Dataset (NHD) and the National Elevation Dataset (NED). The NHDPlus includes a stream network (based on the 1:100,00-scale NHD), improved networking, naming, and value-added attributes (VAAs). NHDPlus also includes elevation-derived catchments (drainage areas) produced using a drainage enforcement technique first widely used in New England, and thus referred to as "the New England Method." This technique involves "burning in" the 1:100,000-scale NHD and when available building "walls" using the National Watershed Boundary Dataset (WBD). The resulting modified digital elevation model (HydroDEM) is used to produce hydrologic derivatives that agree with the NHD and WBD. Over the past two years, an interdisciplinary team from the U.S. Geological Survey (USGS), and the U.S. Environmental Protection Agency (USEPA), and contractors, found that this method produces the best quality NHD catchments using an automated process (USEPA, 2007). The NHDPlus dataset is organized by 18 Production Units that cover the conterminous United States. The NHDPlus version 1.1 data are grouped by the U.S. Geologic Survey's Major River Basins (MRBs, Crawford and others, 2006). MRB1, covering the New England and Mid-Atlantic River basins, contains NHDPlus Production Units 1 and 2. MRB2, covering the South Atlantic-Gulf and Tennessee River basins, contains NHDPlus Production Units 3 and 6. MRB3, covering the Great Lakes, Ohio, Upper Mississippi, and Souris-Red-Rainy River basins, contains NHDPlus Production Units 4, 5, 7 and 9. MRB4, covering the Missouri River basins, contains NHDPlus Production Units 10-lower and 10-upper. MRB5, covering the Lower Mississippi, Arkansas-White-Red, and Texas-Gulf River basins, contains NHDPlus Production Units 8, 11 and 12. MRB6, covering the Rio Grande, Colorado and Great Basin River basins, contains NHDPlus Production Units 13, 14, 15 and 16. MRB7, covering the Pacific Northwest River basins, contains NHDPlus Production Unit 17. MRB8, covering California River basins, contains NHDPlus Production Unit 18.

  12. g

    Data from: Analysis Ready Data Sensitivity Analyses

    • ecat.ga.gov.au
    Updated Jun 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Analysis Ready Data Sensitivity Analyses [Dataset]. https://ecat.ga.gov.au/geonetwork/static9008124/search?keyword=surface%20reflectance
    Explore at:
    Dataset updated
    Jun 19, 2024
    Description

    CEOS Analysis Ready Data for Land (CARD4L) are satellite data that have been processed to a minimum set of requirements and organized into a form that allows immediate analysis with a minimum of additional user effort and interoperability both through time and with other datasets [1]. In this paper, key input data (e.g. aerosol optical depth, precipitable water, BRDF parameters) needed for atmospheric and BRDF corrections of Landsat data are identified and a sensitivity analysis is conducted using outputs of a physics based atmospheric and BRDF model. The results show that aerosol impacts more on the visible bands where the average variation of reflectance could reach 0.05 of reflectance unit. The variation over dark targets can be much higher so that it is a critical parameter for aquatic applications. By contrast, precipitable water (water vapor in the rest of the paper) only impacts the near-infrared (NIR) and shortwave (SWIR) bands and the extent of change is much smaller. BRDF parameters impact time series most on winter and summer images of highly anisotropic areas and when they are normalized to 45º solar angle. Different BRDF levels for different spectrum ranges not only impact the magnitude of reflectance, but also the signature for these areas. It seems that it is necessary to normalize surface BRDF to ensure time series consistency of the Landsat ARD product. Abstract presented at 2019 IEEE International Geoscience and Remote Sensing Symposium (IGARSS)

  13. d

    Data from: Attributes for NHDPlus Catchments (Version 1.1) for the...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Nov 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Attributes for NHDPlus Catchments (Version 1.1) for the Conterminous United States: Normalized Atmospheric Deposition for 2002, Ammonium (NH4) [Dataset]. https://catalog.data.gov/dataset/attributes-for-nhdplus-catchments-version-1-1-for-the-conterminous-united-states-normalize-57d70
    Explore at:
    Dataset updated
    Nov 1, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Contiguous United States, United States
    Description

    This data set represents the average normalized atmospheric (wet) deposition, in kilograms, of Ammonium (NH4) for the year 2002 compiled for every catchment of NHDPlus for the conterminous United States. Estimates of NH4 deposition are based on National Atmospheric Deposition Program (NADP) measurements (B. Larsen, U.S. Geological Survey, written commun., 2007). De-trending methods applied to the year 2002 are described in Alexander and others, 2001. NADP site selection met the following criteria: stations must have records from 1995 to 2002 and have a minimum of 30 observations. The NHDPlus Version 1.1 is an integrated suite of application-ready geospatial datasets that incorporates many of the best features of the National Hydrography Dataset (NHD) and the National Elevation Dataset (NED). The NHDPlus includes a stream network (based on the 1:100,00-scale NHD), improved networking, naming, and value-added attributes (VAAs). NHDPlus also includes elevation-derived catchments (drainage areas) produced using a drainage enforcement technique first widely used in New England, and thus referred to as "the New England Method." This technique involves "burning in" the 1:100,000-scale NHD and when available building "walls" using the National Watershed Boundary Dataset (WBD). The resulting modified digital elevation model (HydroDEM) is used to produce hydrologic derivatives that agree with the NHD and WBD. Over the past two years, an interdisciplinary team from the U.S. Geological Survey (USGS), and the U.S. Environmental Protection Agency (USEPA), and contractors, found that this method produces the best quality NHD catchments using an automated process (USEPA, 2007). The NHDPlus dataset is organized by 18 Production Units that cover the conterminous United States. The NHDPlus version 1.1 data are grouped by the U.S. Geologic Survey's Major River Basins (MRBs, Crawford and others, 2006). MRB1, covering the New England and Mid-Atlantic River basins, contains NHDPlus Production Units 1 and 2. MRB2, covering the South Atlantic-Gulf and Tennessee River basins, contains NHDPlus Production Units 3 and 6. MRB3, covering the Great Lakes, Ohio, Upper Mississippi, and Souris-Red-Rainy River basins, contains NHDPlus Production Units 4, 5, 7 and 9. MRB4, covering the Missouri River basins, contains NHDPlus Production Units 10-lower and 10-upper. MRB5, covering the Lower Mississippi, Arkansas-White-Red, and Texas-Gulf River basins, contains NHDPlus Production Units 8, 11 and 12. MRB6, covering the Rio Grande, Colorado and Great Basin River basins, contains NHDPlus Production Units 13, 14, 15 and 16. MRB7, covering the Pacific Northwest River basins, contains NHDPlus Production Unit 17. MRB8, covering California River basins, contains NHDPlus Production Unit 18.

  14. g

    VIIRS Satellite Remote Sensing of Ocean Color, January-April 2016, Gulf of...

    • data.griidc.org
    • search.dataone.org
    Updated Feb 14, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert (Bob) Arnone (2019). VIIRS Satellite Remote Sensing of Ocean Color, January-April 2016, Gulf of Mexico [Dataset]. http://doi.org/10.7266/N7416V4D
    Explore at:
    Dataset updated
    Feb 14, 2019
    Dataset provided by
    GRIIDC
    Authors
    Robert (Bob) Arnone
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Gulf of Mexico (Gulf of America)
    Description

    The spatial extent of surface bio-optical properties (chlorophyll-a, backscattering, absorption, light attenuation, etc.) can be determined from changes in satellite-detected water leaving radiance (Lw), enabling a synoptic, albeit two-dimensional view of various ocean parameters on a global basis. This wider field of view will enable a large-scale characterization of water properties in the sampling area, providing information that will allow the sampling vessel to target key features, such as river filaments. We have provided satellite-derived measurements from the Visual Infrared Imaging Radiometer Suite (VIIRS) aboard the Suomi National Polar-orbiting Partnership (SNPP) satellite. This sensor has 7 visible/near infrared moderate resolution (M)-bands (wavelengths = 410, 443, 486, 551, 671, 745, 865 nm) at a spatial resolution of 750-m. Raw calibrated radiance products were orthorectified and processed by USM to correct for atmospheric components and obtain the radiance signal emitted from the ocean (Lw). The spectral Lw is the basis for which subsequent bio-optical algorithms are calculated.

  15. UniCourt Court Data API - USA Court Records (AI Normalized)

    • datarade.ai
    .json, .csv, .xls
    Updated Jul 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UniCourt (2022). UniCourt Court Data API - USA Court Records (AI Normalized) [Dataset]. https://datarade.ai/data-products/court-data-api-unicourt-2c86
    Explore at:
    .json, .csv, .xlsAvailable download formats
    Dataset updated
    Jul 8, 2022
    Dataset provided by
    Unicourt
    Authors
    UniCourt
    Area covered
    United States
    Description

    UniCourt simplifies access to structured court records with our Court Data API, so you can search court cases via API, get real-time alerts with webhooks, streamline your account management, and get bulk access to the AI normalized court data you need.

    Search Court Cases with APIs

    • Leverage UniCourt’s easy API integrations to search state and federal (PACER) court records directly from your own internal applications and systems. • Access the docket entries and case details you need on the parties, attorneys, law firms, and judges involved in litigation. • Conduct the same detailed case searches you can in our app with our APIs and easily narrow your search results using our jurisdiction, case type, and case status filters. • Use our Related Cases API to search for and download all of the court data for consolidated cases from the Judicial Panel on Multidistrict Litigation, as well as associated civil and criminal cases from U.S. District Courts.

    Get Real-Time Alerts with Webhooks

    • UniCourt’s webhooks provide you with industry leading automation tools for real-time push notifications to your internal applications for all your case tracking needs. • Get daily court data feeds with new case results for your automated court searches pushed directly to your applications in a structured format. • Use our custom search file webhook to search for and track thousands of entities at once and receive your results packaged into a custom CSV file. • Avoid making multiple API calls to figure out if a case has updates or not and remove the need to continuously check the status of large document orders and updates.

    Bulk Access to Court Data

    • UniCourt downloads thousands of new cases everyday from state and federal courts, and we structure them, normalize them with our AI, and make them accessible in bulk via our Court Data API. • Our rapidly growing CrowdSourced Library™ provides you with a massive free repository of 100+ million court cases, tens of millions of court documents, and billions of docket entries all at your fingertips. • Leverage your bulk access to AI normalized court data that’s been enriched with other public data sets to build your own analytics, competitive intelligence, and machine learning models.

    Streamlined Account Management

    • Easily manage your UniCourt account with information on your billing cycle and billing usage delivered to you via API. • Eliminate the requirement of logging in to your account to get a list of all of your invoices and use our APIs to directly download the invoices you need. • Get detailed data on which cases are being tracked by the users for your account and access all of the related tracking schedules for cases your users are tracking. • Gather complete information on the saved searches being run by your account, including the search parameters, filters, and much more.

  16. d

    Data from: Filamentation and restoration of normal growth in Escherichia...

    • datadryad.org
    • search.dataone.org
    • +2more
    zip
    Updated Aug 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrea Mückl; Matthaeus Schwarz-Schilling; Katrin Fischer; Friedrich C. Simmel (2019). Filamentation and restoration of normal growth in Escherichia coli using a combined CRISPRi sgRNA/antisense RNA approach [Dataset]. http://doi.org/10.5061/dryad.t153690
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 28, 2019
    Dataset provided by
    Dryad
    Authors
    Andrea Mückl; Matthaeus Schwarz-Schilling; Katrin Fischer; Friedrich C. Simmel
    Time period covered
    2019
    Description

    Fig_2b_c_cell-freeThis file contains the data from Fig 2. mVenus fluorescence intensities were measured with a plate reader. mVenus was expressed from a purified plasmid in E. coli based cell extract and purified sgRNA, asgRNA and/or dCas9 were supplemented.Fig_2d_RT-qPCRThis zip-file contains the following files: "ftsZ -dRFU.xls" - contains the data for the melt curve in Figure S3 (inset). "Cq ftsZ.xls" - contains the data shown in Figure S3A. "Cq ref genes.xls" - contains the data used to normalize the data shown in Figure 2D. "ftsZgenePos.xls" - this contains the position list and sample names for the data in "cq ftsZ" and "ftsZ -dRFU". "RefgenesPos.xls" - this contain the position list and sample names for the data in "cq ref genes".Fig_3a_ImagesThis file contains the phase-contrast microscopy images of the induced cells at different time points used for the histogram of the cell lengths shown in Fig. 3a. The raw data for Fig. 3a are in the excel file “Fig_3a_CellLength.xlsx”.Fig_...

  17. MEDDOPROF corpus: complete gold standard annotations for occupation...

    • zenodo.org
    zip
    Updated May 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salvador Lima-López; Salvador Lima-López; Eulàlia Farré-Maduell; Antonio Miranda-Escalada; Antonio Miranda-Escalada; Vicent Briva-Iglesias; Martin Krallinger; Martin Krallinger; Eulàlia Farré-Maduell; Vicent Briva-Iglesias (2023). MEDDOPROF corpus: complete gold standard annotations for occupation detection in medical documents in Spanish [Dataset]. http://doi.org/10.5281/zenodo.5070541
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 22, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Salvador Lima-López; Salvador Lima-López; Eulàlia Farré-Maduell; Antonio Miranda-Escalada; Antonio Miranda-Escalada; Vicent Briva-Iglesias; Martin Krallinger; Martin Krallinger; Eulàlia Farré-Maduell; Vicent Briva-Iglesias
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The MEDDOPROF Shared Task tackles the detection of occupations and employment statuses in clinical cases in Spanish from different specialties. Systems capable of automatically processing clinical texts are of interest to the medical community, social workers, researchers, the pharmaceutical industry, computer engineers, AI developers, policy makers, citizen’s associations and patients. Additionally, other NLP tasks (such as anonymization) can also benefit from this type of data.

    MEDDOPROF has three different sub-tasks:

    1) MEDDOPROF-NER: Participants must find the beginning and end of occupation mentions and classify them as PROFESION (PROFESSION), SITUACION_LABORAL (WORKING_STATUS) or ACTIVIDAD (ACTIVIDAD).

    2) MEDDOPROF-CLASS: Participants must find the beginning and end of occupation mentions and classify them according to their referent (PACIENTE [patient], FAMILIAR [family member], SANITARIO [health professional] or OTRO [other]).

    3) MEDDOPROF-NORM: Participants must find the beginning and end of occupation mentions and normalize them according to a reference codes list.

    This is the complete Gold Standard. Annotations for the NER and CLASS sub-track are provided both separately and joint together (with each annotation level separated by a dash, e.g. PROFESION-PACIENTE). The normalized mentions are given as tab-separated file (.tsv) with four columns: filename, mention text, span and code.

    Please cite if you use this resource:

    Salvador Lima-López, Eulàlia Farré-Maduell, Antonio Miranda-Escalada, Vicent Brivá-Iglesias and Martin Krallinger. NLP applied to occupational health: MEDDOPROF shared task at IberLEF 2021 on automatic recognition, classification and normalization of professions and occupations from medical texts. In Procesamiento del Lenguaje Natural, 67. 2021.

    @article{meddoprof,
      title={NLP applied to occupational health: MEDDOPROF shared task at IberLEF 2021 on automatic recognition, classification and normalization of professions and occupations from medical texts},
      author={Lima-López, Salvador and Farré-Maduell, Eulàlia and Miranda-Escalada, Antonio and Brivá-Iglesias, Vicent and Krallinger, Martin},
    journal = {Procesamiento del Lenguaje Natural},
    volume = {67},
    year={2021},
    issn = {1989-7553},
    url = {http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6393},
    pages = {243--256}
    }

    Resources:

    - Web

    - Training Data

    - Test set

    - Codes Reference List (for MEDDOPROF-NORM)

    - Annotation Guidelines

    MEDDOPROF is part of the IberLEF 2021 workshop, which is co-located with the SEPLN 2021 conference. For further information, please visit https://temu.bsc.es/meddoprof/ or email us at encargo-pln-life@bsc.es

    MEDDOPROF is promoted by the Plan de Impulso de las Tecnologías del Lenguaje de la Agenda Digital (Plan TL).

  18. Auroral Electrojet (AE, AL, AO, AU) - A Global Measure of Auroral Zone...

    • catalog.data.gov
    • ncei.noaa.gov
    • +1more
    Updated Oct 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DOC/NOAA/NESDIS/NCEI > National Centers for Environmental Information, NESDIS, NOAA, U.S. Department of Commerce (Point of Contact) (2024). Auroral Electrojet (AE, AL, AO, AU) - A Global Measure of Auroral Zone Magnetic Activity [Dataset]. https://catalog.data.gov/dataset/auroral-electrojet-ae-al-ao-au-a-global-measure-of-auroral-zone-magnetic-activity2
    Explore at:
    Dataset updated
    Oct 18, 2024
    Dataset provided by
    United States Department of Commercehttp://www.commerce.gov/
    National Centers for Environmental Informationhttps://www.ncei.noaa.gov/
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    National Environmental Satellite, Data, and Information Service
    Description

    The AE index is derived from geomagnetic variations in the horizontal component observed at selected (10-13) observatories along the auroral zone in the northern hemisphere. To normalize the data a base value for each station is first calculated for each month by averaging all the data from the station on the five international quietest days. This base value is subtracted from each value of one-minute data obtained at the station during that month. Then among the data from all the stations at each given time (UT), the largest and smallest values are selected. The AU and AL indices are respectively defined by the largest and the smallest values so selected. The symbols, AU and AL, derive from the fact that these values form the upper and lower envelopes of the superposed plots of all the data from these stations as functions of UT. The difference, AU minus AL, defines the AE index, and the mean value of the AU and AL, i.e. (AU+AL)/2, defines the AO index. The term "AE indices" is usually used to represent these four indices (AU, AL, AE and AO). The AU and AL indices are intended to express the strongest current intensity of the eastward and westward auroral electrojets, respectively. The AE index represents the overall activity of the electrojets, and the AO index provides a measure of the equivalent zonal current.

  19. File Merge Tool (aLink)

    • data.europa.eu
    octet stream
    Updated Mar 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Junta de Andalucía (2024). File Merge Tool (aLink) [Dataset]. https://data.europa.eu/data/datasets/https-pdpopendata-ckan-paas-junta-andalucia-es-datosabiertos-portal-dataset-2020202012a06258-afa8-4670-9bbe-0192c9d004cb?locale=en
    Explore at:
    octet streamAvailable download formats
    Dataset updated
    Mar 17, 2024
    Dataset provided by
    Regional Government of Andalusiahttp://www.juntadeandalucia.es/
    Authors
    Junta de Andalucía
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IT application called aLink. File fusion tool, which combines a series of techniques in different stages to carry out a process of file fusion of large volumes of data. In addition to allowing to link files with probabilistic processes through common variables, it also allows to normalize variables that contain postal addresses, names and surnames of people and DNI, NIF or NIE (Foreigner Identification Number).

  20. m

    Mother's own milk normalize immune system development in extremely preterm...

    • data.mendeley.com
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ziyang Tan (2025). Mother's own milk normalize immune system development in extremely preterm infants [Dataset]. http://doi.org/10.17632/fpc6ypbsts.2
    Explore at:
    Dataset updated
    Mar 10, 2025
    Authors
    Ziyang Tan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset is used to reproduce figures for the paper "Mother's own milk normalize immune system development in extremely preterm infants". The scripts are shared under https://github.com/Brodinlab/preterm_DHA.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Single-cell best practices (2023). Luecken Cite-seq human bone marrow 2021 preprocessing [Dataset]. http://doi.org/10.6084/m9.figshare.23623950.v2
Organization logo

Luecken Cite-seq human bone marrow 2021 preprocessing

Explore at:
hdfAvailable download formats
Dataset updated
Oct 5, 2023
Dataset provided by
figshare
Authors
Single-cell best practices
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Dataset published by Luecken et al. 2021 which contains data from human bone marrow measured through joint profiling of single-nucleus RNA and Antibody-Derived Tags (ADTs) using the 10X 3' Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0.File Descriptioncite_quality_control.h5mu: Filtered cell by feature MuData object after quality control.cite_normalization.h5mu: MuData object of normalized data using DSB (denoised and scaled by background) normalization.cite_doublet_removal_xdbt.h5mu: MuData of data after doublet removal based on known cell type markers. Cells were removed if they were double positive for mutually exclusive markers with a DSB value >2.5.cite_dimensionality_reduction.h5mu: MuData of data after dimensionality reduction.cite_batch_correction.h5mu: MuData of data after batch correction.CitationLuecken, M. D. et al. A sandbox for prediction and integration of DNA, RNA, and proteins in single cells. Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2021).Original data linkhttps://openproblems.bio/neurips_docs/data/dataset/

Search
Clear search
Close search
Google apps
Main menu