16 datasets found

Z
GEO gene expression dataset recompute for selected tumor samples
data.niaid.nih.gov
Updated May 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Visentin, Luca (2024). GEO gene expression dataset recompute for selected tumor samples [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10817923
Explore at:
Dataset updated
May 13, 2024
Dataset authored and provided by
Visentin, Luca
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We aligned and quantified RNA-Seq data present in GEO with a standardized pipeline to homogenize data preprocessing for downstream applications.

All uploaded files are UTF-8, .csv-formatted matrices. The *_expected_count.csv.gz files are unlogged, raw expression counts as reported by rsem-quantify-expression (see details below). The associated *_metadata.csv.gz files contain metadata pertinent to each column of the corresponding expression matrix.Some metadata files may have more rows than the associated number of columns. This is for series that were only partially RNA-Seq based (e.g. combinated RNA-Seq plus miRNA-Seq samples in the same GEO accession ID).

Metadata columns are derived from GEO series files, and follow their definitions. See each GEO entry directly to determine metadata meaning.

Each recompute has at least the gene_id column holding Ensembl Gene IDs. The remaining columns are ENA run accession IDs of the specific recomputed samples.Each associated metadata has at least the following columns:

geo_accession: The GEO sample ID of the sample.

ena_sample: The ENA sample ID of the sample.

ena_run: The ENA run accession ID of the sample, to be cross-referenced with the expression matrices.

The remaining columns are derived from GEO metadata files and other ENA-provided data. Please refer to the x.FASTQ package for more information.

Pipeline Details

The alignment and quantification was made with the x.FASTQ tool available on Github installed locally on an Arch Linux machine on commit 3a93dd77a70df59c74f7b15216c26f12cd918e81 running the Linux 6.7.8-zen1-1-zen kernel with a 11th Gen Intel i7-1185G7 (8) CPU and a Intel TigerLake-LP GT2 [Iris Xe Graphics] GPU. Please note that no sample filtering or omissions were done based on sample quality or sequencing depth. However, sensible trimming (e.g. low-quality bases and common adapters) was performed on all the samples.

Reference genome was downloaded from Ensembl, version hg38. STAR was used to create the index genome with overhang set to 149.
Z
Field-wide assessment of differential HT-seq from NCBI GEO database
data.niaid.nih.gov
zenodo.org
Updated Jan 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tenson, Tanel (2023). Field-wide assessment of differential HT-seq from NCBI GEO database [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3747112
Explore at:
Dataset updated
Jan 13, 2023
Dataset provided by
Tenson, Tanel
Luidalepp, Hannes
Päll, Taavi
Maiväli, Ülo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We analysed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository.

This release includes GEO series published up to Dec-31, 2020;

geo-htseq.tar.gz archive contains following files:

output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).

output/document_summaries.csv, document summaries of NCBI GEO series.

output/suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions.

output/suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO.

output/publications.csv, publication info of NCBI GEO series.

output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series

output/spots.csv, NCBI SRA sequencing run metadata.

output/cancer.csv, cancer related experiment accessions.

output/transcription_factor.csv, TF related experiment accessions.

output/single-cell.csv, single cell experiment accessions.

blacklist.txt, list of supplementary files that were either too large to import or were causing computing environment crash during import.

Workflow to produce this dataset is available on Github at rstats-tartu/geo-htseq.

geo-htseq-updates.tar.gz archive contains files:

results/detools_from_pmc.csv, differential expression analysis programs inferred from published articles

results/n_data.csv, manually curated sample size info for NCBI GEO HT-seq series

results/simres_df_parsed.csv, pi0 values estimated from differential expression results obtained from simulated RNA-seq data

results/data/parsed_suppfiles_rerun.csv, pi0 values estimated using smoother method from anti-conservative p-value sets
Expression Data recompute of selected GEO-deposited RNA-Seq data of HMEC-1...
zenodo.org
application/gzip
Updated Feb 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luca Visentin; Luca Visentin (2025). Expression Data recompute of selected GEO-deposited RNA-Seq data of HMEC-1 cell lines [Dataset]. http://doi.org/10.5281/zenodo.14793942
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14793942
Dataset updated
Feb 3, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Luca Visentin; Luca Visentin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We aligned and quantified RNA-Seq data present in GEO regarding HMEC-1 cell lines with a standardized pipeline to homogenize data preprocessing for downstream applications.

All uploaded files are UTF-8, .csv-formatted matrices. The *_expected_count.csv.gz files are unlogged, raw expression counts as reported by rsem-quantify-expression with the 'expected counts' feature. The associated *_metadata.csv.gz files contain metadata pertinent to each column of the corresponding expression matrix.
Some metadata files may have more rows than the associated number of columns. This is for series that were only partially RNA-Seq based (e.g. combinated RNA-Seq plus miRNA-Seq samples in the same GEO accession ID).

Metadata columns are derived from GEO series files, and follow their definitions. See each GEO entry directly to determine metadata meaning.

Each recompute has at least the gene_id column holding Ensembl Gene IDs. The remaining columns are ENA run accession IDs of the specific recomputed samples.
Each associated metadata has at least the following columns:

geo_sample: The GEO sample ID of the sample.

geo_series: The GEO series ID of the sample.

ena_sample: The ENA sample ID of the sample.

ena_run: The ENA run accession ID of the sample, to be cross-referenced with the expression matrices.

The remaining columns are derived from GEO metadata files and other ENA-provided data. Please refer to the x.FASTQ package for more information (https://github.com/TCP-Lab/x.FASTQ).

Reference genome was downloaded from Ensembl, version hg38. STAR was used to create the index genome with overhang set to 149.

The different datasets where generated over a long period of time trough a variety of different versions of x.FASTQ. However, the versions of the softwares that acted on the files themselves (e.g. STAR, rsem, etc...) were unchanged, and reported below:
Z
Geo-referencing of journal articles and platform design for spatial query...
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kmoch, Alexander (2020). Geo-referencing of journal articles and platform design for spatial query capabilities [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_1153886
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Kmoch, Alexander
Uuemaa, Evelyn
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We analyzed the corpus of three geoscientific journals to investigate if there are enough locational references in research articles to apply a geographical search method, on the example of New Zealand. We counted place name occurrences that match records from the official Land Information New Zealand (LINZ) gazetteer in the titles, abstracts and full texts of freely available papers of the New Zealand Journal of Geology and Geophysics, the New Zealand Journal of Marine and Freshwater Research, and the Journal of Hydrology, New Zealand, for the years 1958 to 2015. We generated ISO standard compliant metadata records for each article including the spatial references and make them available in a public catalogue service.

articles_georef_count_data.xlsx: The counts and evaluation tracking of the place name occurrences in the journal articles.

summary_final.xlsx: Summary statistics for evaluation based on the counts data.

article_template.xml: XML template for ISO 19139 compliant metadata record filled for each article.

full_article.xml: Exemplary fully filled ISO 19139 compliant metadata record.
i
System for Earth Sample Registration
uri.interlex.org
neuinfo.org
+1more
Updated Dec 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). System for Earth Sample Registration [Dataset]. http://identifiers.org/RRID:SCR_002222
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002222
Dataset updated
Dec 4, 2023
Description
Sample Catalog and Registry for the International Geo Sample Number. SESAR catalogs and preserves sample metadata profiles, and provides access to the sample catalog via the Global Sample Search.
a
Nordic Coastal Standardised metadata template 2023
geo.abds.is
catalogue.arctic-sdi.org
Updated Nov 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Conservation of Arctic Flora and Fauna (CAFF) (2023). Nordic Coastal Standardised metadata template 2023 [Dataset]. https://geo.abds.is/geonetwork/srv/api/records/90080b13-5ffa-4ace-98e6-aad739eda345
Explore at:
www:download-1.0-http--downloadAvailable download formats
Dataset updated
Nov 28, 2023
Dataset authored and provided by
Conservation of Arctic Flora and Fauna (CAFF)
Area covered
Description
Standardized metadata template, for identifying knowledge locations on Arctic Coastal Ecosystems applicable for different knowledge systems. This template was developed by the Nordic Coastal Group, composed of the Nordic representatives on CBMP Coastal. The template is intended to identify locations for Indigenous Knowledge, Scientific, Hunters Knowledge, Local Knowledge, and community-based monitoring. The template is composed of two files a Word document that provides the rationale and detailed description for the Excel sheet that allows for standardized data gathering
WSDOT - GIS Line Feature Class Template
geo.wa.gov
Updated Jan 16, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
WSDOT Online Map Center (2020). WSDOT - GIS Line Feature Class Template [Dataset]. https://geo.wa.gov/datasets/31dda2646dbe450897ca6323f31f440f
Explore at:
Dataset updated
Jan 16, 2020
Dataset provided by
Washington State Department of Transportation
Authors
WSDOT Online Map Center
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered

Description
This ESRI featureclass and associated metadata is a template. Attribute schema is pre-defined to help users create data that is more consistent or compliant with agency standards.Metadata has been created using the FGDC metadata style but stored in the ArcGIS Format. Content presentation will change upon export to FGDC format.
f
Data from: Metadata record for the manuscript: FOXA1 and adaptive response...
springernature.figshare.com
xlsx
Updated Feb 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Steven P. Angus; Timothy J. Stuhlmiller; Gaurav Mehta; Samantha M. Bevill; Daniel R. Goulet; J. Felix Olivares-Quintero; Michael P. East; Maki Tanioka; Jon S. Zawistowski; Darshan Singh; Noah Sciaky; Xin Chen; Xiaping He; Naim U. Rashid; Lynn Chollet-Hinton; Cheng Fan; Matthew G. Soloway; Patricia A. Spears; Stuart Jefferys; Joel S. Parker; Kristalyn K. Gallagher; Andres Forero-Torres; Ian E. Krop; Alastair M. Thompson; Rashmi Murthy; Michael L. Gatza; Charles M. Perou; H. Shelton Earp; Lisa A. Carey; Gary L. Johnson (2024). Metadata record for the manuscript: FOXA1 and adaptive response determinants to HER2 targeted therapy in TBCRC 036 [Dataset]. http://doi.org/10.6084/m9.figshare.14376746.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14376746.v1
Dataset updated
Feb 14, 2024
Dataset provided by
figshare
Authors
Steven P. Angus; Timothy J. Stuhlmiller; Gaurav Mehta; Samantha M. Bevill; Daniel R. Goulet; J. Felix Olivares-Quintero; Michael P. East; Maki Tanioka; Jon S. Zawistowski; Darshan Singh; Noah Sciaky; Xin Chen; Xiaping He; Naim U. Rashid; Lynn Chollet-Hinton; Cheng Fan; Matthew G. Soloway; Patricia A. Spears; Stuart Jefferys; Joel S. Parker; Kristalyn K. Gallagher; Andres Forero-Torres; Ian E. Krop; Alastair M. Thompson; Rashmi Murthy; Michael L. Gatza; Charles M. Perou; H. Shelton Earp; Lisa A. Carey; Gary L. Johnson
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Summary

This metadata record provides details of the data supporting the claims of the related manuscript: “FOXA1 and adaptive response determinants to HER2 targeted therapy in TBCRC 036”.

The related study aimed to determine the global alterations in gene enhancers and transcriptional changes to identify factors involved in the adaptive response to HER2 inhibition. In parallel, it analysed the in vivo human adaptive molecular responses to HER2 targeting in a window-of-opportunity clinical trial using both RNAseq and a chemical proteomics method (MIB/MS) to assess the functional kinome.

Type of data: mass spectrometry proteomics data; normalised patient RNA sequencing data; cell line RNA sequencing data; cell line ChIPseq data

Subject of data: Homo sapiens; Eukaryotic cell lines

Recruitment: Eligible women included those with newly diagnosed Stage I-IV HER2+ breast cancer scheduled to undergo definitive surgery (either lumpectomy or mastectomy). Stage I-IIIc patients could not be candidates for a therapeutic neoadjuvant treatment. Study subjects provided informed written consent that included details of the nontherapeutic nature of the trial.

Trial registration number: https://clinicaltrials.gov/ct2/show/NCT01875666

Data access

The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the data set identifier https://identifiers.org/pride.project:PXD021865.

Normalized patient RNAseq data (https://identifiers.org/geo:GSE161743), cell line RNAseq (https://identifiers.org/geo:GSE160001 and https://identifiers.org/geo:GSE160001), and cell line ChIPseq (https://identifiers.org/geo:GSE160667) are all part of the SuperSeries https://identifiers.org/geo:GSE160670 available through the Gene Expression Omnibus.

Processed and normalized data are provided as supplemental materials associated with the article on the journal website, and also attached to this data record in the Excel spreadsheets called Supplementary Data 1-10 and the PDF called Supplementary material file.PDF. Accompanying Supplementary Information and Supplementary Data files contain relevant data used to produce the included figures and are available with this article. A detailed list of which data files underlie which figures and tables in the related article is included in the file ‘Angus_et_al_2021_underlying_data_files_list.xlsx’, which is shared with this data record.

The data supporting Figure 3c is in the GraphPad Prism file called ‘siGrowth’, which is not shared publicly as it is in a non-open format, but it can be made available upon reasonable request to the corresponding author.

Corresponding author(s) for this study

Gary L. Johnson, PhD, Department of Pharmacology, 4079 Genetic Medicine Building, University of North Carolina School of Medicine, Chapel Hill, NC 27599. Email: glj@med.unc.edu. Phone: 919-843-3106.

Study approval

Approved by the UNC Office of Human Research Ethics and conducted in accordance with the Declaration of Helsinki. IRB# 13-1826
d
US Restaurant POI dataset with metadata
datarade.ai
.csv
Updated Jul 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Geolytica (2022). US Restaurant POI dataset with metadata [Dataset]. https://datarade.ai/data-products/us-restaurant-poi-dataset-with-metadata-geolytica
Explore at:
.csvAvailable download formats
Dataset updated
Jul 30, 2022
Dataset authored and provided by
Geolytica
Area covered
United States of America
Description
Point of Interest (POI) is defined as an entity (such as a business) at a ground location (point) which may be (of interest). We provide high-quality POI data that is fresh, consistent, customizable, easy to use and with high-density coverage for all countries of the world.

This is our process flow:

Our machine learning systems continuously crawl for new POI data Our geoparsing and geocoding calculates their geo locations Our categorization systems cleanup and standardize the datasets Our data pipeline API publishes the datasets on our data store

A new POI comes into existence. It could be a bar, a stadium, a museum, a restaurant, a cinema, or store, etc.. In today's interconnected world its information will appear very quickly in social media, pictures, websites, press releases. Soon after that, our systems will pick it up.

POI Data is in constant flux. Every minute worldwide over 200 businesses will move, over 600 new businesses will open their doors and over 400 businesses will cease to exist. And over 94% of all businesses have a public online presence of some kind tracking such changes. When a business changes, their website and social media presence will change too. We'll then extract and merge the new information, thus creating the most accurate and up-to-date business information dataset across the globe.

We offer our customers perpetual data licenses for any dataset representing this ever changing information, downloaded at any given point in time. This makes our company's licensing model unique in the current Data as a Service - DaaS Industry. Our customers don't have to delete our data after the expiration of a certain "Term", regardless of whether the data was purchased as a one time snapshot, or via our data update pipeline.

Customers requiring regularly updated datasets may subscribe to our Annual subscription plans. Our data is continuously being refreshed, therefore subscription plans are recommended for those who need the most up to date data. The main differentiators between us vs the competition are our flexible licensing terms and our data freshness.

Data samples may be downloaded at https://store.poidata.xyz/us
WSDOT - GIS Point Feature Class Template
geo.wa.gov
gisdata-wsdot.opendata.arcgis.com
+1more
Updated Jan 16, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
WSDOT Online Map Center (2020). WSDOT - GIS Point Feature Class Template [Dataset]. https://geo.wa.gov/items/875e78af7574474aa5463be15bcbeb4d
Explore at:
Dataset updated
Jan 16, 2020
Dataset provided by
Washington State Department of Transportation
Authors
WSDOT Online Map Center
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered

Description
WSDOT template for Esri file geodatabase point feature class. Template has pre-defined attribute schema to help users create data that is more consistent or compliant with agency standards. Metadata has been created using the FGDC metadata style but stored in the ArcGIS format. Content presentation will change upon export to FGDC format.This service is maintained by the WSDOT Transportation Data, GIS & Modeling Office. If you are having trouble viewing the service, please contact Online Map Support at onlinemapsupport@wsdot.wa.gov.
E
Sea-Surface Temperature, NOAA Geo-polar Blended Analysis Diurnal Correction...
coastwatch.noaa.gov
Updated Sep 1, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office of Satellite Products and Operations (2020). Sea-Surface Temperature, NOAA Geo-polar Blended Analysis Diurnal Correction (Day+Night), GHRSST, Near Real-Time, Global 5km, 2019-Present, Daily [Dataset]. https://coastwatch.noaa.gov/erddap/info/noaacwBLENDEDsstDLDaily/index.html
Explore at:
Dataset updated
Sep 1, 2020
Dataset provided by
The GHRSST Project Office
Authors
Office of Satellite Products and Operations
Time period covered
Jul 22, 2019 - Jul 31, 2025
Area covered
Variables measured
mask, time, latitude, longitude, analysed_sst, analysis_error, sea_ice_fraction
Description
Analysed blended sea surface temperature over the global ocean using day and night input data. An SST estimation scheme which combines multi-satellite retrievals of sea surface temperature datasets available from polar orbiters, geostationary InfraRed (IR) and microwave sensors into a single global analysis. This global SST ananlysis provide a daily gap free map of the foundation sea surface temperature at 0.05o spatial resolution. acknowledgement=NOAA/NESDIS cdm_data_type=Grid comment=The Geo-Polar Blended Sea Surface Temperature (SST) Analysis combines multi-satellite retrievals of sea surface temperature into a single analysis of SST Conventions=CF-1.6, Unidata Observation Dataset v1.0, COARDS, ACDD-1.3 Easternmost_Easting=179.975 gds_version_id=2.0 geospatial_lat_max=89.975 geospatial_lat_min=-89.975 geospatial_lat_resolution=0.049999999999999996 geospatial_lat_units=degrees_north geospatial_lon_max=179.975 geospatial_lon_min=-179.975 geospatial_lon_resolution=0.049999999999999996 geospatial_lon_units=degrees_east history=NESDIS geo-SST L1 to L2 processor, NESDIS Advanced Clear-Sky Processor for Oceans (ACSPO), NESDIS Geo-Polar 1/20th degree Blended SST Analysis id=Geo_Polar_DABlended-OSPO-L4-GLOB-v1.0 infoUrl=https://podaac.jpl.nasa.gov/dataset/Geo_Polar_DABlended-OSPO-L4-GLOB-v1.0 institution=NOAA NESDIS CoastWatch keywords_vocabulary=GCMD Science Keywords metadata_link=https://podaac.jpl.nasa.gov/ws/metadata/dataset?format=iso&shortName=Geo_Polar_DABlended-OSPO-L4-GLOB-v1.0 naming_authority=gov.noaa.coastwatch Northernmost_Northing=89.975 platform=goes-18, goes-19, himawari-9, metop-b, metop-c, meteosat-9, meteosat-10, NOAA-20, NOAA-21, processing_level=L4 project=Group for High Resolution Sea Surface Temperature references=Fieguth,P.W. et al. "Mapping Mediterranean altimeter data with a multiresolution optimal interpolation algorithm", J. Atmos. Ocean Tech, 15 (2): 535-546, 1998. Fieguth, P. Multiply-Rooted Multiscale Models for Large-Scale Estimation, IEEE Image Processing, 10(11), 1676-1686, 2001. Khellah, F., P.W. Fieguth, M.J. Murray and M.R. Allen, "Statistical Processing of Large Image Sequences", IEEE Transactions on Geoscience and Remote Sensing, 12 (1), 80-93, 2005. sensor=abi, abi, ahi, avhrr, avhrr, severi, severi, viirs, viirs, source=OSPO_acspoSST_GOES18_ABI, OSPO_acspoSST_GOES19_ABI, STAR_acspoSST_HIMAWARI-9_AHI, OSPO_acspoSST_METOPB_AVHRR, OSPO_acspoSST_METOPC_AVHRR, OSPO_geoSST_METEOSAT9_severi, OSPO_geoSST_METEOSAT10_severi, OSPO_acspoSST_NOAA20_VIIRS, OSPO_acspoSST_NOAA21_VIIRS, sourceUrl=(local files) Southernmost_Northing=-89.975 spatial_resolution=0.05 degree standard_name_vocabulary=CF Standard Name Table v29 testOutOfDate=now-4days time_coverage_end=2025-07-31T12:00:00Z time_coverage_start=2019-07-22T12:00:00Z Westernmost_Easting=-179.975
a
Updated GIS Brand and Metadata Standards
city-of-vancouver-wa-geo-hub-cityofvancouver.hub.arcgis.com
Updated Feb 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vancouver Online Maps (2024). Updated GIS Brand and Metadata Standards [Dataset]. https://city-of-vancouver-wa-geo-hub-cityofvancouver.hub.arcgis.com/content/5dfe17cff7cc4e3e88cff47fe61d725f
Explore at:
Dataset updated
Feb 28, 2024
Dataset authored and provided by
Vancouver Online Maps
Description
This project contains:One map containing the City of Vancouver City Limits layer for use in populating sample templates600x400 pixel formatted templates for generating thumbnails for Applications, Web Maps/Maps/Packages, Layers, and items to be deprecatedTemplates for print layouts at 8.5x11", landscape and portrait sizesStyle file containing correct fonts and colors in alignment with City of Vancouver brandingExample text should be replaced as necessary, and it is recommended you save any applicable changes as a copy of this package for your use.
Data from: A risk-reward examination of sample multiplexing reagents for...
zenodo.org
bin
Updated Jun 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Brown; Rory Bowden; Daniel Brown; Rory Bowden (2023). A risk-reward examination of sample multiplexing reagents for Single Cell RNA-Seq [Dataset]. http://doi.org/10.5281/zenodo.8031079
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8031079
Dataset updated
Jun 24, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Daniel Brown; Rory Bowden; Daniel Brown; Rory Bowden
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
For the preprint I originally planned to upload count matrices to GEO. Upon publication I would then upload fastqs and count matrices to a controlled server. I have to organise a data access committee for the human samples before releasing the fastqs.

Upon looking at the GEO metadata template my eyes watered and I decided to instead upload the processed data to Zenodo instead.

Be sure to check the md5 checksum: checklist.chk
d
National Agricultural Imagery Program (NAIP), Minnesota, 2006, Geo-rectified...
datadiscoverystudio.org
Updated Oct 2, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2015). National Agricultural Imagery Program (NAIP), Minnesota, 2006, Geo-rectified Images. [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/1693e1010f7f4b49a31be3e245c39634/html
Explore at:
Dataset updated
Oct 2, 2015
Description
description: This data set contains natural color imagery from the National Agricultural Imagery Program (NAIP). NAIP acquires digital ortho imagery during the agricultural growing seasons in the continental U.S.. A primary goal of the NAIP program is to enable availability of ortho imagery within one year of acquisition. The source files are 2 meter ground sample distance (GSD) ortho imagery rectified to a horizontal accuracy of within 10 meters of reference digital ortho quarter quads (DOQQ's) from the National Digital Ortho Program (NDOP). The tiling format of NAIP imagery is based on a 3.75' x 3.75' quarter quadrangle with a 300 meter buffer on all four sides. NAIP quarter quads are formatted to the UTM coordinate system using NAD83. NAIP imagery may contain as much as 10% cloud cover per tile. This file was generated by compressing NAIP quarter quadrangle tiles that cover a county. MrSID compression, with mosaic option, was used. Target values for the compression ratio are (15:1) and MrSID Generation 3. (Note: MnGeo created this metadata record using information from Farm Service Agency metadata. Each county file is accompanied by the original FSA metadata for that county.); abstract: This data set contains natural color imagery from the National Agricultural Imagery Program (NAIP). NAIP acquires digital ortho imagery during the agricultural growing seasons in the continental U.S.. A primary goal of the NAIP program is to enable availability of ortho imagery within one year of acquisition. The source files are 2 meter ground sample distance (GSD) ortho imagery rectified to a horizontal accuracy of within 10 meters of reference digital ortho quarter quads (DOQQ's) from the National Digital Ortho Program (NDOP). The tiling format of NAIP imagery is based on a 3.75' x 3.75' quarter quadrangle with a 300 meter buffer on all four sides. NAIP quarter quads are formatted to the UTM coordinate system using NAD83. NAIP imagery may contain as much as 10% cloud cover per tile. This file was generated by compressing NAIP quarter quadrangle tiles that cover a county. MrSID compression, with mosaic option, was used. Target values for the compression ratio are (15:1) and MrSID Generation 3. (Note: MnGeo created this metadata record using information from Farm Service Agency metadata. Each county file is accompanied by the original FSA metadata for that county.)
d
Processed ship-based Navigation Data acquired during the Robert D. Conrad...
datadiscoverystudio.org
search.dataone.org
+2more
mgds:nav v.1
Updated Oct 7, 2009
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2009). Processed ship-based Navigation Data acquired during the Robert D. Conrad expedition RC1112 (1967)Marine Geoscience Digital Library internal dataset identifiers [Dataset]. http://doi.org/10.1594/IEDA/310686
Explore at:
mgds:nav v.1Available download formats
Unique identifier
https://doi.org/10.1594/IEDA/310686
Dataset updated
Oct 7, 2009
Area covered

Description
This data set was acquired with a ship-based Navigation system during Robert D. Conrad expedition RC1112 conducted in 1967 (Chief Scientist: Dr. George Bryan). These data files are of Text File (ASCII) format and include Navigation data and were processed after data collection.
d
Kongsberg EM710 Multibeam Sonar System Metadata and Documentation from the...
search.dataone.org
marine-geo.org
+1more
Updated Mar 4, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IEDA: Marine-Geo Digital Library (2019). Kongsberg EM710 Multibeam Sonar System Metadata and Documentation from the Papahanaumokuakea Marine National Monument acquired during the Falkor expedition FK140307 (2014) [Dataset]. http://doi.org/10.1594/IEDA/321509
Explore at:
Unique identifier
https://doi.org/10.1594/IEDA/321509
Dataset updated
Mar 4, 2019
Dataset provided by
IEDA: Marine-Geo Digital Library
Time period covered
Mar 7, 2014 - Apr 11, 2014
Area covered
Papahanaumokuakea Marine National Monument
Description
This data set contains Kongsberg EM710 Multibeam Sonar system metadata and documentation from the Falkor expedition FK140307 conducted in 2014 (Chief Scientist: Dr. Christopher Kelley; Investigator(s): Dr. Christopher Kelley and Dr. John R. Smith). These data files are of PDF format. Data were acquired as part of the project(s): Multibeam Sonar Data from Cruise FK140307 Papahanaumokuakea Marine National Monument, Northwestern Hawaiian Islands, USA.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Visentin, Luca (2024). GEO gene expression dataset recompute for selected tumor samples [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10817923

GEO gene expression dataset recompute for selected tumor samples

Explore at:

Dataset updated

May 13, 2024

Dataset authored and provided by

Visentin, Luca

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

We aligned and quantified RNA-Seq data present in GEO with a standardized pipeline to homogenize data preprocessing for downstream applications.

All uploaded files are UTF-8, .csv-formatted matrices. The *_expected_count.csv.gz files are unlogged, raw expression counts as reported by rsem-quantify-expression (see details below). The associated *_metadata.csv.gz files contain metadata pertinent to each column of the corresponding expression matrix.Some metadata files may have more rows than the associated number of columns. This is for series that were only partially RNA-Seq based (e.g. combinated RNA-Seq plus miRNA-Seq samples in the same GEO accession ID).

Metadata columns are derived from GEO series files, and follow their definitions. See each GEO entry directly to determine metadata meaning.

Each recompute has at least the gene_id column holding Ensembl Gene IDs. The remaining columns are ENA run accession IDs of the specific recomputed samples.Each associated metadata has at least the following columns:

geo_accession: The GEO sample ID of the sample.

ena_sample: The ENA sample ID of the sample.

ena_run: The ENA run accession ID of the sample, to be cross-referenced with the expression matrices.

The remaining columns are derived from GEO metadata files and other ENA-provided data. Please refer to the x.FASTQ package for more information.

Pipeline Details

The alignment and quantification was made with the x.FASTQ tool available on Github installed locally on an Arch Linux machine on commit 3a93dd77a70df59c74f7b15216c26f12cd918e81 running the Linux 6.7.8-zen1-1-zen kernel with a 11th Gen Intel i7-1185G7 (8) CPU and a Intel TigerLake-LP GT2 [Iris Xe Graphics] GPU. Please note that no sample filtering or omissions were done based on sample quality or sequencing depth. However, sensible trimming (e.g. low-quality bases and common adapters) was performed on all the samples.

Reference genome was downloaded from Ensembl, version hg38. STAR was used to create the index genome with overhang set to 149.

Clear search

Close search

Google apps

Main menu

GEO gene expression dataset recompute for selected tumor samples

Field-wide assessment of differential HT-seq from NCBI GEO database

Expression Data recompute of selected GEO-deposited RNA-Seq data of HMEC-1...

Geo-referencing of journal articles and platform design for spatial query...

System for Earth Sample Registration

Nordic Coastal Standardised metadata template 2023

WSDOT - GIS Line Feature Class Template

Data from: Metadata record for the manuscript: FOXA1 and adaptive response...

US Restaurant POI dataset with metadata

WSDOT - GIS Point Feature Class Template

Sea-Surface Temperature, NOAA Geo-polar Blended Analysis Diurnal Correction...

Updated GIS Brand and Metadata Standards

Data from: A risk-reward examination of sample multiplexing reagents for...

National Agricultural Imagery Program (NAIP), Minnesota, 2006, Geo-rectified...

Processed ship-based Navigation Data acquired during the Robert D. Conrad...

Kongsberg EM710 Multibeam Sonar System Metadata and Documentation from the...

GEO gene expression dataset recompute for selected tumor samplesSee More Versions

GEO gene expression dataset recompute for selected tumor samples