3 datasets found
  1. Field-wide assessment of differential HT-seq from NCBI GEO database

    • zenodo.org
    application/gzip
    Updated Jan 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Taavi Päll; Taavi Päll; Hannes Luidalepp; Tanel Tenson; Tanel Tenson; Ülo Maiväli; Ülo Maiväli; Hannes Luidalepp (2023). Field-wide assessment of differential HT-seq from NCBI GEO database [Dataset]. http://doi.org/10.5281/zenodo.5356064
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Taavi Päll; Taavi Päll; Hannes Luidalepp; Tanel Tenson; Tanel Tenson; Ülo Maiväli; Ülo Maiväli; Hannes Luidalepp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We analysed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository.

    - This release includes GEO series up to Dec-31, 2020;

    - Fixed xlrd missing optional dependency, which affected import of some xls files, previously we were using only openpyxl (thanks to anonymous reviewer);

    - All files in supplementary _RAW.tar files were checked for p values, previously _RAW.tar files were completely omitted, alas (thanks to anonymous reviewer).

    Archived dataset contains following files:

    - output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).

    - output/document_summaries.csv, document summaries of NCBI GEO series

    - output/publications.csv, publication info of NCBI GEO series

    - output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series

    - output/single-cell.csv, single cell experiments

    - spots.csv, NCBI SRA sequencing run metadata

    - suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions. One filename per row.

    - suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO. One filename per row.

  2. Field-wide assessment of differential HT-seq from NCBI GEO database

    • zenodo.org
    application/gzip
    Updated Jan 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Taavi Päll; Taavi Päll; Hannes Luidalepp; Tanel Tenson; Tanel Tenson; Ülo Maiväli; Ülo Maiväli; Hannes Luidalepp (2023). Field-wide assessment of differential HT-seq from NCBI GEO database [Dataset]. http://doi.org/10.5281/zenodo.6769241
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Taavi Päll; Taavi Päll; Hannes Luidalepp; Tanel Tenson; Tanel Tenson; Ülo Maiväli; Ülo Maiväli; Hannes Luidalepp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We analysed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository.

    - This release includes GEO series up to Dec-31, 2020;

    Archived dataset contains following files:

    - output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).

    - output/document_summaries.csv, document summaries of NCBI GEO series

    - output/publications.csv, publication info of NCBI GEO series

    - output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series

    - output/single-cell.csv, single cell experiments

    - spots.csv, NCBI SRA sequencing run metadata

    - suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions. One filename per row.

    - suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO. One filename per row.

    Workflow to produce this dataset is available on Github at rstats-tartu/geo-htseq.

  3. Z

    Field-wide assessment of differential HT-seq from NCBI GEO database

    • data.niaid.nih.gov
    Updated Jan 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luidalepp, Hannes (2023). Field-wide assessment of differential HT-seq from NCBI GEO database [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3747112
    Explore at:
    Dataset updated
    Jan 13, 2023
    Dataset provided by
    Päll, Taavi
    Maiväli, Ülo
    Tenson, Tanel
    Luidalepp, Hannes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We analysed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository.

    • This release includes GEO series published up to Dec-31, 2020;

    geo-htseq.tar.gz archive contains following files:

    • output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).

    • output/document_summaries.csv, document summaries of NCBI GEO series.

    • output/suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions.

    • output/suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO.

    • output/publications.csv, publication info of NCBI GEO series.

    • output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series

    • output/spots.csv, NCBI SRA sequencing run metadata.

    • output/cancer.csv, cancer related experiment accessions.

    • output/transcription_factor.csv, TF related experiment accessions.

    • output/single-cell.csv, single cell experiment accessions.

    • blacklist.txt, list of supplementary files that were either too large to import or were causing computing environment crash during import.

    Workflow to produce this dataset is available on Github at rstats-tartu/geo-htseq.

    geo-htseq-updates.tar.gz archive contains files:

    • results/detools_from_pmc.csv, differential expression analysis programs inferred from published articles

    • results/n_data.csv, manually curated sample size info for NCBI GEO HT-seq series

    • results/simres_df_parsed.csv, pi0 values estimated from differential expression results obtained from simulated RNA-seq data

    • results/data/parsed_suppfiles_rerun.csv, pi0 values estimated using smoother method from anti-conservative p-value sets

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Taavi Päll; Taavi Päll; Hannes Luidalepp; Tanel Tenson; Tanel Tenson; Ülo Maiväli; Ülo Maiväli; Hannes Luidalepp (2023). Field-wide assessment of differential HT-seq from NCBI GEO database [Dataset]. http://doi.org/10.5281/zenodo.5356064
Organization logo

Field-wide assessment of differential HT-seq from NCBI GEO database

Explore at:
application/gzipAvailable download formats
Dataset updated
Jan 13, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Taavi Päll; Taavi Päll; Hannes Luidalepp; Tanel Tenson; Tanel Tenson; Ülo Maiväli; Ülo Maiväli; Hannes Luidalepp
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

We analysed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository.

- This release includes GEO series up to Dec-31, 2020;

- Fixed xlrd missing optional dependency, which affected import of some xls files, previously we were using only openpyxl (thanks to anonymous reviewer);

- All files in supplementary _RAW.tar files were checked for p values, previously _RAW.tar files were completely omitted, alas (thanks to anonymous reviewer).

Archived dataset contains following files:

- output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).

- output/document_summaries.csv, document summaries of NCBI GEO series

- output/publications.csv, publication info of NCBI GEO series

- output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series

- output/single-cell.csv, single cell experiments

- spots.csv, NCBI SRA sequencing run metadata

- suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions. One filename per row.

- suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO. One filename per row.

Search
Clear search
Close search
Google apps
Main menu