Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data outputs 1-18 Raw data output 1. Differentially expressed genes in AML CSCs compared with GTCs as well as in TCGA AML cancer samples compared with normal ones. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 2. Commonly and uniquely differentially expressed genes in AML CSC/GTC microarray and TCGA bulk RNA-seq datasets. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 3. Common differentially expressed genes between training and test set samples the microarray dataset. This data was generated based on the results of AML microarray data analysis. Raw data output 4. Detailed information on the samples of the breast cancer microarray dataset (GSE52327) used in this study. Raw data output 5. Differentially expressed genes in breast CSCs compared with GTCs as well as in TCGA BRCA cancer samples compared with normal ones. Raw data output 6. Commonly and uniquely differentially expressed genes in breast cancer CSC/GTC microarray and TCGA BRCA bulk RNA-seq datasets. This data was generated based on the results of breast cancer microarray and TCGA BRCA data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 7. Differential and common co-expression and protein-protein interaction of genes between CSC and GTC samples. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 8. Differentially expressed genes between AML dormant and active CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 9. Uniquely expressed genes in dormant or active AML CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 10. Intersections between the targeting transcription factors of AML key CSC genes and differentially expressed genes between AML CSCs vs GTCs and between dormant and active AML CSCs or the uniquely expressed genes in either class of CSCs. Raw data output 11. Targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 12. CSC-specific targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 13. The protein-protein interactions between AML key CSC genes with themselves and their targeting transcription factors. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. Raw data output 14. The previously confirmed associations of genes having the highest targeting desirableness and CSC-specific targeting desirableness scores with AML or other cancers’ (stem) cells as well as hematopoietic stem cells. These data were generated based on a PubMed database-based literature mining. Raw data output 15. Drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 16. CSC-specific drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 17. Candidate drugs for experimental validation. These drugs were selected based on their respective (CSC-specific) drug scores. CSC is the abbreviation of cancer stem cell. Raw data output 18. Detailed information on the samples of the AML microarray dataset GSE30375 used in this study.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains data collected as part of the Ancient Adhesives project under the European Union’s Horizon 2020 research and innovation programme Grant Agreement No. 678 804151 (Grant holder G.H.J.L.).
It is being made public to act as supplementary data for a publication and for other researchers to use this data in their own work.
The data in this dataset were collected at TUDelft, University of Cantabria, and Museum of Prehistory and Archaeology of Cantabria in 2023.
This dataset contains:
The acronym MOR stands for Morín Cave, a cave in Cantabria (Spain) where the objects were found.
The data included in this dataset has been organized per method. For each specimen, more than one point was measured as indicated in the file name. Only the measurements with interpretable results are made available.
The file name includes the unique ID of the object + the analytical technique + the number of the scan. For example: MOR11_ATR_loc1
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This imaging mass cytometry (IMC) dataset serves as an example to demonstrate raw data processing and downstream analysis tools. The data was generated as part of the Integrated iMMUnoprofiling of large adaptive CANcer patient cohorts (IMMUcan) project (immucan.eu) using the Hyperion imaging system (www.fluidigm.com/products-services/instruments/hyperion). To get an overview on the technology and available analysis strategies, please visit bodenmillergroup.github.io/IMCWorkflow. The individual data files are described below:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sample 1 was used for Exploratory Factor Analysis, Sample 2 was used for Confirmatory Factor Analysis.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This document mainly includes Coding sequences of PME-domain and pro-region of Type-1 PME in representative plants,Raw data from fusion gene analysis by LIR inference,Raw data from repeated sequence studies within four Cruciferae representative species and Graphical Abstract.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data from HDMSe and SWATH MS analyses of 309 prostate cancer serum samples. Prostate cancer cohort:
309 patients were divided into control (n=112), prostate cancer (PCa) (n=175), and benign prostate hyperplasia (BPH) (n=22). PCa patients were then subdivided into active surveillance (AS) (n=51) or treatment group. Treatments were radiotherapy (pre: n=26, post: n=14), hormone therapy (pre: n=7, post: n=8), prostatectomy (pre: n=21, post: n=8), and radiotherapy (pre: n=23, post: n=17)
XRD Raw data collected. This dataset is associated with the following publication: Nadagouda , M., C. Han , D. Dionysiou, and L. Wang. An innovative zinc oxide-coated zeolite adsorbent for removal of humic acid. JOURNAL OF HAZARDOUS MATERIALS. Elsevier Science Ltd, New York, NY, USA, 313: 283-290, (2016).
These data (illumina paired end fastq) exemplify the different WGS types which have been isolated from UK
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Sample data for exercises in Further Adventures in Data Cleaning.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data on sample languages
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Raw data and descriptive statistic data of the market survey performed with the Add-In XLSTAT 2009.1.02 is provided as Excel-file (CSV). The data include file name, sample name, area, calculated N2O amounts, test result and statistical values.
https://ega-archive.org/dacs/EGAC00001002814https://ega-archive.org/dacs/EGAC00001002814
Dataset comprising raw paired RNA-seq data in fastq.gz format for 7 samples of rosette forming brain tumors
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Over the course of 24 hours, we collected raw (Photoplethysmography (PPG), Acceleration, and Gyro) and processed (steps, calories, sleep, HR, HRV, SPO2, Respiratory Rate, R-R) data samples. Biostrap approaches health insights from a data-driven perspective. Our clinical-grade hardware enables users to accurately track SpO2, HRV, RHR, and a variety of other biometrics with confidence.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Geological Survey Ireland has a core scanning suite consisting of a Short-Wave Infra-red (SWIR) camera and a Medium-Wave Infra-red (MWIR) camera.We have over 400km of drill core in our core store and are in the process of scanning all of it. We currently have ~7Tb of data.This data is freely available, but due to the size of the files please email gsi.corestore[AT]gsi.ie so we can facilitate delivery.This is a sample dataset consisting of 1 box of core.A single core-box scanned in the Short Wave Infra-red range for use with explanatory notebooks available on our GitHub repository. This data consists of box 25 of drillhole GSI-17-007, 105.98m to 110.35m. This box contains the contact between the Ballymore Formation and the Oakport Formation.We are open to collaboration using either the scanner or the data with any of our stakeholders.For questions, issues, suggestions for improvement or to discuss collaboration, please contact Russell Rogers, c/o duty.geologist[AT]gsi.ie.We also have a GitHub repository that hosts notebooks using the sample dataset, explaining some of the methods we have used in python to pre-process and process our image data.1. Opening and Starting with Geological Survey Ireland Hyperpectral Data2. Denoising Geological Survey Ireland Hyperspectral Data3. Removing the core box from the image4. Removing the continuum5. ClusteringThe notebook uses the Minisom module, because it is a very lightweight implementation with minimal dependencies, but there are many other SOM implementations available in python.
This data set includes the raw data and the posterior samples from the Bayesian models referring to the article, Seeing Both the Forest and the Trees Distinct Resolution of Hierarchical Representations in Visual Working Memory
Overview In support of the Wind Forecasting Improvement Project, Pacific Northwest National Laboratory (PNNL) deployed surface meteorological stations in Oregon. Data Details A PNNL computer is used as the base station to download the meteorological data acquired by the data logger at each site via a cellular modem. The data collected will be made available to the National Oceanic and Atmospheric Administration each hour and used to support the short-term forecasting project by providing an independent evaluation of the added value of new data to meteorological forecasts. Each meteorological station consists of a solar-powered data acquisition system and wind speed, wind direction, temperature, humidity, barometric pressure, and solar radiation sensors on a 3-m tower. Specifically, the stations are comprised of the following instruments and equipment: Campbell Scientific CM6 Tripod Campbell Scientific CR10X Measurement and Control System R.M. Young 05106 Wind Monitor Vaisala HMP45C Temperature and Humidity Probe Vaisala PTB101B Barometric Pressure Sensor Li-Cor LI200X Pyranometer RavenXT Cellular Modem The data logger is used to sample, at 1-second intervals, the horizontal wind speed and direction at 3 meters above ground level (AGL); the air temperature, relative humidity, barometric pressure, and solar radiation at 2 meters AGL; and the logger temperature and power supply. The logger outputs the 1-minute averages of these measurements to final storage and power on the cellular modem, so the data can be retrieved and downloaded to a base station computer. The data are archived as 1-hour comma-delimited ASCII files (see "Table 2. Format of the WFIP2 Comma-delimited ASCII Data Files" in wfip2-met-data.pdf). All dates and times in the file names and data records are in UTC and denote the end of the 1-minute average. Data Quality Data for each primary measurement at every site are automatically plotted daily and reviewed about every three days. Instrument outages or events are reported with the Instrument and Model Data Problem Log at: .
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is an example of data from one clutch over 4 days 7-10 dpf.
If you need the other datasets, please contact the authors
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Transparency in data visualization is an essential ingredient for scientific communication. The traditional approach of visualizing continuous quantitative data solely in the form of summary statistics (i.e., measures of central tendency and dispersion) has repeatedly been criticized for not revealing the underlying raw data distribution. Remarkably, however, systematic and easy-to-use solutions for raw data visualization using the most commonly reported statistical software package for data analysis, IBM SPSS Statistics, are missing. Here, a comprehensive collection of more than 100 SPSS syntax files and an SPSS dataset template is presented and made freely available that allow the creation of transparent graphs for one-sample designs, for one- and two-factorial between-subject designs, for selected one- and two-factorial within-subject designs as well as for selected two-factorial mixed designs and, with some creativity, even beyond (e.g., three-factorial mixed-designs). Depending on graph type (e.g., pure dot plot, box plot, and line plot), raw data can be displayed along with standard measures of central tendency (arithmetic mean and median) and dispersion (95% CI and SD). The free-to-use syntax can also be modified to match with individual needs. A variety of example applications of syntax are illustrated in a tutorial-like fashion along with fictitious datasets accompanying this contribution. The syntax collection is hoped to provide researchers, students, teachers, and others working with SPSS a valuable tool to move towards more transparency in data visualization.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This entry contains raw data files from experiments performed on the Vulcan beamline at the Spallation Neutron Source at Oak Ridge National Laboratory using a pressure cell. Cylindrical granite and marble samples were subjected to confining pressures of either 0 psi or approximately 2500 psi and internal pressures of either 0 psi, 1500 psi or 2500 psi through a blind axial hole at the center of one end of the sample. The sample diameters were 1.5" and the sample lengths were 6". The blind hole was 0.25" in diameter and 3" deep. One set of experiments measured strains at points located circumferentially around the center of the sample with identical radii to determine if there was strain variability (this would not be expected for a homogeneous material based on the symmetry of loading). Another set of experiments measured load variation across the radius of the sample at a fixed axial and circumferential location. Raw neutron diffraction intensity files and experimental parameter descriptions are included.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
{# General information# The script runs with R (Version 3.1.1; 2014-07-10) and packages plyr (Version 1.8.1), XLConnect (Version 0.2-9), utilsMPIO (Version 0.0.25), sp (Version 1.0-15), rgdal (Version 0.8-16), tools (Version 3.1.1) and lattice (Version 0.20-29)# --------------------------------------------------------------------------------------------------------# Questions can be directed to: Martin Bulla (bulla.mar@gmail.com)# -------------------------------------------------------------------------------------------------------- # Data collection and how the individual variables were derived is described in: #Steiger, S.S., et al., When the sun never sets: diverse activity rhythms under continuous daylight in free-living arctic-breeding birds. Proceedings of the Royal Society B: Biological Sciences, 2013. 280(1764): p. 20131016-20131016. # Dale, J., et al., The effects of life history and sexual selection on male and female plumage colouration. Nature, 2015. # Data are available as Rdata file # Missing values are NA. # --------------------------------------------------------------------------------------------------------# For better readability the subsections of the script can be collapsed # --------------------------------------------------------------------------------------------------------}{# Description of the method # 1 - data are visualized in an interactive actogram with time of day on x-axis and one panel for each day of data # 2 - red rectangle indicates the active field, clicking with the mouse in that field on the depicted light signal generates a data point that is automatically (via custom made function) saved in the csv file. For this data extraction I recommend, to click always on the bottom line of the red rectangle, as there is always data available due to a dummy variable ("lin") that creates continuous data at the bottom of the active panel. The data are captured only if greenish vertical bar appears and if new line of data appears in R console). # 3 - to extract incubation bouts, first click in the new plot has to be start of incubation, then next click depict end of incubation and the click on the same stop start of the incubation for the other sex. If the end and start of incubation are at different times, the data will be still extracted, but the sex, logger and bird_ID will be wrong. These need to be changed manually in the csv file. Similarly, the first bout for a given plot will be always assigned to male (if no data are present in the csv file) or based on previous data. Hence, whenever a data from a new plot are extracted, at a first mouse click it is worth checking whether the sex, logger and bird_ID information is correct and if not adjust it manually. # 4 - if all information from one day (panel) is extracted, right-click on the plot and choose "stop". This will activate the following day (panel) for extraction. # 5 - If you wish to end extraction before going through all the rectangles, just press "escape". }{# Annotations of data-files from turnstone_2009_Barrow_nest-t401_transmitter.RData dfr-- contains raw data on signal strength from radio tag attached to the rump of female and male, and information about when the birds where captured and incubation stage of the nest1. who: identifies whether the recording refers to female, male, capture or start of hatching2. datetime_: date and time of each recording3. logger: unique identity of the radio tag 4. signal_: signal strength of the radio tag5. sex: sex of the bird (f = female, m = male)6. nest: unique identity of the nest7. day: datetime_ variable truncated to year-month-day format8. time: time of day in hours9. datetime_utc: date and time of each recording, but in UTC time10. cols: colors assigned to "who"--------------------------------------------------------------------------------------------------------m-- contains metadata for a given nest1. sp: identifies species (RUTU = Ruddy turnstone)2. nest: unique identity of the nest3. year_: year of observation4. IDfemale: unique identity of the female5. IDmale: unique identity of the male6. lat: latitude coordinate of the nest7. lon: longitude coordinate of the nest8. hatch_start: date and time when the hatching of the eggs started 9. scinam: scientific name of the species10. breeding_site: unique identity of the breeding site (barr = Barrow, Alaska)11. logger: type of device used to record incubation (IT - radio tag)12. sampling: mean incubation sampling interval in seconds--------------------------------------------------------------------------------------------------------s-- contains metadata for the incubating parents1. year_: year of capture2. species: identifies species (RUTU = Ruddy turnstone)3. author: identifies the author who measured the bird4. nest: unique identity of the nest5. caught_date_time: date and time when the bird was captured6. recapture: was the bird capture before? (0 - no, 1 - yes)7. sex: sex of the bird (f = female, m = male)8. bird_ID: unique identity of the bird9. logger: unique identity of the radio tag --------------------------------------------------------------------------------------------------------}
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data outputs 1-18 Raw data output 1. Differentially expressed genes in AML CSCs compared with GTCs as well as in TCGA AML cancer samples compared with normal ones. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 2. Commonly and uniquely differentially expressed genes in AML CSC/GTC microarray and TCGA bulk RNA-seq datasets. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 3. Common differentially expressed genes between training and test set samples the microarray dataset. This data was generated based on the results of AML microarray data analysis. Raw data output 4. Detailed information on the samples of the breast cancer microarray dataset (GSE52327) used in this study. Raw data output 5. Differentially expressed genes in breast CSCs compared with GTCs as well as in TCGA BRCA cancer samples compared with normal ones. Raw data output 6. Commonly and uniquely differentially expressed genes in breast cancer CSC/GTC microarray and TCGA BRCA bulk RNA-seq datasets. This data was generated based on the results of breast cancer microarray and TCGA BRCA data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 7. Differential and common co-expression and protein-protein interaction of genes between CSC and GTC samples. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 8. Differentially expressed genes between AML dormant and active CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 9. Uniquely expressed genes in dormant or active AML CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 10. Intersections between the targeting transcription factors of AML key CSC genes and differentially expressed genes between AML CSCs vs GTCs and between dormant and active AML CSCs or the uniquely expressed genes in either class of CSCs. Raw data output 11. Targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 12. CSC-specific targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 13. The protein-protein interactions between AML key CSC genes with themselves and their targeting transcription factors. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. Raw data output 14. The previously confirmed associations of genes having the highest targeting desirableness and CSC-specific targeting desirableness scores with AML or other cancers’ (stem) cells as well as hematopoietic stem cells. These data were generated based on a PubMed database-based literature mining. Raw data output 15. Drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 16. CSC-specific drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 17. Candidate drugs for experimental validation. These drugs were selected based on their respective (CSC-specific) drug scores. CSC is the abbreviation of cancer stem cell. Raw data output 18. Detailed information on the samples of the AML microarray dataset GSE30375 used in this study.