29 datasets found
  1. f

    Data and probability for an incomplete 2×2 table.

    • plos.figshare.com
    xls
    Updated Jun 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hezhi Lu; Fengjing Cai; Yuan Li; Xionghui Ou (2023). Data and probability for an incomplete 2×2 table. [Dataset]. http://doi.org/10.1371/journal.pone.0272007.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 10, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Hezhi Lu; Fengjing Cai; Yuan Li; Xionghui Ou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data and probability for an incomplete 2×2 table.

  2. f

    Proportion of a nominal sample from each respondent category according to...

    • plos.figshare.com
    xls
    Updated Jun 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel J. Simons; Christopher F. Chabris (2023). Proportion of a nominal sample from each respondent category according to the 2010 Census data, along with weights applied to individual respondents in the MTurk and SurveyUSA samples. [Dataset]. http://doi.org/10.1371/journal.pone.0051876.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 10, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Daniel J. Simons; Christopher F. Chabris
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Note: SurveyUSA weights are based on data from Simons & Chabris (2011), re-normed to 2010 Census data.

  3. d

    Spatial Coverage Map and Resampling Error Assessment for Hyperspectral...

    • search.dataone.org
    • borealisdata.ca
    Updated Dec 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Inamdar, Deep; Soffer, Raymond; Kalacska, Margaret; Naprstek, Tomas (2023). Spatial Coverage Map and Resampling Error Assessment for Hyperspectral Imaging Data [Dataset]. http://doi.org/10.5683/SP3/EO8LM8
    Explore at:
    Dataset updated
    Dec 28, 2023
    Dataset provided by
    Borealis
    Authors
    Inamdar, Deep; Soffer, Raymond; Kalacska, Margaret; Naprstek, Tomas
    Description

    A set of MATLAB functions (HSI_PSFS, SC_RS_Analysis_NAD.m, SC_RS_Analysis_sim.m) were developed to assess the spatial coverage of pushbroom hyperspectral imaging (HSI) data. HSI_PSFs derives the net point spread function of HSI data based on nominal data acquisition and sensor parameters (sensor speed, sensor heading, sensor altitude, number of cross track pixels, sensor field of view, integration time, frame time and pixel summing level). SC_RS_Analysis_sim calculates a theoretical spatial coverage map for HSI data based on nominal data acquisition and sensor parameters. The spatial coverage map is the sum of the point spread functions of all the pixels collected within an HSI dataset. Practically, the spatial coverage map quantifies how HSI data spatially samples spectral information across an imaged scene. A secondary theoretical spatial coverage map is also calculated for spatially resampled (nearest neighbour approach) HSI data. The function also calculates theoretical resampling errors such as pixel duplication (%), pixel loss (%) and pixel shifting (m). SC_RS_Analysis_NAD calculates an empirical spatial coverage map for collected HSI data (before and after spatial resampling) based on its nominal data acquisition and sensor parameters. The function also calculates empirical resampling errors. The current implementation of SC_RS_Analysis_NAD only works for ITRES (Calgary, Alberta, Canada) data products as it uses auxiliary information generated during the ITRES data processing workflow. This auxiliary information includes a ground look-up table that specifies the location (easting and northing) of each pixel of the HSI data in its raw sensor geometry. This auxiliary information also includes the pixel-to-pixel mapping between the HSI data in its raw sensor geometry and the spatially resampled HSI data. SC_RS_Analysis_NAD can readily be modified to work with HSI data collected by sensors from other manufacturers so long as the required auxiliary information can be extracted during data processing.

  4. D

    Experimental Data for Fault Diagnosis in the Adaptive High-Rise D1244

    • darus.uni-stuttgart.de
    Updated Feb 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonas Stiefelmaier (2025). Experimental Data for Fault Diagnosis in the Adaptive High-Rise D1244 [Dataset]. http://doi.org/10.18419/DARUS-4784
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 24, 2025
    Dataset provided by
    DaRUS
    Authors
    Jonas Stiefelmaier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Dataset funded by
    DFG
    Description

    General information: This dataset is meant to serve as a benchmark problem for fault detection and isolation in dynamic systems. It contains preprocessed sensor data from the adaptive high-rise demonstrator building D1244, built in the scope of the CRC1244. Parts of the measurements have been artificially corrupted and labeled accordingly. Please note that although the measurements are stored in Matlab's .mat-format (Version 7.0), they can easily be processed using free software such as the SciPy library in Python. Structure of the dataset: train contains training data (only nominal) validation contains validation data (nominal and faulty). Faulty samples were obtained by manipulating a single signal in a random nominal sample from the validation data. test contains test data (nominal and faulty). Faulty samples were obtained by manipulating a single signal in a random nominal sample from the test data. meta contains textual labels for all signals as well as additional information on the considered fault classes File contents: Each file contains the following data from 1200 timesteps (60 seconds sampled at 20 Hz): t: time in seconds u: actuator forces (obtained from pressure measurements) in newtons y: relative elongations as well as bending curvatures of structural elements obtained from strain gauge measurements, and actuator displacements measured by position encoders label: categorical label of the present fault class, where 0 denotes the nominal class and faults in the different signals are encoded according to their index in the list of fault types in meta/labels.mat Faulty samples additionally include the corresponding nominal values for reference u_true: actuator forces without faults y_true: measured outputs without faults Textual labels for all in- and output signals as well as all faults are given in the struct labels. Each sample's textual fault label is additionally contained in its filename (between the first and second underscore).

  5. Solar Radiation Spectrum 2018-2023

    • kaggle.com
    Updated Aug 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TavoGLC (2023). Solar Radiation Spectrum 2018-2023 [Dataset]. https://www.kaggle.com/datasets/tavoglc/solar-radiation-spectrum-2018-2023
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 18, 2023
    Dataset provided by
    Kaggle
    Authors
    TavoGLC
    Description

    TSIS-1 SIM Solar Spectral Irradiance V09 ; ; SELECTION CRITERIA ; date range: 20180314 to 20230129 ; cadence: 24 hours ; spectral range: 200.0 to 2400.0 nm ; number of data: 3307488 ; identifier_product_doi: 10.5067/TSIS/SIM/DATA318 ; identifier_product_doi_authority: http://dx.doi.org/ ; END SELECTION CRITERIA ;
    ; DATA DEFINITIONS, number = 11 (name, type, format) ; nominal_date_yyyymmdd, R8, f11.2 ; nominal_date_jdn, R8, f11.2 ; wavelength, R4, f9.3 (nm) ; instrument_mode_id, I2, i3 ; data_version, I2, i3 ; irradiance_1AU, R8, e15.8 (W/m^2/nm) ; instrument_uncertainty, R8, e15.8 (W/m^2/nm, 1 sigma) ; measurement_precision, R8, e15.8 (W/m^2/nm, 1 sigma) ; measurement_stability, R8, e15.8 (W/m^2/nm, 1 sigma) ; additional_uncertainty, R8, e15.8 (W/m^2/nm, 1 sigma) ; quality, UI2, i6 ; END DATA DEFINITIONS ; ; Background on the Total and Spectral Solar Irradiance Sensor (TSIS-1) ; ; The Total and Spectral Solar Irradiance Sensor (TSIS-1) level 3 (L3) data product is constructed ; using measurements from the Total Irradiance Monitor (TIM) and Spectral Irradiance Monitor (SIM) ; instruments. The TIM instrument measures the total solar irradiance (TSI) that is incident at the ; outer boundaries of the atmosphere and the SIM instrument measures the solar spectral irradiance ; (SSI) from 200 nm to 2400 nm, which are combined into 12-hr and 24-hr solar spectra. The TSIS-1 data ; products are provided on a fixed wavelength scale, which has a variable resolution over the ; spectral range. Irradiances are reported at a mean solar distance of 1 AU and zero relative line-of- ; sight velocity with respect to the Sun. ; ; Table: Solar Spectral Irradiance (SSI) Measurement Summary. ; ; Measuring Instrument SIM ; Temporal Cadence Daily ; Detector Diodes (200 nm to 1620 nm), ESR (1620 nm to 2400 nm) ; Instrument Modes 86 (UV), 85 (VIS), 84 (IR), 83 (ESR) ; Spectral Range 200 nm to 2400 nm ; ; The spectral irradiances are tabulated below ("DATA RECORDS"), with each row giving the nominal date ; (YYYYMMDD.D), nominal date (Julian Day), wavelength center (nm), instrument mode, data version, ; spectral irradiance @ 1au (irradiance_1AU, Watts/m^2/nm), instrument_uncertainty (Watts/m^2/nm), ; measurement_precision (Watts/m^2/nm), measurement_stability (Watts/m^2/nm), additional_uncertainty ; (Watts/m^2/nm), and a "quality" (data quality flag) value. Measurement_stability is given as ; 0.00000000e+00 (0.0) at wavelengths > 1050 nm, where we do not currently calculate a degradation ; correction, and for all data that arrives after the bi-annual Channel C calibration scans. The ; bi-annual Channel C scans trigger a new data release version, so there could be up to six months of ; measurement stability values that are 0.0 until determined during the creation of the next data release. ; Data quality flags are assigned to each spectral measurement in the 'quality' column. The value in this ; column is the addition of all the bit-wise data quality flags (DQF) associated with a given measurement. ; Nominal data has a DQF of '0'. The L3 TSIS-1 SIM data quality flags are: ; ; VALUE CONDITION ; ----- --------- ; 1 Missing data ; 2 Backfilled data (from previous day) ; 512 Data taken with offset pointing; a spectral correction has been applied ; ; Data with the '512' bit set was taken from March 19, 2022 through May 19, 2022. During this period, ; the TSIS-1 SIM pointing was off by ~1 arcmin due to external contamination of the pointing system ; quad-diode (HFSSB). A wavelength-dependent correction has been applied to data during this period, ; and the corresponding additional irradiance uncertainties associated with this correction are given ; in the 'additional_uncertainty' column. Note that it is possible that multiple flags can be set on ; the same measurement. For example, a quality of '514' is backfilled data, and the data used was taken ; during the offset pointing. ; ; Instrument_uncertainty, measurement_precision, measurement_stability, and additional_uncertainty are ; all in units of (Watts/m^2/nm). ; ; Each field (column) is defined and further described in the "DATA DEFINITIONS" section. ; ; An IDL file reader (http://lasp.colorado.edu/data/tsis/file_readers/read_lasp_ascii_file.pro) is ; available which will read this file and return an array of structures whose field names and types ; are taken from the "DATA DEFINITIONS" section. ; ; Erik Richard (2023), Level 3 (L3) Solar Spectral Irradiance Daily Means V009, ; Greenbelt, MD, USA, Goddard Earth Sciences Data and Information Services Center (GES DISC), ; Accessed [Data Access Date] at http://dx.doi.org/10.5067/TSIS/SIM/DATA318 ; ; For more information on the TSIS-1 instruments and data products, see: ; http://lasp.colorado...

  6. Frame-Labeled 60 GHz FMCW Radar Gesture Dataset

    • zenodo.org
    zip
    Updated May 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah Seifi; Sarah Seifi; Tobias Sukianto; Cecilia Carbonelli; Tobias Sukianto; Cecilia Carbonelli (2025). Frame-Labeled 60 GHz FMCW Radar Gesture Dataset [Dataset]. http://doi.org/10.5281/zenodo.15178095
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 7, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sarah Seifi; Sarah Seifi; Tobias Sukianto; Cecilia Carbonelli; Tobias Sukianto; Cecilia Carbonelli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    As the field of human-computer interaction continues to evolve, there is a growing need for robust datasets that can enable the development of gesture recognition systems that operate reliably in diverse real-world scenarios. We present a radar-based gesture dataset, recorded using the BGT60TR13C XENSIV™ 60GHz Frequency Modulated Continuous Radar sensor to address this need. This dataset includes both nominal gestures and anomalous gestures, providing a diverse and challenging benchmark for understanding and improving gesture recognition systems.

    The dataset contains a total of 49,000 gesture recordings, with 25,000 nominal gestures and 24,000 anomalous gestures. Each recording consists of 100 frames of raw radar data, accompanied by a label file that provides annotations for every individual frame in each gesture sequence. This frame-based annotation allows for high-resolution temporal analysis and evaluation.

    Nominal Gesture Data

    The nominal gestures represent standard, correctly performed gestures. These gestures were collected to serve as the baseline for gesture recognition tasks. The details of the nominal data are as follows:

    • Gesture Types: The dataset includes five nominal gesture types:

      1. Swipe Left
      2. Swipe Right
      3. Swipe Up
      4. Swipe Down
      5. Push
    • Total Samples: 25,000 nominal gestures.

    • Participants: The nominal gestures were performed by 12 participants (p1 through p12).

    Each nominal gesture has a corresponding label file that annotates every frame with the nominal gesture type, providing a detailed temporal profile for training and evaluation purposes.

    Anomalous Gesture Data

    The anomalous gestures represent deviations from the nominal gestures. These anomalies were designed to simulate real-world conditions in which gestures might be performed incorrectly, under varying speeds, or with modified execution patterns. The anomalous data introduces additional challenges for gesture recognition models, testing their ability to generalize and handle edge cases effectively.

    • Total Samples: 24,000 anomalous gestures.

    • Anomaly Types: The anomalous gestures include three distinct types of anomalies:

      1. Fast Executions: Gestures performed at a rapid pace, lasting approximately 0.1 seconds (much faster than the nominal average of 0.5 seconds).
      2. Slow Executions: Gestures performed at a significantly slower pace, lasting approximately 3 seconds (much slower than the nominal average).
      3. Wrist Executions: Gestures performed using the wrist instead of a fully extended arm, significantly altering the execution pattern.
    • Participants: The anomalous gestures involved contributions from eight participants, including p1, p2, p6, p7, p9, p10, p11, and p12.

    • Locations: All anomalous gestures were collected in location e1 (a closed-space meeting room).

    Radar Configuration Details

    The radar system was configured with an operational frequency range spanning from 58.5 GHz to 62.5 GHz. This configuration provides a range resolution of 37.5 mm and the ability to resolve targets at a maximum range of 1.2 meters. For signal transmission, the radar employed a burst configuration comprising 32 chirps per burst with a frame rate of 33 Hz and a pulse repetition time of 300 µs.

    Data Format

    The data for each user, categorized by location and anomaly type, is saved in compressed .npz files. Each .npz file contains key-value pairs for the data and its corresponding labels. The file naming convention is as follows:
    UserLabel_EnvironmentLabel(_AnomalyLabel).npy. For nominal gestures, the anomaly label is omitted.

    The .npz file contains two primary keys:

    1. inputs: Represents the raw radar data.
    2. targets: Refers to the corresponding label vector for the raw data.

    The raw radar data inputsis stored as a NumPy array with 5 dimensions, structured as follows:
    n_recordings x n_frames x n_antennas x n_chirps x n_samples, where:

    1. n_recordings: The number of gesture sequence instances (i.e., recordings).
    2. n_frames: The frame length of each gesture (100 frames per gesture).
    3. n_antennas: The number of virtual antennas (3 antennas).
    4. n_chirps: The number of chirps per frame (32 chirps).
    5. n_samples: The number of samples per chirp (64 samples).

    The labels targetsare stored as a NumPy array with 2 dimensions, structured as follows:
    n_recordings x n_frames, where:

    1. n_recordings: The number of gesture sequence instances (i.e., recordings).
    2. n_frames: The frame length of each gesture (100 frames per gesture).

    Each entry in the targets matrix corresponds to the frame-level label for the associated raw radar data in inputs.

    The total size of the dataset is approximately 48.1 GB, provided as a compressed file named radar_dataset.zip.

    Metadata

    The user labels are defined as follows:

    • p1: Male
    • p2: Female
    • p3: Female
    • p4: Male
    • p5: Male
    • p6: Male
    • p7: Male
    • p8: Male
    • p9: Male
    • p10: Female
    • p11: Male
    • p12: Male

    The environmental labels included in the dataset are defined as follows:

    • e1: Closed-space meeting room
    • e2: Open-space office room
    • e3: Library
    • e4: Kitchen
    • e5: Exercise room
    • e6: Bedroom

    The anomaly labels included in the dataset are defined as follows:

    • fast: Fast gesture execution
    • slow: Slow gesture execution
    • wrist: Wrist gesture execution

    This dataset represents a robust and diverse set of radar-based gesture data, enabling researchers and developers to explore novel models and evaluate their robustness in a variety of scenarios. The inclusion of frame-based labeling provides an additional level of detail to facilitate the design of advanced gesture recognition systems that can operate with high temporal resolution.

    Disclaimer

    This dataset builds upon the version previously published on IEEE DataExplorer (https://ieee-dataport.org/documents/60-ghz-fmcw-radar-gesture-dataset), which included only one label per recording. In contrast, this version includes frame-based labels, providing individual annotations for each frame of the recorded gestures. By offering more granular labeling, this dataset further supports the development and evaluation of gesture recognition models with enhanced temporal precision. However, the raw radar data remains unchanged compared to the dataset available on IEEE DataExplorer.

  7. f

    Data_Sheet_1_Predicting the data structure prior to extreme events from...

    • frontiersin.figshare.com
    pdf
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhirup Banerjee; Arindam Mishra; Syamal K. Dana; Chittaranjan Hens; Tomasz Kapitaniak; Jürgen Kurths; Norbert Marwan (2023). Data_Sheet_1_Predicting the data structure prior to extreme events from passive observables using echo state network.PDF [Dataset]. http://doi.org/10.3389/fams.2022.955044.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    Frontiers
    Authors
    Abhirup Banerjee; Arindam Mishra; Syamal K. Dana; Chittaranjan Hens; Tomasz Kapitaniak; Jürgen Kurths; Norbert Marwan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Extreme events are defined as events that largely deviate from the nominal state of the system as observed in a time series. Due to the rarity and uncertainty of their occurrence, predicting extreme events has been challenging. In real life, some variables (passive variables) often encode significant information about the occurrence of extreme events manifested in another variable (active variable). For example, observables such as temperature, pressure, etc., act as passive variables in case of extreme precipitation events. These passive variables do not show any large excursion from the nominal condition yet carry the fingerprint of the extreme events. In this study, we propose a reservoir computation-based framework that can predict the preceding structure or pattern in the time evolution of the active variable that leads to an extreme event using information from the passive variable. An appropriate threshold height of events is a prerequisite for detecting extreme events and improving the skill of their prediction. We demonstrate that the magnitude of extreme events and the appearance of a coherent pattern before the arrival of the extreme event in a time series affect the prediction skill. Quantitatively, we confirm this using a metric describing the mean phase difference between the input time signals, which decreases when the magnitude of the extreme event is relatively higher, thereby increasing the predictability skill.

  8. D

    Data for Fault Diagnosis in Adaptive Buildings

    • darus.uni-stuttgart.de
    Updated Feb 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonas Stiefelmaier (2023). Data for Fault Diagnosis in Adaptive Buildings [Dataset]. http://doi.org/10.18419/DARUS-3332
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 7, 2023
    Dataset provided by
    DaRUS
    Authors
    Jonas Stiefelmaier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Dataset funded by
    DFG
    Description

    General information: This dataset is meant to serve as a benchmark problem for fault detection and isolation in dynamical systems. It contains pre-processed sensor data from the adaptive high-rise demonstrator building D1244, built in the scope of the CRC1244. Parts of the measurements have been artificially corrupted and labeled accordingly. Please note that although the measurements are stored in Matlab's .mat-format (Version 7.0), they can easily be processed using free software such as the SciPy library in Python. Structure of the dataset: train contains the training data (only nominal) test_easy contains test data (nominal and faulty with high fault amplitude). Faulty samples were obtained by manipulating a single signal in a random nominal sample from the test data. test_hard contains test data (nominal and faulty with low fault amplitude) meta contains textual labels for all signals and fault types File contents: Each file contains the following data from 16384 timesteps: t: time in seconds u: demanded actuator forces in newtons y: measured outputs (relative elongations measured by strain gauges and actuator displacements in meters measured by position encoders) label: categorical label of the present fault class, where 0 denotes the nominal class and faults in the different signals are encoded according to their index in the list of fault types meta/labels.txt Faulty samples additionally include the corresponding nominal values for reference u_true: delivered actuator forces y_true: measured outputs without faults A sample's textual fault label is also contained in its filename (between the first and second underscore).

  9. f

    Summary statistics for expression quantitative trait loci in the developing...

    • springernature.figshare.com
    application/gzip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heath O'Brien; Nicholas J. Bray (2023). Summary statistics for expression quantitative trait loci in the developing human brain and their enrichment in neuropsychiatric disorders [Dataset]. http://doi.org/10.6084/m9.figshare.6881825.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Authors
    Heath O'Brien; Nicholas J. Bray
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains summary statistics for eQTL (Expression Quantitative Trait Loci) analyses for 120 human fetal brains from the second trimester of gestation (12 to 19post-conception weeks). Expression matrices, covariates, and summary statistics are provided for all tested eQTL and for top eQTL for all genes.The data are contained within a single .zip archive file. Individual data files are of openly accessible .txt text file format containing p- or q- values by SNP, and .bed Browser Extensible Data format files, containing annotation track data such as chromosomal coordinates. Data files of multiple GB in size are stored in individual .gz gzip compressed files.The related study investigates genetic influences on gene expression in the human fetal brain and their relationship with a variety of postnatal brain-related traits, including susceptibility to neuropsychiatric disorders. This dataset represents the first eQTL dataset derived exclusively from the human fetal brain, and is based on initial deep RNA sequencing and genotyping.The detailed breakdown of the files in this dataset is provided below and in README.md.Gene Level Analyses:

    • expression_gene.bed.gz

    ·
    normalised, variance-stabilising transformed count data (29,875 genes)

    ·
    columns: chr, gene_start, gene_end, gene_id, samples...

    • all_eqtls_gene.txt.gz· nominal p-values for all SNPs within 1 MB of each gene· columns: gene_id, variant_id, tss_distance, ma_samples, ma_count, maf, pval_nominal, slope, slope_se

    • top_eqtls_gene.txt.gz· q-values for most significant eQTL for each gene (includes nominal p-value thresholds that can be used to filter significant SNPs)· columns: chr, snp_start, snp_end, gene_id, num_var, beta_shape1, beta_shape2, true_df, pval_true_df, variant_id, tss_distance, minor_allele_samples, minor_allele_count, maf, ref_factor, pval_nominal, slope, slope_se, pval_perm, pval_beta, qval, pval_nominal_threshold

    Transcript Level Analyses: - expression_transcript.bed.gz · normalised, variance-stabilising transformed count data (144,448 transcripts)· columns: chr, transcript_start, transcript_end, transcript_id, samples... - all_eqtls_transcript.txt.gz· nominal p-values for all SNPs within 1 MB of each transcript· columns: transcript_id, variant_id, tss_distance, ma_samples, ma_count, maf, pval_nominal, slope, slope_se - top_eqtls_transcript.txt.gz· q-values for most significant eQTL for each transcript (includes nominal p-value thresholds that can be used to filter significant SNPs)· columns: columns: chr, snp_start, snp_end, transcript_id, num_var, beta_shape1, beta_shape2, true_df, pval_true_df, variant_id, tss_distance, minor_allele_samples, minor_allele_count, maf, ref_factor, pval_nominal, slope, slope_se, pval_perm, pval_beta, qval, pval_nominal_thresholdCovariates (Used For Both Gene Level and Transcript-Level Analyses) - covariates.txt· columns: Sample, Sex, PCW, RIN, ReadLength, PC1, PC2, PC3, PEER1, PEER2, PEER3, PEER4, PEER5, PEER6, PEER7, PEER8, PEER9, PEER10

  10. F

    Gross Domestic Product

    • fred.stlouisfed.org
    • trends.sourcemedium.com
    json
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Gross Domestic Product [Dataset]. https://fred.stlouisfed.org/series/GDP
    Explore at:
    jsonAvailable download formats
    Dataset updated
    May 29, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Description

    View economic output, reported as the nominal value of all new goods and services produced by labor and property located in the U.S.

  11. Fundamental Data Record for Atmospheric Composition [ATMOS_L1B]

    • earth.esa.int
    Updated Sep 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Space Agency (2024). Fundamental Data Record for Atmospheric Composition [ATMOS_L1B] [Dataset]. https://earth.esa.int/eogateway/catalog/fdr-for-atmospheric-composition
    Explore at:
    Dataset updated
    Sep 12, 2024
    Dataset authored and provided by
    European Space Agencyhttp://www.esa.int/
    License

    https://earth.esa.int/eogateway/documents/20142/1564626/Terms-and-Conditions-for-the-use-of-ESA-Data.pdfhttps://earth.esa.int/eogateway/documents/20142/1564626/Terms-and-Conditions-for-the-use-of-ESA-Data.pdf

    Time period covered
    Jun 28, 1995 - Apr 7, 2012
    Description

    The Fundamental Data Record (FDR) for Atmospheric Composition UVN v.1.0 dataset is a cross-instrument Level-1 product [ATMOS_L1B] generated in 2023 and resulting from the ESA FDR4ATMOS project. The FDR contains selected Earth Observation Level 1b parameters (irradiance/reflectance) from the nadir-looking measurements of the ERS-2 GOME and Envisat SCIAMACHY missions for the period ranging from 1995 to 2012. The data record offers harmonised cross-calibrated spectra with focus on spectral windows in the Ultraviolet-Visible-Near Infrared regions for the retrieval of critical atmospheric constituents like ozone (O3), sulphur dioxide (SO2), nitrogen dioxide (NO2) column densities, alongside cloud parameters. The FDR4ATMOS products should be regarded as experimental due to the innovative approach and the current use of a limited-sized test dataset to investigate the impact of harmonization on the Level 2 target species, specifically SO2, O3 and NO2. Presently, this analysis is being carried out within follow-on activities. The FDR4ATMOS V1 is currently being extended to include the MetOp GOME-2 series. Product format For many aspects, the FDR product has improved compared to the existing individual mission datasets: GOME solar irradiances are harmonised using a validated SCIAMACHY solar reference spectrum, solving the problem of the fast-changing etalon present in the original GOME Level 1b data; Reflectances for both GOME and SCIAMACHY are provided in the FDR product. GOME reflectances are harmonised to degradation-corrected SCIAMACHY values, using collocated data from the CEOS PIC sites; SCIAMACHY data are scaled to the lowest integration time within the spectral band using high-frequency PMD measurements from the same wavelength range. This simplifies the use of the SCIAMACHY spectra which were split in a complex cluster structure (with own integration time) in the original Level 1b data; The harmonization process applied mitigates the viewing angle dependency observed in the UV spectral region for GOME data; Uncertainties are provided. Each FDR product provides, within the same file, irradiance/reflectance data for UV-VIS-NIR special regions across all orbits on a single day, including therein information from the individual ERS-2 GOME and Envisat SCIAMACHY measurements. FDR has been generated in two formats: Level 1A and Level 1B targeting expert users and nominal applications respectively. The Level 1A [ATMOS_L1A] data include additional parameters such as harmonisation factors, PMD, and polarisation data extracted from the original mission Level 1 products. The ATMOS_L1A dataset is not part of the nominal dissemination to users. In case of specific requirements, please contact EOHelp. Please refer to the README file for essential guidance before using the data. All the new products are conveniently formatted in NetCDF. Free standard tools, such as Panoply, can be used to read NetCDF data. Panoply is sourced and updated by external entities. For further details, please consult our Terms and Conditions page. Uncertainty characterisation One of the main aspects of the project was the characterization of Level 1 uncertainties for both instruments, based on metrological best practices. The following documents are provided: General guidance on a metrological approach to Fundamental Data Records (FDR) Uncertainty Characterisation document Effect tables NetCDF files containing example uncertainty propagation analysis and spectral error correlation matrices for SCIAMACHY (Atlantic and Mauretania scene for 2003 and 2010) and GOME (Atlantic scene for 2003) reflectance_uncertainty_example_FDR4ATMOS_GOME.nc reflectance_uncertainty_example_FDR4ATMOS_SCIA.nc Known Issues Non-monotonous wavelength axis for SCIAMACHY in FDR data version 1.0 In the SCIAMACHY OBSERVATION group of the atmospheric FDR v1.0 dataset (DOI: 10.5270/ESA-852456e), the wavelength axis (lambda variable) is not monotonically increasing. This issue affects all spectral channels (UV, VIS, NIR) in the SCIAMACHY group, while GOME OBSERVATION data remain unaffected. The root cause of the issue lies in the incorrect indexing of the lambda variable during the NetCDF writing process. Notably, the wavelength values themselves are calculated correctly within the processing chain. Temporary Workaround The wavelength axis is correct in the first record of each product. As a workaround, users can extract the wavelength axis from the first record and apply it to all subsequent measurements within the same product. The first record can be retrieved by setting the first two indices (time and scanline) to 0 (assuming counting of array indices starts at 0). Note that this process must be repeated separately for each spectral range (UV, VIS, NIR) and every daily product. Since the wavelength axis of SCIAMACHY is highly stable over time, using the first record introduces no expected impact on retrieval results. Python pseudo-code example: lambda_...

  12. W

    Data from: ENSEMBLES CNRM-CM3 1PCTTO2X run1, daily values

    • wdc-climate.de
    • cera-www.dkrz.de
    Updated Sep 10, 2007
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Royer, Jean-Francois (2007). ENSEMBLES CNRM-CM3 1PCTTO2X run1, daily values [Dataset]. https://www.wdc-climate.de/ui/entry?acronym=ENSEMBLES_CNCM3_1PTO2X_1_D
    Explore at:
    Dataset updated
    Sep 10, 2007
    Dataset provided by
    World Data Center for Climate (WDCC) at DKRZ
    Authors
    Royer, Jean-Francois
    License

    http://ensembles-eu.metoffice.com/docs/Ensembles_Data_Policy_261108.pdfhttp://ensembles-eu.metoffice.com/docs/Ensembles_Data_Policy_261108.pdf

    Time period covered
    Jan 1, 1860 - Dec 31, 2080
    Area covered
    Description

    These data represent daily values (daily mean, instantaneous daily output) of selected variables for ENSEMBLES (http://www.ensembles-eu.org). The list of output variables can be found in: http://ensembles.wdc-climate.de/output-variables

    The 1PCTTO2X simulation(included year 2080) was initiated from nominal year 1970 of preindustriel run,when equilibrium was reached (corresponds to nominal year 1860 of CO2-doubling experiment). Forcing agents included: CO2, CH4, N2O, O3, CFC11 (including other CFCs and HFCs), CFC12; sulfate(Boucher), BC, sea salt, desert dust aerosols.

    These datasets are available in netCDF format. The dataset names are composed of - centre/model acronym (e.g. CNCM3: CNRM/CM3) - scenario acronym (e.g. SRA1B: SRES A1B) - run number (e.g. 1: run 1) - time interval (MM:monthly mean, DM:daily mean, DC:diurnal cycle, 6H:6 hourly, 12h:12hourly) - variable acronym with level value --> example: CNCM3_SRA1B_1_MM_hur850

    Technical data to this experiment: CNRM-CM3 (2004): atmosphere: Arpege-Climat v3 (T42L45, cy 22b+); ocean: OPA8.1; sea ice: Gelato 3.10; river routing: TRIP

  13. United Kingdom Nominal Average Weekly Earnings: sa: Total Pay (TP): Whole...

    • ceicdata.com
    Updated Feb 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). United Kingdom Nominal Average Weekly Earnings: sa: Total Pay (TP): Whole Economy [Dataset]. https://www.ceicdata.com/en/united-kingdom/average-weekly-earnings-seasonally-adjusted-sic-2007/nominal-average-weekly-earnings-sa-total-pay-tp-whole-economy
    Explore at:
    Dataset updated
    Feb 15, 2025
    Dataset provided by
    CEIC Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2023 - Nov 1, 2024
    Area covered
    United Kingdom
    Variables measured
    Wage/Earnings
    Description

    United Kingdom Nominal Average Weekly Earnings: sa: Total Pay (TP): Whole Economy data was reported at 716.000 GBP in Feb 2025. This records an increase from the previous number of 711.000 GBP for Jan 2025. United Kingdom Nominal Average Weekly Earnings: sa: Total Pay (TP): Whole Economy data is updated monthly, averaging 461.000 GBP from Jan 2000 (Median) to Feb 2025, with 302 observations. The data reached an all-time high of 716.000 GBP in Feb 2025 and a record low of 299.809 GBP in Feb 2000. United Kingdom Nominal Average Weekly Earnings: sa: Total Pay (TP): Whole Economy data remains active status in CEIC and is reported by Office for National Statistics. The data is categorized under Global Database’s United Kingdom – Table UK.G083: Average Weekly Earnings: Seasonally Adjusted: SIC 2007 . Labour Force Estimates are shown for the mid-month of the three-month average time periods. For example, estimates for January to March 2012 are shown as 'February 2012', estimates for February to April 2012 are shown as 'March 2012', etc. [COVID-19-IMPACT]

  14. d

    MKAD (Open Sourced Code)

    • catalog.data.gov
    Updated Apr 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). MKAD (Open Sourced Code) [Dataset]. https://catalog.data.gov/dataset/mkad-open-sourced-code
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Area covered
    MKAD
    Description

    The Multiple Kernel Anomaly Detection (MKAD) algorithm is designed for anomaly detection over a set of files. It combines multiple kernels into a single optimization function using the One Class Support Vector Machine (OCSVM) framework. Any kernel function can be combined in the algorithm as long as it meets the Mercer conditions, however for the purposes of this code the data preformatting and kernel type is specific to the Flight Operations Quality Assurance (FOQA) data and has been integrated into the coding steps. For this domain, discrete binary switch sequences are used in the discrete kernel, and discretized continuous parameter features are used to form the continuous kernel. The OCSVM uses a training set of nominal examples (in this case flights) and evaluates test examples for anomaly detection to determine whether they are anomalous or not. After completing this analysis the algorithm reports the anomalous examples and determines whether there is a contribution from either or both continuous and discrete elements.

  15. Degradation Measurement of Robot Arm Position Accuracy

    • data.nist.gov
    Updated Sep 7, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Helen Qiao (2018). Degradation Measurement of Robot Arm Position Accuracy [Dataset]. http://doi.org/10.18434/M31962
    Explore at:
    Dataset updated
    Sep 7, 2018
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Authors
    Helen Qiao
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    The dataset contains both the robot's high-level tool center position (TCP) health data and controller-level components' information (i.e., joint positions, velocities, currents, temperatures, currents). The datasets can be used by users (e.g., software developers, data scientists) who work on robot health management (including accuracy) but have limited or no access to robots that can capture real data. The datasets can support the: - Development of robot health monitoring algorithms and tools - Research of technologies and tools to support robot monitoring, diagnostics, prognostics, and health management (collectively called PHM) - Validation and verification of the industrial PHM implementation. For example, the verification of a robot's TCP accuracy after the work cell has been reconfigured, or whenever a manufacturer wants to determine if the robot arm has experienced a degradation. For data collection, a trajectory is programmed for the Universal Robot (UR5) approaching and stopping at randomly-selected locations in its workspace. The robot moves along this preprogrammed trajectory during different conditions of temperature, payload, and speed. The TCP (x,y,z) of the robot are measured by a 7-D measurement system developed at NIST. Differences are calculated between the measured positions from the 7-D measurement system and the nominal positions calculated by the nominal robot kinematic parameters. The results are recorded within the dataset. Controller level sensing data are also collected from each joint (direct output from the controller of the UR5), to understand the influences of position degradation from temperature, payload, and speed. Controller-level data can be used for the root cause analysis of the robot performance degradation, by providing joint positions, velocities, currents, accelerations, torques, and temperatures. For example, the cold-start temperatures of the six joints were approximately 25 degrees Celsius. After two hours of operation, the joint temperatures increased to approximately 35 degrees Celsius. Control variables are listed in the header file in the data set (UR5TestResult_header.xlsx). If you'd like to comment on this data and/or offer recommendations on future datasets, please email guixiu.qiao@nist.gov.

  16. Student Academic Performance and Probation Dataset

    • kaggle.com
    Updated Nov 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rathnakumarw (2024). Student Academic Performance and Probation Dataset [Dataset]. https://www.kaggle.com/datasets/rathnakumarw/student-academic-performance-and-probation-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 29, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rathnakumarw
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset Description This dataset consists of academic and demographic information about 300 students from a university, which can be used for predicting academic outcomes, such as probation status. The dataset was simulated to represent a variety of student attributes across multiple categories like personal data, academic history, and other related information. The primary goal of this dataset is to analyze factors contributing to academic performance and identify students at risk of probation.

    Column Descriptions Student No.: (Numeric) A unique identifier for each student. In this dataset, each student has a different ID number, making it a 100% unique column. Cohort: (Numeric) The year a student enrolled in the university. No missing values and consistent across the dataset. College: (Nominal) The name of the college the student belongs to. Examples include "Engineering," "Science," etc. No missing values. College Code: (Nominal) A numerical or alphanumerical code representing the college. This is an alternative representation of the "College" column. Major: (Nominal) The major field of study of the student. Some missing values (23%) represent students who haven’t declared a major or are in an undeclared status. Major Code: (Nominal) A code representing the major subject. Similar to the "Major" column, this has 23% missing values due to undeclared majors. Minor: (Nominal) The minor subject, if any, chosen by the student. This column has a high percentage of missing data (91%) since most students do not have minors. Spec: (Nominal) Specialization within the major field of study. Like the "Minor" column, this has 93% missing data as most students do not declare a specialization. Degree: (Numeric) The type of degree the student is pursuing (e.g., Bachelor's). In this dataset, all students are pursuing the same degree, so there are no missing values. Status: (Nominal) The current academic standing of the student (e.g., "Active," "Inactive"). No missing values. Load Status: (Nominal) The academic load status (e.g., "Full-time," "Part-time"). This column has very few missing values (1%). Gender: (Nominal) The gender of the student (e.g., "Male," "Female"). No missing values. Country: (Nominal) The country of origin of the student. Only 2 missing values, making it nearly complete. Governorate: (Nominal) The administrative region (governorate) the student comes from. This column has a small percentage of missing values (1%). Wellayah: (Nominal) The district or locality within the governorate. Around 1% of the data is missing. CGPA: (Numeric) The cumulative grade point average (CGPA) of the student. This field has 145 missing values, representing students without available CGPA records. Estimated Graduation Year: (Numeric) The expected year in which the student will graduate. No missing values. From HEAC: (Nominal) Indicates whether the student was admitted through the Higher Education Admission Center (HEAC). This column has 4% missing values. Admission Category: (Nominal) The category of admission (e.g., scholarship, self-funded). This column has a significant amount of missing data (98%), indicating that admission category data is either unavailable or irrelevant for most students. Birth Date: (Nominal) The birth date of the student. The dataset includes very few missing values (0%) and has been replaced by the derived feature "Age." Actual Graduation Date: (Nominal) The actual date on which a student graduates. More than half of the values are missing (54%), representing students who haven’t graduated yet. Withdrawal: (Nominal) Indicates whether the student has withdrawn from the university. This column has 89% missing data since the majority of students haven’t withdrawn. Marital Status: (Nominal) The marital status of the student (e.g., "Single," "Married"). No missing values. SQU Hostel: (Nominal) Indicates whether the student lives in the university hostel. No missing values. Percentage (Secondary School Score): (Nominal) The student’s percentage score from secondary school. No missing values. Probation Student: (Nominal) Indicates whether the student is under academic probation. This is the target variable for classification, with no missing values.

    Record Details Total Records: 300 Total Attributes: 26 Missing Values: Some columns have a significant proportion of missing data (e.g., Minor, Spec, Major Code), while others have very few or no missing values (e.g., Gender, Cohort, College). Missing values were handled using a placeholder for clarity in certain columns.

  17. United Kingdom Nominal Average Weekly Earnings: sa: Bonus Pay (BP): Whole...

    • ceicdata.com
    Updated Feb 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2018). United Kingdom Nominal Average Weekly Earnings: sa: Bonus Pay (BP): Whole Economy [Dataset]. https://www.ceicdata.com/en/united-kingdom/average-weekly-earnings-seasonally-adjusted-sic-2007/nominal-average-weekly-earnings-sa-bonus-pay-bp-whole-economy
    Explore at:
    Dataset updated
    Feb 15, 2025
    Dataset provided by
    CEIC Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jul 1, 2017 - Jun 1, 2018
    Area covered
    United Kingdom
    Variables measured
    Wage/Earnings
    Description

    United Kingdom Nominal Average Weekly Earnings: sa: Bonus Pay (BP): Whole Economy data was reported at 31.581 GBP in Sep 2018. This records a decrease from the previous number of 31.594 GBP for Aug 2018. United Kingdom Nominal Average Weekly Earnings: sa: Bonus Pay (BP): Whole Economy data is updated monthly, averaging 26.253 GBP from Jan 2000 (Median) to Sep 2018, with 225 observations. The data reached an all-time high of 40.836 GBP in Apr 2013 and a record low of 12.800 GBP in Feb 2000. United Kingdom Nominal Average Weekly Earnings: sa: Bonus Pay (BP): Whole Economy data remains active status in CEIC and is reported by Office for National Statistics. The data is categorized under Global Database’s United Kingdom – Table UK.G049: Average Weekly Earnings: Seasonally Adjusted: SIC 2007 . Labour Force Estimates are shown for the mid-month of the three-month average time periods. For example, estimates for January to March 2012 are shown as 'February 2012', estimates for February to April 2012 are shown as 'March 2012', etc.

  18. Data from: Total Synthesis of Nominal ent-Chlorabietol B

    • acs.figshare.com
    txt
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yulong Li; Zhezhe Xu; Zhipeng Xie; Xingchao Guan; Zhixiang Xie (2023). Total Synthesis of Nominal ent-Chlorabietol B [Dataset]. http://doi.org/10.1021/acs.joc.0c00233.s002
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    ACS Publications
    Authors
    Yulong Li; Zhezhe Xu; Zhipeng Xie; Xingchao Guan; Zhixiang Xie
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The nominal enantiomer of chlorabietol B was regio- and stereoselectively synthesized from (−)-abietic acid in 13 steps. Key features of the synthesis involved an oxidative [3+2] cycloaddition to install the dihydrobenzofuran moiety and an Aldol reaction, followed by elimination and reduction steps to introduce the long chain with three cis double bonds. However, obvious differences in the NMR spectra of the synthetic and natural samples suggested that the proposed structure of chlorabietol B should be revised carefully.

  19. S

    Statistical Area 2 2025 Clipped

    • datafinder.stats.govt.nz
    csv, dwg, geodatabase +6
    Updated Dec 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stats NZ (2022). Statistical Area 2 2025 Clipped [Dataset]. https://datafinder.stats.govt.nz/layer/120969-statistical-area-2-2025-clipped/
    Explore at:
    pdf, csv, geopackage / sqlite, kml, geodatabase, mapinfo tab, dwg, mapinfo mif, shapefileAvailable download formats
    Dataset updated
    Dec 15, 2022
    Dataset provided by
    Statistics New Zealandhttp://www.stats.govt.nz/
    Authors
    Stats NZ
    License

    https://datafinder.stats.govt.nz/license/attribution-4-0-international/https://datafinder.stats.govt.nz/license/attribution-4-0-international/

    Area covered
    Description

    Refer to the current geographies boundaries table for a list of all current geographies and recent updates.

    This dataset is the definitive version of the annually released statistical area 2 (SA2) boundaries as at 1 January 2025 as defined by Stats NZ, clipped to the coastline. This clipped version has been created for cartographic purposes and so does not fully represent the official full extent boundaries. This clipped version contains 2,311 SA2 areas.

    SA2 is an output geography that provides higher aggregations of population data than can be provided at the statistical area 1 (SA1) level. The SA2 geography aims to reflect communities that interact together socially and economically. In populated areas, SA2s generally contain similar sized populations.

    The SA2 should:

    form a contiguous cluster of one or more SA1s,

    excluding exceptions below, allow the release of multivariate statistics with minimal data suppression,

    capture a similar type of area, such as a high-density urban area, farmland, wilderness area, and water area,

    be socially homogeneous and capture a community of interest. It may have, for example:

    • a shared road network,

    • shared community facilities,

    • shared historical or social links, or

    • socio-economic similarity,

    form a nested hierarchy with statistical output geographies and administrative boundaries. It must:

    • be built from SA1s,

    • either define or aggregate to define SA3s, urban areas, territorial authorities, and regional councils.

    SA2s in city council areas generally have a population of 2,000–4,000 residents while SA2s in district council areas generally have a population of 1,000–3,000 residents.

    In major urban areas, an SA2 or a group of SA2s often approximates a single suburb. In rural areas, rural settlements are included in their respective SA2 with the surrounding rural area.

    SA2s in urban areas where there is significant business and industrial activity, for example ports, airports, industrial, commercial, and retail areas, often have fewer than 1,000 residents. These SA2s are useful for analysing business demographics, labour markets, and commuting patterns.

    In rural areas, some SA2s have fewer than 1,000 residents because they are in conservation areas or contain sparse populations that cover a large area.

    To minimise suppression of population data, small islands with zero or low populations close to the mainland, and marinas are generally included in their adjacent land-based SA2.

    Zero or nominal population SA2s

    To ensure that the SA2 geography covers all of New Zealand and aligns with New Zealand’s topography and local government boundaries, some SA2s have zero or nominal populations. These include:

    • SA2s where territorial authority boundaries straddle regional council boundaries. These SA2s each have fewer than 200 residents and are: Arahiwi, Tiroa, Rangataiki, Kaimanawa, Taharua, Te More, Ngamatea, Whangamomona, and Mara.

    • SA2s created for single islands or groups of islands that are some distance from the mainland or to separate large unpopulated islands from urban areas

    • SA2s that represent inland water, inlets or oceanic areas including: inland lakes larger than 50 square kilometres, harbours larger than 40 square kilometres, major ports, other non-contiguous inlets and harbours defined by territorial authority, and contiguous oceanic areas defined by regional council.

    • SA2s for non-digitised oceanic areas, offshore oil rigs, islands, and the Ross Dependency. Each SA2 is represented by a single meshblock. The following 16 SA2s are held in non-digitised form (SA2 code; SA2 name):

    400001; New Zealand Economic Zone, 400002; Oceanic Kermadec Islands, 400003; Kermadec Islands, 400004; Oceanic Oil Rig Taranaki, 400005; Oceanic Campbell Island, 400006; Campbell Island, 400007; Oceanic Oil Rig Southland, 400008; Oceanic Auckland Islands, 400009; Auckland Islands, 400010 ; Oceanic Bounty Islands, 400011; Bounty Islands, 400012; Oceanic Snares Islands, 400013; Snares Islands, 400014; Oceanic Antipodes Islands, 400015; Antipodes Islands, 400016; Ross Dependency.

    SA2 numbering and naming

    Each SA2 is a single geographic entity with a name and a numeric code. The name refers to a geographic feature or a recognised place name or suburb. In some instances where place names are the same or very similar, the SA2s are differentiated by their territorial authority name, for example, Gladstone (Carterton District) and Gladstone (Invercargill City).

    SA2 codes have six digits. North Island SA2 codes start with a 1 or 2, South Island SA2 codes start with a 3 and non-digitised SA2 codes start with a 4. They are numbered approximately north to south within their respective territorial authorities. To ensure the north–south code pattern is maintained, the SA2 codes were given 00 for the last two digits when the geography was created in 2018. When SA2 names or boundaries change only the last two digits of the code will change.

    Clipped Version

    This clipped version has been created for cartographic purposes and so does not fully represent the official full extent boundaries.

    High-definition version

    This high definition (HD) version is the most detailed geometry, suitable for use in GIS for geometric analysis operations and for the computation of areas, centroids and other metrics. The HD version is aligned to the LINZ cadastre.

    Macrons

    Names are provided with and without tohutō/macrons. The column name for those without macrons is suffixed ‘ascii’.

    Digital data

    Digital boundary data became freely available on 1 July 2007.

    Further information

    To download geographic classifications in table formats such as CSV please use Ariā

    For more information please refer to the Statistical standard for geographic areas 2023.

    Contact: geography@stats.govt.nz

  20. The estimated FDR of the four tests under the six models for 1,000...

    • plos.figshare.com
    xls
    Updated Jun 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiaoyu Liang; Xuewei Cao; Qiuying Sha; Shuanglin Zhang (2023). The estimated FDR of the four tests under the six models for 1,000 phenotypes (K = 1,000). [Dataset]. http://doi.org/10.1371/journal.pone.0276646.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Xiaoyu Liang; Xuewei Cao; Qiuying Sha; Shuanglin Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MAF is 0.3. The sample size (n) is 2,000. ρf = 0.2, ρe = 0.3, and c2 = 0.5. β is the effect size. FDR is evaluated using 200 replicated samples at a nominal FDR level of 5%. All estimated FDR are within the 95% confidence interval (0.0198, 0.0802).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Hezhi Lu; Fengjing Cai; Yuan Li; Xionghui Ou (2023). Data and probability for an incomplete 2×2 table. [Dataset]. http://doi.org/10.1371/journal.pone.0272007.t001

Data and probability for an incomplete 2×2 table.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Jun 10, 2023
Dataset provided by
PLOS ONE
Authors
Hezhi Lu; Fengjing Cai; Yuan Li; Xionghui Ou
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Data and probability for an incomplete 2×2 table.

Search
Clear search
Close search
Google apps
Main menu