100+ datasets found
  1. d

    Political Analysis Using R: Example Code and Data, Plus Data for Practice...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Monogan, Jamie (2023). Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems [Dataset]. http://doi.org/10.7910/DVN/ARKOTI
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Monogan, Jamie
    Description

    Each R script replicates all of the example code from one chapter from the book. All required data for each script are also uploaded, as are all data used in the practice problems at the end of each chapter. The data are drawn from a wide array of sources, so please cite the original work if you ever use any of these data sets for research purposes.

  2. d

    Data from: PISA Data Analysis Manual: SPSS, Second Edition

    • catalog.data.gov
    Updated Mar 30, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of State (2021). PISA Data Analysis Manual: SPSS, Second Edition [Dataset]. https://catalog.data.gov/dataset/pisa-data-analysis-manual-spss-second-edition
    Explore at:
    Dataset updated
    Mar 30, 2021
    Dataset provided by
    U.S. Department of State
    Description

    The OECD Programme for International Student Assessment (PISA) surveys collected data on students’ performances in reading, mathematics and science, as well as contextual information on students’ background, home characteristics and school factors which could influence performance. This publication includes detailed information on how to analyse the PISA data, enabling researchers to both reproduce the initial results and to undertake further analyses. In addition to the inclusion of the necessary techniques, the manual also includes a detailed account of the PISA 2006 database and worked examples providing full syntax in SPSS.

  3. Data Science Platform Market Analysis North America, Europe, APAC, South...

    • technavio.com
    Updated Feb 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Data Science Platform Market Analysis North America, Europe, APAC, South America, Middle East and Africa - US, Germany, China, Canada, UK, India, France, Japan, Brazil, UAE - Size and Forecast 2025-2029 [Dataset]. https://www.technavio.com/report/data-science-platform-market-industry-analysis
    Explore at:
    Dataset updated
    Feb 13, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    United Kingdom, United States, Global
    Description

    Snapshot img

    Data Science Platform Market Size 2025-2029

    The data science platform market size is forecast to increase by USD 763.9 million at a CAGR of 40.2% between 2024 and 2029.

    The market is experiencing significant growth, driven by the integration of artificial intelligence (AI) and machine learning (ML). This enhancement enables more advanced data analysis and prediction capabilities, making data science platforms an essential tool for businesses seeking to gain insights from their data. Another trend shaping the market is the emergence of containerization and microservices in platforms. This development offers increased flexibility and scalability, allowing organizations to efficiently manage their projects. 
    However, the use of platforms also presents challenges, particularly In the area of data privacy and security. Ensuring the protection of sensitive data is crucial for businesses, and platforms must provide strong security measures to mitigate risks. In summary, the market is witnessing substantial growth due to the integration of AI and ML technologies, containerization, and microservices, while data privacy and security remain key challenges.
    

    What will be the Size of the Data Science Platform Market During the Forecast Period?

    Request Free Sample

    The market is experiencing significant growth due to the increasing demand for advanced data analysis capabilities in various industries. Cloud-based solutions are gaining popularity as they offer scalability, flexibility, and cost savings. The market encompasses the entire project life cycle, from data acquisition and preparation to model development, training, and distribution. Big data, IoT, multimedia, machine data, consumer data, and business data are prime sources fueling this market's expansion. Unstructured data, previously challenging to process, is now being effectively managed through tools and software. Relational databases and machine learning models are integral components of platforms, enabling data exploration, preprocessing, and visualization.
    Moreover, Artificial intelligence (AI) and machine learning (ML) technologies are essential for handling complex workflows, including data cleaning, model development, and model distribution. Data scientists benefit from these platforms by streamlining their tasks, improving productivity, and ensuring accurate and efficient model training. The market is expected to continue its growth trajectory as businesses increasingly recognize the value of data-driven insights.
    

    How is this Data Science Platform Industry segmented and which is the largest segment?

    The industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

    Deployment
    
      On-premises
      Cloud
    
    
    Component
    
      Platform
      Services
    
    
    End-user
    
      BFSI
      Retail and e-commerce
      Manufacturing
      Media and entertainment
      Others
    
    
    Sector
    
      Large enterprises
      SMEs
    
    
    Geography
    
      North America
    
        Canada
        US
    
    
      Europe
    
        Germany
        UK
        France
    
    
      APAC
    
        China
        India
        Japan
    
    
      South America
    
        Brazil
    
    
      Middle East and Africa
    

    By Deployment Insights

    The on-premises segment is estimated to witness significant growth during the forecast period.
    

    On-premises deployment is a traditional method for implementing technology solutions within an organization. This approach involves purchasing software with a one-time license fee and a service contract. On-premises solutions offer enhanced security, as they keep user credentials and data within the company's premises. They can be customized to meet specific business requirements, allowing for quick adaptation. On-premises deployment eliminates the need for third-party providers to manage and secure data, ensuring data privacy and confidentiality. Additionally, it enables rapid and easy data access, and keeps IP addresses and data confidential. This deployment model is particularly beneficial for businesses dealing with sensitive data, such as those in manufacturing and large enterprises. While cloud-based solutions offer flexibility and cost savings, on-premises deployment remains a popular choice for organizations prioritizing data security and control.

    Get a glance at the Data Science Platform Industry report of share of various segments. Request Free Sample

    The on-premises segment was valued at USD 38.70 million in 2019 and showed a gradual increase during the forecast period.

    Regional Analysis

    North America is estimated to contribute 48% to the growth of the global market during the forecast period.
    

    Technavio's analysts have elaborately explained the regional trends and drivers that shape the market during the forecast period.

    For more insights on the market share of various regions, Request F

  4. f

    Data from: pmartR: Quality Control and Statistics for Mass...

    • acs.figshare.com
    • figshare.com
    xlsx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kelly G. Stratton; Bobbie-Jo M. Webb-Robertson; Lee Ann McCue; Bryan Stanfill; Daniel Claborne; Iobani Godinez; Thomas Johansen; Allison M. Thompson; Kristin E. Burnum-Johnson; Katrina M. Waters; Lisa M. Bramer (2023). pmartR: Quality Control and Statistics for Mass Spectrometry-Based Biological Data [Dataset]. http://doi.org/10.1021/acs.jproteome.8b00760.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    ACS Publications
    Authors
    Kelly G. Stratton; Bobbie-Jo M. Webb-Robertson; Lee Ann McCue; Bryan Stanfill; Daniel Claborne; Iobani Godinez; Thomas Johansen; Allison M. Thompson; Kristin E. Burnum-Johnson; Katrina M. Waters; Lisa M. Bramer
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Prior to statistical analysis of mass spectrometry (MS) data, quality control (QC) of the identified biomolecule peak intensities is imperative for reducing process-based sources of variation and extreme biological outliers. Without this step, statistical results can be biased. Additionally, liquid chromatography–MS proteomics data present inherent challenges due to large amounts of missing data that require special consideration during statistical analysis. While a number of R packages exist to address these challenges individually, there is no single R package that addresses all of them. We present pmartR, an open-source R package, for QC (filtering and normalization), exploratory data analysis (EDA), visualization, and statistical analysis robust to missing data. Example analysis using proteomics data from a mouse study comparing smoke exposure to control demonstrates the core functionality of the package and highlights the capabilities for handling missing data. In particular, using a combined quantitative and qualitative statistical test, 19 proteins whose statistical significance would have been missed by a quantitative test alone were identified. The pmartR package provides a single software tool for QC, EDA, and statistical comparisons of MS data that is robust to missing data and includes numerous visualization capabilities.

  5. Company Datasets for Business Profiling

    • datarade.ai
    Updated Feb 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oxylabs (2017). Company Datasets for Business Profiling [Dataset]. https://datarade.ai/data-products/company-datasets-for-business-profiling-oxylabs
    Explore at:
    .json, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Feb 23, 2017
    Dataset authored and provided by
    Oxylabs
    Area covered
    Isle of Man, Tunisia, Taiwan, Canada, Bangladesh, Northern Mariana Islands, Andorra, British Indian Ocean Territory, Nepal, Moldova (Republic of)
    Description

    Company Datasets for valuable business insights!

    Discover new business prospects, identify investment opportunities, track competitor performance, and streamline your sales efforts with comprehensive Company Datasets.

    These datasets are sourced from top industry providers, ensuring you have access to high-quality information:

    • Owler: Gain valuable business insights and competitive intelligence. -AngelList: Receive fresh startup data transformed into actionable insights. -CrunchBase: Access clean, parsed, and ready-to-use business data from private and public companies. -Craft.co: Make data-informed business decisions with Craft.co's company datasets. -Product Hunt: Harness the Product Hunt dataset, a leader in curating the best new products.

    We provide fresh and ready-to-use company data, eliminating the need for complex scraping and parsing. Our data includes crucial details such as:

    • Company name;
    • Size;
    • Founding date;
    • Location;
    • Industry;
    • Revenue;
    • Employee count;
    • Competitors.

    You can choose your preferred data delivery method, including various storage options, delivery frequency, and input/output formats.

    Receive datasets in CSV, JSON, and other formats, with storage options like AWS S3 and Google Cloud Storage. Opt for one-time, monthly, quarterly, or bi-annual data delivery.

    With Oxylabs Datasets, you can count on:

    • Fresh and accurate data collected and parsed by our expert web scraping team.
    • Time and resource savings, allowing you to focus on data analysis and achieving your business goals.
    • A customized approach tailored to your specific business needs.
    • Legal compliance in line with GDPR and CCPA standards, thanks to our membership in the Ethical Web Data Collection Initiative.

    Pricing Options:

    Standard Datasets: choose from various ready-to-use datasets with standardized data schemas, priced from $1,000/month.

    Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.

    Experience a seamless journey with Oxylabs:

    • Understanding your data needs: We work closely to understand your business nature and daily operations, defining your unique data requirements.
    • Developing a customized solution: Our experts create a custom framework to extract public data using our in-house web scraping infrastructure.
    • Delivering data sample: We provide a sample for your feedback on data quality and the entire delivery process.
    • Continuous data delivery: We continuously collect public data and deliver custom datasets per the agreed frequency.

    Unlock the power of data with Oxylabs' Company Datasets and supercharge your business insights today!

  6. Google Analytics Sample

    • kaggle.com
    zip
    Updated Sep 19, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google BigQuery (2019). Google Analytics Sample [Dataset]. https://www.kaggle.com/datasets/bigquery/google-analytics-sample
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Sep 19, 2019
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Googlehttp://google.com/
    Authors
    Google BigQuery
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website.

    Content

    The sample dataset contains Google Analytics 360 data from the Google Merchandise Store, a real ecommerce store. The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website. It includes the following kinds of information:

    Traffic source data: information about where website visitors originate. This includes data about organic traffic, paid search traffic, display traffic, etc. Content data: information about the behavior of users on the site. This includes the URLs of pages that visitors look at, how they interact with content, etc. Transactional data: information about the transactions that occur on the Google Merchandise Store website.

    Fork this kernel to get started.

    Acknowledgements

    Data from: https://bigquery.cloud.google.com/table/bigquery-public-data:google_analytics_sample.ga_sessions_20170801

    Banner Photo by Edho Pratama from Unsplash.

    Inspiration

    What is the total number of transactions generated per device browser in July 2017?

    The real bounce rate is defined as the percentage of visits with a single pageview. What was the real bounce rate per traffic source?

    What was the average number of product pageviews for users who made a purchase in July 2017?

    What was the average number of product pageviews for users who did not make a purchase in July 2017?

    What was the average total transactions per user that made a purchase in July 2017?

    What is the average amount of money spent per session in July 2017?

    What is the sequence of pages viewed?

  7. f

    Comparison of the Sample Included in Data Analysis and the Sample Excluded...

    • figshare.com
    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Margie E. Lachman; Stefan Agrigoroaei (2023). Comparison of the Sample Included in Data Analysis and the Sample Excluded from Data Analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0013297.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Margie E. Lachman; Stefan Agrigoroaei
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    p values for means are derived from independent samples t-tests.p values for percentages are derived from χ2 tests.

  8. Example data for working with the ASpecD framework

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jul 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Till Biskup; Till Biskup (2023). Example data for working with the ASpecD framework [Dataset]. http://doi.org/10.5281/zenodo.8150115
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 16, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Till Biskup; Till Biskup
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ASpecD is a Python framework for handling spectroscopic data focussing on reproducibility. In short: Each and every processing step applied to your data will be recorded and can be traced back. Additionally, for each representation of your data (e.g., figures, tables) you can easily follow how the data shown have been processed and where they originate from.

    To provide readers of the publication describing the ASpecD framework with a concrete example of data analysis making use of recipe-driven data analysis, this repository contains both, a recipe as well as the data that are analysed, as shown in the publication describing the ASpecD framework:

    • Jara Popp, Till Biskup: ASpecD: A Modular Framework for the Analysis of Spectroscopic Data Focussing on Reproducibility and Good Scientific Practice. Chemistry--Methods 2:e202100097, 2022. doi:10.1002/cmtd.202100097
  9. Starfleet Headache Treatment - Example Data for Repeated ANOVA

    • figshare.com
    txt
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jesus Rogel-Salazar (2022). Starfleet Headache Treatment - Example Data for Repeated ANOVA [Dataset]. http://doi.org/10.6084/m9.figshare.19089896.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 28, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Jesus Rogel-Salazar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Fictitious data to look at the overall health of Starfleet volunteers on four different drugs. Since each volunteer is measured on each of the four drugs, we propose to use this datasets to look at repeater measures ANOVA to determine if the mean health scores differs between drugs.

  10. U

    Research data supporting “Hybrid Sankey diagrams: visual analysis of...

    • researchdata.bath.ac.uk
    • repository.cam.ac.uk
    Updated Oct 24, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rick Lupton; Julian M. Allwood (2016). Research data supporting “Hybrid Sankey diagrams: visual analysis of multidimensional data for understanding resource use” [Dataset]. http://doi.org/10.17863/CAM.6038
    Explore at:
    Dataset updated
    Oct 24, 2016
    Dataset provided by
    University of Bath
    University of Cambridge
    Authors
    Rick Lupton; Julian M. Allwood
    Dataset funded by
    Engineering and Physical Sciences Research Council
    Description

    This data includes two example databases from the paper "Hybrid Sankey diagrams: visual analysis of multidimensional data for understanding resource use": the made-up fruit flows, and real global steel flow data from Cullen et al. (2012). It also includes the Sankey Diagram Definitions to reproduce the diagrams in the paper. The code to reproduce the figures is written in Python in the form of Jupyter notebooks. A conda environment file is included to easily set up the necessary Python packages to run the notebooks. All files are included in the "examples.zip" file. The notebook files are also uploaded standalone so they can be linked to nbviewer.

  11. z

    Missing data in the analysis of multilevel and dependent data (Example data...

    • zenodo.org
    bin
    Updated Jul 20, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simon Grund; Simon Grund; Oliver Lüdtke; Oliver Lüdtke; Alexander Robitzsch; Alexander Robitzsch (2023). Missing data in the analysis of multilevel and dependent data (Example data sets) [Dataset]. http://doi.org/10.5281/zenodo.7773614
    Explore at:
    binAvailable download formats
    Dataset updated
    Jul 20, 2023
    Dataset provided by
    Springer
    Authors
    Simon Grund; Simon Grund; Oliver Lüdtke; Oliver Lüdtke; Alexander Robitzsch; Alexander Robitzsch
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Example data sets for the book chapter titled "Missing Data in the Analysis of Multilevel and Dependent Data" submitted for publication in the second edition of "Dependent Data in Social Science Research" (Stemmler et al., 2015). This repository includes the data sets used in both example analyses (Examples 1 and 2) in two file formats (binary ".rda" for use in R; plain-text ".dat").

    The data sets contain simulated data from 23,376 (Example 1) and 23,072 (Example 2) individuals from 2,000 groups on four variables:

    ID = group identifier (1-2000)
    x = numeric (Level 1)
    y = numeric (Level 1)
    w = binary (Level 2)

    In all data sets, missing values are coded as "NA".

  12. m

    Raw data outputs 1-18

    • bridges.monash.edu
    • researchdata.edu.au
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abbas Salavaty Hosein Abadi; Sara Alaei; Mirana Ramialison; Peter Currie (2023). Raw data outputs 1-18 [Dataset]. http://doi.org/10.26180/21259491.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Monash University
    Authors
    Abbas Salavaty Hosein Abadi; Sara Alaei; Mirana Ramialison; Peter Currie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Raw data outputs 1-18 Raw data output 1. Differentially expressed genes in AML CSCs compared with GTCs as well as in TCGA AML cancer samples compared with normal ones. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 2. Commonly and uniquely differentially expressed genes in AML CSC/GTC microarray and TCGA bulk RNA-seq datasets. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 3. Common differentially expressed genes between training and test set samples the microarray dataset. This data was generated based on the results of AML microarray data analysis. Raw data output 4. Detailed information on the samples of the breast cancer microarray dataset (GSE52327) used in this study. Raw data output 5. Differentially expressed genes in breast CSCs compared with GTCs as well as in TCGA BRCA cancer samples compared with normal ones. Raw data output 6. Commonly and uniquely differentially expressed genes in breast cancer CSC/GTC microarray and TCGA BRCA bulk RNA-seq datasets. This data was generated based on the results of breast cancer microarray and TCGA BRCA data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 7. Differential and common co-expression and protein-protein interaction of genes between CSC and GTC samples. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 8. Differentially expressed genes between AML dormant and active CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 9. Uniquely expressed genes in dormant or active AML CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 10. Intersections between the targeting transcription factors of AML key CSC genes and differentially expressed genes between AML CSCs vs GTCs and between dormant and active AML CSCs or the uniquely expressed genes in either class of CSCs. Raw data output 11. Targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 12. CSC-specific targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 13. The protein-protein interactions between AML key CSC genes with themselves and their targeting transcription factors. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. Raw data output 14. The previously confirmed associations of genes having the highest targeting desirableness and CSC-specific targeting desirableness scores with AML or other cancers’ (stem) cells as well as hematopoietic stem cells. These data were generated based on a PubMed database-based literature mining. Raw data output 15. Drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 16. CSC-specific drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 17. Candidate drugs for experimental validation. These drugs were selected based on their respective (CSC-specific) drug scores. CSC is the abbreviation of cancer stem cell. Raw data output 18. Detailed information on the samples of the AML microarray dataset GSE30375 used in this study.

  13. Dataset for Exploring case-control samples with non-targeted analysis

    • catalog.data.gov
    • datasets.ai
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Dataset for Exploring case-control samples with non-targeted analysis [Dataset]. https://catalog.data.gov/dataset/dataset-for-exploring-case-control-samples-with-non-targeted-analysis
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    These data contain the results of GC-MS, LC-MS and immunochemistry analyses of mask sample extracts. The data include tentatively identified compounds through library searches and compound abundance. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The data can not be accessed. Format: The dataset contains the identification of compounds found in the mask samples as well as the abundance of those compounds for individuals who participated in the trial. This dataset is associated with the following publication: Pleil, J., M. Wallace, J. McCord, M. Madden, J. Sobus, and G. Ferguson. How do cancer-sniffing dogs sort biological samples? Exploring case-control samples with non-targeted LC-Orbitrap, GC-MS, and immunochemistry methods. Journal of Breath Research. Institute of Physics Publishing, Bristol, UK, 14(1): 016006, (2019).

  14. d

    Grain-Size and Data Analysis Results from Sediment Samples Collected at...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Jul 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Grain-Size and Data Analysis Results from Sediment Samples Collected at Crocker Reef, Florida, Between 2017 and 2019 [Dataset]. https://catalog.data.gov/dataset/grain-size-and-data-analysis-results-from-sediment-samples-collected-at-crocker-reef-flori
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    U.S. Geological Survey
    Area covered
    Crocker Reef, Florida
    Description

    Sediment samples were collected from undisturbed sections of the seafloor around Crocker Reef, Florida. Crocker Reef is a barrier reef located in the northern portion of the Florida Reef Tract that has been classified by Kellogg and others (2015) as a senile or dead reef consisting of areas of sand and rubble with only scattered stony coral colonies. Samples were collected from November 2017 to April 2019 to help ground truth coincident instrumentation deployed during the same time interval, which was used to record various oceanic (currents, waves, turbidity, and pressure) time series datasets that would be used in subsequent analyses. All sediment samples were analyzed using a laser diffraction Coulter LS13 320 particle-size analyzer and sieves to measure the grain-size distribution of the sediments.

  15. d

    Data from the Chemical Analysis of Archived Stream-Sediment Samples, Alaska

    • catalog.data.gov
    • data.usgs.gov
    • +3more
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Data from the Chemical Analysis of Archived Stream-Sediment Samples, Alaska [Dataset]. https://catalog.data.gov/dataset/data-from-the-chemical-analysis-of-archived-stream-sediment-samples-alaska
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Alaska
    Description

    This data release contains the elemental concentration data for more than 1700 archived stream-sediment samples collected in Alaska. Samples were retrieved from the USGS Mineral Program's sample archive in Denver, CO, and the Alaska Division of Geological and Geophysical Surveys Geologic Materials Center in Anchorage, AK. All samples were analyzed using a multi-element analytical method involving fusion of the sample by sodium peroxide, dissolution of the fusion cake by nitric acid, and elemental analysis by inductively coupled plasma-optical emission spectroscopy (ICP-OES) and inductively coupled plasma-mass spectroscopy (ICP-MS). Additionally, 106 samples from the Nixon Fork area were analyzed by a second multi-element method in which the samples are decomposed by a mixture of hydrochloric, nitric, perchloric, and hydrofluoric acids and the elemental composition is determined by ICP-OES and ICP-MS. New Hg (mercury) concentrations, determined by cold-vapor atomic absorption spectrometry, are reported for 296 samples from southeast Alaska.

  16. Household Energy Survey, July 2013 - West Bank and Gaza

    • pcbs.gov.ps
    Updated Aug 31, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Palestinian Central Bureau of Statistics (2020). Household Energy Survey, July 2013 - West Bank and Gaza [Dataset]. https://www.pcbs.gov.ps/PCBS-Metadata-en-v5.2/index.php/catalog/573
    Explore at:
    Dataset updated
    Aug 31, 2020
    Dataset authored and provided by
    Palestinian Central Bureau of Statisticshttp://pcbs.gov.ps/
    Time period covered
    2013
    Area covered
    Gaza Strip, West Bank, Gaza
    Description

    Abstract

    Because of the importance of the household sector and due to it's large contribution to energy consumption in the Palestinian Territory, PCBS decided to conduct a special household energy survey to cover energy indicators in the household sector. To achieve this, a questionnaire was attached to the Labor Force Survey.

    This survey aimed to provide data on energy consumption in the household sector and to provide data on energy consumption behavior in the society by type of energy.

    This report presents data on various energy households indicators in the Palestinian Territory, and presents statistical data on electricity and other fuel consumption for the household sector, using type of fuel by different activities (cooking, Baking, conditioning, lighting, and water Heating).

    Geographic coverage

    Palestine.

    Analysis unit

    Households

    Universe

    The target population was all Palestinian households living in the Palestine.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Sample Frame The sampling frame consists of all the enumeration areas enumerated in 2007: each enumeration area consists of buildings and housing units with an average of around 124 households. These enumeration areas are used as primary sampling units (PSUs) in the first stage of the sampling selection.

    Sample size The estimated sample size is 3,184 households.

    Sampling Design: The sample of this survey is a part of the main sample of the Labor Force Survey (LFS), which is implemented quarterly (distributed over 13 weeks) by PCBS since 1995. This survey was attached to the LFS in the third quarter of 2013 and the sample comprised six weeks, from the eighth week to the thirteen week of the third round of the Labor Force Survey of 2013. The sample is two-stage stratified cluster sample:

    First stage: selection of a stratified systematic random sample of 206 enumeration areas for the semi-round.

    Second stage: selection of a random area sample of an average of 16 households from each enumeration area selected in the first stage.

    Sample strata The population was divided by: 1. Governorate (16 governorates) 2. Type of locality (urban, rural, refugee camps)

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The design of the questionnaire for the Household Energy Survey was based on the experiences of similar countries as well as on international standards and recommendations for the most important indicators, taking into account the special situation of the Palestinian Territory.

    Cleaning operations

    The data processing stage consisted of the following operations: Editing and coding prior to data entry: all questionnaires were edited and coded in the office using the same instructions adopted for editing in the field.

    Data entry: The household energy survey questionnaire was programmed onto handheld devices and data were entered directly using these devices in the West Bank. With regard to Jerusalem J1 and the Gaza Strip, data were entered into the computer in the offices in Ramallah and Gaza. At this stage, data were entered into the computer using a data entry template developed in Access. The data entry program was prepared to satisfy a number of requirements: · To prevent the duplication of questionnaires during data entry. · To apply checks on the integrity and consistency of entered data. · To handle errors in a user friendly manner. · The ability to transfer captured data to another format for data analysis using statistical analysis software such as SPSS.

    Response rate

    During fieldwork 3,184 families were visited in the Palestinian Territory, There is 2,692 complete questioner. , this percent was about 85%.

    Sampling error estimates

    Sampling Errors Data of this survey may be affected by sampling errors due to use of a sample and not a complete enumeration. Therefore, certain differences are anticipated in comparison with the real values obtained through censuses. The variance was calculated for the most important indicators: the variance table is attached with the final report. There is no problem in the dissemination of results at national and regional level (North, Middle, South of West Bank, Gaza Strip) and by locality. However, the indicator of averages of household consumption for certain fuels by region show a high variance.

    Non Sampling Errors The implementation of the survey encountered non-response where the household was not present at home during the field work visit and where the housing unit was vacant: these made up a high percentage of the non-response cases. The total non-response rate was 10.8%, which is very low when compared to the household surveys conducted by PCBS. The refusal rate was 3.3%, which is very low compared to the household surveys conducted by PCBS and may be attributed to the short and clear questionnaire.

    The survey sample consisted of around 3,184 households, of which 2,692 households completed the interview: 1,757 households from the West Bank and 935 households in the Gaza Strip. Weights were modified to account for the non-response rate. The response rate in the West Bank was 86.8 % while in the Gaza Strip it was 94.3%.

    Non-Response Cases

    No. of cases non-response cases
    2,692 Household completed 35 Household traveling 17 Unit does not exist 111 No one at home
    102 Refused to cooperate
    152 Vacant housing unit 5 No available information
    70 Other
    3,184 Total sample size

    Response and non-response formulas:

    Percentage of over coverage errors = Total cases of over coverage x 100% Number of cases in original sample = 5.3%

    Non response rate = Total cases of non response x 100% Net Sample size = 10.8%

    Net sample = Original sample - cases of over coverage Response rate = 100% - non-response rate = 89.2%

    Treatment of non-response cases using weight adjustment

    Where
    the primary weight before adjustment for the household i g: adjustment group by ( governorate, locality type ). fg: weight adjustment factor for the group g. : Total weights in group g
    cases : Total weights of over coverage : Total weights of response cases

    We calculate fg for each group ,and final we obtain the final household weight () by using the following formula:

    Comparability The data of the survey are comparable geographically and over time by comparing data from different geographical areas to data of previous surveys and the 2007 census.

    Data quality assurance procedures Several procedures were undertaken to ensure appropriate quality control in the survey. Field workers were trained on the main skills prior to data collection, field visits were conducted to field workers to ensure the integrity of data collection, editing of questionnaires took place prior to data entry and a data entry application was used that prevents errors during the data entry process, then the data were reviewed. This was done to ensure that data were error free, while cleaning and inspection of anomalous values were carried out to ensure harmony between the different questions on the questionnaire.

    Technical notes The following are important technical notes on the indicators presented in the results of the survey: · Some households were not present in their houses and could not be seen by interviewers. · Some households were not accurate in answering the questions in the questionnaire.
    · Some errors occurred due to the way the questions were asked by interviewers. · Misunderstanding of the questions by the respondents. · Answering questions related to consumption based on estimations. · In all calculations related to gasoline, the average of all available types of gasoline was used. · In this survey, data were collected about the consumption of olive cake and coal in households, but due to lack of relevant data and fairly high variance, the data were grouped with others in the statistical tables. · The increase in consumption of electricity and the decrease in the consumption of the other types of fuel in the Gaza Strip reflected the Israeli siege imposed on the territory.

    Data appraisal

    The data of the survey is comparable geographically and over time by comparing the data between different geographical areas to data of previous surveys.

  17. Envestnet | Yodlee's De-Identified Bank Statement Data | Row/Aggregate Level...

    • datarade.ai
    .sql, .txt
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Envestnet | Yodlee, Envestnet | Yodlee's De-Identified Bank Statement Data | Row/Aggregate Level | USA Consumer Data covering 3600+ corporations | 90M+ Accounts [Dataset]. https://datarade.ai/data-products/envestnet-yodlee-s-de-identified-bank-statement-data-row-envestnet-yodlee
    Explore at:
    .sql, .txtAvailable download formats
    Dataset provided by
    Yodlee
    Envestnethttp://envestnet.com/
    Authors
    Envestnet | Yodlee
    Area covered
    United States of America
    Description

    Envestnet®| Yodlee®'s Bank Statement Data (Aggregate/Row) Panels consist of de-identified, near-real time (T+1) USA credit/debit/ACH transaction level data – offering a wide view of the consumer activity ecosystem. The underlying data is sourced from end users leveraging the aggregation portion of the Envestnet®| Yodlee®'s financial technology platform.

    Envestnet | Yodlee Consumer Panels (Aggregate/Row) include data relating to millions of transactions, including ticket size and merchant location. The dataset includes de-identified credit/debit card and bank transactions (such as a payroll deposit, account transfer, or mortgage payment). Our coverage offers insights into areas such as consumer, TMT, energy, REITs, internet, utilities, ecommerce, MBS, CMBS, equities, credit, commodities, FX, and corporate activity. We apply rigorous data science practices to deliver key KPIs daily that are focused, relevant, and ready to put into production.

    We offer free trials. Our team is available to provide support for loading, validation, sample scripts, or other services you may need to generate insights from our data.

    Investors, corporate researchers, and corporates can use our data to answer some key business questions such as: - How much are consumers spending with specific merchants/brands and how is that changing over time? - Is the share of consumer spend at a specific merchant increasing or decreasing? - How are consumers reacting to new products or services launched by merchants? - For loyal customers, how is the share of spend changing over time? - What is the company’s market share in a region for similar customers? - Is the company’s loyal user base increasing or decreasing? - Is the lifetime customer value increasing or decreasing?

    Additional Use Cases: - Use spending data to analyze sales/revenue broadly (sector-wide) or granular (company-specific). Historically, our tracked consumer spend has correlated above 85% with company-reported data from thousands of firms. Users can sort and filter by many metrics and KPIs, such as sales and transaction growth rates and online or offline transactions, as well as view customer behavior within a geographic market at a state or city level. - Reveal cohort consumer behavior to decipher long-term behavioral consumer spending shifts. Measure market share, wallet share, loyalty, consumer lifetime value, retention, demographics, and more.) - Study the effects of inflation rates via such metrics as increased total spend, ticket size, and number of transactions. - Seek out alpha-generating signals or manage your business strategically with essential, aggregated transaction and spending data analytics.

    Use Cases Categories (Our data provides an innumerable amount of use cases, and we look forward to working with new ones): 1. Market Research: Company Analysis, Company Valuation, Competitive Intelligence, Competitor Analysis, Competitor Analytics, Competitor Insights, Customer Data Enrichment, Customer Data Insights, Customer Data Intelligence, Demand Forecasting, Ecommerce Intelligence, Employee Pay Strategy, Employment Analytics, Job Income Analysis, Job Market Pricing, Marketing, Marketing Data Enrichment, Marketing Intelligence, Marketing Strategy, Payment History Analytics, Price Analysis, Pricing Analytics, Retail, Retail Analytics, Retail Intelligence, Retail POS Data Analysis, and Salary Benchmarking

    1. Investment Research: Financial Services, Hedge Funds, Investing, Mergers & Acquisitions (M&A), Stock Picking, Venture Capital (VC)

    2. Consumer Analysis: Consumer Data Enrichment, Consumer Intelligence

    3. Market Data: AnalyticsB2C Data Enrichment, Bank Data Enrichment, Behavioral Analytics, Benchmarking, Customer Insights, Customer Intelligence, Data Enhancement, Data Enrichment, Data Intelligence, Data Modeling, Ecommerce Analysis, Ecommerce Data Enrichment, Economic Analysis, Financial Data Enrichment, Financial Intelligence, Local Economic Forecasting, Location-based Analytics, Market Analysis, Market Analytics, Market Intelligence, Market Potential Analysis, Market Research, Market Share Analysis, Sales, Sales Data Enrichment, Sales Enablement, Sales Insights, Sales Intelligence, Spending Analytics, Stock Market Predictions, and Trend Analysis

  18. H

    Advancing Open and Reproducible Water Data Science by Integrating Data...

    • hydroshare.org
    • beta.hydroshare.org
    • +1more
    zip
    Updated Jan 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Advancing Open and Reproducible Water Data Science by Integrating Data Analytics with an Online Data Repository [Dataset]. https://www.hydroshare.org/resource/45d3427e794543cfbee129c604d7e865
    Explore at:
    zip(50.9 MB)Available download formats
    Dataset updated
    Jan 9, 2024
    Dataset provided by
    HydroShare
    Authors
    Jeffery S. Horsburgh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Scientific and related management challenges in the water domain require synthesis of data from multiple domains. Many data analysis tasks are difficult because datasets are large and complex; standard formats for data types are not always agreed upon nor mapped to an efficient structure for analysis; water scientists may lack training in methods needed to efficiently tackle large and complex datasets; and available tools can make it difficult to share, collaborate around, and reproduce scientific work. Overcoming these barriers to accessing, organizing, and preparing datasets for analyses will be an enabler for transforming scientific inquiries. Building on the HydroShare repository’s established cyberinfrastructure, we have advanced two packages for the Python language that make data loading, organization, and curation for analysis easier, reducing time spent in choosing appropriate data structures and writing code to ingest data. These packages enable automated retrieval of data from HydroShare and the USGS’s National Water Information System (NWIS), loading of data into performant structures keyed to specific scientific data types and that integrate with existing visualization, analysis, and data science capabilities available in Python, and then writing analysis results back to HydroShare for sharing and eventual publication. These capabilities reduce the technical burden for scientists associated with creating a computational environment for executing analyses by installing and maintaining the packages within CUAHSI’s HydroShare-linked JupyterHub server. HydroShare users can leverage these tools to build, share, and publish more reproducible scientific workflows. The HydroShare Python Client and USGS NWIS Data Retrieval packages can be installed within a Python environment on any computer running Microsoft Windows, Apple MacOS, or Linux from the Python Package Index using the PIP utility. They can also be used online via the CUAHSI JupyterHub server (https://jupyterhub.cuahsi.org/) or other Python notebook environments like Google Collaboratory (https://colab.research.google.com/). Source code, documentation, and examples for the software are freely available in GitHub at https://github.com/hydroshare/hsclient/ and https://github.com/USGS-python/dataretrieval.

    This presentation was delivered as part of the Hawai'i Data Science Institute's regular seminar series: https://datascience.hawaii.edu/event/data-science-and-analytics-for-water/

  19. Big data and business analytics revenue worldwide 2015-2022

    • statista.com
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2023). Big data and business analytics revenue worldwide 2015-2022 [Dataset]. https://www.statista.com/statistics/551501/worldwide-big-data-business-analytics-revenue/
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    The global big data and business analytics (BDA) market was valued at 168.8 billion U.S. dollars in 2018 and is forecast to grow to 215.7 billion U.S. dollars by 2021. In 2021, more than half of BDA spending will go towards services. IT services is projected to make up around 85 billion U.S. dollars, and business services will account for the remainder. Big data High volume, high velocity and high variety: one or more of these characteristics is used to define big data, the kind of data sets that are too large or too complex for traditional data processing applications. Fast-growing mobile data traffic, cloud computing traffic, as well as the rapid development of technologies such as artificial intelligence (AI) and the Internet of Things (IoT) all contribute to the increasing volume and complexity of data sets. For example, connected IoT devices are projected to generate 79.4 ZBs of data in 2025. Business analytics Advanced analytics tools, such as predictive analytics and data mining, help to extract value from the data and generate business insights. The size of the business intelligence and analytics software application market is forecast to reach around 16.5 billion U.S. dollars in 2022. Growth in this market is driven by a focus on digital transformation, a demand for data visualization dashboards, and an increased adoption of cloud.

  20. r

    Groundwater and surface water sample locations included in hydrochemical...

    • researchdata.edu.au
    Updated May 20, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bioregional Assessment Program (2020). Groundwater and surface water sample locations included in hydrochemical cluster analysis [Dataset]. https://researchdata.edu.au/groundwater-surface-water-cluster-analysis/2980315
    Explore at:
    Dataset updated
    May 20, 2020
    Dataset provided by
    data.gov.au
    Authors
    Bioregional Assessment Program
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract

    Dataset combines multiple parameters measured in groundwater and surface water samples and interpreted using Hierarchical Cluster Analysis. This dataset combines groundwater and surface water hydrochemical (major ions, EC, TDS) sourced from GA datasets that were originally sourced from the NT government database and other sources listed under column 'Source'.

    Attribution

    Geological and Bioregional Assessment Program

    History

    A hydrochemical dataset composed of 857 groundwater samples and 21 surface water samples was used for Hierarchical Cluster Analysis to investigate inter-aquifer and GW-SW connectivity. Groundwater samples are assigned to the following hydrostratigraphic units: Antrim Plateau Volcanics/Helen Springs/Jindare, Bukalara, Cenozoic, Cretaceous, Jinduckin/AnthonyLagoon/HookerCreek; Proterozoic and Tindall/GumRidge/Montejinni.\r Variables used for the analysis included major ions (Ca, Mg, Na, K, HCO3, Cl, SO4), electrical conductivity (EC) and pH with data normalised for statistical calculation. Datapoints with charge balance error above 10% have been removed prior to interpretation. The interpretation resulted in five clusters with distinct geochemical properties, as described in the respective section of the technical report Hydrogeology of the Beetaloo GBA region.\r Check column source to identify data origin with references in the above mentioned report.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Monogan, Jamie (2023). Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems [Dataset]. http://doi.org/10.7910/DVN/ARKOTI

Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems

Explore at:
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Monogan, Jamie
Description

Each R script replicates all of the example code from one chapter from the book. All required data for each script are also uploaded, as are all data used in the practice problems at the end of each chapter. The data are drawn from a wide array of sources, so please cite the original work if you ever use any of these data sets for research purposes.

Search
Clear search
Close search
Google apps
Main menu