100+ datasets found
  1. CRM Practice Data set for CRM Data Analysis

    • kaggle.com
    Updated Feb 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SOURABH RODAGI (2025). CRM Practice Data set for CRM Data Analysis [Dataset]. https://www.kaggle.com/datasets/sourabhrodagi/crm-practice-data-set-for-crm-data-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 3, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    SOURABH RODAGI
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset is good start for practicing data analytics with CRM software with User_ID Signup_Date Bet_Amount Region Account_Status Login_Count Customer_Type these columns and it has enough variables and variations for perfect data analysis.

  2. Company Datasets for Business Profiling

    • datarade.ai
    Updated Feb 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oxylabs (2017). Company Datasets for Business Profiling [Dataset]. https://datarade.ai/data-products/company-datasets-for-business-profiling-oxylabs
    Explore at:
    .json, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Feb 23, 2017
    Dataset authored and provided by
    Oxylabs
    Area covered
    Isle of Man, Moldova (Republic of), Bangladesh, Canada, British Indian Ocean Territory, Tunisia, Taiwan, Nepal, Andorra, Northern Mariana Islands
    Description

    Company Datasets for valuable business insights!

    Discover new business prospects, identify investment opportunities, track competitor performance, and streamline your sales efforts with comprehensive Company Datasets.

    These datasets are sourced from top industry providers, ensuring you have access to high-quality information:

    • Owler: Gain valuable business insights and competitive intelligence. -AngelList: Receive fresh startup data transformed into actionable insights. -CrunchBase: Access clean, parsed, and ready-to-use business data from private and public companies. -Craft.co: Make data-informed business decisions with Craft.co's company datasets. -Product Hunt: Harness the Product Hunt dataset, a leader in curating the best new products.

    We provide fresh and ready-to-use company data, eliminating the need for complex scraping and parsing. Our data includes crucial details such as:

    • Company name;
    • Size;
    • Founding date;
    • Location;
    • Industry;
    • Revenue;
    • Employee count;
    • Competitors.

    You can choose your preferred data delivery method, including various storage options, delivery frequency, and input/output formats.

    Receive datasets in CSV, JSON, and other formats, with storage options like AWS S3 and Google Cloud Storage. Opt for one-time, monthly, quarterly, or bi-annual data delivery.

    With Oxylabs Datasets, you can count on:

    • Fresh and accurate data collected and parsed by our expert web scraping team.
    • Time and resource savings, allowing you to focus on data analysis and achieving your business goals.
    • A customized approach tailored to your specific business needs.
    • Legal compliance in line with GDPR and CCPA standards, thanks to our membership in the Ethical Web Data Collection Initiative.

    Pricing Options:

    Standard Datasets: choose from various ready-to-use datasets with standardized data schemas, priced from $1,000/month.

    Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.

    Experience a seamless journey with Oxylabs:

    • Understanding your data needs: We work closely to understand your business nature and daily operations, defining your unique data requirements.
    • Developing a customized solution: Our experts create a custom framework to extract public data using our in-house web scraping infrastructure.
    • Delivering data sample: We provide a sample for your feedback on data quality and the entire delivery process.
    • Continuous data delivery: We continuously collect public data and deliver custom datasets per the agreed frequency.

    Unlock the power of data with Oxylabs' Company Datasets and supercharge your business insights today!

  3. train data for data analysis practice

    • kaggle.com
    zip
    Updated Oct 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ayush Dwivedi (2024). train data for data analysis practice [Dataset]. https://www.kaggle.com/datasets/imayushd/train-data-for-data-analysis-practice
    Explore at:
    zip(3763799 bytes)Available download formats
    Dataset updated
    Oct 12, 2024
    Authors
    Ayush Dwivedi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    it is for practice purpose , there are diff. type of data types in dataset

  4. H

    Political Analysis Using R: Example Code and Data, Plus Data for Practice...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Apr 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jamie Monogan (2020). Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems [Dataset]. http://doi.org/10.7910/DVN/ARKOTI
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 28, 2020
    Dataset provided by
    Harvard Dataverse
    Authors
    Jamie Monogan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Each R script replicates all of the example code from one chapter from the book. All required data for each script are also uploaded, as are all data used in the practice problems at the end of each chapter. The data are drawn from a wide array of sources, so please cite the original work if you ever use any of these data sets for research purposes.

  5. Practice - Logarithm in Data Analysis

    • kaggle.com
    zip
    Updated Nov 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Panagiotis Prassas (2023). Practice - Logarithm in Data Analysis [Dataset]. https://www.kaggle.com/datasets/panagiotisprassas/practice-logarithm-in-data-analysis
    Explore at:
    zip(257 bytes)Available download formats
    Dataset updated
    Nov 11, 2023
    Authors
    Panagiotis Prassas
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Panagiotis Prassas

    Released under Apache 2.0

    Contents

  6. m

    Raw data outputs 1-18

    • bridges.monash.edu
    • researchdata.edu.au
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abbas Salavaty Hosein Abadi; Sara Alaei; Mirana Ramialison; Peter Currie (2023). Raw data outputs 1-18 [Dataset]. http://doi.org/10.26180/21259491.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Monash University
    Authors
    Abbas Salavaty Hosein Abadi; Sara Alaei; Mirana Ramialison; Peter Currie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Raw data outputs 1-18 Raw data output 1. Differentially expressed genes in AML CSCs compared with GTCs as well as in TCGA AML cancer samples compared with normal ones. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 2. Commonly and uniquely differentially expressed genes in AML CSC/GTC microarray and TCGA bulk RNA-seq datasets. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 3. Common differentially expressed genes between training and test set samples the microarray dataset. This data was generated based on the results of AML microarray data analysis. Raw data output 4. Detailed information on the samples of the breast cancer microarray dataset (GSE52327) used in this study. Raw data output 5. Differentially expressed genes in breast CSCs compared with GTCs as well as in TCGA BRCA cancer samples compared with normal ones. Raw data output 6. Commonly and uniquely differentially expressed genes in breast cancer CSC/GTC microarray and TCGA BRCA bulk RNA-seq datasets. This data was generated based on the results of breast cancer microarray and TCGA BRCA data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 7. Differential and common co-expression and protein-protein interaction of genes between CSC and GTC samples. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 8. Differentially expressed genes between AML dormant and active CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 9. Uniquely expressed genes in dormant or active AML CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 10. Intersections between the targeting transcription factors of AML key CSC genes and differentially expressed genes between AML CSCs vs GTCs and between dormant and active AML CSCs or the uniquely expressed genes in either class of CSCs. Raw data output 11. Targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 12. CSC-specific targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 13. The protein-protein interactions between AML key CSC genes with themselves and their targeting transcription factors. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. Raw data output 14. The previously confirmed associations of genes having the highest targeting desirableness and CSC-specific targeting desirableness scores with AML or other cancers’ (stem) cells as well as hematopoietic stem cells. These data were generated based on a PubMed database-based literature mining. Raw data output 15. Drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 16. CSC-specific drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 17. Candidate drugs for experimental validation. These drugs were selected based on their respective (CSC-specific) drug scores. CSC is the abbreviation of cancer stem cell. Raw data output 18. Detailed information on the samples of the AML microarray dataset GSE30375 used in this study.

  7. Fake Dataset for Practice

    • kaggle.com
    zip
    Updated Aug 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shuvo Kumar Basak-4004 (2023). Fake Dataset for Practice [Dataset]. https://www.kaggle.com/datasets/shuvokumarbasak4004/fake-dataset-for-practice
    Explore at:
    zip(1515599 bytes)Available download formats
    Dataset updated
    Aug 21, 2023
    Authors
    Shuvo Kumar Basak-4004
    Description

    Description: This dataset is created solely for the purpose of practice and learning. It contains entirely fake and fabricated information, including names, phone numbers, emails, cities, ages, and other attributes. None of the information in this dataset corresponds to real individuals or entities. It serves as a resource for those who are learning data manipulation, analysis, and machine learning techniques. Please note that the data is completely fictional and should not be treated as representing any real-world scenarios or individuals.

    Attributes: - phone_number: Fake phone numbers in various formats. - name: Fictitious names generated for practice purposes. - email: Imaginary email addresses created for the dataset. - city: Made-up city names to simulate geographical diversity. - age: Randomly generated ages for practice analysis. - sex: Simulated gender values (Male, Female). - married_status: Synthetic marital status information. - job: Fictional job titles for practicing data analysis. - income: Fake income values for learning data manipulation. - religion: Pretend religious affiliations for practice. - nationality: Simulated nationalities for practice purposes.

    Please be aware that this dataset is not based on real data and should be used exclusively for educational purposes.

  8. d

    Data for: Integrating open education practices with data analysis of open...

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Jul 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marja Bakermans (2024). Data for: Integrating open education practices with data analysis of open science in an undergraduate course [Dataset]. http://doi.org/10.5061/dryad.37pvmcvst
    Explore at:
    Dataset updated
    Jul 27, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    Marja Bakermans
    Description

    The open science movement produces vast quantities of openly published data connected to journal articles, creating an enormous resource for educators to engage students in current topics and analyses. However, educators face challenges using these materials to meet course objectives. I present a case study using open science (published articles and their corresponding datasets) and open educational practices in a capstone course. While engaging in current topics of conservation, students trace connections in the research process, learn statistical analyses, and recreate analyses using the programming language R. I assessed the presence of best practices in open articles and datasets, examined student selection in the open grading policy, surveyed students on their perceived learning gains, and conducted a thematic analysis on student reflections. First, articles and datasets met just over half of the assessed fairness practices, but this increased with the publication date. There was a..., Article and dataset fairness To assess the utility of open articles and their datasets as an educational tool in an undergraduate academic setting, I measured the congruence of each pair to a set of best practices and guiding principles. I assessed ten guiding principles and best practices (Table 1), where each category was scored ‘1’ or ‘0’ based on whether it met that criteria, with a total possible score of ten. Open grading policies Students were allowed to specify the percentage weight for each assessment category in the course, including 1) six coding exercises (Exercises), 2) one lead exercise (Lead Exercise), 3) fourteen annotation assignments of readings (Annotations), 4) one final project (Final Project), 5) five discussion board posts and a statement of learning reflection (Discussion), and 6) attendance and participation (Participation). I examined if assessment categories (independent variable) were weighted (dependent variable) differently by students using an analysis of ..., , # Data for: Integrating open education practices with data analysis of open science in an undergraduate course

    Author: Marja H Bakermans Affiliation: Worcester Polytechnic Institute, 100 Institute Rd, Worcester, MA 01609 USA ORCID: https://orcid.org/0000-0002-4879-7771 Institutional IRB approval: IRB-24–0314

    Data and file overview

    The full dataset file called OEPandOSdata (.xlsx extension) contains 8 files. Below are descriptions of the name and contents of each file. NA = not applicable or no data available

    1. BestPracticesData.csv
      • Description: Data to assess the adherence of articles and datasets to open science best practices.
      • Column headers and descriptions:
        • Article: articles used in the study, numbered randomly
        • F1: Findable, Data are assigned a unique and persistent doi
        • F2: Findable, Metadata includes an identifier of data
        • F3: Findable, Data are registered in a searchable database
        • A1: ...
  9. B

    Data Cleaning Sample

    • borealisdata.ca
    • dataone.org
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    Borealis
    Authors
    Rong Luo
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Sample data for exercises in Further Adventures in Data Cleaning.

  10. An Exceptional Story with Exceptional Data - Data Analysis

    • tomtunguz.com
    Updated Jul 2, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomasz Tunguz (2014). An Exceptional Story with Exceptional Data - Data Analysis [Dataset]. https://tomtunguz.com/telling-stories-with-data/
    Explore at:
    Dataset updated
    Jul 2, 2014
    Dataset provided by
    Theory Ventures
    Authors
    Tomasz Tunguz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Learn how data storytelling drives impact through FiveThirtyEight's Messi analysis. Key lessons for founders on turning complex data into compelling narratives.

  11. Data Science Platform Market Analysis, Size, and Forecast 2025-2029: North...

    • technavio.com
    pdf
    Updated Feb 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Data Science Platform Market Analysis, Size, and Forecast 2025-2029: North America (US and Canada), Europe (France, Germany, UK), APAC (China, India, Japan), South America (Brazil), and Middle East and Africa (UAE) [Dataset]. https://www.technavio.com/report/data-science-platform-market-industry-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Feb 8, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    License

    https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice

    Time period covered
    2025 - 2029
    Area covered
    United States
    Description

    Snapshot img

    Data Science Platform Market Size 2025-2029

    The data science platform market size is valued to increase USD 763.9 million, at a CAGR of 40.2% from 2024 to 2029. Integration of AI and ML technologies with data science platforms will drive the data science platform market.

    Major Market Trends & Insights

    North America dominated the market and accounted for a 48% growth during the forecast period.
    By Deployment - On-premises segment was valued at USD 38.70 million in 2023
    By Component - Platform segment accounted for the largest market revenue share in 2023
    

    Market Size & Forecast

    Market Opportunities: USD 1.00 million
    Market Future Opportunities: USD 763.90 million
    CAGR : 40.2%
    North America: Largest market in 2023
    

    Market Summary

    The market represents a dynamic and continually evolving landscape, underpinned by advancements in core technologies and applications. Key technologies, such as machine learning and artificial intelligence, are increasingly integrated into data science platforms to enhance predictive analytics and automate data processing. Additionally, the emergence of containerization and microservices in data science platforms enables greater flexibility and scalability. However, the market also faces challenges, including data privacy and security risks, which necessitate robust compliance with regulations.
    According to recent estimates, the market is expected to account for over 30% of the overall big data analytics market by 2025, underscoring its growing importance in the data-driven business landscape.
    

    What will be the Size of the Data Science Platform Market during the forecast period?

    Get Key Insights on Market Forecast (PDF) Request Free Sample

    How is the Data Science Platform Market Segmented and what are the key trends of market segmentation?

    The data science platform industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

    Deployment
    
      On-premises
      Cloud
    
    
    Component
    
      Platform
      Services
    
    
    End-user
    
      BFSI
      Retail and e-commerce
      Manufacturing
      Media and entertainment
      Others
    
    
    Sector
    
      Large enterprises
      SMEs
    
    
    Application
    
      Data Preparation
      Data Visualization
      Machine Learning
      Predictive Analytics
      Data Governance
      Others
    
    
    Geography
    
      North America
    
        US
        Canada
    
    
      Europe
    
        France
        Germany
        UK
    
    
      Middle East and Africa
    
        UAE
    
    
      APAC
    
        China
        India
        Japan
    
    
      South America
    
        Brazil
    
    
      Rest of World (ROW)
    

    By Deployment Insights

    The on-premises segment is estimated to witness significant growth during the forecast period.

    In the dynamic and evolving the market, big data processing is a key focus, enabling advanced model accuracy metrics through various data mining methods. Distributed computing and algorithm optimization are integral components, ensuring efficient handling of large datasets. Data governance policies are crucial for managing data security protocols and ensuring data lineage tracking. Software development kits, model versioning, and anomaly detection systems facilitate seamless development, deployment, and monitoring of predictive modeling techniques, including machine learning algorithms, regression analysis, and statistical modeling. Real-time data streaming and parallelized algorithms enable real-time insights, while predictive modeling techniques and machine learning algorithms drive business intelligence and decision-making.

    Cloud computing infrastructure, data visualization tools, high-performance computing, and database management systems support scalable data solutions and efficient data warehousing. ETL processes and data integration pipelines ensure data quality assessment and feature engineering techniques. Clustering techniques and natural language processing are essential for advanced data analysis. The market is witnessing significant growth, with adoption increasing by 18.7% in the past year, and industry experts anticipate a further expansion of 21.6% in the upcoming period. Companies across various sectors are recognizing the potential of data science platforms, leading to a surge in demand for scalable, secure, and efficient solutions.

    API integration services and deep learning frameworks are gaining traction, offering advanced capabilities and seamless integration with existing systems. Data security protocols and model explainability methods are becoming increasingly important, ensuring transparency and trust in data-driven decision-making. The market is expected to continue unfolding, with ongoing advancements in technology and evolving business needs shaping its future trajectory.

    Request Free Sample

    The On-premises segment was valued at USD 38.70 million in 2019 and showed

  12. m

    Dataset of development of business during the COVID-19 crisis

    • data.mendeley.com
    • narcis.nl
    Updated Nov 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tatiana N. Litvinova (2020). Dataset of development of business during the COVID-19 crisis [Dataset]. http://doi.org/10.17632/9vvrd34f8t.1
    Explore at:
    Dataset updated
    Nov 9, 2020
    Authors
    Tatiana N. Litvinova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.

  13. d

    Analysis Practice Data

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arshad, Abdul Rehman (2023). Analysis Practice Data [Dataset]. http://doi.org/10.7910/DVN/R1VIPU
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Arshad, Abdul Rehman
    Description

    This data set comes as a supplementary resource for my book on Biostatistics and SPSS. Readers are free to download this file and practice using SPSS as they go along reading the book.

  14. f

    Data from: Statistical Analysis of Individual Participant Data...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Oct 3, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Altman, Douglas G.; Stewart, Gavin B.; Duley, Lelia; Stewart, Lesley A.; Simmonds, Mark C.; Askie, Lisa M. (2012). Statistical Analysis of Individual Participant Data Meta-Analyses: A Comparison of Methods and Recommendations for Practice [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001127650
    Explore at:
    Dataset updated
    Oct 3, 2012
    Authors
    Altman, Douglas G.; Stewart, Gavin B.; Duley, Lelia; Stewart, Lesley A.; Simmonds, Mark C.; Askie, Lisa M.
    Description

    BackgroundIndividual participant data (IPD) meta-analyses that obtain “raw” data from studies rather than summary data typically adopt a “two-stage” approach to analysis whereby IPD within trials generate summary measures, which are combined using standard meta-analytical methods. Recently, a range of “one-stage” approaches which combine all individual participant data in a single meta-analysis have been suggested as providing a more powerful and flexible approach. However, they are more complex to implement and require statistical support. This study uses a dataset to compare “two-stage” and “one-stage” models of varying complexity, to ascertain whether results obtained from the approaches differ in a clinically meaningful way. Methods and FindingsWe included data from 24 randomised controlled trials, evaluating antiplatelet agents, for the prevention of pre-eclampsia in pregnancy. We performed two-stage and one-stage IPD meta-analyses to estimate overall treatment effect and to explore potential treatment interactions whereby particular types of women and their babies might benefit differentially from receiving antiplatelets. Two-stage and one-stage approaches gave similar results, showing a benefit of using anti-platelets (Relative risk 0.90, 95% CI 0.84 to 0.97). Neither approach suggested that any particular type of women benefited more or less from antiplatelets. There were no material differences in results between different types of one-stage model. ConclusionsFor these data, two-stage and one-stage approaches to analysis produce similar results. Although one-stage models offer a flexible environment for exploring model structure and are useful where across study patterns relating to types of participant, intervention and outcome mask similar relationships within trials, the additional insights provided by their usage may not outweigh the costs of statistical support for routine application in syntheses of randomised controlled trials. Researchers considering undertaking an IPD meta-analysis should not necessarily be deterred by a perceived need for sophisticated statistical methods when combining information from large randomised trials.

  15. f

    Data_Sheet_4_“R” U ready?: a case study using R to analyze changes in gene...

    • frontiersin.figshare.com
    docx
    Updated Mar 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder (2024). Data_Sheet_4_“R” U ready?: a case study using R to analyze changes in gene expression during evolution.docx [Dataset]. http://doi.org/10.3389/feduc.2024.1379910.s004
    Explore at:
    docxAvailable download formats
    Dataset updated
    Mar 22, 2024
    Dataset provided by
    Frontiers
    Authors
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets.

  16. Property Data set for Data analysis Practice

    • kaggle.com
    zip
    Updated Jan 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Satish Dhawale (2024). Property Data set for Data analysis Practice [Dataset]. https://www.kaggle.com/datasets/satishdhawle/property-data-set-for-data-analysis-practice
    Explore at:
    zip(468546 bytes)Available download formats
    Dataset updated
    Jan 17, 2024
    Authors
    Satish Dhawale
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Data includes House data of Mumbai, maharashtra area, you can find out different analysis with the help of above data

    Data includes these are the columns : SR. AREA_TYPE
    LOCATION
    SIZE
    TOTAL_SQFT BATH
    BALCONY PRICE IN LAKHS

  17. Sample data files for Python Course

    • figshare.com
    txt
    Updated Nov 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Verhaar (2022). Sample data files for Python Course [Dataset]. http://doi.org/10.6084/m9.figshare.21501549.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 4, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Peter Verhaar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sample data set used in an introductory course on Programming in Python

  18. Poor statistical reporting, inadequate data presentation and spin persist...

    • plos.figshare.com
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joanna Diong; Annie A. Butler; Simon C. Gandevia; Martin E. Héroux (2023). Poor statistical reporting, inadequate data presentation and spin persist despite editorial advice [Dataset]. http://doi.org/10.1371/journal.pone.0202121
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Joanna Diong; Annie A. Butler; Simon C. Gandevia; Martin E. Héroux
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Journal of Physiology and British Journal of Pharmacology jointly published an editorial series in 2011 to improve standards in statistical reporting and data analysis. It is not known whether reporting practices changed in response to the editorial advice. We conducted a cross-sectional analysis of reporting practices in a random sample of research papers published in these journals before (n = 202) and after (n = 199) publication of the editorial advice. Descriptive data are presented. There was no evidence that reporting practices improved following publication of the editorial advice. Overall, 76-84% of papers with written measures that summarized data variability used standard errors of the mean, and 90-96% of papers did not report exact p-values for primary analyses and post-hoc tests. 76-84% of papers that plotted measures to summarize data variability used standard errors of the mean, and only 2-4% of papers plotted raw data used to calculate variability. Of papers that reported p-values between 0.05 and 0.1, 56-63% interpreted these as trends or statistically significant. Implied or gross spin was noted incidentally in papers before (n = 10) and after (n = 9) the editorial advice was published. Overall, poor statistical reporting, inadequate data presentation and spin were present before and after the editorial advice was published. While the scientific community continues to implement strategies for improving reporting practices, our results indicate stronger incentives or enforcements are needed.

  19. c

    Sample Sales Dataset

    • cubig.ai
    zip
    Updated Jun 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Sample Sales Dataset [Dataset]. https://cubig.ai/store/products/477/sample-sales-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 15, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Sample Sales Data is a retail sales dataset of 2,823 orders and 25 columns that includes a variety of sales-related data, including order numbers, product information, quantity, unit price, sales, order date, order status, customer and delivery information.

    2) Data Utilization (1) Sample Sales Data has characteristics that: • This dataset consists of numerical (sales, quantity, unit price, etc.), categorical (product, country, city, customer name, transaction size, etc.), and date (order date) variables, with missing values in some columns (STATE, ADDRESSLINE2, POSTALCODE, etc.). (2) Sample Sales Data can be used to: • Analysis of sales trends and performance by product: Key variables such as order date, product line, and country can be used to visualize and analyze monthly and yearly sales trends, the proportion of sales by product line, and top sales by country and region. • Segmentation and marketing strategies: Segmentation of customer groups based on customer information, transaction size, and regional data, and use them to design targeted marketing and customized promotion strategies.

  20. B

    Easing into Excellent Excel Practices Learning Series / Série...

    • borealisdata.ca
    • search.dataone.org
    Updated Nov 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julie Marcoux (2023). Easing into Excellent Excel Practices Learning Series / Série d'apprentissages en route vers des excellentes pratiques Excel [Dataset]. http://doi.org/10.5683/SP3/WZYO1F
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 15, 2023
    Dataset provided by
    Borealis
    Authors
    Julie Marcoux
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    With a step-by-step approach, learn to prepare Excel files, data worksheets, and individual data columns for data analysis; practice conditional formatting and creating pivot tables/charts; go over basic principles of Research Data Management as they might apply to an Excel project. Avec une approche étape par étape, apprenez à préparer pour l’analyse des données des fichiers Excel, des feuilles de calcul de données et des colonnes de données individuelles; pratiquez la mise en forme conditionnelle et la création de tableaux croisés dynamiques ou de graphiques; passez en revue les principes de base de la gestion des données de recherche tels qu’ils pourraient s’appliquer à un projet Excel.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
SOURABH RODAGI (2025). CRM Practice Data set for CRM Data Analysis [Dataset]. https://www.kaggle.com/datasets/sourabhrodagi/crm-practice-data-set-for-crm-data-analysis
Organization logo

CRM Practice Data set for CRM Data Analysis

This dataset is good start for practicing data analytics with CRM software.

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 3, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
SOURABH RODAGI
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

This dataset is good start for practicing data analytics with CRM software with User_ID Signup_Date Bet_Amount Region Account_Status Login_Count Customer_Type these columns and it has enough variables and variations for perfect data analysis.

Search
Clear search
Close search
Google apps
Main menu