100+ datasets found
  1. Data Analytics Case Study 1 Cyclistics Bike Share

    • kaggle.com
    Updated Feb 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Keri Ransom (2022). Data Analytics Case Study 1 Cyclistics Bike Share [Dataset]. https://www.kaggle.com/datasets/keriransom/da-case-study-cyclistics-bike-share
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 27, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Keri Ransom
    Description

    Dataset

    This dataset was created by Keri Ransom

    Contents

  2. Cyclistic (case study)

    • kaggle.com
    Updated May 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Josipa Tanocki Varga (2023). Cyclistic (case study) [Dataset]. https://www.kaggle.com/datasets/josipatanockivarga/cyclistic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 7, 2023
    Dataset provided by
    Kaggle
    Authors
    Josipa Tanocki Varga
    Description

    The dataset is used for the capstone project in Google Data Analytics Certificate course on Coursera.

    Altough Cyclistic is a fictional bike-sharing company whose data students must analyse as a part of the case study, the datasets is appropriate for the task and will enable students to answer the business questions. The data has been made available by Motivate International Inc. under this licence

    This is my first dataset published on Kaggle, as a part of the learning process, so any suggestion/comment/constructive criticism is welcome.

  3. A

    ‘Retail Case Study Data’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Retail Case Study Data’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-retail-case-study-data-529d/30064658/?iid=008-653&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Retail Case Study Data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/darpan25bajaj/retail-case-study-data on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Analytics in Retail:

    With the retail market getting more and more competitive by the day, there has never been anything more important than the ability for optimizing service business processes when trying to satisfy the expectations of customers. Channelizing and managing data with the aim of working in favor of the customer as well as generating profits is very significant for survival.

    Ideally, a retailer’s customer data reflects the company’s success in reaching and nurturing its customers. Retailers built reports summarizing customer behavior using metrics such as conversion rate, average order value, recency of purchase and total amount spent in recent transactions. These measurements provided general insight into the behavioral tendencies of customers.

    Customer intelligence is the practice of determining and delivering data-driven insights into past and predicted future customer behavior.To be effective, customer intelligence must combine raw transactional and behavioral data to generate derived measures. In a nutshell, for big retail players all over the world, data analytics is applied more these days at all stages of the retail process – taking track of popular products that are emerging, doing forecasts of sales and future demand via predictive simulation, optimizing placements of products and offers through heat-mapping of customers and many others.

    About the Data

    A Retail store is required to analyze the day-to-day transactions and keep a track of its customers spread across various locations along with their purchases/returns across various categories.

    What can be done with the data?

    Create a report and display the calculated metrics, reports and inferences.

    Data Schema

    This book has three sheets (Customer, Transaction, Product Hierarchy):

    • Customer: Customer information including demographics
    • Transaction: Transaction of customers
    • Product Hierarchy: Product information

    --- Original source retains full ownership of the source dataset ---

  4. f

    Data_Sheet_4_“R” U ready?: a case study using R to analyze changes in gene...

    • frontiersin.figshare.com
    docx
    Updated Mar 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder (2024). Data_Sheet_4_“R” U ready?: a case study using R to analyze changes in gene expression during evolution.docx [Dataset]. http://doi.org/10.3389/feduc.2024.1379910.s004
    Explore at:
    docxAvailable download formats
    Dataset updated
    Mar 22, 2024
    Dataset provided by
    Frontiers
    Authors
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets.

  5. o

    Data from: Comprehensive Predictive Analytics for Collaborators' Answers,...

    • ourarchive.otago.ac.nz
    • zenodo.org
    Updated May 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elijah Zolduoarrati; Sherlock Licorish; Nigel Stanger (2025). Comprehensive Predictive Analytics for Collaborators' Answers, Code Quality, and Dropout: Stack Overflow Case Study – Replication Package [Dataset]. https://ourarchive.otago.ac.nz/esploro/outputs/dataset/Comprehensive-Predictive-Analytics-for-Collaborators-Answers/9926743737901891
    Explore at:
    Dataset updated
    May 3, 2025
    Dataset provided by
    Zenodo
    Authors
    Elijah Zolduoarrati; Sherlock Licorish; Nigel Stanger
    Time period covered
    May 3, 2025
    Description

    Previous studies that used data from Stack Overflow to develop predictive models often employed limited benchmarks of 3-5 models or adopted arbitrary selection methods. Despite being insightful, such approaches may not provide optimal results given their limited scope, suggesting the need to benchmark more models to avoid overlooking untested algorithms. Our study evaluates 21 algorithms across three tasks: predicting the number of question a user is likely to answer, their code quality violations, and their dropout status. We employed normalisation, standardisation, as well as logarithmic and power transformations paired with Bayesian hyperparameter optimisation and genetic algorithms. CodeBERT, a pre-trained language model for both natural and programming languages, was fine-tuned to classify user dropout given their posts (questions and answers) and code snippets. This replication package is provided for those interested in further examining our research methodology.

  6. d

    Poverty Mapping Project: Poverty and Food Security Case Studies

    • catalog.data.gov
    • data.nasa.gov
    • +3more
    Updated Apr 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SEDAC (2025). Poverty Mapping Project: Poverty and Food Security Case Studies [Dataset]. https://catalog.data.gov/dataset/poverty-mapping-project-poverty-and-food-security-case-studies
    Explore at:
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    SEDAC
    Description

    The Poverty Mapping Project: Poverty and Food Security Case Studies data set consists of small area estimates of poverty, inequality, food security and related measures for subnational administrative Units in Mexico, Ecuador, Kenya, Malawi, Bangladesh, Sri Lanka, Nigeria and Vietnam. These data come from country level cases studies that examine poverty and food security from a spatial analysis perspective. The data products include shapefiles (vector data) and tabular data sets (csv format). Additionally, a data catalog (xls format) containing detailed information and documentation is provided. This data set is produced by the Columbia University Center for International Earth Science Information Network (CIESIN) and Centro Internacional de Agricultura Tropical (CIAT). The data set was originally produced by CIAT, International Maize and Wheat Improvement Center (CIMMYT), International Livestock Research Institute (ILRI), International Food Policy Research Institute (IFPRI), International Rice Research Institute (IRRI), International Water Management Institute (IWMI), and International Institute for Tropical Agriculture (IITA).

  7. summary_of_case_study_insights

    • kaggle.com
    Updated Jan 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shiva Singh (2022). summary_of_case_study_insights [Dataset]. https://www.kaggle.com/shivasinghgogreen/summary-of-case-study-insights/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 4, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shiva Singh
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This table is a summary table of insights of my first data analyst project, a Google Data Analytics Professional Certificate Programme Case Study.

    Content

    It has nearly 5M rows and a 20 columns.

  8. f

    Data_Sheet_1_Advanced large language models and visualization tools for data...

    • frontiersin.figshare.com
    txt
    Updated Aug 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jorge Valverde-Rebaza; Aram González; Octavio Navarro-Hinojosa; Julieta Noguez (2024). Data_Sheet_1_Advanced large language models and visualization tools for data analytics learning.csv [Dataset]. http://doi.org/10.3389/feduc.2024.1418006.s001
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 8, 2024
    Dataset provided by
    Frontiers
    Authors
    Jorge Valverde-Rebaza; Aram González; Octavio Navarro-Hinojosa; Julieta Noguez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionIn recent years, numerous AI tools have been employed to equip learners with diverse technical skills such as coding, data analysis, and other competencies related to computational sciences. However, the desired outcomes have not been consistently achieved. This study aims to analyze the perspectives of students and professionals from non-computational fields on the use of generative AI tools, augmented with visualization support, to tackle data analytics projects. The focus is on promoting the development of coding skills and fostering a deep understanding of the solutions generated. Consequently, our research seeks to introduce innovative approaches for incorporating visualization and generative AI tools into educational practices.MethodsThis article examines how learners perform and their perspectives when using traditional tools vs. LLM-based tools to acquire data analytics skills. To explore this, we conducted a case study with a cohort of 59 participants among students and professionals without computational thinking skills. These participants developed a data analytics project in the context of a Data Analytics short session. Our case study focused on examining the participants' performance using traditional programming tools, ChatGPT, and LIDA with GPT as an advanced generative AI tool.ResultsThe results shown the transformative potential of approaches based on integrating advanced generative AI tools like GPT with specialized frameworks such as LIDA. The higher levels of participant preference indicate the superiority of these approaches over traditional development methods. Additionally, our findings suggest that the learning curves for the different approaches vary significantly. Since learners encountered technical difficulties in developing the project and interpreting the results. Our findings suggest that the integration of LIDA with GPT can significantly enhance the learning of advanced skills, especially those related to data analytics. We aim to establish this study as a foundation for the methodical adoption of generative AI tools in educational settings, paving the way for more effective and comprehensive training in these critical areas.DiscussionIt is important to highlight that when using general-purpose generative AI tools such as ChatGPT, users must be aware of the data analytics process and take responsibility for filtering out potential errors or incompleteness in the requirements of a data analytics project. These deficiencies can be mitigated by using more advanced tools specialized in supporting data analytics tasks, such as LIDA with GPT. However, users still need advanced programming knowledge to properly configure this connection via API. There is a significant opportunity for generative AI tools to improve their performance, providing accurate, complete, and convincing results for data analytics projects, thereby increasing user confidence in adopting these technologies. We hope this work underscores the opportunities and needs for integrating advanced LLMs into educational practices, particularly in developing computational thinking skills.

  9. d

    Datasets for Computational Methods and GIS Applications in Social Science

    • search.dataone.org
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fahui Wang; Lingbo Liu (2024). Datasets for Computational Methods and GIS Applications in Social Science [Dataset]. http://doi.org/10.7910/DVN/4CM7V4
    Explore at:
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Fahui Wang; Lingbo Liu
    Description

    Dataset for the textbook Computational Methods and GIS Applications in Social Science (3rd Edition), 2023 Fahui Wang, Lingbo Liu Main Book Citation: Wang, F., & Liu, L. (2023). Computational Methods and GIS Applications in Social Science (3rd ed.). CRC Press. https://doi.org/10.1201/9781003292302 KNIME Lab Manual Citation: Liu, L., & Wang, F. (2023). Computational Methods and GIS Applications in Social Science - Lab Manual. CRC Press. https://doi.org/10.1201/9781003304357 KNIME Hub Dataset and Workflow for Computational Methods and GIS Applications in Social Science-Lab Manual Update Log If Python package not found in Package Management, use ArcGIS Pro's Python Command Prompt to install them, e.g., conda install -c conda-forge python-igraph leidenalg NetworkCommDetPro in CMGIS-V3-Tools was updated on July 10,2024 Add spatial adjacency table into Florida on June 29,2024 The dataset and tool for ABM Crime Simulation were updated on August 3, 2023, The toolkits in CMGIS-V3-Tools was updated on August 3rd,2023. Report Issues on GitHub https://github.com/UrbanGISer/Computational-Methods-and-GIS-Applications-in-Social-Science Following the website of Fahui Wang : http://faculty.lsu.edu/fahui Contents Chapter 1. Getting Started with ArcGIS: Data Management and Basic Spatial Analysis Tools Case Study 1: Mapping and Analyzing Population Density Pattern in Baton Rouge, Louisiana Chapter 2. Measuring Distance and Travel Time and Analyzing Distance Decay Behavior Case Study 2A: Estimating Drive Time and Transit Time in Baton Rouge, Louisiana Case Study 2B: Analyzing Distance Decay Behavior for Hospitalization in Florida Chapter 3. Spatial Smoothing and Spatial Interpolation Case Study 3A: Mapping Place Names in Guangxi, China Case Study 3B: Area-Based Interpolations of Population in Baton Rouge, Louisiana Case Study 3C: Detecting Spatiotemporal Crime Hotspots in Baton Rouge, Louisiana Chapter 4. Delineating Functional Regions and Applications in Health Geography Case Study 4A: Defining Service Areas of Acute Hospitals in Baton Rouge, Louisiana Case Study 4B: Automated Delineation of Hospital Service Areas in Florida Chapter 5. GIS-Based Measures of Spatial Accessibility and Application in Examining Healthcare Disparity Case Study 5: Measuring Accessibility of Primary Care Physicians in Baton Rouge Chapter 6. Function Fittings by Regressions and Application in Analyzing Urban Density Patterns Case Study 6: Analyzing Population Density Patterns in Chicago Urban Area >Chapter 7. Principal Components, Factor and Cluster Analyses and Application in Social Area Analysis Case Study 7: Social Area Analysis in Beijing Chapter 8. Spatial Statistics and Applications in Cultural and Crime Geography Case Study 8A: Spatial Distribution and Clusters of Place Names in Yunnan, China Case Study 8B: Detecting Colocation Between Crime Incidents and Facilities Case Study 8C: Spatial Cluster and Regression Analyses of Homicide Patterns in Chicago Chapter 9. Regionalization Methods and Application in Analysis of Cancer Data Case Study 9: Constructing Geographical Areas for Mapping Cancer Rates in Louisiana Chapter 10. System of Linear Equations and Application of Garin-Lowry in Simulating Urban Population and Employment Patterns Case Study 10: Simulating Population and Service Employment Distributions in a Hypothetical City Chapter 11. Linear and Quadratic Programming and Applications in Examining Wasteful Commuting and Allocating Healthcare Providers Case Study 11A: Measuring Wasteful Commuting in Columbus, Ohio Case Study 11B: Location-Allocation Analysis of Hospitals in Rural China Chapter 12. Monte Carlo Method and Applications in Urban Population and Traffic Simulations Case Study 12A. Examining Zonal Effect on Urban Population Density Functions in Chicago by Monte Carlo Simulation Case Study 12B: Monte Carlo-Based Traffic Simulation in Baton Rouge, Louisiana Chapter 13. Agent-Based Model and Application in Crime Simulation Case Study 13: Agent-Based Crime Simulation in Baton Rouge, Louisiana Chapter 14. Spatiotemporal Big Data Analytics and Application in Urban Studies Case Study 14A: Exploring Taxi Trajectory in ArcGIS Case Study 14B: Identifying High Traffic Corridors and Destinations in Shanghai Dataset File Structure 1 BatonRouge Census.gdb BR.gdb 2A BatonRouge BR_Road.gdb Hosp_Address.csv TransitNetworkTemplate.xml BR_GTFS Google API Pro.tbx 2B Florida FL_HSA.gdb R_ArcGIS_Tools.tbx (RegressionR) 3A China_GX GX.gdb 3B BatonRouge BR.gdb 3C BatonRouge BRcrime R_ArcGIS_Tools.tbx (STKDE) 4A BatonRouge BRRoad.gdb 4B Florida FL_HSA.gdb HSA Delineation Pro.tbx Huff Model Pro.tbx FLplgnAdjAppend.csv 5 BRMSA BRMSA.gdb Accessibility Pro.tbx 6 Chicago ChiUrArea.gdb R_ArcGIS_Tools.tbx (RegressionR) 7 Beijing BJSA.gdb bjattr.csv R_ArcGIS_Tools.tbx (PCAandFA, BasicClustering) 8A Yunnan YN.gdb R_ArcGIS_Tools.tbx (SaTScanR) 8B Jiangsu JS.gdb 8C Chicago ChiCity.gdb cityattr.csv ...

  10. Database: Data analytics and Artificial Neural Network framework to profile...

    • figshare.com
    xlsx
    Updated Feb 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rasikh Tariq (2024). Database: Data analytics and Artificial Neural Network framework to profile academic success: Case Study of Leaders of Tomorrow Program [Dataset]. http://doi.org/10.6084/m9.figshare.25281136.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 23, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Rasikh Tariq
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Database for the article: Data analytics and Artificial Neural Network framework to profile academic success: Case Study of Leaders of Tomorrow Program

  11. 2022 Bike Data Case Study

    • kaggle.com
    Updated Oct 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brad (2023). 2022 Bike Data Case Study [Dataset]. https://www.kaggle.com/datasets/bradley3baker/2022-bike-data-case-study
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 20, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Brad
    Description

    This is being done as a capstone for the Google Data Analytics Certificate.

    The data for Cyclistic (a fictional company) is located here: https://divvy-tripdata.s3.amazonaws.com/index.html

    This data was made available by Motivate International Inc. under the following license: https://www.divvybikes.com/data-license-agreement.

    This will focus on the data provided for 2022.

  12. Online Data Science Training Programs Market Analysis, Size, and Forecast...

    • technavio.com
    Updated Feb 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Online Data Science Training Programs Market Analysis, Size, and Forecast 2025-2029: North America (Mexico), Europe (France, Germany, Italy, and UK), Middle East and Africa (UAE), APAC (Australia, China, India, Japan, and South Korea), South America (Brazil), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/online-data-science-training-programs-market-industry-analysis
    Explore at:
    Dataset updated
    Feb 15, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    Mexico, Germany, Global
    Description

    Snapshot img

    Online Data Science Training Programs Market Size 2025-2029

    The online data science training programs market size is forecast to increase by USD 8.67 billion, at a CAGR of 35.8% between 2024 and 2029.

    The market is experiencing significant growth due to the increasing demand for data science professionals in various industries. The job market offers lucrative opportunities for individuals with data science skills, making online training programs an attractive option for those seeking to upskill or reskill. Another key driver in the market is the adoption of microlearning and gamification techniques in data science training. These approaches make learning more engaging and accessible, allowing individuals to acquire new skills at their own pace. Furthermore, the availability of open-source learning materials has democratized access to data science education, enabling a larger pool of learners to enter the field. However, the market also faces challenges, including the need for continuous updates to keep up with the rapidly evolving data science landscape and the lack of standardization in online training programs, which can make it difficult for employers to assess the quality of graduates. Companies seeking to capitalize on market opportunities should focus on offering up-to-date, high-quality training programs that incorporate microlearning and gamification techniques, while also addressing the challenges of continuous updates and standardization. By doing so, they can differentiate themselves in a competitive market and meet the evolving needs of learners and employers alike.

    What will be the Size of the Online Data Science Training Programs Market during the forecast period?

    Request Free SampleThe online data science training market continues to evolve, driven by the increasing demand for data-driven insights and innovations across various sectors. Data science applications, from computer vision and deep learning to natural language processing and predictive analytics, are revolutionizing industries and transforming business operations. Industry case studies showcase the impact of data science in action, with big data and machine learning driving advancements in healthcare, finance, and retail. Virtual labs enable learners to gain hands-on experience, while data scientist salaries remain competitive and attractive. Cloud computing and data science platforms facilitate interactive learning and collaborative research, fostering a vibrant data science community. Data privacy and security concerns are addressed through advanced data governance and ethical frameworks. Data science libraries, such as TensorFlow and Scikit-Learn, streamline the development process, while data storytelling tools help communicate complex insights effectively. Data mining and predictive analytics enable organizations to uncover hidden trends and patterns, driving innovation and growth. The future of data science is bright, with ongoing research and development in areas like data ethics, data governance, and artificial intelligence. Data science conferences and education programs provide opportunities for professionals to expand their knowledge and expertise, ensuring they remain at the forefront of this dynamic field.

    How is this Online Data Science Training Programs Industry segmented?

    The online data science training programs industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments. TypeProfessional degree coursesCertification coursesApplicationStudentsWorking professionalsLanguageR programmingPythonBig MLSASOthersMethodLive streamingRecordedProgram TypeBootcampsCertificatesDegree ProgramsGeographyNorth AmericaUSMexicoEuropeFranceGermanyItalyUKMiddle East and AfricaUAEAPACAustraliaChinaIndiaJapanSouth KoreaSouth AmericaBrazilRest of World (ROW)

    By Type Insights

    The professional degree courses segment is estimated to witness significant growth during the forecast period.The market encompasses various segments catering to diverse learning needs. The professional degree course segment holds a significant position, offering comprehensive and in-depth training in data science. This segment's curriculum covers essential aspects such as statistical analysis, machine learning, data visualization, and data engineering. Delivered by industry professionals and academic experts, these courses ensure a high-quality education experience. Interactive learning environments, including live lectures, webinars, and group discussions, foster a collaborative and engaging experience. Data science applications, including deep learning, computer vision, and natural language processing, are integral to the market's growth. Data analysis, a crucial application, is gaining traction due to the increasing demand

  13. University ecosystem analytics: Case study of regional integration and...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Petersen (2025). University ecosystem analytics: Case study of regional integration and competitiveness in California and Texas [Dataset]. http://doi.org/10.5061/dryad.2rbnzs7zc
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 15, 2025
    Dataset provided by
    University of Californiahttp://universityofcalifornia.edu/
    Authors
    Alexander Petersen
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    California, Texas
    Description

    Despite substantial policy efforts aimed at developing regional innovation systems (RIS), our understanding of institutional factors that promote synergy and integration at the regional scale is limited. To address this gap, we constructed 2 representations of research university ecosystems in California (CA) and Texas (TX) that identify institutional co-occurrences in research and news media, within and across these regions. The selection of these regions is attributed to the University of California and the University of Texas, two multi-campus university systems (MUS) that feature distinct configurations of institutional specialization. As such, we exploit these differences to analyze four institutional assortativity channels that foster system-level synergies: institutional proximity, prestige, homophily, and specialization. The first representation we constructed is based upon ~3 million publications collected from Clarivate Analytics Web of Science Core Collection (WOS) that are affiliated with at least one of the 28 institutions in our sample, which together represent >5% of publications indexed by WOS over the sample period 1970-2020. The 28 institutions consist of 10 institutions belonging to the University of California (UC) system and 12 institutions belonging to the University of Texas (UT) system; we complement these two public multi-campus university systems (MUS) by including six prominent private universities, which represent a non-MUS comparison group. As universities increasingly compete for visibility to attract student enrollment and build scientific reputation, the management of institution of higher education (IHE) brand has emerged as an important strategic endeavor. Hence, the second representation we constructed is based upon ~2 million digital news media articles published between 2000-2020 that specifically mention at least one of these universities. Similar to the first representation, mapping the rates of digital media co-visibility among IHE facilitates a systems-level understanding of the factors that condition the structure and dynamics of brand stratification within research university ecosystems, and fosters the development of novel measures for two dimensions of brand equity – namely, visibility and association. Methods 1) Research affiliated with a particular institution. We collected 2,965,198 records published between 1970-2020 from the Clarivate Analytics WOS Core Collection using their in-house institutional disambiguation tool to identify publications with at least one author from a particular campus. 2) Digital media affiliated with a particular institution. We assembled a dataset of 1,947,349 unique web-based digital media articles representing news articles, blog posts and other web content specifically mentioning any of the institutions by their official name, e.g. “University of California Los Angeles” or “UCLA”, accounting for the official abbreviations. These media articles were originally produced by 57,947 unique media sources, according to primary source data obtained from the Media Cloud project (MC) database, https://www.mediacloud.org/ . We use both data sources to develop a co-occurrence framework for defining university-university relationships based upon research co-production (via collaboration among scholars affiliated with each university) and media article co-visibility over the period 2000-2020, by applying concepts and methods from network science, machine learning (NLP) and organizational science.

  14. Data from: Experts, Expertise and Citizen Science: A Case Study of Air...

    • beta.ukdataservice.ac.uk
    • datacatalogue.cessda.eu
    Updated 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UK Data Service (2025). Experts, Expertise and Citizen Science: A Case Study of Air Quality Monitoring, 2021-2024 [Dataset]. http://doi.org/10.5255/ukda-sn-857606
    Explore at:
    Dataset updated
    2025
    Dataset provided by
    UK Data Servicehttps://ukdataservice.ac.uk/
    DataCitehttps://www.datacite.org/
    Description

    Environmental controversies are often about knowledge and expertise as much as they are about politics, rights and life chances. The reason is that the evidence produced by the different groups involved is often part of the controversy, with disputes over what is known and not known, by whom, and with what degree of accuracy being a source of tension rather than consensus.

    Community groups are responding to these challenges through new forms of citizen science in which they collect new data that can be used to contest decisions that affect their lives and communities. In this project, we worked with one such group to monitor air quality and to improve their local environment. This involved supporting, and reporting on, their work to deploy monitoring equipment and build community networks as well as examining how these efforts are received by others. Interviews were conducted with a cross-section of actors and groups with a stake in the project. These included the local civil society group, policy makers and representatives of other relevant organisations.

    These interviews allowed participants to articulate their own perspective and experience and enable the project team to understand how different kinds of expertise are – and/or should be – valued within decision-making activities.

  15. f

    Data from: Case study in public administration: a critical review of...

    • scielo.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Case study in public administration: a critical review of Brazilian scientific production [Dataset]. https://scielo.figshare.com/articles/dataset/Case_study_in_public_administration_a_critical_review_of_Brazilian_scientific_production/20020104
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    SciELO journals
    Authors
    Mariana Guerra; Adalmir de Oliveira Gomes; Antônio Isidro da Silva Filho
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This paper presents a critical review of 47 articles published between 2006 and 2011 to identify how case studies have been applied in Brazilian research on public administration. In addition to their theoretical and methodological characteristics, four further specific topics of interest were addressed: (a) what is meant by case study; (b) the relationship between the phenomenon of interest and the case under investigation; (c) the possibility of replication; and (d) how the supposed method contributes towards the development of the field of public administration. The main inconsistencies found were: the methodological descriptions are confusing; the results are inconsistent compared with data gathering procedures and data analysis techniques; a lack of information about the number of interviewed individuals; and no descriptions of research variables. The results suggest the reviewed case studies present methodological inconsistencies and limitations, which undermine their scientific value and relevance to academic work in Brazil.

  16. Google Data Analytics Case Study 2 - Andrew Oshobu

    • kaggle.com
    Updated Oct 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Oshobu (2021). Google Data Analytics Case Study 2 - Andrew Oshobu [Dataset]. https://www.kaggle.com/andrewoshobu/google-data-analytics-case-study-2-andrew-oshobu/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 14, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Andrew Oshobu
    Description

    Dataset

    This dataset was created by Andrew Oshobu

    Contents

  17. PRONTO heterogeneous benchmark dataset

    • zenodo.org
    • explore.openaire.eu
    txt
    Updated Aug 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anna Stief; Anna Stief; Ruomu Tan; Ruomu Tan; Yi Cao; James R. Ottewill; Yi Cao; James R. Ottewill (2024). PRONTO heterogeneous benchmark dataset [Dataset]. http://doi.org/10.5281/zenodo.1341583
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 2, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anna Stief; Anna Stief; Ruomu Tan; Ruomu Tan; Yi Cao; James R. Ottewill; Yi Cao; James R. Ottewill
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The PRONTO heterogeneous benchmark dataset is based on an industrial-scale multiphase flow facility. It includes data from heterogeneous sources, including process measurements, alarm records, high frequency ultrasonic flow and pressure measurements, an operation log and video recordings. The study collected data from various operational conditions with and without induced faults to generate a multi-rate, multi-modal dataset. The dataset is suitable for developing and validating algorithms for fault detection and diagnosis (FDD) and data fusion.

    When using the dataset please cite the following publication:

    A. Stief, R. Tan, Y. Cao, J. R. Ottewill, N. F. Thornhill, J. Baranowski, A heterogeneous benchmark dataset for data analytics: Multiphase flow facility case study, Journal of Process Control, 79 (2019) 41–55, DOI: https://doi.org/10.1016/j.jprocont.2019.04.009

    The dataset has been used in the following works:

    A. Stief, R. Tan, Y. Cao, J. R. Ottewill. Analytics of heterogeneous process data: Multiphase flow facility case study. IFAC-PapersOnLine, 51(18):363–368, 2018. DOI: https://doi.org/10.1016/j.ifacol.2018.09.327

    A. Stief, J. R. Ottewill, R. Tan, Y. Cao. Process and alarm data integration under a two-stage Bayesian framework for fault diagnostics. IFAC-PapersOnLine, 51(24):1220–1226, 2018. DOI: https://doi.org/10.1016/j.ifacol.2018.09.696

    A. Stief, J. R. Ottewill, J. Baranowski. Investigation of the diagnostic properties of sensors and features in a multiphase flow facility case study. in: 12th IFAC Symposium on Dynamics and Control of Process Systems (in press), 2019

    M. Lucke, X. Mei, A. Stief, M. Chioua, N. F. Thornhill. Variable selection for fault detection and identification based on mutual information of multi-valued alarm series, in: 12th IFAC Symposium on Dynamics and Control of Process Systems (in press), 2019

    R. Tan, T. Cong, N. F. Thornhill, J. R. Ottewill, J. Baranowski. Statistical monitoring of processes with multiple operating modes, in: 12th IFAC Symposium on Dynamics and Control of Process Systems (in press), 2019.

  18. Data Analytics Case Study – Case 1 Project

    • kaggle.com
    Updated Oct 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anas Aljarrah (2022). Data Analytics Case Study – Case 1 Project [Dataset]. https://www.kaggle.com/cascert/data-analytics-case-study-case-1-project/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 8, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Anas Aljarrah
    Description

    How Does a Bike-Share Navigate Speedy Success?

    This is a case study for GOOGLE DATA ANALYSTS CERTIFICATE. This project includes the processes of business task, hypotheses, data pipeline, data visualization and insight finding. If you think this notebook is helpful or needs improvement, please upvote this project. Thank you! Should you have any suggestions or further questions, please don't hesitate to leave a comment

    LinkedIn: /in/anas-aljarrah/

  19. Case Analysis Tracking System

    • datasets.ai
    • cloud.csiss.gmu.edu
    • +2more
    Updated Sep 8, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Archives and Records Administration (2024). Case Analysis Tracking System [Dataset]. https://datasets.ai/datasets/case-analysis-tracking-system
    Explore at:
    Dataset updated
    Sep 8, 2024
    Dataset authored and provided by
    National Archives and Records Administrationhttp://www.archives.gov/
    Description

    CATS tracks Public and Federal Agency Reference Requests for OPF (Official Personnel Folder) , EMF (Employee Medical Folder), and eOPF (electronic Official Personnel Folder) Records.

  20. Supporting Clean-Up of Contaminated Sites with Decision Analysis: A Case...

    • catalog.data.gov
    • datasets.ai
    Updated Dec 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2021). Supporting Clean-Up of Contaminated Sites with Decision Analysis: A Case Study on Prioritization of Remediation Alternatives in Superfund [Dataset]. https://catalog.data.gov/dataset/supporting-clean-up-of-contaminated-sites-with-decision-analysis-a-case-study-on-prioritiz
    Explore at:
    Dataset updated
    Dec 6, 2021
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The summary from the detailed analysis of the case study in EPA (1988b) is provided in Table 3 of the manuscript, and was used as the data source for the two datasets used in this study. These include a flat and hierarchical structure of the five balancing criteria, shown in Table 4 and Table 5, respectively. Table 4 provides a comprehensive score for each balancing criterion, similar to the summary tables presented in the FS of Superfund sites (e.g., (EPA 2016b, AECOM 2019)). Table 5 uses the same information in Table 3, but in this case, each piece of information is used to define multiple sub-criteria for each balancing criterion, except the cost one. This leads to a much more elaborate information table with the four remaining balancing criteria, now characterized by 13 sub-criteria. It is important to note that the scoring provided in Table 4 and Table 5, with the exception of the cost (c_5), were derived from the author’s interpretation of the descriptive language of the detailed analysis in for the hypothetical case study in presented in Table A-7 in Appendix A of the guidance document of EPA (1988b). It should be noted that the analysis of the three remedy alternatives presented in this hypothetical case study is governed by site-specific characteristics and may not represent potential performance of these remediation alternatives for other sites . The intent of this exercise is to illustrate the flexibility and adaptability of the MCDA process to address both the main, overarching criteria, as well as sub-criteria that may have specific importance in the decision process for a particular site. Ultimately, the sub-criteria can be adapted to address specific stakeholder perspectives or technical factors that may be linked to properties unique to the contaminant or physical characteristics of the site. This dataset is associated with the following publication: Cinelli, M., M.A. Gonzalez, R. Ford, J. McKernan, S. Corrente, M. Kadziński, and R. Słowiński. Supporting contaminated sites management with Multiple Criteria Decision Analysis: Demonstration of a regulation-consistent approach. JOURNAL OF CLEANER PRODUCTION. Elsevier Science Ltd, New York, NY, USA, 316: 128347, (2021).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Keri Ransom (2022). Data Analytics Case Study 1 Cyclistics Bike Share [Dataset]. https://www.kaggle.com/datasets/keriransom/da-case-study-cyclistics-bike-share
Organization logo

Data Analytics Case Study 1 Cyclistics Bike Share

My Analysis for the Google Capstone Data Analytics Certification Course

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 27, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Keri Ransom
Description

Dataset

This dataset was created by Keri Ransom

Contents

Search
Clear search
Close search
Google apps
Main menu