100+ datasets found
  1. Sales Data Analysis Project

    • kaggle.com
    zip
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stina Tonia (2024). Sales Data Analysis Project [Dataset]. https://www.kaggle.com/datasets/stinatonia/2019-project-on-sales
    Explore at:
    zip(3818151 bytes)Available download formats
    Dataset updated
    Jun 1, 2024
    Authors
    Stina Tonia
    Description

    This project was done to analyze sales data: to identify trends, top-selling products, and revenue metrics for business decision-making. I did this project offered by MeriSKILL, to learn more and be exposed to real-world projects and challenges that will provide me with valuable industry experience and help me develop my data analytical skills.https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F20837845%2Fe3561db319392bf9cc8b7d3fcc7ed94d%2F2019%20Sales%20dashboard.png?generation=1717273572595587&alt=media" alt=""> More on this project is on Medium

  2. COVID-19 data analysis project using MySQL.

    • kaggle.com
    zip
    Updated Dec 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shourya Negi (2024). COVID-19 data analysis project using MySQL. [Dataset]. https://www.kaggle.com/datasets/shouryanegi/covid-19-data-analysis-project-using-mysql
    Explore at:
    zip(2253676 bytes)Available download formats
    Dataset updated
    Dec 1, 2024
    Authors
    Shourya Negi
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset contains detailed information about the COVID-19 pandemic. The inspiration behind this dataset is to analyze trends, identify patterns, and understand the global impact of COVID-19 through SQL queries. It is designed for anyone interested in data exploration and real-world analytics.

  3. H

    Python Codes for Data Analysis of The Impact of COVID-19 on Technical...

    • dataverse.harvard.edu
    • figshare.com
    Updated Mar 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elizabeth Szkirpan (2022). Python Codes for Data Analysis of The Impact of COVID-19 on Technical Services Units Survey Results [Dataset]. http://doi.org/10.7910/DVN/SXMSDZ
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 21, 2022
    Dataset provided by
    Harvard Dataverse
    Authors
    Elizabeth Szkirpan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Copies of Anaconda 3 Jupyter Notebooks and Python script for holistic and clustered analysis of "The Impact of COVID-19 on Technical Services Units" survey results. Data was analyzed holistically using cleaned and standardized survey results and by library type clusters. To streamline data analysis in certain locations, an off-shoot CSV file was created so data could be standardized without compromising the integrity of the parent clean file. Three Jupyter Notebooks/Python scripts are available in relation to this project: COVID_Impact_TechnicalServices_HolisticAnalysis (a holistic analysis of all survey data) and COVID_Impact_TechnicalServices_LibraryTypeAnalysis (a clustered analysis of impact by library type, clustered files available as part of the Dataverse for this project).

  4. d

    GLobal Ocean Data Analysis Project (GLODAP) version 1.1 (NCEI Accession...

    • catalog.data.gov
    • data.cnra.ca.gov
    • +5more
    Updated Nov 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (Point of Contact) (2025). GLobal Ocean Data Analysis Project (GLODAP) version 1.1 (NCEI Accession 0001644) [Dataset]. https://catalog.data.gov/dataset/global-ocean-data-analysis-project-glodap-version-1-1-ncei-accession-00016441
    Explore at:
    Dataset updated
    Nov 1, 2025
    Dataset provided by
    (Point of Contact)
    Description

    The GLobal Ocean Data Analysis Project (GLODAP) is a cooperative effort to coordinate global synthesis projects funded through NOAA/DOE and NSF as part of the Joint Global Ocean Flux Study - Synthesis and Modeling Project (JGOFS-SMP). Cruises conducted as part of the World Ocean Circulation Experiment (WOCE), Joint Global Ocean Flux Study (JGOFS) and NOAA Ocean-Atmosphere Exchange Study (OACES) over the decade of the 1990s have created an oceanographic database of unparalleled quality and quantity. These data provide an important asset to the scientific community investigating carbon cycling in the oceans.

  5. Covid - 19 Data Analysis Project using Python

    • kaggle.com
    zip
    Updated May 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nisheet Lakra (2025). Covid - 19 Data Analysis Project using Python [Dataset]. https://www.kaggle.com/datasets/nisheetlakra/covid-19-data-analysis-project-using-python
    Explore at:
    zip(1996732 bytes)Available download formats
    Dataset updated
    May 24, 2025
    Authors
    Nisheet Lakra
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This dataset provides a comprehensive, time-series record of the global COVID-19 pandemic, including daily counts of confirmed cases, deaths, and recoveries across multiple countries and regions. It is designed to support data scientists, researchers, and public health professionals in conducting exploratory data analysis, forecasting, and impact assessment studies related to the spread and consequences of the virus.

  6. d

    Protected Areas Database of the United States (PAD-US) 3.0 Vector Analysis...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Oct 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Protected Areas Database of the United States (PAD-US) 3.0 Vector Analysis and Summary Statistics [Dataset]. https://catalog.data.gov/dataset/protected-areas-database-of-the-united-states-pad-us-3-0-vector-analysis-and-summary-stati
    Explore at:
    Dataset updated
    Oct 22, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    United States
    Description

    Spatial analysis and statistical summaries of the Protected Areas Database of the United States (PAD-US) provide land managers and decision makers with a general assessment of management intent for biodiversity protection, natural resource management, and recreation access across the nation. The PAD-US 3.0 Combined Fee, Designation, Easement feature class (with Military Lands and Tribal Areas from the Proclamation and Other Planning Boundaries feature class) was modified to remove overlaps, avoiding overestimation in protected area statistics and to support user needs. A Python scripted process ("PADUS3_0_CreateVectorAnalysisFileScript.zip") associated with this data release prioritized overlapping designations (e.g. Wilderness within a National Forest) based upon their relative biodiversity conservation status (e.g. GAP Status Code 1 over 2), public access values (in the order of Closed, Restricted, Open, Unknown), and geodatabase load order (records are deliberately organized in the PAD-US full inventory with fee owned lands loaded before overlapping management designations, and easements). The Vector Analysis File ("PADUS3_0VectorAnalysisFile_ClipCensus.zip") associated item of PAD-US 3.0 Spatial Analysis and Statistics ( https://doi.org/10.5066/P9KLBB5D ) was clipped to the Census state boundary file to define the extent and serve as a common denominator for statistical summaries. Boundaries of interest to stakeholders (State, Department of the Interior Region, Congressional District, County, EcoRegions I-IV, Urban Areas, Landscape Conservation Cooperative) were incorporated into separate geodatabase feature classes to support various data summaries ("PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.zip") and Comma-separated Value (CSV) tables ("PADUS3_0SummaryStatistics_TabularData_CSV.zip") summarizing "PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.zip" are provided as an alternative format and enable users to explore and download summary statistics of interest (Comma-separated Table [CSV], Microsoft Excel Workbook [.XLSX], Portable Document Format [.PDF] Report) from the PAD-US Lands and Inland Water Statistics Dashboard ( https://www.usgs.gov/programs/gap-analysis-project/science/pad-us-statistics ). In addition, a "flattened" version of the PAD-US 3.0 combined file without other extent boundaries ("PADUS3_0VectorAnalysisFile_ClipCensus.zip") allow for other applications that require a representation of overall protection status without overlapping designation boundaries. The "PADUS3_0VectorAnalysis_State_Clip_CENSUS2020" feature class ("PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.gdb") is the source of the PAD-US 3.0 raster files (associated item of PAD-US 3.0 Spatial Analysis and Statistics, https://doi.org/10.5066/P9KLBB5D ). Note, the PAD-US inventory is now considered functionally complete with the vast majority of land protection types represented in some manner, while work continues to maintain updates and improve data quality (see inventory completeness estimates at: http://www.protectedlands.net/data-stewards/ ). In addition, changes in protected area status between versions of the PAD-US may be attributed to improving the completeness and accuracy of the spatial data more than actual management actions or new acquisitions. USGS provides no legal warranty for the use of this data. While PAD-US is the official aggregation of protected areas ( https://www.fgdc.gov/ngda-reports/NGDA_Datasets.html ), agencies are the best source of their lands data.

  7. c

    Walmart Products Dataset – Free Product Data CSV

    • crawlfeeds.com
    csv, zip
    Updated Dec 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Walmart Products Dataset – Free Product Data CSV [Dataset]. https://crawlfeeds.com/datasets/walmart-products-free-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Dec 2, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Looking for a free Walmart product dataset? The Walmart Products Free Dataset delivers a ready-to-use ecommerce product data CSV containing ~2,100 verified product records from Walmart.com. It includes vital details like product titles, prices, categories, brand info, availability, and descriptions — perfect for data analysis, price comparison, market research, or building machine-learning models.

    Key Features

    Complete Product Metadata: Each entry includes URL, title, brand, SKU, price, currency, description, availability, delivery method, average rating, total ratings, image links, unique ID, and timestamp.

    CSV Format, Ready to Use: Download instantly - no need for scraping, cleaning or formatting.

    Good for E-commerce Research & ML: Ideal for product cataloging, price tracking, demand forecasting, recommendation systems, or data-driven projects.

    Free & Easy Access: Priced at USD $0.0, making it a great starting point for developers, data analysts or students.

    Who Benefits?

    • Data analysts & researchers exploring e-commerce trends or product catalog data.
    • Developers & data scientists building price-comparison tools, recommendation engines or ML models.
    • E-commerce strategists/marketers need product metadata for competitive analysis or market research.
    • Students/hobbyists needing a free dataset for learning or demo projects.

    Why Use This Dataset Instead of Manual Scraping?

    • Time-saving: No need to write scrapers or deal with rate limits.
    • Clean, structured data: All records are verified and already formatted in CSV, saving hours of cleaning.
    • Risk-free: Avoid Terms-of-Service issues or IP blocks that come with manual scraping.
      Instant access: Free and immediately downloadable.
  8. Google Data Analytics Capstone Project

    • kaggle.com
    zip
    Updated Nov 13, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NANCY CHAUHAN (2021). Google Data Analytics Capstone Project [Dataset]. https://www.kaggle.com/datasets/nancychauhan199/google-case-study-pdf
    Explore at:
    zip(284279 bytes)Available download formats
    Dataset updated
    Nov 13, 2021
    Authors
    NANCY CHAUHAN
    Description

    Case Study: How Does a Bike-Share Navigate Speedy Success?¶

    Introduction

    Welcome to the Cyclistic bike-share analysis case study! In this case study, you will perform many real-world tasks of a junior data analyst. You will work for a fictional company, Cyclistic, and meet different characters and team members. In order to answer the key business questions, you will follow the steps of the data analysis process: ask, prepare, process, analyze, share, and act. Along the way, the Case Study Roadmap tables — including guiding questions and key tasks — will help you stay on the right path. By the end of this lesson, you will have a portfolio-ready case study. Download the packet and reference the details of this case study anytime. Then, when you begin your job hunt, your case study will be a tangible way to demonstrate your knowledge and skills to potential employers.

    Scenario

    You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations. Characters and teams ● Cyclistic: A bike-share program that features more than 5,800 bicycles and 600 docking stations. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use the assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use them to commute to work each day. ● Lily Moreno: The director of marketing and your manager. Moreno is responsible for the development of campaigns and initiatives to promote the bike-share program. These may include email, social media, and other channels. ● Cyclistic marketing analytics team: A team of data analysts who are responsible for collecting, analyzing, and reporting data that helps guide Cyclistic marketing strategy. You joined this team six months ago and have been busy learning about Cyclistic’s mission and business goals — as well as how you, as a junior data analyst, can help Cyclistic achieve them. ● Cyclistic executive team: The notoriously detail-oriented executive team will decide whether to approve the recommended marketing program.

    About the company

    In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime. Until now, Cyclistic’s marketing strategy relied on building general awareness and appealing to broad consumer segments. One approach that helped make these things possible was the flexibility of its pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as casual riders. Customers who purchase annual memberships are Cyclistic members. Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders. Although the pricing flexibility helps Cyclistic attract more customers, Moreno believes that maximizing the number of annual members will be key to future growth. Rather than creating a marketing campaign that targets all-new customers, Moreno believes there is a very good chance to convert casual riders into members. She notes that casual riders are already aware of the Cyclistic program and have chosen Cyclistic for their mobility needs. Moreno has set a clear goal: Design marketing strategies aimed at converting casual riders into annual members. In order to do that, however, the marketing analyst team needs to better understand how annual members and casual riders differ, why casual riders would buy a membership, and how digital media could affect their marketing tactics. Moreno and her team are interested in analyzing the Cyclistic historical bike trip data to identify trends

    Three questions will guide the future marketing program:

    How do annual members and casual riders use Cyclistic bikes differently? Why would casual riders buy Cyclistic annual memberships? How can Cyclistic use digital media to influence casual riders to become members? Moreno has assigned you the first question to answer: How do annual members and casual rid...

  9. EMEC Wildlife Data Analysis Project - Cleansed Data

    • dtechtive.com
    • find.data.gov.scot
    zip
    Updated Jan 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marine Scotland (2020). EMEC Wildlife Data Analysis Project - Cleansed Data [Dataset]. https://dtechtive.com/datasets/19784
    Explore at:
    zip(4.8759 MB), zip(1.2569 MB), zip(0.0296 MB), zip(0.0665 MB)Available download formats
    Dataset updated
    Jan 7, 2020
    Dataset provided by
    Marine Directoratehttps://www.gov.scot/about/how-government-is-run/directorates/marine-scotland/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    European Marine Energy Centre (EMEC) wildlife observation data has been collected from the marine renewable test sites at Billia Croo and Fall of Warness in Orkney. These data have been processed and cleansed and used in reports prepared by EMEC, for SNH and Marine Scotland.

  10. d

    Data release for solar-sensor angle analysis subset associated with the...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Nov 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Data release for solar-sensor angle analysis subset associated with the journal article "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" [Dataset]. https://catalog.data.gov/dataset/data-release-for-solar-sensor-angle-analysis-subset-associated-with-the-journal-article-so
    Explore at:
    Dataset updated
    Nov 27, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Western United States, United States
    Description

    This dataset provides geospatial location data and scripts used to analyze the relationship between MODIS-derived NDVI and solar and sensor angles in a pinyon-juniper ecosystem in Grand Canyon National Park. The data are provided in support of the following publication: "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States". The data and scripts allow users to replicate, test, or further explore results. The file GrcaScpnModisCellCenters.csv contains locations (latitude-longitude) of all the 250-m MODIS (MOD09GQ) cell centers associated with the Grand Canyon pinyon-juniper ecosystem that the Southern Colorado Plateau Network (SCPN) is monitoring through its land surface phenology and integrated upland monitoring programs. The file SolarSensorAngles.csv contains MODIS angle measurements for the pixel at the phenocam location plus a random 100 point subset of pixels within the GRCA-PJ ecosystem. The script files (folder: 'Code') consist of 1) a Google Earth Engine (GEE) script used to download MODIS data through the GEE javascript interface, and 2) a script used to calculate derived variables and to test relationships between solar and sensor angles and NDVI using the statistical software package 'R'. The file Fig_8_NdviSolarSensor.JPG shows NDVI dependence on solar and sensor geometry demonstrated for both a single pixel/year and for multiple pixels over time. (Left) MODIS NDVI versus solar-to-sensor angle for the Grand Canyon phenocam location in 2018, the year for which there is corresponding phenocam data. (Right) Modeled r-squared values by year for 100 randomly selected MODIS pixels in the SCPN-monitored Grand Canyon pinyon-juniper ecosystem. The model for forward-scatter MODIS-NDVI is log(NDVI) ~ solar-to-sensor angle. The model for back-scatter MODIS-NDVI is log(NDVI) ~ solar-to-sensor angle + sensor zenith angle. Boxplots show interquartile ranges; whiskers extend to 10th and 90th percentiles. The horizontal line marking the average median value for forward-scatter r-squared (0.835) is nearly indistinguishable from the back-scatter line (0.833). The dataset folder also includes supplemental R-project and packrat files that allow the user to apply the workflow by opening a project that will use the same package versions used in this study (eg, .folders Rproj.user, and packrat, and files .RData, and PhenocamPR.Rproj). The empty folder GEE_DataAngles is included so that the user can save the data files from the Google Earth Engine scripts to this location, where they can then be incorporated into the r-processing scripts without needing to change folder names. To successfully use the packrat information to replicate the exact processing steps that were used, the user should refer to packrat documentation available at https://cran.r-project.org/web/packages/packrat/index.html and at https://www.rdocumentation.org/packages/packrat/versions/0.5.0. Alternatively, the user may also use the descriptive documentation phenopix package documentation, and description/references provided in the associated journal article to process the data to achieve the same results using newer packages or other software programs.

  11. d

    Water Analysis Project Summary Sheet Updated 4-14-2023

    • catalog.data.gov
    • data.wa.gov
    • +1more
    Updated Apr 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.wa.gov (2023). Water Analysis Project Summary Sheet Updated 4-14-2023 [Dataset]. https://catalog.data.gov/dataset/water-analysis-project-summary-sheet-2-02-2023
    Explore at:
    Dataset updated
    Apr 22, 2023
    Dataset provided by
    data.wa.gov
    Description

    Provides access to content and methods developed for the 2022 State of Salmon Water Analysis Project as produced under RCO contract #22-1587 with SBGH-Partners, dated April 18, 2022.

  12. g

    Insurance Dataset

    • gts.ai
    json
    Updated Oct 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2022). Insurance Dataset [Dataset]. https://gts.ai/case-study/insurance-dataset-annotation-services-for-precision-data-analysis/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Oct 16, 2022
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The Insurance Dataset project is an extensive initiative focused on collecting and analyzing insurance-related data from various sources.

  13. m

    An experiment on the reliability analysis of megaproject sustainability

    • data.mendeley.com
    • narcis.nl
    Updated Jan 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhen Chen (2021). An experiment on the reliability analysis of megaproject sustainability [Dataset]. http://doi.org/10.17632/gy2h2ybtjg.1
    Explore at:
    Dataset updated
    Jan 5, 2021
    Authors
    Zhen Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Hypothesis: The reliability can be adopted to quantitatively measure the sustainability of mega-projects.

    Presentation: This dataset shows two scenario based examples to establish an initial reliability assessment of megaproject sustainability. Data were gathered from the author’s assumption with regard to assumed differences between scenarios A and B. There are two sheets in this Microsoft Excel file, including a comparison between two scenarios by using a Fault Tree Analysis model, and a correlation analysis between reliability and unavailability.

    Notable findings: It has been found from this exploratory experiment that the reliability can be used to quantitatively measure megaproject sustainability, and there is a negative correlation between reliability and unavailability among 11 related events in association with sustainability goals in the life-cycle of megaproject.

    Interpretation: Results from data analysis by using the two sheets can be useful to inform decision making on megaproject sustainability. For example, the reliability to achieve sustainability goals can be enhanced by decrease the unavailability or the failure at individual work stages in megaproject delivery.

    Implication: This dataset file can be used to perform reliability analysis in other experiment to access megaproject sustainability.

  14. Wallmart Data Set for Project

    • kaggle.com
    zip
    Updated Jan 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sonal Anand (2025). Wallmart Data Set for Project [Dataset]. https://www.kaggle.com/datasets/sonalanand/wallmart-data-set-for-project
    Explore at:
    zip(125095 bytes)Available download formats
    Dataset updated
    Jan 16, 2025
    Authors
    Sonal Anand
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Sonal Anand

    Released under Apache 2.0

    Contents

  15. G

    Construction Data Analytics Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Construction Data Analytics Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/construction-data-analytics-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Oct 6, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Construction Data Analytics Market Outlook



    According to our latest research, the global Construction Data Analytics market size reached USD 4.12 billion in 2024, with a robust growth trajectory fueled by the accelerating adoption of digital solutions across the construction sector. The market is expected to expand at a CAGR of 15.7% from 2025 to 2033, driving the market value to an estimated USD 14.33 billion by 2033. This significant growth is underpinned by the increasing demand for data-driven decision-making, enhanced project management efficiencies, and the necessity to mitigate risks in complex construction environments.




    One of the primary growth factors in the Construction Data Analytics market is the rapid digital transformation underway in the construction industry. As construction projects become more complex and the volume of data generated onsite increases, there is a growing need for sophisticated analytics platforms capable of aggregating, processing, and interpreting this information. Companies are leveraging data analytics to optimize resource allocation, streamline workflows, and improve overall project outcomes. The integration of analytics with Building Information Modeling (BIM), Internet of Things (IoT) sensors, and mobile technologies is empowering stakeholders to make real-time, informed decisions, thereby reducing delays and cost overruns. Furthermore, the global push towards smart cities and sustainable infrastructure is further propelling the adoption of advanced analytics tools within the construction sector.




    Another critical driver for the Construction Data Analytics market is the heightened focus on risk management and safety compliance. Construction remains one of the most hazardous industries, and the ability to proactively identify and address risks is paramount. Data analytics platforms are being deployed to analyze historical safety records, monitor real-time site conditions, and predict potential hazards before they escalate. This data-driven approach not only enhances worker safety but also ensures regulatory compliance and minimizes insurance liabilities for construction firms. As governments and regulatory bodies impose stricter safety mandates, the demand for robust analytics solutions is expected to surge, further bolstering market growth.




    Additionally, the increasing pressure on construction companies to deliver projects on time and within budget is catalyzing the adoption of data analytics. Delays and cost overruns are perennial challenges in the industry, often stemming from poor project management, supply chain disruptions, and unforeseen risks. Advanced analytics platforms enable stakeholders to gain granular visibility into project schedules, resource utilization, and supply chain performance. By harnessing predictive analytics and machine learning, companies can anticipate potential bottlenecks, optimize procurement strategies, and ensure timely project delivery. This growing emphasis on operational efficiency and transparency is a key factor driving the expansion of the Construction Data Analytics market globally.




    From a regional perspective, North America continues to command the largest share of the Construction Data Analytics market in 2024, accounting for approximately 38% of the global market value. This dominance is attributed to the early adoption of digital technologies, a strong presence of leading analytics vendors, and significant investments in infrastructure modernization. Meanwhile, the Asia Pacific region is witnessing the fastest growth, with a projected CAGR of 18.2% through 2033, driven by rapid urbanization, government-led smart city initiatives, and increasing digitization in emerging economies such as China and India. Europe also demonstrates steady growth, supported by stringent regulatory requirements and a strong focus on sustainable construction practices.





    Component Analysis



    The Component segment of the Construction Data Analytics market is bifurcated i

  16. UCI and OpenML Data Sets for Ordinal Quantification

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Jul 25, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mirko Bunse; Mirko Bunse; Alejandro Moreo; Alejandro Moreo; Fabrizio Sebastiani; Fabrizio Sebastiani; Martin Senz; Martin Senz (2023). UCI and OpenML Data Sets for Ordinal Quantification [Dataset]. http://doi.org/10.5281/zenodo.8177302
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 25, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mirko Bunse; Mirko Bunse; Alejandro Moreo; Alejandro Moreo; Fabrizio Sebastiani; Fabrizio Sebastiani; Martin Senz; Martin Senz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These four labeled data sets are targeted at ordinal quantification. The goal of quantification is not to predict the label of each individual instance, but the distribution of labels in unlabeled sets of data.

    With the scripts provided, you can extract CSV files from the UCI machine learning repository and from OpenML. The ordinal class labels stem from a binning of a continuous regression label.

    We complement this data set with the indices of data items that appear in each sample of our evaluation. Hence, you can precisely replicate our samples by drawing the specified data items. The indices stem from two evaluation protocols that are well suited for ordinal quantification. To this end, each row in the files app_val_indices.csv, app_tst_indices.csv, app-oq_val_indices.csv, and app-oq_tst_indices.csv represents one sample.

    Our first protocol is the artificial prevalence protocol (APP), where all possible distributions of labels are drawn with an equal probability. The second protocol, APP-OQ, is a variant thereof, where only the smoothest 20% of all APP samples are considered. This variant is targeted at ordinal quantification tasks, where classes are ordered and a similarity of neighboring classes can be assumed.

    Usage

    You can extract four CSV files through the provided script extract-oq.jl, which is conveniently wrapped in a Makefile. The Project.toml and Manifest.toml specify the Julia package dependencies, similar to a requirements file in Python.

    Preliminaries: You have to have a working Julia installation. We have used Julia v1.6.5 in our experiments.

    Data Extraction: In your terminal, you can call either

    make

    (recommended), or

    julia --project="." --eval "using Pkg; Pkg.instantiate()"
    julia --project="." extract-oq.jl

    Outcome: The first row in each CSV file is the header. The first column, named "class_label", is the ordinal class.

    Further Reading

    Implementation of our experiments: https://github.com/mirkobunse/regularized-oq

  17. L

    Reading Theta scores for Integrated Data Analysis

    • ldbase.org
    csv
    Updated Jun 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chris Schatschneider (2024). Reading Theta scores for Integrated Data Analysis [Dataset]. https://ldbase.org/datasets/e54f6aad-e0e2-4b40-9017-d4e347d9a497
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jun 3, 2024
    Authors
    Chris Schatschneider
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    This dataset contains the latent variable READ_THETA that was estimated across 5 different projects using integrated data analysis

  18. c

    Data from: Global Ocean Data Analysis Project version 2.2020 (GLODAPv2.2020)...

    • s.cnmilf.com
    • catalog.data.gov
    Updated Oct 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (Point of Contact) (2025). Global Ocean Data Analysis Project version 2.2020 (GLODAPv2.2020) (NCEI Accession 0210813) [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/global-ocean-data-analysis-project-version-2-2020-glodapv2-2020-ncei-accession-0210813
    Explore at:
    Dataset updated
    Oct 2, 2025
    Dataset provided by
    (Point of Contact)
    Description

    This dataset consists of the GLODAPv2.2020 data product composed of data from 946 scientific cruises covering the global ocean between 1972 and 2019. It includes full depth discrete bottle measurements of salinity, oxygen, nitrate, silicate, phosphate, dissolved inorganic carbon (TCO2), total alkalinity (TAlk), CO2 fugacity (fCO2), pH, chlorofluorocarbons (CFC-11, CFC-12, CFC-113, and CCl4), various isotopes and organic compounds. It was created by appending data from 106 cruises to GLODAPv2.2019 (Olsen et al., 2019, NCEI Accession 0186803). The data for salinity, oxygen, nitrate, silicate, phosphate, TCO2, TAlk, pH, CFC-11, CFC-12, CFC-113, and CCl4 were subjected to primary and secondary quality control. Severe biases in these data have been corrected for, and outliers removed. However, differences in data related to any known or likely time trends or variations have not been corrected for. These data are believed to be accurate to 0.005 in salinity, 1% in oxygen, 2% in nitrate, 2% in silicate, 2% in phosphate, 4 µmol kg-1 in TCO2, 4 µmol kg-1 in TAlk, and for the halogenated transient tracers: 5%.

  19. Coffee Shop Sales Analysis

    • kaggle.com
    Updated Apr 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Monis Amir (2024). Coffee Shop Sales Analysis [Dataset]. https://www.kaggle.com/datasets/monisamir/coffee-shop-sales-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 25, 2024
    Dataset provided by
    Kaggle
    Authors
    Monis Amir
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Analyzing Coffee Shop Sales: Excel Insights 📈

    In my first Data Analytics Project, I Discover the secrets of a fictional coffee shop's success with my data-driven analysis. By Analyzing a 5-sheet Excel dataset, I've uncovered valuable sales trends, customer preferences, and insights that can guide future business decisions. 📊☕

    DATA CLEANING 🧹

    • REMOVED DUPLICATES OR IRRELEVANT ENTRIES: Thoroughly eliminated duplicate records and irrelevant data to refine the dataset for analysis.

    • FIXED STRUCTURAL ERRORS: Rectified any inconsistencies or structural issues within the data to ensure uniformity and accuracy.

    • CHECKED FOR DATA CONSISTENCY: Verified the integrity and coherence of the dataset by identifying and resolving any inconsistencies or discrepancies.

    DATA MANIPULATION 🛠️

    • UTILIZED LOOKUPS: Used Excel's lookup functions for efficient data retrieval and analysis.

    • IMPLEMENTED INDEX MATCH: Leveraged the Index Match function to perform advanced data searches and matches.

    • APPLIED SUMIFS FUNCTIONS: Utilized SumIFs to calculate totals based on specified criteria.

    • CALCULATED PROFITS: Used relevant formulas and techniques to determine profit margins and insights from the data.

    PIVOTING THE DATA 𝄜

    • CREATED PIVOT TABLES: Utilized Excel's PivotTable feature to pivot the data for in-depth analysis.

    • FILTERED DATA: Utilized pivot tables to filter and analyze specific subsets of data, enabling focused insights. Specially used in “PEAK HOURS” and “TOP 3 PRODUCTS” charts.

    VISUALIZATION 📊

    • KEY INSIGHTS: Unveiled the grand total sales revenue while also analyzing the average bill per person, offering comprehensive insights into the coffee shop's performance and customer spending habits.

    • SALES TREND ANALYSIS: Used Line chart to compute total sales across various time intervals, revealing valuable insights into evolving sales trends.

    • PEAK HOUR ANALYSIS: Leveraged Clustered Column chart to identify peak sales hours, shedding light on optimal operating times and potential staffing needs.

    • TOP 3 PRODUCTS IDENTIFICATION: Utilized Clustered Bar chart to determine the top three coffee types, facilitating strategic decisions regarding inventory management and marketing focus.

    *I also used a Timeline to visualize chronological data trends and identify key patterns over specific times.

    While it's a significant milestone for me, I recognize that there's always room for growth and improvement. Your feedback and insights are invaluable to me as I continue to refine my skills and tackle future projects. I'm eager to hear your thoughts and suggestions on how I can make my next endeavor even more impactful and insightful.

    THANKS TO: WsCube Tech Mo Chen Alex Freberg

    TOOLS USED: Microsoft Excel

    DataAnalytics #DataAnalyst #ExcelProject #DataVisualization #BusinessIntelligence #SalesAnalysis #DataAnalysis #DataDrivenDecisions

  20. N

    Comprehensive Income by Age Group Dataset: Longitudinal Analysis of Gate, OK...

    • neilsberg.com
    Updated Aug 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Comprehensive Income by Age Group Dataset: Longitudinal Analysis of Gate, OK Household Incomes Across 4 Age Groups and 16 Income Brackets. Annual Editions Collection // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/2ecf2ac3-aeee-11ee-aaca-3860777c1fe6/
    Explore at:
    Dataset updated
    Aug 7, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Oklahoma, Gate
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Gate household income by age. The dataset can be utilized to understand the age-based income distribution of Gate income.

    Content

    The dataset will have the following datasets when applicable

    Please note: The 2020 1-Year ACS estimates data was not reported by the Census Bureau due to the impact on survey collection and analysis caused by COVID-19. Consequently, median household income data for 2020 is unavailable for large cities (population 65,000 and above).

    • Gate, OK annual median income by age groups dataset (in 2022 inflation-adjusted dollars)
    • Age-wise distribution of Gate, OK household incomes: Comparative analysis across 16 income brackets

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Interested in deeper insights and visual analysis?

    Explore our comprehensive data analysis and visual representations for a deeper understanding of Gate income distribution by age. You can refer the same here

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stina Tonia (2024). Sales Data Analysis Project [Dataset]. https://www.kaggle.com/datasets/stinatonia/2019-project-on-sales
Organization logo

Sales Data Analysis Project

Explore at:
zip(3818151 bytes)Available download formats
Dataset updated
Jun 1, 2024
Authors
Stina Tonia
Description

This project was done to analyze sales data: to identify trends, top-selling products, and revenue metrics for business decision-making. I did this project offered by MeriSKILL, to learn more and be exposed to real-world projects and challenges that will provide me with valuable industry experience and help me develop my data analytical skills.https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F20837845%2Fe3561db319392bf9cc8b7d3fcc7ed94d%2F2019%20Sales%20dashboard.png?generation=1717273572595587&alt=media" alt=""> More on this project is on Medium

Search
Clear search
Close search
Google apps
Main menu