6 datasets found
  1. Data from: Sales Performance

    • kaggle.com
    Updated May 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Babatunde Zenith (2023). Sales Performance [Dataset]. https://www.kaggle.com/datasets/babatundezenith/sales-viz/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 2, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Babatunde Zenith
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    This fictional sales dataset was created using a R code for the purpose of visualizing trends in customer demographics, product performance, and sales over time. A link to my Github repository containing all the codes used in generating the data frame and all the preceding processes can be found here

  2. A

    ‘Video Games Sales Dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Dec 21, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2016). ‘Video Games Sales Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-video-games-sales-dataset-1d34/779cd618/?iid=015-666&v=presentation
    Explore at:
    Dataset updated
    Dec 21, 2016
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Video Games Sales Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/sidtwr/videogames-sales-dataset on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    Motivated by Gregory Smith's web scrape of VGChartz Video Games Sales, this data set simply extends the number of variables with another web scrape from Metacritic. Unfortunately, there are missing observations as Metacritic only covers a subset of the platforms. Also, a game may not have all the observations of the additional variables discussed below. Complete cases are ~ 6,900

    Content

    Alongside the fields: Name, Platform, Year_of_Release, Genre, Publisher, NA_Sales, EU_Sales, JP_Sales, Other_Sales, Global_Sales, we have:-

    Critic_score - Aggregate score compiled by Metacritic staff Critic_count - The number of critics used in coming up with the Critic_score User_score - Score by Metacritic's subscribers User_count - Number of users who gave the user_score Developer - Party responsible for creating the game Rating - The ESRB ratings

    Acknowledgements

    This repository, https://github.com/wtamu-cisresearch/scraper, after a few adjustments worked extremely well!

    Inspiration

    It would be interesting to see any machine learning techniques or continued data visualisations applied on this data set.#

    --- Original source retains full ownership of the source dataset ---

  3. f

    StreamFuels: Continuously updated fuel sales datasets for forecasting,...

    • figshare.com
    txt
    Updated Jul 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lucas Gabriel Mendes Castro; André Gustavo da Rosa Ribeiro; Jean Paul Barddal; Alceu de Souza Britto Jr; Vinicius Mourao Alves Souza (2025). StreamFuels: Continuously updated fuel sales datasets for forecasting, classification, and pattern analysis [Dataset]. http://doi.org/10.6084/m9.figshare.29561942.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 14, 2025
    Dataset provided by
    figshare
    Authors
    Lucas Gabriel Mendes Castro; André Gustavo da Rosa Ribeiro; Jean Paul Barddal; Alceu de Souza Britto Jr; Vinicius Mourao Alves Souza
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description of the data and file structureWe are providing 5 datasets in TSF (time series format).All files include the series name, start and end timestamps, product type, and the time series values.NOTE: These datasets have values ​​updated until May 2025. To obtain the updated datasets, consider using the StreamFuels package, available at https://pypi.org/project/streamfuels/Files and variablesFile: yearly_fuel_sales_by_state.tsfDescription: comprises 216 time series capturing the yearly historical sales of eight fuel products – ethanol, regular gasoline (gasoline-r), aviation gasoline (gasoline-a), liquefied petroleum gas (LPG), aviation kerosene (kerosene-a), illuminating kerosene (kerosene-i), fuel oil, and diesel – across 27 Brazilian states. Among these time series, 135 have records dating back to 1947, while 27 series begin in 1953, another 27 in 1959, and the remaining 27 in 1980.File: monthly_oilgas_operations_by_state.tsfDescription: comprises 76 time series capturing the monthly records of five types of industrial operations – production, reinjection, flaring, self-consumption, and availability – for three products: natural gas, petroleum, and natural gas liquid (NGL), across the 27 Brazilian states. Only natural gas includes all five operations, while petroleum and NGL are limited to the production operation. Of the total, 31 series started in 1997, while the remaining 45 date back to 2000.File: yearly_fuel_sales_by_city.tsfDescription: comprises 29,282 time series capturing the yearly sales of eight fuels and asphalt (a petroleum derivative) across 5,325 Brazilian cities. Most of the series begins in 1990 and 1992. However, we have some recent series that began in 2018.File: fuel_type_classification.tsfDescription: comprises 14,032 time series, each with a fixed length of 12 observations (i.e., one year of sales) and eight possible class labels. Among the five datasets, it is the only one that is labeled, with each series associated with a specific fuel type.File: monthly_fuel_sales_by_state.tsfDescription: comprises 216 time series capturing the monthly historical sales of eight fuel products across Brazil’s 27 states. All the time series dating back to 1990.Code/softwareAll the codes to collect and preprocessing the data are available at https://github.com/lucas-castrow/streamfuels/Access informationOther publicly accessible locations of the data:https://github.com/lucas-castrow/datasets_streamfuelsData was derived from the following sources:https://www.gov.br/anp/en/https://www.gov.br/anp/pt-br/centrais-de-conteudo/dados-abertos/vendas-de-derivados-de-petroleo-e-biocombustiveis

  4. A

    ‘Video Game Sales’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Nov 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Video Game Sales’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-video-game-sales-30b0/092867fa/?iid=010-909&v=presentation
    Explore at:
    Dataset updated
    Nov 20, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Video Game Sales’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/gregorut/videogamesales on 12 November 2021.

    --- Dataset description provided by original source is as follows ---

    This dataset contains a list of video games with sales greater than 100,000 copies. It was generated by a scrape of vgchartz.com.

    Fields include

    • Rank - Ranking of overall sales

    • Name - The games name

    • Platform - Platform of the games release (i.e. PC,PS4, etc.)

    • Year - Year of the game's release

    • Genre - Genre of the game

    • Publisher - Publisher of the game

    • NA_Sales - Sales in North America (in millions)

    • EU_Sales - Sales in Europe (in millions)

    • JP_Sales - Sales in Japan (in millions)

    • Other_Sales - Sales in the rest of the world (in millions)

    • Global_Sales - Total worldwide sales.

    The script to scrape the data is available at https://github.com/GregorUT/vgchartzScrape. It is based on BeautifulSoup using Python. There are 16,598 records. 2 records were dropped due to incomplete information.

    --- Original source retains full ownership of the source dataset ---

  5. GIS Data and Analysis for Cooling Demand and Environmental Impact in The...

    • zenodo.org
    • data.niaid.nih.gov
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simon van Lierde; Simon van Lierde (2025). GIS Data and Analysis for Cooling Demand and Environmental Impact in The Hague [Dataset]. http://doi.org/10.5281/zenodo.8344581
    Explore at:
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Simon van Lierde; Simon van Lierde
    Area covered
    The Hague
    Description

    This dataset contains raw GIS data sourced from the BAG (Basisregistratie Adressen en Gebouwen; Registry of Addresses and Buildings). It provides comprehensive information on buildings, including advanced height data and administrative details. It also contains geographic divisions within The Hague. Additionally, the dataset incorporates energy label data, offering insights into the energy efficiency and performance of these buildings. This combined dataset serves as the backbone of a Master's thesis in Industrial Ecology, analysing residential and office cooling and its environmental impacts in The Hague, Netherlands. The codebase of this analysis can be found in this Github repository: https://github.com/simonvanlierde/msc-thesis-ie

    The dataset includes a background research spreadsheet containing supporting calculations. It also presents geopackages with results from the cooling demand model (CDM) for various scenarios: Status quo (SQ), 2030, and 2050 scenarios (Low, Medium, and High)

    Background research data

    The background_research_data.xlsx spreadsheet contains comprehensive background research calculations supporting the shaping of input parameters used in the model. It contains several sheets:

    • Cooling Technologies: Details the various cooling technologies examined in the study, summarizing their characteristics and the market penetration mixes used in the analysis.
    • LCA Results of Ventilation Systems: Provides an overview of the ecoinvent processes serving as proxies for the life-cycle impacts of cooling equipment, along with calculations of the weight of cooling systems and contribution tables from the LCA-based assessment.
    • Material Scarcity: A detailed examination of the critical raw material content in the material footprint of ecoinvent processes, representing cooling equipment.
    • Heat Plans per Neighbourhood: Forecasts of future heating solutions for each neighbourhood in The Hague.
    • Building Stock: Analysis of the projected growth trends in residential and office building stocks in The Hague. AC Market: Market analysis covering air conditioner sales in the Netherlands from 2002 to 2022.
    • Climate Change: Computations of climate-related parameters based on KNMI climate scenarios.
    • Electricity Mix Analysis: Analysis of future projections for the Dutch electricity grid and calculations of life-cycle carbon intensities of the grid.

    Input data

    Geographic divisions

    • The outline of The Hague municipality through the Municipal boundaries (Gemeenten) layer, sourced from the Administrative boundaries (Bestuurlijke Gemeenten) dataset on the PDOK WFS service.
    • District (Wijken) and Neighbourhood (Buurten) layers were downloaded from the PDOK WFS service (from the CBS Wijken en Buurten 2022 data package) and clipped to the outline of The Hague.
    • The 4-digit postcodes layer was downloaded from PDOK WFS service (CBS Postcode4 statistieken 2020) and clipped to The Hague's outline. The postcodes within The Hague were subsequently stored in a csv file.
    • The census block layer was downloaded from the PDOK WFS service (from the CBS Vierkantstatistieken 100m 2021 data package) and also clipped to the outline of The Hague.
    • These layers have been combined in the GeographicDivisions_TheHague GeoPackage.

    BAG data

    • BAG data was acquired through the download of a BAG GeoPackage from the BAG ATOM download page.
    • In the resulting GeoPackage, the Residences (Verblijfsobject) and Building (Pand) layers were clipped to match The Hague's outline.
    • The resulting residence data can be found in the BAG_buildings_TheHague GeoPackage.

    3D BAG

    • Due to limitations imposed by the PDOK WFS service, which restricts the number of downloadable buildings to 10,000, it was necessary to acquire 145 individual GeoPackages for tiles covering The Hague from the 3D BAG website.
    • These GeoPackages were merged using the ogr2ogr append function from the GDAL library in bash.
    • Roof elevation data was extracted from the LoD 1.2 2D layer from the resulting GeoPackage.
    • Ground elevation data was obtained from the Pand layer.
    • Both of these layers were clipped to match The Hague's outline.
    • Roof and ground elevation data from the LoD 1.2 2D and Pand layers were joined to the Pand layer in the BAG dataset using the BAG ID of each building.
    • The resulting data can be found in the BAG_buildings_TheHague GeoPackage.

    Energy labels

    • Energy labels were downloaded from the Energy label registry (EP-online) and stored in energy_labels_TheNetherlands.csv.

    UHI effect data

    • A bitmap with the UHI effect intensity in The Hague was retrieved from the from the Dutch Natural Capital Atlas (Atlas Natuurlijk Kapitaal) and stored in UHI_effect_TheHague.tiff.

    Output data

    • The residence-level data joined to the building layer is contained in the BAG_buildings_with_residence_data_full GeoPackage.
    • The results for each building, according to different scenarios, are compiled in the buildings_with_CDM_results_[scenario]_full GeoPackages. The scenarios are abbreviated as follows:
      • SQ: Status Quo, covering the 2018-2022 reference period.
      • 2030: An average scenario projected for the year 2030.
      • 2050_L: A low-impact, best-case scenario for 2050.
      • 2050_M: A medium-impact, moderate scenario for 2050.
      • 2050_H: A high-impact, worst-case scenario for 2050.

  6. PUDL US Hourly Electricity Demand by State

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Sep 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ethan Welty; Ethan Welty; Zane Selvans; Zane Selvans; Yash Kumar; Yash Kumar (2021). PUDL US Hourly Electricity Demand by State [Dataset]. http://doi.org/10.5281/zenodo.5348396
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Sep 2, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ethan Welty; Ethan Welty; Zane Selvans; Zane Selvans; Yash Kumar; Yash Kumar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Hourly Electricity Demand by State

    This archive contains the output of the Public Utility Data Liberation (PUDL) Project state electricity demand allocation analysis, as of the v0.4.0 release of the PUDL Python package. Here is the script that produced this output. It was run using the Docker container and processed data that are included in PUDL Data Release v2.0.0.

    The analysis uses hourly electricity demand reported at the balancing authority and utility level in the FERC 714 (data archive), and service territories for utilities and balancing authorities inferred from the counties served by each utility, and the utilities that make up each balancing authority in the EIA 861 (data archive), to estimate the total hourly electricity demand for each US state.

    We used the total electricity sales by state reported in the EIA 861 as a scaling factor to ensure that the magnitude of electricity sales is roughly correct, and obtains the shape of the demand curve from the hourly planning area demand reported in the FERC 714. The scaling is necessary partly due to imperfections in the historical utility and balancing authority service territory maps which we have been able to reconstruct from the data reported in the EIA 861 Service Territories and Balancing Authority tables.

    The compilation of historical service territories based on the EIA 861 data is somewhat manual and could be improved, but overall the results seem reasonable. Additional predictive spatial variables will be required to obtain more granular electricity demand estimates (e.g. at the county level).

    FERC 714 Respondents

    The file ferc714_respondents.csv links FERC Form 714 respondents to what we believe to be their corresponding EIA utilities or balancing authorities.

    • eia_code: An integer ID reported in the FERC Form 714 corresponding to the respondent's EIA ID. In some cases this is a Utility ID, and in others it is a Balancing Authority ID, but which is not specified and so we have had to infer the type of entity which is responding. Note that in many cases the same company acts as both a utility and a balancing authority, and the integer ID associated with the company is often the same in both roles, but it does not need to be.
    • respondent_type: Either balancing_authority or utility depending on which type of entity we believe was responding to the FERC 714.
    • respondent_id_ferc714: The integer ID of the responding entity within the FERC 714.
    • respondent_name_ferc714: The name provided by the respondent in the FERC 714.
    • balancing_authority_id_eia: If the respondent was identified as a balancing authority, the EIA ID for that balancing authority, taken from the EIA Form 861.
    • balancing_authority_code_eia: If the respondent was identified as a balancing authority, the EIA short code used to identify the balancing authority, taken from the EIA Form 861.
    • balancing_authority_name_eia: If the respondent was identified as a balancing authority, the name of the balancing authority, taken from the EIA Form 861.
    • utility_id_eia: If the respondent was identified as a utility, the EIA utility ID, taken from the EIA Form 861.
    • utility_name_eia: If the respondent was identified as a utility, the name of the utility, taken from the EIA 861.

    FERC 714 Respondent Service Territories

    The file ferc714_service_territories.csv describes the historical service territories for FERC 714 respondents for the years 2006-2019. For each respondent and year, their service territory is composed of a collection of counties, identified by their 5-digit FIPS codes. The file contains the following columns, with each row associating a single county with a FERC 714 respondent in a particular year:

    • respondent_id_ferc714: The FERC Form 714 respondent ID, which is also found in ferc714_respondents.csv
    • report_date: The first day of the year for which the service territory is being described.
    • state: Two letter abbreviation for the state containing the county, for human readability.
    • county: The name of the county, for human readability.
    • state_id_fips: The 2-digit FIPS state code.
    • county_id_fips: The 5-digit FIPS county code for use with other geospatial data resources, like the US Census DP1 geodatabase.

    State Hourly Electricity Demand Estimates

    The file demand.csv contains hourly electricity demand estimates for each US state from 2006-2019. It contains the following columns:

    • state_id_fips: The 2-digit FIPS state code.
    • utc_datetime: UTC time at hourly resolution.
    • demand_mwh: Electricity demand for that state and hour in MWh. This is an allocation of the electricity demand reported directly in the FERC Form 714.
    • scaled_demand_mwh: Estimated total electricity demand for that state and hour, in MWh. This is the reported FERC Form 714 hourly demand scaled up or down linearly such that the total annual electricity demand matches the total annual electricity sales reported at the state level in the EIA Form 861.

    A collection of plots are also included, comparing the original and scaled demand time series for each state.

    Acknowledgements

    This analysis was funded largely by GridLab, and done in collaboration with researchers at the Lawrence Berkeley National Laboratory, including Umed Paliwal and Nikit Abhyankar.

    • Ethan Welty wrote the final code and most of the algorithms.
    • Yash Kumar did initial data explorations and geospatial analyses.

    The data screening methods were originally designed to identify unrealistic data in the electricity demand timeseries reported to EIA on Form 930, and have been applied here to data form the FERC Form 714.

    They are adapted from code published and modified by:

    And described at:

    The imputation methods were designed for multivariate time series forecasting.

    They are adapted from code published by:

    And described at:

    About PUDL & Catalyst Cooperative

    For additional information about this data and PUDL, see the following resources:

  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Babatunde Zenith (2023). Sales Performance [Dataset]. https://www.kaggle.com/datasets/babatundezenith/sales-viz/suggestions
Organization logo

Data from: Sales Performance

Creating a dashboard to visualize the sales data for a fictional retail store.

Related Article
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 2, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Babatunde Zenith
License

Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically

Description

This fictional sales dataset was created using a R code for the purpose of visualizing trends in customer demographics, product performance, and sales over time. A link to my Github repository containing all the codes used in generating the data frame and all the preceding processes can be found here

Search
Clear search
Close search
Google apps
Main menu