6 datasets found

Data from: Sales Performance
kaggle.com
Updated May 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Babatunde Zenith (2023). Sales Performance [Dataset]. https://www.kaggle.com/datasets/babatundezenith/sales-viz/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 2, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Babatunde Zenith
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
This fictional sales dataset was created using a R code for the purpose of visualizing trends in customer demographics, product performance, and sales over time. A link to my Github repository containing all the codes used in generating the data frame and all the preceding processes can be found here
A
‘Video Games Sales Dataset’ analyzed by Analyst-2
analyst-2.ai
Updated Dec 21, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2016). ‘Video Games Sales Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-video-games-sales-dataset-1d34/779cd618/?iid=015-666&v=presentation
Explore at:
Dataset updated
Dec 21, 2016
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Video Games Sales Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/sidtwr/videogames-sales-dataset on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Context

Motivated by Gregory Smith's web scrape of VGChartz Video Games Sales, this data set simply extends the number of variables with another web scrape from Metacritic. Unfortunately, there are missing observations as Metacritic only covers a subset of the platforms. Also, a game may not have all the observations of the additional variables discussed below. Complete cases are ~ 6,900

Content

Alongside the fields: Name, Platform, Year_of_Release, Genre, Publisher, NA_Sales, EU_Sales, JP_Sales, Other_Sales, Global_Sales, we have:-

Critic_score - Aggregate score compiled by Metacritic staff Critic_count - The number of critics used in coming up with the Critic_score User_score - Score by Metacritic's subscribers User_count - Number of users who gave the user_score Developer - Party responsible for creating the game Rating - The ESRB ratings

Acknowledgements

This repository, https://github.com/wtamu-cisresearch/scraper, after a few adjustments worked extremely well!

Inspiration

It would be interesting to see any machine learning techniques or continued data visualisations applied on this data set.#

--- Original source retains full ownership of the source dataset ---
f
StreamFuels: Continuously updated fuel sales datasets for forecasting,...
figshare.com
txt
Updated Jul 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lucas Gabriel Mendes Castro; André Gustavo da Rosa Ribeiro; Jean Paul Barddal; Alceu de Souza Britto Jr; Vinicius Mourao Alves Souza (2025). StreamFuels: Continuously updated fuel sales datasets for forecasting, classification, and pattern analysis [Dataset]. http://doi.org/10.6084/m9.figshare.29561942.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.29561942.v2
Dataset updated
Jul 14, 2025
Dataset provided by
figshare
Authors
Lucas Gabriel Mendes Castro; André Gustavo da Rosa Ribeiro; Jean Paul Barddal; Alceu de Souza Britto Jr; Vinicius Mourao Alves Souza
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description of the data and file structureWe are providing 5 datasets in TSF (time series format).All files include the series name, start and end timestamps, product type, and the time series values.NOTE: These datasets have values updated until May 2025. To obtain the updated datasets, consider using the StreamFuels package, available at https://pypi.org/project/streamfuels/Files and variablesFile: yearly_fuel_sales_by_state.tsfDescription: comprises 216 time series capturing the yearly historical sales of eight fuel products – ethanol, regular gasoline (gasoline-r), aviation gasoline (gasoline-a), liquefied petroleum gas (LPG), aviation kerosene (kerosene-a), illuminating kerosene (kerosene-i), fuel oil, and diesel – across 27 Brazilian states. Among these time series, 135 have records dating back to 1947, while 27 series begin in 1953, another 27 in 1959, and the remaining 27 in 1980.File: monthly_oilgas_operations_by_state.tsfDescription: comprises 76 time series capturing the monthly records of five types of industrial operations – production, reinjection, flaring, self-consumption, and availability – for three products: natural gas, petroleum, and natural gas liquid (NGL), across the 27 Brazilian states. Only natural gas includes all five operations, while petroleum and NGL are limited to the production operation. Of the total, 31 series started in 1997, while the remaining 45 date back to 2000.File: yearly_fuel_sales_by_city.tsfDescription: comprises 29,282 time series capturing the yearly sales of eight fuels and asphalt (a petroleum derivative) across 5,325 Brazilian cities. Most of the series begins in 1990 and 1992. However, we have some recent series that began in 2018.File: fuel_type_classification.tsfDescription: comprises 14,032 time series, each with a fixed length of 12 observations (i.e., one year of sales) and eight possible class labels. Among the five datasets, it is the only one that is labeled, with each series associated with a specific fuel type.File: monthly_fuel_sales_by_state.tsfDescription: comprises 216 time series capturing the monthly historical sales of eight fuel products across Brazil’s 27 states. All the time series dating back to 1990.Code/softwareAll the codes to collect and preprocessing the data are available at https://github.com/lucas-castrow/streamfuels/Access informationOther publicly accessible locations of the data:https://github.com/lucas-castrow/datasets_streamfuelsData was derived from the following sources:https://www.gov.br/anp/en/https://www.gov.br/anp/pt-br/centrais-de-conteudo/dados-abertos/vendas-de-derivados-de-petroleo-e-biocombustiveis
A
‘Video Game Sales’ analyzed by Analyst-2
analyst-2.ai
Updated Nov 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Video Game Sales’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-video-game-sales-30b0/092867fa/?iid=010-909&v=presentation
Explore at:
Dataset updated
Nov 20, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Video Game Sales’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/gregorut/videogamesales on 12 November 2021.

--- Dataset description provided by original source is as follows ---

This dataset contains a list of video games with sales greater than 100,000 copies. It was generated by a scrape of vgchartz.com.

Fields include

Rank - Ranking of overall sales

Name - The games name

Platform - Platform of the games release (i.e. PC,PS4, etc.)

Year - Year of the game's release

Genre - Genre of the game

Publisher - Publisher of the game

NA_Sales - Sales in North America (in millions)

EU_Sales - Sales in Europe (in millions)

JP_Sales - Sales in Japan (in millions)

Other_Sales - Sales in the rest of the world (in millions)

Global_Sales - Total worldwide sales.

The script to scrape the data is available at https://github.com/GregorUT/vgchartzScrape. It is based on BeautifulSoup using Python. There are 16,598 records. 2 records were dropped due to incomplete information.

--- Original source retains full ownership of the source dataset ---
GIS Data and Analysis for Cooling Demand and Environmental Impact in The...
zenodo.org
data.niaid.nih.gov
Updated Apr 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Simon van Lierde; Simon van Lierde (2025). GIS Data and Analysis for Cooling Demand and Environmental Impact in The Hague [Dataset]. http://doi.org/10.5281/zenodo.8344581
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.8344581
Dataset updated
Apr 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Simon van Lierde; Simon van Lierde
Area covered
The Hague
Description
This dataset contains raw GIS data sourced from the BAG (Basisregistratie Adressen en Gebouwen; Registry of Addresses and Buildings). It provides comprehensive information on buildings, including advanced height data and administrative details. It also contains geographic divisions within The Hague. Additionally, the dataset incorporates energy label data, offering insights into the energy efficiency and performance of these buildings. This combined dataset serves as the backbone of a Master's thesis in Industrial Ecology, analysing residential and office cooling and its environmental impacts in The Hague, Netherlands. The codebase of this analysis can be found in this Github repository: https://github.com/simonvanlierde/msc-thesis-ie
The dataset includes a background research spreadsheet containing supporting calculations. It also presents geopackages with results from the cooling demand model (CDM) for various scenarios: Status quo (SQ), 2030, and 2050 scenarios (Low, Medium, and High)
Background research data
The background_research_data.xlsx spreadsheet contains comprehensive background research calculations supporting the shaping of input parameters used in the model. It contains several sheets:
Cooling Technologies: Details the various cooling technologies examined in the study, summarizing their characteristics and the market penetration mixes used in the analysis.
LCA Results of Ventilation Systems: Provides an overview of the ecoinvent processes serving as proxies for the life-cycle impacts of cooling equipment, along with calculations of the weight of cooling systems and contribution tables from the LCA-based assessment.
Material Scarcity: A detailed examination of the critical raw material content in the material footprint of ecoinvent processes, representing cooling equipment.
Heat Plans per Neighbourhood: Forecasts of future heating solutions for each neighbourhood in The Hague.
Building Stock: Analysis of the projected growth trends in residential and office building stocks in The Hague. AC Market: Market analysis covering air conditioner sales in the Netherlands from 2002 to 2022.
Climate Change: Computations of climate-related parameters based on KNMI climate scenarios.
Electricity Mix Analysis: Analysis of future projections for the Dutch electricity grid and calculations of life-cycle carbon intensities of the grid.
Input data
Geographic divisions
The outline of The Hague municipality through the Municipal boundaries (Gemeenten) layer, sourced from the Administrative boundaries (Bestuurlijke Gemeenten) dataset on the PDOK WFS service.
District (Wijken) and Neighbourhood (Buurten) layers were downloaded from the PDOK WFS service (from the CBS Wijken en Buurten 2022 data package) and clipped to the outline of The Hague.
The 4-digit postcodes layer was downloaded from PDOK WFS service (CBS Postcode4 statistieken 2020) and clipped to The Hague's outline. The postcodes within The Hague were subsequently stored in a csv file.
The census block layer was downloaded from the PDOK WFS service (from the CBS Vierkantstatistieken 100m 2021 data package) and also clipped to the outline of The Hague.
These layers have been combined in the GeographicDivisions_TheHague GeoPackage.
BAG data
BAG data was acquired through the download of a BAG GeoPackage from the BAG ATOM download page.
In the resulting GeoPackage, the Residences (Verblijfsobject) and Building (Pand) layers were clipped to match The Hague's outline.
The resulting residence data can be found in the BAG_buildings_TheHague GeoPackage.
3D BAG
Due to limitations imposed by the PDOK WFS service, which restricts the number of downloadable buildings to 10,000, it was necessary to acquire 145 individual GeoPackages for tiles covering The Hague from the 3D BAG website.
These GeoPackages were merged using the ogr2ogr append function from the GDAL library in bash.
Roof elevation data was extracted from the LoD 1.2 2D layer from the resulting GeoPackage.
Ground elevation data was obtained from the Pand layer.
Both of these layers were clipped to match The Hague's outline.
Roof and ground elevation data from the LoD 1.2 2D and Pand layers were joined to the Pand layer in the BAG dataset using the BAG ID of each building.
The resulting data can be found in the BAG_buildings_TheHague GeoPackage.
Energy labels
Energy labels were downloaded from the Energy label registry (EP-online) and stored in energy_labels_TheNetherlands.csv.
UHI effect data
A bitmap with the UHI effect intensity in The Hague was retrieved from the from the Dutch Natural Capital Atlas (Atlas Natuurlijk Kapitaal) and stored in UHI_effect_TheHague.tiff.
Output data
The residence-level data joined to the building layer is contained in the BAG_buildings_with_residence_data_full GeoPackage.
The results for each building, according to different scenarios, are compiled in the buildings_with_CDM_results_[scenario]_full GeoPackages. The scenarios are abbreviated as follows:
SQ: Status Quo, covering the 2018-2022 reference period.
2030: An average scenario projected for the year 2030.
2050_L: A low-impact, best-case scenario for 2050.
2050_M: A medium-impact, moderate scenario for 2050.
2050_H: A high-impact, worst-case scenario for 2050.
PUDL US Hourly Electricity Demand by State
zenodo.org
data.niaid.nih.gov
application/gzip
Updated Sep 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ethan Welty; Ethan Welty; Zane Selvans; Zane Selvans; Yash Kumar; Yash Kumar (2021). PUDL US Hourly Electricity Demand by State [Dataset]. http://doi.org/10.5281/zenodo.5348396
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5348396
Dataset updated
Sep 2, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ethan Welty; Ethan Welty; Zane Selvans; Zane Selvans; Yash Kumar; Yash Kumar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Hourly Electricity Demand by State

This archive contains the output of the Public Utility Data Liberation (PUDL) Project state electricity demand allocation analysis, as of the v0.4.0 release of the PUDL Python package. Here is the script that produced this output. It was run using the Docker container and processed data that are included in PUDL Data Release v2.0.0.

The analysis uses hourly electricity demand reported at the balancing authority and utility level in the FERC 714 (data archive), and service territories for utilities and balancing authorities inferred from the counties served by each utility, and the utilities that make up each balancing authority in the EIA 861 (data archive), to estimate the total hourly electricity demand for each US state.

We used the total electricity sales by state reported in the EIA 861 as a scaling factor to ensure that the magnitude of electricity sales is roughly correct, and obtains the shape of the demand curve from the hourly planning area demand reported in the FERC 714. The scaling is necessary partly due to imperfections in the historical utility and balancing authority service territory maps which we have been able to reconstruct from the data reported in the EIA 861 Service Territories and Balancing Authority tables.

The compilation of historical service territories based on the EIA 861 data is somewhat manual and could be improved, but overall the results seem reasonable. Additional predictive spatial variables will be required to obtain more granular electricity demand estimates (e.g. at the county level).

FERC 714 Respondents

The file ferc714_respondents.csv links FERC Form 714 respondents to what we believe to be their corresponding EIA utilities or balancing authorities.

eia_code: An integer ID reported in the FERC Form 714 corresponding to the respondent's EIA ID. In some cases this is a Utility ID, and in others it is a Balancing Authority ID, but which is not specified and so we have had to infer the type of entity which is responding. Note that in many cases the same company acts as both a utility and a balancing authority, and the integer ID associated with the company is often the same in both roles, but it does not need to be.

respondent_type: Either balancing_authority or utility depending on which type of entity we believe was responding to the FERC 714.

respondent_id_ferc714: The integer ID of the responding entity within the FERC 714.

respondent_name_ferc714: The name provided by the respondent in the FERC 714.

balancing_authority_id_eia: If the respondent was identified as a balancing authority, the EIA ID for that balancing authority, taken from the EIA Form 861.

balancing_authority_code_eia: If the respondent was identified as a balancing authority, the EIA short code used to identify the balancing authority, taken from the EIA Form 861.

balancing_authority_name_eia: If the respondent was identified as a balancing authority, the name of the balancing authority, taken from the EIA Form 861.

utility_id_eia: If the respondent was identified as a utility, the EIA utility ID, taken from the EIA Form 861.

utility_name_eia: If the respondent was identified as a utility, the name of the utility, taken from the EIA 861.

FERC 714 Respondent Service Territories

The file ferc714_service_territories.csv describes the historical service territories for FERC 714 respondents for the years 2006-2019. For each respondent and year, their service territory is composed of a collection of counties, identified by their 5-digit FIPS codes. The file contains the following columns, with each row associating a single county with a FERC 714 respondent in a particular year:

respondent_id_ferc714: The FERC Form 714 respondent ID, which is also found in ferc714_respondents.csv

report_date: The first day of the year for which the service territory is being described.

state: Two letter abbreviation for the state containing the county, for human readability.

county: The name of the county, for human readability.

state_id_fips: The 2-digit FIPS state code.

county_id_fips: The 5-digit FIPS county code for use with other geospatial data resources, like the US Census DP1 geodatabase.

State Hourly Electricity Demand Estimates

The file demand.csv contains hourly electricity demand estimates for each US state from 2006-2019. It contains the following columns:

state_id_fips: The 2-digit FIPS state code.

utc_datetime: UTC time at hourly resolution.

demand_mwh: Electricity demand for that state and hour in MWh. This is an allocation of the electricity demand reported directly in the FERC Form 714.

scaled_demand_mwh: Estimated total electricity demand for that state and hour, in MWh. This is the reported FERC Form 714 hourly demand scaled up or down linearly such that the total annual electricity demand matches the total annual electricity sales reported at the state level in the EIA Form 861.

A collection of plots are also included, comparing the original and scaled demand time series for each state.

Acknowledgements

This analysis was funded largely by GridLab, and done in collaboration with researchers at the Lawrence Berkeley National Laboratory, including Umed Paliwal and Nikit Abhyankar.

Ethan Welty wrote the final code and most of the algorithms.

Yash Kumar did initial data explorations and geospatial analyses.

The data screening methods were originally designed to identify unrealistic data in the electricity demand timeseries reported to EIA on Form 930, and have been applied here to data form the FERC Form 714.

They are adapted from code published and modified by:

Tyler Ruggles

Greg Schivley

And described at:

Developing reliable hourly electricity demand data through screening and imputation

EIA Cleaned Hourly Electricity Demand Code (Zenodo)

EIA Cleaned Hourly Electricity Demand Code (GitHub)

The imputation methods were designed for multivariate time series forecasting.

They are adapted from code published by:

Xinyu Chen chenxy346@gmail.com

And described at:

Low-Rank Autoregressive Tensor Completion for Multivariate Time Series Forecasting

Scalable Low-Rank Tensor Learning for Spatiotemporal Traffic Data Imputation

Tensor Learning (张量学习)

About PUDL & Catalyst Cooperative

For additional information about this data and PUDL, see the following resources:

The PUDL Repository on GitHub

The PUDL Documentation

Other Catalyst Cooperative data archives on Zenodo
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Babatunde Zenith (2023). Sales Performance [Dataset]. https://www.kaggle.com/datasets/babatundezenith/sales-viz/suggestions

Data from: Sales Performance

Creating a dashboard to visualize the sales data for a fictional retail store.

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 2, 2023

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Babatunde Zenith

License

Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically

Description

This fictional sales dataset was created using a R code for the purpose of visualizing trends in customer demographics, product performance, and sales over time. A link to my Github repository containing all the codes used in generating the data frame and all the preceding processes can be found here

Clear search

Close search

Google apps

Main menu

Data from: Sales Performance

‘Video Games Sales Dataset’ analyzed by Analyst-2

Context

Content

Acknowledgements

Inspiration

StreamFuels: Continuously updated fuel sales datasets for forecasting,...

‘Video Game Sales’ analyzed by Analyst-2

GIS Data and Analysis for Cooling Demand and Environmental Impact in The...

Background research data

Input data

Output data

PUDL US Hourly Electricity Demand by State

Data from: Sales Performance

Creating a dashboard to visualize the sales data for a fictional retail store.