11 datasets found

Adventure Works 2022 CSVs
kaggle.com
zip
Updated Nov 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Algorismus (2022). Adventure Works 2022 CSVs [Dataset]. https://www.kaggle.com/datasets/algorismus/adventure-works-in-excel-tables
Explore at:
zip(567646 bytes)Available download formats
Dataset updated
Nov 2, 2022
Authors
Algorismus
License
http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html
Description
Adventure Works 2022 dataset

How this Dataset is created?

On the official website the dataset is available over SQL server (localhost) and CSVs to be used via Power BI Desktop running on Virtual Lab (Virtaul Machine). As per first two steps of Importing data are executed in the virtual lab and then resultant Power BI tables are copied in CSVs. Added records till year 2022 as required.

How this Dataset may help you?

this dataset will be helpful in case you want to work offline with Adventure Works data in Power BI desktop in order to carry lab instructions as per training material on official website. The dataset is useful in case you want to work on Power BI desktop Sales Analysis example from Microsoft website PL 300 learning.

How to use this Dataset?

Download the CSV file(s) and import in Power BI desktop as tables. The CSVs are named as tables created after first two steps of importing data as mentioned in the PL-300 Microsoft Power BI Data Analyst exam lab.
FitBit Fitness Tracker Data (revised)
kaggle.com
zip
Updated Dec 17, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
duart2688 (2022). FitBit Fitness Tracker Data (revised) [Dataset]. https://www.kaggle.com/duart2688/fitabase-data-cleaned-using-sql
Explore at:
zip(12763010 bytes)Available download formats
Dataset updated
Dec 17, 2022
Authors
duart2688
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Content

This dataset generated by respondents to a distributed survey via Amazon Mechanical Turk between 03.12.2016-05.12.2016. Thirty eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring. Individual reports can be parsed by export session ID (column A) or timestamp (column B). Variation between output represents use of different types of Fitbit trackers and individual tracking behaviors / preferences.

Main modifications

This is the list of manipulations performed on the original dataset, published by Möbius. All the cleaning process and rearrangements were performed in BigQuery, using SQL functions. 1) After I took a closer look at the source dataset, I realized that for my case study, I did not need some of the tables contained in the original archive. Therefore, I decided not to import - dailyCalories_merged.csv, - dailyIntensities_merged.csv, - dailySteps_merged.csv. as they proved redundant, their content could be found in the dailyActivity_merged.csv file. In addition, the files - minutesCaloriesWide_merged.csv, - minutesIntensitiesWide_merged.csv, - minuteStepsWide_merged.csv.
were not imported, as they presented the same data contained in other files in a wide format. Hence, only the files with long format containing the same data were imported in the BigQuery database.

2) To be able to compare and measure the correlation among different variables based on hourly records, I decided to create a new table based on LEFT JOIN function and columns Id and ActivityHour. I repeated the same JOIN on tables with minute records. Hence I obtained 2 new tables: - hourly_activity.csv, - minute_activity.csv.

3) To validate most of the columns containing DATE and DATETIME values that were imported as STRING data type, I used the PARSE_DATE() and PARSE_DATETIME() commands. While importing the - heartrate_seconds_merged.csv, - hourlyCalories_merged.csv, - hourlyIntensities_merged.csv, - hourlySteps_merged.csv, - minutesCaloriesNarrow_merged.csv, - minuteIntensitiesNarrow_merged.csv, - minuteMETsNarrow_merged.csv, - minuteSleep_merged.csv, - minuteSteps_merged.csv, - sleepDay_merge.csv, - weigthLog_Info_merged.csv files to BigQuery, it was necessary to import the DATETIME and DATE type columns as STRING, because the original syntax, used in the CSV files, couldn’t be recognized as a correct DATETIME data type, due to “AM” and “PM” text at the end of the expression.

Acknowlegement

Möbius' version of the data set can be found here.

Furberg, Robert; Brinton, Julia; Keating, Michael ; Ortiz, Alexa https://zenodo.org/record/53894#.YMoUpnVKiP9-
d
Health and Retirement Study (HRS)
search.dataone.org
dataverse.harvard.edu
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Damico, Anthony (2023). Health and Retirement Study (HRS) [Dataset]. http://doi.org/10.7910/DVN/ELEKOY
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/ELEKOY
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Damico, Anthony
Description
analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D
Real-Estate Dashboard
kaggle.com
zip
Updated May 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ramy Elbouhy (2025). Real-Estate Dashboard [Dataset]. https://www.kaggle.com/datasets/ramyelbouhy/real-estate-dashboard/data
Explore at:
zip(10488043 bytes)Available download formats
Dataset updated
May 23, 2025
Authors
Ramy Elbouhy
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Introduction

Objective:

Improve understanding of real estate performance.

Leverage data to support business decisions.

Scope:

Track property sales, visits, and performance metrics.

Technical Steps

Step 1: Creating an Azure SQL Database

Action: Provisioned an Azure SQL Database to host real estate data.

Why Azure?: Scalability, security, and integration with Power BI.

Step 2: Importing Data

Action: Imported datasets (properties, visits, sales, agents, etc.) into the SQL database.

Tools Used: SQL Server Management Studio (SSMS) and Azure Data Studio.

Step 3: Data Transformation in SQL

Normalized Data: Ensured data consistency by normalizing the formats of dates and categorical fields.

Calculated Fields:

Time on Market: DATEDIFF function to calculate the difference between listing and sale dates.

Conversion Rate: Aggregated sales and visits data using COUNT and SUM to calculate conversion rates per agent and property.

Buyer Segmentation: Identified first-time vs repeat buyers using JOINs and COUNT functions.

Data Cleaning: Removed duplicates, handled null values, and standardized city names and property types.

Step 4: Connecting Power BI to Azure SQL

Action: Established a live connection to Azure SQL Database in Power BI.

Benefit: Real-time data updates and efficient analysis.

Step 5: Data Modeling in Power BI

Relationships:

Defined relationships between tables (e.g., Sales, Visits, Properties, Agents) using primary and foreign keys.

Utilized active and inactive relationships for dynamic calculations like time-based comparisons.

Calculated Columns and Measures:

Time on Market: Created a calculated measure using DATEDIFF.

Conversion Rates: Used DIVIDE and CALCULATE for accurate per-agent and per-property analysis.

Step 6: Creating Visualizations

Key Visuals:

Sales Heatmap by City: Geographic visualization to highlight sales performance.

Conversion Rates: Bar charts and line graphs for trend analysis.

Time on Market: Boxplots and histograms for distribution insights.

Buyer Segmentation: Pie charts and bar graphs to show buyer profiles.

Step 7: Building Dashboards

Structure:

Page 1: Overview (Key Metrics and Sales Heatmap).

Page 2: Performance Analysis (Conversion Rates, Time on Market).

Page 3: Buyer Insights (First-Time vs Repeat Buyers, Property Distribution).

Insights Gained

Insight 1: Sales Performance by City

Cities highest sales volume.

City low performance, requiring further investigation.

Insight 2: Conversion Rates

Agent highest conversion rate.

Certain properties (e.g., luxury villas) outperform others in conversion.

Insight 3: Time on Market

Average time on market.

Insight 4: Buyer Trends

Repeat Buyers make up 60% of purchases.

First-Time Buyers prefer apartments over villas.

Recommendation

Focus on High-Performing Cities Recommendation 2: Support Low-Performing Areas

Investigate challenges to develop targeted marketing strategies.

Enhance Conversion Rates

Train agents based on techniques used by top performers.

Prioritize marketing for properties with high conversion rates.

Engage First-Time Buyers

Create specific campaigns for apartments to attract first-time buyers.

Offer financial guidance programs to boost their confidence.

Summary:

Built a robust data solution from Azure SQL to Power BI.

Derived actionable insights that can drive real estate growth.
Z
Qualisign: Software Metrics and GoF Design Patterns of the Maven Central...
data.niaid.nih.gov
Updated Sep 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aichberger, Johann (2020). Qualisign: Software Metrics and GoF Design Patterns of the Maven Central Repository [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3731871
Explore at:
Dataset updated
Sep 24, 2020
Authors
Aichberger, Johann
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains software metric and design pattern data for around 100,000 projects from the Maven Central repository. The data was collected and analyzed as part of my master's thesis "Mining Software Repositories for the Effects of Design Patterns on Software Quality" (https://www.overleaf.com/read/vnfhydqxmpvx, https://zenodo.org/record/4048275).

The included qualisign.* files all contain the same data in different formats: - qualisign.sql: standard SQL format (exported using "pg_dump --inserts ..."), - qualisign.psql: PostgreSQL plain format (exported using "pg_dump -Fp ..."), - qualisign.csql: PostgreSQL custom format (exported using "pg_dump -Fc ...").

create-tables.sql has to be executed before importing one of the qualisign.* files. Once qualisign.*sql has been imported, create-views.sql can be executed to preprocess the data, thereby creating materialized views that are more appropriate for data analysis purposes.

Software metrics were calculated using CKJM extended: http://gromit.iiar.pwr.wroc.pl/p_inf/ckjm/

Included software metrics are (21 total): - AMC: Average Method Complexity - CA: Afferent Coupling - CAM: Cohesion Among Methods - CBM: Coupling Between Methods - CBO: Coupling Between Objects - CC: Cyclomatic Complexity - CE: Efferent Coupling - DAM: Data Access Metric - DIT: Depth of Inheritance Tree - IC: Inheritance Coupling - LCOM: Lack of Cohesion of Methods (Chidamber and Kemerer) - LCOM3: Lack of Cohesion of Methods (Constantine and Graham) - LOC: Lines of Code - MFA: Measure of Functional Abstraction - MOA: Measure of Aggregation - NOC: Number of Children - NOM: Number of Methods - NOP: Number of Polymorphic Methods - NPM: Number of Public Methods - RFC: Response for Class - WMC: Weighted Methods per Class

In the qualisign.* data, these metrics are only available on the class level. create-views.sql additionally provides averages of these metrics on the package and project levels.

Design patterns were detected using SSA: https://users.encs.concordia.ca/~nikolaos/pattern_detection.html

Included design patterns are (15 total): - Adapter - Bridge - Chain of Responsibility - Command - Composite - Decorator - Factory Method - Observer - Prototype - Proxy - Singleton - State - Strategy - Template Method - Visitor

The code to generate the dataset is available at: https://github.com/jaichberg/qualisign

The code to perform quality analysis on the dataset is available at: https://github.com/jaichberg/qualisign-analysis
Z
Rediscovery Datasets: Connecting Duplicate Reports of Apache, Eclipse, and...
data.niaid.nih.gov
Updated Aug 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sadat, Mefta; Bener, Ayse Basar; Miranskyy, Andriy V. (2024). Rediscovery Datasets: Connecting Duplicate Reports of Apache, Eclipse, and KDE [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_400614
Explore at:
Dataset updated
Aug 3, 2024
Dataset provided by
Ryerson University
Authors
Sadat, Mefta; Bener, Ayse Basar; Miranskyy, Andriy V.
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We present three defect rediscovery datasets mined from Bugzilla. The datasets capture data for three groups of open source software projects: Apache, Eclipse, and KDE. The datasets contain information about approximately 914 thousands of defect reports over a period of 18 years (1999-2017) to capture the inter-relationships among duplicate defects.

File Descriptions

apache.csv - Apache Defect Rediscovery dataset

eclipse.csv - Eclipse Defect Rediscovery dataset

kde.csv - KDE Defect Rediscovery dataset

apache.relations.csv - Inter-relations of rediscovered defects of Apache

eclipse.relations.csv - Inter-relations of rediscovered defects of Eclipse

kde.relations.csv - Inter-relations of rediscovered defects of KDE

create_and_populate_neo4j_objects.cypher - Populates Neo4j graphDB by importing all the data from the CSV files. Note that you have to set dbms.import.csv.legacy_quote_escaping configuration setting to false to load the CSV files as per https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/#config_dbms.import.csv.legacy_quote_escaping

create_and_populate_mysql_objects.sql - Populates MySQL RDBMS by importing all the data from the CSV files

rediscovery_db_mysql.zip - For your convenience, we also provide full backup of the MySQL database

neo4j_examples.txt - Sample Neo4j queries

mysql_examples.txt - Sample MySQL queries

rediscovery_eclipse_6325.png - Output of Neo4j example #1

distinct_attrs.csv - Distinct values of bug_status, resolution, priority, severity for each project
US National Flight Data 2015 - 2020
kaggle.com
zip
Updated Feb 18, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BingeCode (2021). US National Flight Data 2015 - 2020 [Dataset]. https://www.kaggle.com/bingecode/us-national-flight-data-2015-2020
Explore at:
zip(890115594 bytes)Available download formats
Dataset updated
Feb 18, 2021
Authors
BingeCode
License
https://www.usa.gov/government-works/https://www.usa.gov/government-works/
Area covered
United States
Description
Context

This data set was retrieved from the Transtats webpage of the Bureau of Transportation Statistics of the US Department of Transportation. This data was cleaned and made ready for use in an university project where the goal was to compare different database engines in terms of performance and more.

NOTE: December 2020 was not included in the data set since it was not made available by the BTS as of today, 18th Feb 2021.

Content

The data is split into CSV files for each year 2015 to 2020. flights.csv contains all the data in one file. The CSV files contain no headers for the columns. The headers are as follows:

'YEAR', 'MONTH', 'DAY_OF_MONTH', 'DAY_OF_WEEK', 'OP_UNIQUE_CARRIER', 'ORIGIN_CITY_NAME', 'ORIGIN_STATE_ABR', 'DEST_CITY_NAME', 'DEST_STATE_ABR', 'CRS_DEP_TIME', 'DEP_DELAY_NEW', 'CRS_ARR_TIME', 'ARR_DELAY_NEW', 'CANCELLED', 'CANCELLATION_CODE', 'AIR_TIME', 'DISTANCE'

NOTE: The headers were removed due to the requirement of easily importing the data into SQL

Other

If you have any questions about how I retrieved/cleaned the data or anything about my project, feel free to check out my Github repository or shoot me a message.
Export time comparison between PFB and Gen3.
plos.figshare.com
xls
Updated Jun 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Lukowski; Andrew Prokhorenkov; Robert L. Grossman (2023). Export time comparison between PFB and Gen3. [Dataset]. http://doi.org/10.1371/journal.pcbi.1010944.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1010944.t003
Dataset updated
Jun 21, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Michael Lukowski; Andrew Prokhorenkov; Robert L. Grossman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We introduce a self-describing serialized format for bulk biomedical data called the Portable Format for Biomedical (PFB) data. The Portable Format for Biomedical data is based upon Avro and encapsulates a data model, a data dictionary, the data itself, and pointers to third party controlled vocabularies. In general, each data element in the data dictionary is associated with a third party controlled vocabulary to make it easier for applications to harmonize two or more PFB files. We also introduce an open source software development kit (SDK) called PyPFB for creating, exploring and modifying PFB files. We describe experimental studies showing the performance improvements when importing and exporting bulk biomedical data in the PFB format versus using JSON and SQL formats.
International Student Mobility 2020-2023
kaggle.com
zip
Updated Nov 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniela Rivas (2025). International Student Mobility 2020-2023 [Dataset]. https://www.kaggle.com/datasets/danielarivasu/international-student-mobility
Explore at:
zip(2794 bytes)Available download formats
Dataset updated
Nov 17, 2025
Authors
Daniela Rivas
License
https://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets
Description
This dataset contains information on international student mobility and economic indicators for multiple countries and territories during 2020-2023. In addition, it includes data on Importing Countries and Exporting Countries, which refer to countries that receive (import) or send (export) international students.

The data was obtained from official open-data sources, mainly the UNESCO Institute for Statistics (UIS) and the World Bank open data portal. The original files were downloaded as CSV directly from these platforms. After downloading, the datasets were merged, filtered, and cleaned using SQL and excel (for example, removing duplicates, selecting specific years such as 2023 and 2024). No web scraping was used; all data comes from publicly available official databases.

Conclusions / Insights

Countries with negative average net flows, such as China and India, are net exporters of students, meaning they send more students abroad than they receive.

Countries with positive average net flows, such as the United States and other high-income countries, are net importers of students, attracting more international students than they send.

There appears to be a relationship between GDP per capita (PPP, 2023 USD) and student mobility patterns: countries with higher PPP tend to attract more international students, while countries with lower PPP tend to send more students abroad.

This dataset can be used to analyze trends in international student mobility, compare countries’ economic contexts, and identify patterns between student flows and national wealth.

The classification of countries as Importing or Exporting provides a quick way to group and compare countries in terms of international student dynamics.

Practical application: Universities and educational institutions can use this data to better understand potential target markets, identify countries that send or receive more students, and develop strategies for recruitment and international collaborations.
Cleaned Contoso Dataset
kaggle.com
zip
Updated Aug 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bhanu (2023). Cleaned Contoso Dataset [Dataset]. https://www.kaggle.com/datasets/bhanuthakurr/cleaned-contoso-dataset
Explore at:
zip(487695063 bytes)Available download formats
Dataset updated
Aug 27, 2023
Authors
Bhanu
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Data was imported from the BAK file found here into SQL Server, and then individual tables were exported as CSV. Jupyter Notebook containing the code used to clean the data can be found here

Version 6 has a some more cleaning and structuring that was noticed after importing in Power BI. Changes were made by adding code in python notebook to export new cleaned dataset, such as adding MonthNumber for sorting by month number, similar for WeekDayNumber.

Cleaning was done in python while also using SQL Server to quickly find things. Headers were added separately, ensuring no data loss.Data was cleaned for NaN, garbage values and other columns.
2020 NFL Statistics (Active and Retired Players)
kaggle.com
zip
Updated Feb 8, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Trevor Youngquist (2021). 2020 NFL Statistics (Active and Retired Players) [Dataset]. https://www.kaggle.com/datasets/trevyoungquist/2020-nfl-stats-active-and-retired-players
Explore at:
zip(3930921 bytes)Available download formats
Dataset updated
Feb 8, 2021
Authors
Trevor Youngquist
Description
2020 NFL Stats Web Scrape

This dataset consists of basic statistics and career statistics provided by the NFL on their official website (http://www.nfl.com) for all players, active and retired.

Summary

All of the data was web scraped using Python code, which can be found and downloaded here: https://github.com/ytrevor81/NFL-Stats-Web-Scrape

Explanation of Data

Before we go into the specifics, it's important to note in the basic statistics and career statistics CSV files that all players are assigned a 'Player_Id'. This is the same ID used by the official NFL website to identify each player. This is useful in case of, for example, importing these CSV files in a SQL database for an app.

The first main group of stats is the basic stats provided for each player. This data is stored in the CSV file titled Active_Player_Basic_Stats.csv and Retired_Player_Basic_Stats.csv.

The data pulled for each player in Active_Player_Basic_Stats.csv is as follows: a. Player ID b. Full Name c. Position d. Number e. Current Team f. Height g. Height h. Weight i. Experience j. Age k. College

The data pulled for each player in Retired_Player_Basic_Stats.csv differs slightly from the previous data set. The data is as follows: a. Player ID b. Full Name c. Position f. Height g. Height h. Weight j. College k. Hall of Fame Status

The second main group of stats gathered for each player are their career statistics. Due to the NFL having a various amount of positions that players occupy, the career statistics are divided into statistics categories. The stats for active players and retired players are structured the same, but are stored in separate CSV files (ActivePlayer_(category)_Stats.csv and RetiredPlayer_(category)_Stats.csv). The following are the career statistics categories and accompanying CSV file names: a. Defensive Stats - ..._Defense_Stats.csv b. Fumbles Stats - ..._Fumbles_Stats.csv c. Kick Returns Stats - ..._KickReturns_Stats.csv d. Field Goal Kicking Stats - ..._Kicking_Stats.csv e. Passing Stats - ..._Passing_Stats.csv f. Punt Returns Stats - ..._PuntReturns_Stats.csv g. Punting Stats - ..._Punting_Stats.csv h. Receiving Stats - ..._Receiving_Stats.csv i. Rushing Stats - ..._Rushing_Stats.csv
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Algorismus (2022). Adventure Works 2022 CSVs [Dataset]. https://www.kaggle.com/datasets/algorismus/adventure-works-in-excel-tables

Adventure Works 2022 CSVs

Dataset of Adventure Works from SQL to CSVs (useful for PL-300 exam)

Explore at:

zip(567646 bytes)Available download formats

Dataset updated

Nov 2, 2022

Authors

Algorismus

License

http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

Description

Adventure Works 2022 dataset

How this Dataset is created?

On the official website the dataset is available over SQL server (localhost) and CSVs to be used via Power BI Desktop running on Virtual Lab (Virtaul Machine). As per first two steps of Importing data are executed in the virtual lab and then resultant Power BI tables are copied in CSVs. Added records till year 2022 as required.

How this Dataset may help you?

this dataset will be helpful in case you want to work offline with Adventure Works data in Power BI desktop in order to carry lab instructions as per training material on official website. The dataset is useful in case you want to work on Power BI desktop Sales Analysis example from Microsoft website PL 300 learning.

How to use this Dataset?

Download the CSV file(s) and import in Power BI desktop as tables. The CSVs are named as tables created after first two steps of importing data as mentioned in the PL-300 Microsoft Power BI Data Analyst exam lab.

Clear search

Close search

Google apps

Main menu

Adventure Works 2022 CSVs

Adventure Works 2022 dataset

How this Dataset is created?

How this Dataset may help you?

How to use this Dataset?

FitBit Fitness Tracker Data (revised)

Content

Main modifications

Acknowlegement

Health and Retirement Study (HRS)

Real-Estate Dashboard

Introduction

Technical Steps

Structure:

Insights Gained

Recommendation

Qualisign: Software Metrics and GoF Design Patterns of the Maven Central...

Rediscovery Datasets: Connecting Duplicate Reports of Apache, Eclipse, and...

US National Flight Data 2015 - 2020

Context

Content

Other

Export time comparison between PFB and Gen3.

International Student Mobility 2020-2023

Cleaned Contoso Dataset

2020 NFL Statistics (Active and Retired Players)

2020 NFL Stats Web Scrape

Summary

Explanation of Data

Adventure Works 2022 CSVs

Dataset of Adventure Works from SQL to CSVs (useful for PL-300 exam)

Adventure Works 2022 dataset

How this Dataset is created?

How this Dataset may help you?

How to use this Dataset?