This dataset was created by Nile Leggs
These datasets are used for the case study as the capstone project in Google Data Analytics course on Coursera
The datasets have a different name because Cyclistic is a fictional company. For the purposes of this case study, the datasets are appropriate and will enable you to answer the business questions. The data has been made available by Motivate International Inc. under this license.
This is public data that you can use to explore how different customer types are using Cyclistic bikes. But note that data-privacy issues prohibit you from using riders’ personally identifiable information. This means that you won’t be able to connect pass purchases to credit card numbers to determine if casual riders live in the Cyclistic service area or if they have purchased multiple single passes.
The following dataset was created for my capstone project as part of the "Google Data Analytics Certificate" Course. The case study for my capstone project is regarding different types of bike-sharing user (members & casual users) patterns in Chicago, US by Cyclists company.
This a finalized summary dataset has been cleaned and analyzed using R language on RStudio. The summary dataset consists of total number of rides and average duration by different user-type for each day of the week, in the year of 2019-2020.
I would like to thank Coursera for giving me the chance to learn R programming to apply data analytics on this particular case study.
This dataset was created by Mark Goddard
Capstone case study from Google Data Analytics Professional Certificate program.
This dataset was collected by Motivate International Inc. I've included only the last 12 months, from November 2020 to October 2021.
Welcome to the Cyclistic bike-share analysis case study! In this case study, you will perform many real-world tasks of a junior data analyst. You will work for a fictional company, Cyclistic, and meet different characters and team members. In order to answer the key business questions, you will follow the steps of the data analysis process: ask, prepare, process, analyze, share, and act. Along the way, the Case Study Roadmap tables — including guiding questions and key tasks — will help you stay on the right path.
You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations.
Moreno, the director of marketing and your manager, has set a clear goal: Design marketing strategies aimed at converting casual riders into annual members. In order to do that, however, the marketing analyst team needs to better understand how annual members and casual riders differ, why casual riders would buy a membership, and how digital media could affect their marketing tactics. Moreno and her team are interested in analyzing the Cyclistic historical bike trip data to identify trends.
Moreno has assigned you the first question to answer: How do annual members and casual riders use Cyclistic bikes differently? You will produce a report with the following deliverables: 1. A clear statement of the business task 2. A description of all data sources used 3. Documentation of any cleaning or manipulation of data 4. A summary of your analysis 5. Supporting visualizations and key findings 6. Your top three recommendations based on your analysis
The link to the data set was provided by the Coursera team in the case study description. However, the dataset was originally provided by Motivate International Inc. and a data license agreement was provided. This is public data, Licensed by Lyft Bikes and Scooters, LLC (“Bikeshare”), that can be used to explore how different customer types are using Cyclistic bikes.
The data is organized quarterly and yearly with several datasets to explore, so I chose the one for Q1-Q4 of 2019, and the link to the datasets is as follows: https://divvy-tripdata.s3.amazonaws.com/index.html
These datasets are used for the case study as the capstone project in Google Data Analytics course on Coursera
The datasets have a different name because Cyclistic is a fictional company. For the purposes of this case study, the datasets are appropriate and will enable you to answer the business questions. The data has been made available by Motivate International Inc. under this license.
This is public data that you can use to explore how dierent customer types are using Cyclistic bikes. But note that data-privacy issues prohibit you from using riders’ personally identifiable information. This means that you won’t be able to connect pass purchases to credit card numbers to determine if casual riders live in the Cyclistic service area or if they have purchased multiple single passes.
This dataset was created by Mohammed Mustafa
How Does a Bike-Share Navigate Speedy Success?
This is a case study for GOOGLE DATA ANALYSTS CERTIFICATE. This project includes the processes of business task, hypotheses, data pipeline, data visualization and insight finding. If you think this notebook is helpful or needs improvement, please upvote this project. Thank you! Should you have any suggestions or further questions, please don't hesitate to leave a comment
LinkedIn: /in/anas-aljarrah/
This dataset was created by R. Naga Amrutha
This dataset was created by Osman Mendoza
This data comes from the direct AWS link shared by Google Data Analytics Capstone project course, on Coursera. It is meant to be used for the Track 1, Case Study 1: Cyclistic bike share. It was originally sourced by Divvy, a Chicago based bike sharing company, and made available for everybody to analyze.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Bellabeat, a small company manufacturing high-tech products focused on bringing Health-focused smart devices and other Wellness products to Women around the world. Since Urška Sršen and Sando Mur founded the company in 2013 they have seen it grow tremendously. Now they have asked for an analysis on non-Bellabeat smart device usage and how we can use this data to create new campaign strategies and drive future growth.
library(tidyverse)
library(lubridate)
library(dplyr)
library(ggplot2)
library(tidyr)
I utilized Fitbit Fitness tracker data, located here for this project.
6. activity <- read.csv("Fitabase_Data/dailyActivity_merged.csv")
7. calories <- read.csv("Fitabase_Data/dailyCalories_merged.csv")
8. sleep <- read.csv("Fitabase_Data/sleepDay_merged.csv")
9. weight <- read.csv("Fitabase_Data/weightLogInfo_merged.csv")
While using the view function I'm able to skim through the datasets and make sure everything is imported correctly. I will also use this time to see if I need to clean the data in anyway or format the data differently.
10. View(activity)
11. View(calories)
12. View(sleep)
13. View(weight)
After viewing the datasets I see that I will need to format the Dates and Times to matching formats on all the datasets.
14. sleep$SleepDay=as.POSIXct(sleep$SleepDay, format="%m/%d/%Y %I:%M:%S %p", tz=Sys.timezone())
15. sleep$date <- format(sleep$SleepDay, format = "%m/%d/%y")
16. activity$ActivityDate=as.POSIXct(activity$ActivityDate, format="%m/%d/%Y", tz=Sys.timezone())
17. activity$date <- format(activity$ActivityDate, format = "%m/%d/%y")
18. weight$Date=as.POSIXct(weight$Date, format="%m/%d/%Y %I:%M:%S %p", tz=Sys.timezone())
19. weight$time <- format(weight$Date, format = "%H:%M:%S")
20. weight$date <- format(weight$Date, format = "%m/%d/%y")
21. calories$date <- format(calories$ActivityDay, format = "%m/%d/%y")
Here I will be using the summary function to gather information about minimum, medians, averages, and maximums for certain column in the datasets (ie; Total Steps, Calories, Active Minutes, Minutes Asleep, Sedentary Minutes)
22. activity %>%
select(TotalSteps,
TotalDistance,
SedentaryMinutes, Calories) %>%
summary()
23. activity %>%
select(VeryActiveMinutes, FairlyActiveMinutes, LightlyActiveMinutes) %>%
summary()
24. calories %>%
select(Calories) %>%
summary()
25. sleep %>%
select(TotalSleepRecords, TotalMinutesAsleep, TotalTimeInBed) %>%
summary()
26. weight %>%
select(WeightKg, BMI) %>%
summary()
ggplot(data=activity, aes(x=TotalSteps, y=Calories)) +
geom_point(color='purple') + geom_smooth() + labs(title="Total Steps vs. Calories")
_https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16489441%2Fe7a12b855837b0c6b7a2a5b1736e0fe1%2Fminsleepvsedentarymin.png?generation=1700709515785307&alt=media" alt="">
- The second scatter plot showcas...
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by KB Gaiesky
Released under CC0: Public Domain
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
These datasets are used for the case study as the capstone project in Google Data Analytics course on Coursera
The datasets have a different name because Cyclistic is a fictional company. For the purposes of this case study, the datasets are appropriate and will enable you to answer the business questions. The data has been made available by Motivate International Inc. under this license.
This is public data that you can use to explore how dierent customer types are using Cyclistic bikes. But note that data-privacy issues prohibit you from using riders’ personally identifiable information. This means that you won’t be able to connect pass purchases to credit card numbers to determine if casual riders live in the Cyclistic service area or if they have purchased multiple single passes.
The datasets have a different name because Cyclistic is a fictional company. For the purposes of this case study, the datasets are appropriate and will enable you to answer the business questions. The data has been made available by Motivate International Inc. under the below license.
https://www.divvybikes.com/data-license-agreement
The dataset includes attributes like start time, end time, start latitude & longitude, end latitude & longitude, membership type etc.
This case study is part of the Google Data Analytics certificate.
As a Jr.Data analyst, I have been assigned the task of answering the following question: "How do annual members and casual riders use Cyclistic bikes differently?"
To answer the assigned task, I have used 12 months of the data provided by the Cyclistic bike-share company.
** One analysis Done in spreadsheets with 202004 and 202005 data **
To adjust for outlier Ride lengths like the max and min below: Max RL =MAX(N:N)978:40:02 minimum RL =MIN(N:N)-0:02:56
TRIMMean to shave off the top and bottom of a dataset. TRIMMEAN =TRIMMEAN(N:N,5%)0:20:20 =TRIMMEAN(N:N,2%)0:21:27
Otherwise the Ride length for 202004 is Average RL 0:35:51
The most common day of the week is Sunday. There are 61,148 members and 23,628 casual riders. mode of DOW 1 CountIf member of MC 61148 CountIf casual of MC 23628
Pivot table 1 2020-04 member_casual AVERAGE of ride_length
Same calculations for 2020-05 Average RL 0:33:23 Max RL 481:36:53 minimum RL -0:01:48 mode of DOW 7 CountIf member of MC 113365 CountIf casual of MC 86909 TRIMMEAN 0:25:22 0:26:59
There are 4 pivot tables included in seperate sheets for other comparisons.
I gathered this data using the sources provided by the Google Data Analytics course. All work seen is done by myself.
I want to further use the data in SQL, and Tableau.
This dataset was created by Hope Owens
This dataset was created by Andrew Oshobu
This dataset was created by Nile Leggs