This data set includes information about individual rides made in a bike-sharing system covering the greater San Francisco Bay area. Originally it contained 183412 bike trips and 16 features. After the cleaning process which includes,
- Incorect datatypes for most of the columns
- Missing Values
- Inaccurate values present in member birth year eg minimum year is 1878
- Other gender
- Additional columns to represent age
- Calculating distance between stations
- Choosing necessary features
The dataset now contained 15 features of 171,230 bike trips. The included features includes:
-
Trip Infomation: bike_id
-
Station Information: start_station_name, end_station_name
-
Member Infomation: user_type, member_gender, bike_share_for_all_trip
Derived Features/Variables:
-
Trip Infomation: duration(min),start_hour ,end_hour ,start_day, end_day, day_of_month, week_in_month, and distance(km)
-
Member Infomation: member_age
-
Female riders spends on average a longer time compared to male riders. Although there were hours male riders were more than the female riders. It proved insignificant in the overall trips that happens in a week
-
An average trip on weekends take more since most of the trips that happens on weekdays are the trips immediately before working hours, in between working hours and immediately after working hours.
For the presentation part, I'll be starting with how people used the bike service on the starting and ending hour and also show how it was used on the 7 days of the week. Then I will be following that with how each users both Customers and Subscribers used it by hour of the day and also by days of the week. I will be topping that up by showing what Gender spends more time biking at what hour and also what day.