Hubway Capstone Project-- Demand Over Time of Day¶
Hubway is a bike-share program collectively owned by the metro Boston cities; Boston, Cambridge, Somerville, and Brookline. It is operated by Motivate, who manages similar initiatives in NYC, Portland, Chicago, Washington DC, and several other metro areas in Ohio, Tennessee, and New Jersey. They are opening up operations in San Francisco during the month of June, 2017. Hubway currently exists as a system of 188 stations with 1,800 bikes.- For this project, I investigated shared data for the months of January, May, June, July, and October during the years of 2015 and 2016.
- Of concern were the questions of;
- How do riders use the bike-share service?
- Are the bikes used as a conveyance or for recreation?
- What type of customer uses the service?
- How do riders use the bike-share service?
Import Libraries
In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import datetime
import warnings
warnings.filterwarnings("ignore")
Read data csv file and created data frame
In [2]:
hubway_csv = pd.read_csv('./hubway.csv')
hubway_df = pd.DataFrame(hubway_csv)
hubway_df.info()
Create dataframe of just the start times, stop times, and end station category
In [3]:
dayhub = hubway_df[['starttime','stoptime', 'end station category']]
dayhub.head()
Out[3]:
Covert objects to actual dates and times, while droping date information and the seconds
In [4]:
dayhub['srt_time'] = pd.to_datetime(dayhub['starttime'])
#dayhub['srt_time'] = dayhub.index.map(lambda x: x.replace(second=0))
dayhub['stp_time'] = pd.to_datetime(dayhub['stoptime'])
#dayhub['stp_time'] = dayhub['stp_time'].values.astype('<M8[m]')
dayhub.info()
In [5]:
dayhub['Time'] = [d.time() for d in dayhub['stp_time']]
dayhub['Time'] = dayhub['Time'].map(lambda x: x.replace(second=0))
dayhub.tail(5)
Out[5]:
Drop unneeded information leaving only the end station category and the time
In [6]:
dayhub_vis = dayhub.drop(['starttime', 'stoptime', 'srt_time', 'stp_time'], 1)
dayhub_vis.tail()
Out[6]:
In [7]:
dayhub_vis.info()
In [8]:
hubway_demand = dayhub_vis['Time'].value_counts()
dayhub_dmnd = pd.DataFrame(hubway_demand)
dayhub_dmnd.columns = ['Demand']
dayhub_dmnd.tail()
Out[8]:
In [10]:
ax = dayhub_dmnd.plot(kind='line', title ="Total demand over the course of the day",
figsize=(15,10), legend=True, fontsize=12)
plt.show()
Clearly the highest demand is during rush hour which points to a commuter-type customer
Excellent and useful blog admin, I would like to read more about this topic.
ReplyDeleteccna Training in Chennai
ccna course in Chennai
Python Classes in Chennai
Python Training Institute in Chennai
R Training in Chennai
R Programming Training in Chennai
CCNA Training in T Nagar
CCNA Training in OMR
Thanks so much-- I do have to take it down, and rework the features to get rid of the overfit in some of the models. The program has really taken off here in Boston, Mass. and I want to update it with new data as well.
Delete