Real-time Bus Tracking using CrowdSourcing R & D Project Report Submitted in partial fulfillment of the requirements for the degree of Master of Technology by Deepali Mittal 153050016 under the guidance of Prof. Bhaskaran Raman Department of Computer Science and Engineering Indian Institute of Technology, Bombay Mumbai April,2016
Abstract Buses are the major part of the public transport system. Bus transport services helps in reducing the private car usage and fuel consumption. Most of the people use buses as their mode of transportation. But, when traveling with buses, travelers want to know the accurate arrival time of the bus so as to save their time. Long waits at the bus stops often discourages commuters and make them reluctant to take buses. Though online bus timetables are available, but they provide very limited information and are not very accurate. We try to build a model in which we inform bus commuters exact location of the buses they are waiting for. We have simulated a model of a bus network and studied various CrowdSourcing approaches to tackle the above problem. In one approach, users are required to give bus number and direction as input manually while other one automatically collects inputs from user. Our model considered Participatory sensing with varied number of passengers. According to our model, around 600 passengers per day providing input to our system helps in estimating bus location correctly 96 percent of the time. We also varied the number of passengers providing incorrect input from 0 percent to 25 percent. Average error in case of 0 and 30 percent users providing wrong input is 244 m and 652 m respectively. 1
Contents 1 Introduction 4 1.1 Motivation............................ 4 1.2 Problem Statement........................ 4 1.3 Past/Related Work........................ 5 1.4 Challenges............................. 5 1.5 Approach............................. 5 1.6 Result Summary......................... 5 2 Simulator Design 6 2.1 State Transition......................... 6 2.2 Event based triggering...................... 7 3 User Inputs 8 3.1 Participatory Sensing....................... 8 3.2 Opportunistic Sensing...................... 8 4 Working of Simulator 9 4.1 Bus Spottings Capture...................... 9 4.2 Bus Spottings converted to NowTime Spottings........ 9 4.3 Confidence Calculation...................... 10 4.4 Computing Peaks......................... 10 4.5 Computing BestPeak....................... 10 5 Results and Error Analysis 11 5.1 Participatory Sensing:.................... 11 5.1.1 Plot of CDF of difference between actual bus position and estimated bus position............... 11 5.1.2 Plot of position confidence by varying number of passengers, all providing correct input........... 12 5.1.3 Plot of position confidence by varying number of passengers, some passengers providing incorrect direction input........................... 13 6 Comparison of all Approaches 14 6.1 Comparison by varying number of passengers to get average error................................. 14 7 Challenges and Future Work 15 8 Conclusion 16 References 16 2
List of Figures 1 Cycle of events in simulation.................. 7 2 Transformation in time and space.[1].............. 9 3 Sample of time confidence calculated............... 10 4 Plot of CDF when varying number of users and all are providing correct input....................... 11 5 Number of users=100 and all are providing correct input... 12 6 Number of users=600 and all are providing correct input... 12 7 Number of users=600 and all are providing correct input... 13 8 Number of users=600 and 25 percent are providing incorrect input................................ 13 List of Tables 1 Average error Vs Number of passengers............ 14 2 Average error Vs Number of passengers providing incorrect direction input.......................... 14 3
1 Introduction Buses are the major part of the public transport system. Bus transport services helps in reducing the private car usage and fuel consumption. Most of the people use buses as their mode of transportation. But, when traveling with buses, travelers want to know the accurate arrival time of the bus so as to save their time. There have been lot of scenarios where buses are late or cancelled resulting in huge loss of commuters in terms of time and money. Long waits at the bus stops often discourages commuters and make them reluctant to take buses. Though online bus timetables are available, but they provide very limited information and are not very accurate. If the accurate arrival time of buses is provided to users, it will be beneficial in many ways and also will attract more users to use buses for their commute. Commuters can then schedule their journey in better way i.e. by taking some alternatives in case of extreme delay. We try to build a model in which we inform bus commuters exact location of the buses they are waiting for. 1.1 Motivation Increased use of buses will automatically result in decreased use to private transport, thereby reducing traffic congestion problem on roads. Providing accurate arrival time of buses will improve user experience and attract more users to use buses. Nowadays, most of the bus operating companies provide their time tables on web but provide very limited information and also these time tables are not timely updated. And also there can be many factors contributing to delay in bus arrival at stop like traffic conditions, bad weather etc. One way to provide such a feature is that bus department can incorporate buses with GPS devices to know their exact location, but this will incur substantial cost. Another alternative can be CrowdSourcing. As most of the passengers are smart phone users nowadays, we can crowdsource information about location of buses from those users and using that information to predict exact bus locations. 1.2 Problem Statement To predict bus locations of buses in Mumbai. In our project, we try to simulate bus network and study crowdsourcing approach used to tackle the above problem. Objective of this project is to: 1. Find and understand crowdsourcing models that could be used to solve the problem. 2. Compare performances and analyse the behaviour of the system with change of various parameters. - Number of user inputs. - Number of correct user inputs. 4
1.3 Past/Related Work Predicting current location of buses can be helpful in increasing the user experience during their journey. A lot of research is going on in the field of improving transport system using crowdsourcing. Recently, there has been some android applications such as My MTC in Chennai, Banglaore bus route timings, that has gained popularity for their contribution. CrowdSourcing approach is also used in Singapore to track bus location. 1.4 Challenges There are various challenges faced by almost every crowdsourced application: A user can provide wrong manual input, e.g. in this case user can give wrong direction or bus number as input. In our model, we have considered wrong direction as input and checked the behaviour explained later. Capturing the location using GPS i.e. the user has to turn on the GPS which can be lead to battery consumption of mobile phones. Deciding incentives for every user providing correct input is also a challenge. Multiple bus instances and map to bus arriving recent to time when the user queried and also nearest to that stop from which he queried. In our model, we have addressed this challenge by computing confidence of presence of buses at different locations and then computed the best peak to answer the query of the user, explained in section 5.3, 5.4, 5.5. 1.5 Approach Crowdsourced application can be built in which users seeing a bus at any bus stop can report its location by giving bus number and direction as input. Inputs from every user are collected and we try to estimate the location buses and provide results to querying users who are querying about any particular bus location and want to plan their journey accordingly. In our project we are trying to simulate bus network, to check and measure accuracy by varying parameters like number of passengers, number of buses etc. 1.6 Result Summary In our model, we have considered only one bus route of Mumbai, which has 61 bus stops in total and total journey of 29 kms. We queried from a particular bus stop in every 30 seconds to get the bus location for 1 day. We compared the results by varying the number of passengers from 100 to 1200. And according to our model, 600 passengers traveling and providing 5
input to the system will provide an accuracy of 96 percent with respect to bus s actual location. We also varied the number of passengers i.e. from 0 to 25 percent users providing wrong input and checked the average error. Average error for every user providing correct input is of 244 m and average error 25 percent of users providing wrong direction input is 653 m. 2 Simulator Design We model a bus network across a city for understanding the behaviour of crowdsourcing towards answering the query of different users. I started with an existing code-base, which was made for trains. In previous code, input was only 3 train routes which were handled manually everywhere in code, but now it will work for as many number of input files provided. Consolidated csv file containing all Mumbai bus routes and their bus stops was provided. Also file containing all Mumbai bus stops latitude, longitude postions was given. Separate files for every bus route were created using python script which fetched lat-long positions from other file and used it to calculate distance between consecutive bus stops. These files were input to our simulator model. Lat-Long positions of entire bus route was then plotted on map using GPSVisualizer which was almost similar to map already provided on web. Key components of bus network are bus stop, route, bus and passengers. 2.1 State Transition State in simulation contains information about the bus properties. states of simulator are: Four Initialize: A new bus is added, initialized and is ready to arrive at initial bus stop to begin its hourney. Arrival: Bus arrives at a bus stop and is ready to wait so that passengers can get into bus. Waiting: Bus waits when the passengers gets into the bus. User standing at the bus stop can also provide input at this point of time. Departure: Bus departs from station and is scheduled to arrive at next bus stop (if exists) or add new bus. We begin simulation at time t=0 and ends at t=86400 seconds i.e. one day. We are initializing a bus in every 30 minutes in both up and down direction. Each state is mapped to a particular timestamp. 6
2.2 Event based triggering Events result in change of state of buses. Each event has an event time i.e. the time at which it will trigger. This functionality is achieved by a priority queue which performs the task with most recent time. The list of events is: Figure 1: Cycle of events in simulation Initialize Bus: Schedules a new bus to arrive at the terminal stop after 0 seconds. Arrival of bus at bus stop: Schedule a bus to move to waiting state at the bus stop. Bus waiting at bus stop: Schedules the bus to wait for some time at that bus stop and then scheduled to depart. Passengers at that bus stop gets into the bus if they want to board it and also provide input to the system. Also, other users standing at that bus stop (not boarding that bus) provide input. Bus departing from bus stop: Calculate time required by bus to arrive next stop is calculated using distance between them and bus speed. Also adding random delay of 2 minutes to imitate natural behavior. Schedule the arrival of bus at the next stop. 7
3 User Inputs Entire calculation and system performance in any crowdsourcing application depends on user inputs. Idea is to gather data about bus spottings at different locations at different time and then trying to derive where the bus should be now (i.e. at the time of query). Two crowdsourcing approaches to tackle the problem are: 3.1 Participatory Sensing It is the crowdsourcing model in which user manually provides input to the system i.e. active involvement of user is necessary in data collection. In this model, users provide input when bus arrives at bus stop. Inputs can also be provided while boarding a bus as well as users at bus top waiting for ome other bus. Users provide input through an android application. It uses GPS in the smartphones for location and time information in order to ensure correctness. According to recent studies, GPS have location error of only 6 to 12 m and time error of 100ns. GPS is power consuming and user is required to turn it on only at the time of providing input. User manually selects the route and direction of the bus being spotted from the list shown to the user based on the location. User can still provide wrong input in terms of route and direction. 3.2 Opportunistic Sensing It is a crowdsourcing model where user does not provide input manually to the system i.e. there is no active involvement of users. The users just need to open the application in their smartphones. The application remains on for the entire journey of the passenger and provides input at a constant rate. GSM fingerprinting is used to get input location and time, which is power efficient as compared to GPS. Bus route can be decided on the basis of sensed cell tower IDs and already stored cell tower IDs in database. 8
4 Working of Simulator 4.1 Bus Spottings Capture Inputs provided by the user are stored in a list. Each input provided by the user consists of: UserID: It is the unique user ID provided to every user so as to track all the inputs provided by the user. busstopname: In our model, user is assumed to provide input only at bus stops. So, the location information of each user is the name of the stop at which user provides input. time: This signifies the time when the user provided the input. route and direction: In Participatory sensing model, user provides bus number and direction in which bus is moving i.e. up or down manually. In Opportunistic sensing, the route and direction of the user is found from the entire trail of user input. 4.2 Bus Spottings converted to NowTime Spottings Figure 2: Transformation in time and space.[1] Transformation in terms of time: Raw user input is to be converted into nowtime i.e. bus has moved since time of input. User inputs are scattered throughout the route, each having its own timestamp. Confidence from past is calculated and assigned to each spotting reported by user on the basis how much recent input is, i.e. most recent input is assigned higher weight than less recent inputs. Distance along the route is also calculated which estimates the distance traveled by bus (in nowtime - input time) assuming an average bus speed. Figure 3 shows the snapshot of time confidence values for different spotting spotted by user. 9
Figure 3: Sample of time confidence calculated. 4.3 Confidence Calculation Transformation in terms of space: Entire bus route is divided into 100m segments. Each segment is assigned a weight based on the present location of the buses. This weight is called as position confidence of that segment. It is the probability that there exist a bus in that particular segment at NowTime. Each input in 2000m range of a segment contributes a probability value of overallconfidence of that particular segment. Nearer points are assigned larger weight. overallconfidence is product of confidence calculated from time and also from distance. 4.4 Computing Peaks After getting confidence of existence of bus at each segment, we calculate peaks where bus can be present, i.e. every segment s confidence is compared to its neighbours segment confidence. We are considering 10 neighbours to be compared for the same in both left and right direction. If a segment has more confidence than every other segments confidence it is compared with, then it considered as a peak. For every bus instance on that particular route, there exists a peak to represent its estimated location according to our algorithm. 4.5 Computing BestPeak After getting estimated location of every bus, we have to calculate the best peak, so as to answer the query of a user. Best Peak can be referred as the nearest location to the queried user where bus can be present in NowTime. So that, user gets to know the location of bus which is about to come first to his bus stop from which he queried. 10
5 Results and Error Analysis Results of any crowdsourcing application depends mostly on the inputs provided by the user. We tried to vary some of the input parameters and analyse the behaviour of the system. The parameters are: We run the simulation for 86400 seconds i.e. one day. We vary the number of passengers from 100 to 1200. One bus route considered with 61 bus stops. Bus speed was constant i.e. 4 m/s. Delay at every bus stop is 20 seconds. We are querying our system at every 30 seconds from 2000 seconds to 86400 seconds. In case of Participatory sensing model, we vary the percentage of users providing wrong input in terms of direction to the system from 0 percent to 25 percent. 5.1 Participatory Sensing: 5.1.1 Plot of CDF of difference between actual bus position and estimated bus position Figure 4: Plot of CDF when varying number of users and all are providing correct input 11
According to Figure 4, if number of passengers= 100 providing correct input are considered then only 60 percent of total queries have error of less than 500 m is achieved in terms of locating a bus correctly, but increase in number of passengers also increase this percentage. As shown in Figure 4, if 600 passengers provide input correctly then 96 percent of total queries have error less than 500 m. And it becomes constant if we further increase number of passengers. Therefore, passengers >= 600 will give almost accurate results. 5.1.2 Plot of position confidence by varying number of passengers, all providing correct input As shown below, in Figure 5 and Figure 6, peak in position confidence only in Up direction i.e. probability of finding a bus in that segment is more with increase in number of passengers. Position confidence is very low in figure 5 as compared to figure 6. Figure 5: Number of users=100 and all are providing correct input Figure 6: Number of users=600 and all are providing correct input 12
5.1.3 Plot of position confidence by varying number of passengers, some passengers providing incorrect direction input We can see the difference as shown in Figure 7 and 8 below, lot of buses with wrong input direction, because of which our algorithm estimated buses to travel in opposite direction. Figure 7: Number of users=600 and all are providing correct input Figure 8: Number of users=600 and 25 percent are providing incorrect input 13
6 Comparison of all Approaches 6.1 Comparison by varying number of passengers to get average error. As shown in Table 1, average decreases with increase in number of passengers because our algorithm gives more accuracy with increasing the number of passengers, more passengers are providing bus spot input. As shown No of Passengers 100 200 400 600 800 Average Error(in m) 1732.71 718.01 272.86 244 249.67 Table 1: Average error Vs Number of passengers in Table 2, average error increase with increase in number of users giving incorrect direction input. Lot of position confidence peaks present due to incorrect input and our algorithm estimated buses to travel in opposite direction. % of users giving incorrect I/P 0 5 15 20 25 Average Error(in m) 244 306 463 619 652 Table 2: Average error Vs Number of passengers providing incorrect direction input 14
7 Challenges and Future Work There are various scenarios of giving wrong inputs by the user, simulating every possibility and to figure out its solution to reduce errors is a challenge here. Incorrect inputs from users lead to inaccurate results. If GPS of the user is On for a substantial amount of time, we could read the GPS trace of the user to guess the direction even if manually (which can be wrong) provided. But GPS is power consuming and user may not want to switch on GPS all the time of his journey. In Future, we would like to do following improvements: To study the behaviour of model for all the bus routes. To study the behaviour when users give wrong input in terms of bus number. To study our model behaviour for Opportunistic sensing. Current system does not have any feature of providing incentives. So we can incorporate incentives and then analyse behaviour of system. Building a reputation system which can help in reducing the wrong inputs from the user. To build an android application which gives current location of bus from inputs provided by users. 15
8 Conclusion Buses are the major part of the public transport system. But, there are scenarios of buses coming late at bus stops due to many reasons such as traffic, bad weather conditions etc. So, passengers are always interested in knowing the exact bus arrival time or its location so that they can plan for any other alternative accordingly saving their time or money. Official time tables are being provided online but these are not timely updated and provide limited information. One possible solution can be installation of GPS devices on every bus to track its location but this approach incurs heavy cost as there are many buses in Mumbai. Another approach is Crowdsourcing. We modeled a simulator of bus network across a city and designed different crowdsourcing approaches to meet this problem. One approach is user providing in which passengers report various bus location, time, bus number and its direction of buses passing by manually (Participatory Sensing). Another approach is to take input from user automatically without active participation of user (Opportunistic Sensing). This information could be then processed to predict the bus locations at the time when user is querying about a particular bus. Our model considered Participatory sensing with varied number of passengers. According to our model, around 600 passengers per day providing input to our system helps in estimating bus location correctly 96 percent of the time. We also varied the number of passengers providing incorrect input from 0 percent to 25 percent. Average error in case of 0 and 25 percent users providing wrong input is 244 m and 652 respectively. In the near future, we would like to check performance for Opportunistic sensing model,append some modules related to incentives so that attract more users in providing input, building of reputation model to increase number of correct inputs from user and finally building an Android application. Acknowledgement I would like to thank Prof. Bhaskaran Raman for regular directions, guidance and his valuable feedback provided during my entire project work. References [1] Aurobindo Mondal. Real time train location using crowdsourcing. http://www.cse.iitb.ac.in/synerg/doku.php?id=public: students:aurobindo:home, October, 2015. 16