Date of Award




Document Type


Degree Name

Doctor of Philosophy (PhD)


Department of Information Science

Content Description

1 online resource (xv, 201 pages) : color illustrations, color maps.

Dissertation/Thesis Chair

Catherine T Lawson

Committee Members

Mei-Hwa Chen, Feng Chen


GTFS, Machine Learning, Passenger Data, Weather Model, Bus travel, Bus stops, Machine learning, Traffic engineering

Subject Categories

Library and Information Science


According to Commuting in the United States 2009, 86.1% of Americans commuted by car, light truck, or van, and about three-quarters of these individuals were driving alone, causing traffic congestion and raising environmental and energy-saving concerns in society. Therefore, transportation experts encourage the public to take public transportation and recommend the development of Bus Rapid Transit (BRT). Currently, bus service restructuring and BRT plans are based on rider surveys, community meetings and on-street interviews. However, these methods require large investments in manpower and material resources, and produce potentially biased results. In this research, the author used the machine learning method, a computer program that automatically analyzes a large body of data and calculates what information is most relevant, to evaluate current bus station usage and determine potential BRT station locations. The station features considered by the machine learning include passenger activities (getting on and getting off), station distance to prior/next station and topography. The passenger data, collected by a local transit agency in 2008/2009 in Albany, NY, were classified by different times and weather conditions. The author also performed deep research into General Transit Feed Specification (GTFS) data and developed a GTFS data visualization website, using Google Maps API, to retrieve station distance and topography data. After testing different algorithms, the EM algorithm and K-Means were determined to be the best algorithms for clustering the stations. While the machine learning strategy can successfully make comprehensive evaluations of all stops, it is inadequate where specific routes are concerned. Therefore, based on K-Means, the author developed a BRT station selection tool to cluster a specific route's stops.