Penulis/Author |
FAHMI DZIKRULLAH (1) ; Ir. Noor Akhmad Setiawan, S.T., M.T., Ph.D., IPM. (2); Prof. Ir. Selo, S.T., M.T., M.Sc, Ph.D., IPU, ASEAN Eng. (3) |
Abstrak/Abstract |
The popularity of Bus Rapid Transit (BRT) makes Trans Jogja an alternative of a mass public transportation system for urban mobility. However, without supervision on temporal patterns of passenger's behavior in Trans Jogja on supply and demand, it will result in the decreases of the number of BRT users and the increasing number of private vehicle users, so that traffic jams remain difficult to avoid. Smart Card Automated Fare Collection System (SCAFCS) which is currently used as e-Ticketing in Trans Jogja public transport can be used to analyze passengers pattern with data mining approaches. This paper applied SCAFCS data preprocessing with data warehouse mechanism and implemented Hadoop Platform as distributed computing to improve K-Means++ clustering performance on large datasets scalability; in this case, SCAFCS Trans Jogja has a large dataset (volume) and rapid growth data (velocity). Scalable K-Means++ algorithm generates five clusters with characteristics in number of clusters, namely: Very Low, Low, Average, High, Very High. The clusters were used to analyze passengers pattern based on the dimensions of time (temporal), segmentation of passengers (structure) to determine the variability of passengers based on the card they used and transaction peak on boarding location (spatio). Experimental and testing setup was performed by comparing Sum of Square Error (SSE) which is the total squared error of k cluster at the centroid on three algorithms, simple K-Means, K-Means++ and K-Means++ implementation using Hadoop Platform as parallel and distributed computing. K-Means++ with Hadoop Platform implementation generates smaller SSE value than simple K-Means and K-Means++ algorithms; that shows it has good SSE value. |