Data Science Technology

About

As the volume of data has surged in connected IoT devices and smart city infrastructures, and with the advancement of computation power, Machine Learning (ML) and Artificial Intelligence (AI) methods have gained prominence in transportation research. The effectiveness of ML or AI models is heavily reliant on the quality of training data. However, due to the complexity of transportation systems, acquiring sufficient data to support the development of ML or AI models for system analysis is challenging.

To tackle this obstacle, the research employs digital twin technology as a testing ground to generate accurate data on human movement and system dynamics. By utilizing the generative adversarial network (GAN) method, the digital twin acts as the generator, producing synthetic data under various circumstances, such as weather events (hurricanes, wildfires, or floods) and human events (large gatherings, holiday travel). This augmentation enhances the available datasets for ML or AI models used in transportation system analysis. Furthermore, taking into account the agent-based characteristics of the digital twin, the research quantifies and controls the biases involved in the generation of synthetic data. This capability supports the development of responsible AI models that assist public agencies and decision-makers in making informed decisions.

The research focuses on three key areas: Data Privacy Control, Synthetic Data Generation, and Traffic Condition Prediction.

Data Privacy Control

The transportation system's data boom allows information to be shared among all participants. However, it is important to protect the privacy of users and operators when analyzing this data. To achieve this, the study suggests a privacy control mechanism that enables secure data exchange. Various techniques are being studied, including the use of Generative Adversarial Network (GNA) and Upper Confidence Bounds (UCB) algorithm.

  • He, Y. B., & Chow, J. Y. (2019). Optimal privacy control for transport network data sharing. Transportation Research Part C: Emerging Technologies. doi: 10.1016/j.trc.2019.07.010.

    He, Y. B., Chow, J. Y. J. & Nourinejad, M. (2017). A privacy design problem for sharing transport service tour data. 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC). doi: 10.1109/itsc.2017.8317692.

Synthetic Data Generation

As the IoT devices and smart city infrastructures generate more data, ML and AI methods have become popular for transportation research. But getting enough data for training these models is challenging due to the complexity of transportation systems. To overcome this challenge, the research uses digital twin technology to generate real-life data for human movement and system dynamics. By integrating with the GAN method, the digital twin can create synthetic data under various circumstances like weather events and human events. This enriched dataset can then be used for ML and AI models in analyzing transportation systems.

  • Coming soon.

Traffic Condition Prediction

Traffic prediction plays a crucial role in facilitating efficient hurricane evacuation. Precisely predicting traffic states enables the safe and smooth deployment of resources and effective operational strategies by traffic management agencies. However, accurate traffic prediction during evacuation remains challenging due to the heterogeneous human behaviors, scarcity of traffic data during evacuation, and the uncertainty of hurricane events. The research presents a comprehensive modeling framework that leverages Multilayer Perceptron (MLP) and Long-Short Term Memory (LSTM) models to capture long-term congestion patterns and short-term speed patterns during hurricane evacuation.

  • He, Y. B., Jiang, Q., & Ma, J. (2023). Machine-learning models for traffic condition prediction and evacuation during the hurricane. IEEE ITS Magazine. [Under review]