Session: 17-01-01: Research Posters
Paper Number: 150735
150735 - Generation and Analysis of Electric Vehicle Synthetic Driving Data
The generation and analysis of synthetic driving data offers a powerful approach to advance the development and optimization of Battery Electric Vehicles (BEV). Obtaining actual field data while driving electric vehicles is costly and time-consuming. Sharing the field data obtained anywhere also poses data privacy and security challenges. Due to the scarcity of field data, it negatively affects the performance of numerical operations. Training models cannot provide sufficient accuracy, hinder generality, and reduce predictability. Synthetic data generation can offer significant advantages for driving evaluation. In cases where field data is limited and the replicating of various scenarios is difficult and costly, synthetic data can be used to improve numerical operations' performance by increasing the data's diversity and quantity. In addition, these data support diversity and in-depth analysis in the training process, ensuring that the models are successful in different driving scenarios.
In this study, synthetic driving data was obtained using Gaussian Copula, one of the most effective statistical methods available in the literature. The correlation difference between derived data and field data features was determined. Data pre-processing was carried out with outlier cleaning, missing data filling, and normalization processes, and the training phase was started. To determine the abnormality rates in synthetic data, anomaly detection was made with Long-Short Term Memory (LSTM) - Autoencoder, one of the methods suitable for time series. Values such as 128 hidden layers, dropout rate 0.2, optimizer ‘adam,' initial learning rate 0.01, and epoch 200 were used in the LSTM method. Root mean square error (RMSE) was calculated to determine LSTM learning performance.
The field data, including 35000 samples used in the study, was obtained by driving a battery electric vehicle in a suitable environment and recording the CAN bus while driving. Field and synthetic data include vehicle speed, acceleration, pedal ratio, torque, and power.
According to the results, the average correlation difference between field and synthetic data is 10.85%. RMSE was found to be 0.23137 in anomaly rate detection with LSTM - Autoencoder. The anomaly rates in the field data on a data feature basis are speed 4.48%, acceleration 12.42%, pedal 8.48%, torque 12.28%, power 10.64%, and the general anomaly rate is 15.24%. The RMSE in the derived synthetic data is 0.24464. The anomaly rates in the synthetic data were speed 6.23%, acceleration 5.54%, pedal 7.94%, torque 6.17%, power 7.93%, and the overall anomaly rate was 14.26%.
The data generation method produced synthetic driving data similar to the field data characteristics in the results obtained. In this way, it has become an excellent alternative to lengthy and laborious machine learning methods. At the same time, it has been shown with the help of appropriate metrics that synthetic data is similar to field data but contains lower anomaly values. The methodology preferred in the study has the potential to make significant contributions to Automotive, Aviation, and Energy projects in the USA. Topics such as energy efficiency in electric vehicles, carbon footprint in airlines, and grid support efficiency in wind turbines that require reliable data can be easily supported with the outputs of this study.
Presenting Author: Onur Can Kalay Texas Tech
Presenting Author Biography: He is a post-doc researcher at Texas Tech University.
Authors:
Efe Savran Bursa Uludag UniversityOnur Can Kalay Texas Tech
Fatih Karpat Bursa Uludag University
Stephen Ekwaro-Osire Texas Tech
Generation and Analysis of Electric Vehicle Synthetic Driving Data
Paper Type
Poster Presentation