Classifying Vehicle Categories Based on Technical Specifications Using Random Forest and SMOTE for Data Augmentation

Main Article Content

Dwi Sugianto
Tri Wahyuningsih

Abstract

This study investigates the application of machine learning for classifying vehicles based on their technical specifications using the Random Forest algorithm. The objective was to create a robust classification model capable of categorizing vehicles into six distinct classes: Hybrid, SUV, Sedan, Sports, Truck, and Wagon. The analysis was conducted using a comprehensive dataset that included features such as engine size, horsepower, weight, and fuel efficiency, along with the target variable, vehicle class. To address the issue of class imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) was applied to balance the training data. The results showed that the model performed particularly well in classifying Sedans, achieving a perfect recall and high F1-score, while struggling with underrepresented classes like Hybrid and Wagon. Despite applying SMOTE, the model’s performance for minority classes remained suboptimal, highlighting the challenges associated with highly imbalanced datasets. The study contributes to the field of vehicle classification by demonstrating the use of Random Forest for such tasks and providing insights into the challenges posed by imbalanced class distributions. The findings underscore the importance of feature selection, especially regarding numerical attributes such as horsepower and engine size, in improving classification accuracy. However, the study also identified limitations, including potential biases in the dataset and the difficulty in improving performance for minority vehicle classes. Future research should explore alternative algorithms like XGBoost or deep learning models, and consider expanding the dataset to include more diverse vehicle types. The practical implications of this work are significant for vehicle market segmentation, offering valuable insights for manufacturers, dealerships, and analysts seeking to optimize vehicle classification and improve market targeting strategies.

Article Details

How to Cite
[1]
D. Sugianto and T. Wahyuningsih, “Classifying Vehicle Categories Based on Technical Specifications Using Random Forest and SMOTE for Data Augmentation”, Int. J. Appl. Inf. Manag., vol. 5, no. 4, pp. 179–191, Nov. 2025.
Section
Articles