Classifying Vehicle Categories Based on Technical Specifications Using Random Forest and SMOTE for Data Augmentation
Main Article Content
Abstract
This study investigates the application of machine learning for classifying vehicles based on their technical specifications using the Random Forest algorithm. The objective was to create a robust classification model capable of categorizing vehicles into six distinct classes: Hybrid, SUV, Sedan, Sports, Truck, and Wagon. The analysis was conducted using a comprehensive dataset that included features such as engine size, horsepower, weight, and fuel efficiency, along with the target variable, vehicle class. To address the issue of class imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) was applied to balance the training data. The results showed that the model performed particularly well in classifying Sedans, achieving a perfect recall and high F1-score, while struggling with underrepresented classes like Hybrid and Wagon. Despite applying SMOTE, the model’s performance for minority classes remained suboptimal, highlighting the challenges associated with highly imbalanced datasets. The study contributes to the field of vehicle classification by demonstrating the use of Random Forest for such tasks and providing insights into the challenges posed by imbalanced class distributions. The findings underscore the importance of feature selection, especially regarding numerical attributes such as horsepower and engine size, in improving classification accuracy. However, the study also identified limitations, including potential biases in the dataset and the difficulty in improving performance for minority vehicle classes. Future research should explore alternative algorithms like XGBoost or deep learning models, and consider expanding the dataset to include more diverse vehicle types. The practical implications of this work are significant for vehicle market segmentation, offering valuable insights for manufacturers, dealerships, and analysts seeking to optimize vehicle classification and improve market targeting strategies.
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with International Journal for Applied Information Management agree to the following terms: Authors retain copyright and grant the International Journal for Applied Information Management right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share (copy and redistribute the material in any medium or format) and adapt (remix, transform, and build upon the material) the work for any purpose, even commercially with an acknowledgement of the work's authorship and initial publication in International Journal for Applied Information Management. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in International Journal for Applied Information Management. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).