Evaluating the Performance of Random Forest Algorithm in Classifying Property Sale Amount Categories in Real Estate Data

Main Article Content

Les Endahti
Muhammad Shihab Faturahman

Abstract

This study explores the use of machine learning algorithms to classify property sale categories in real estate data, focusing on the performance of the Random Forest algorithm. The dataset, comprising over one million records of property sales from 2001 to 2022, includes features such as sale amount, assessed value, sales ratio, property type, and residential type. The primary objective is to determine which algorithm better predicts property sale categories and to assess how these predictions can aid in market segmentation and property valuation. After preprocessing the data by removing irrelevant columns and handling missing values, we applied the Random Forest classifier to predict five key property types: 'Single Family', 'Residential', 'Condo', 'Two Family', and 'Three Family'. The model achieved an accuracy of 82.98%, with high recall for categories like 'Single Family' and 'Condo', but struggled with 'Residential', which displayed a lower recall due to its diverse nature. The findings suggest that the Random Forest algorithm performs well in predicting certain property types, but improvements are needed for categories with more variation. The study highlights the importance of selecting relevant features such as sale amount and assessed value, which were found to be the most influential in determining property type. Real estate professionals can leverage these machine learning models for more accurate market segmentation, leading to better pricing and marketing strategies. However, the study also acknowledges limitations, such as the complexity of the 'Residential' category and potential data imbalance. Future research could focus on incorporating additional features, such as location-specific data or detailed property descriptions, and testing alternative algorithms to further enhance classification accuracy.

Article Details

How to Cite
[1]
L. Endahti and M. S. Faturahman, “Evaluating the Performance of Random Forest Algorithm in Classifying Property Sale Amount Categories in Real Estate Data”, Int. J. Appl. Inf. Manag., vol. 5, no. 4, pp. 192–202, Nov. 2025.
Section
Articles