Date of Award

12-2024

Document Type

Project

Degree Name

Master of Science in Information Systems and Technology

Department

Information and Decision Sciences

First Reader/Committee Chair

Shayo Conrad

Abstract

The Real Estate industry is a great asset class that involves constructing, buying, and selling property. Although technology has made significant progress in buying and selling real estate properties online, via Zillow and Redfin, the ways to find distressed real estate properties as an investment opportunity seem to be lacking. This culminating experience project explored how to use machine learning to classify a property as distressed or non-distressed. The research questions are: Q1. How can Natural Language Processing methods like Latent Dirichlet Allocation (LDA) be leveraged to identify distressed and non-distressed real estate properties? (Tijare & Rani, 2020) and Q2. How can machine learning models help in categorizing real estate listings into distressed vs non-distressed properties based on textual features? (Narozhnyi & Kharchenko, 2024). The findings in Q1 are that the LDA model with 4 Topic modellings generated the highest coherence score of 94% and identified the keywords associated with distressed properties. The findings for Q2 are choosing the Multi-Layer Perceptron (MLP) machine learning model, which had the highest F1 scores of 94% and accuracy of 93% and predicted the distressed properties probability. In Q1, we conclude that 4 Topic modelling generated the highest coherence values and hence the keywords predicted distressed properties. In Q2, we conclude that the MLP machine learning model generated the highest F1 score of 94% and testing accuracy of 93% and performed the best to predict distressed properties. Future study can be expanded by studying different asset classes like multifamily, commercial and mixed-use properties, and additional geographical regions could offer a broader perspective.

Share

COinS