Perfect Comps

Identifying Comparable Real Estate Properties using Machine Learning

Bachelor thesis (2018)

Authors

R.H. van Heukelum Electrical Engineering, Mathematics and Computer Science

G.J.W. Oolbekkink Electrical Engineering, Mathematics and Computer Science

M. Wolting Electrical Engineering, Mathematics and Computer Science

Contributors

C.C.S. Liem (mentor)

O.W. Visser (graduation committee member)

S Mulders (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

To reference this document use:

http://resolver.tudelft.nl/uuid:bb7fddf3-f232-42e1-896c-4847da750799

More Info

expand_more

Published Date

04-07-2018

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

For traditional, manual real estate appraisals, the appraiser is required to provide a number of comparable properties (the 'Comps'). These comps act as a benchmark for the valuation as well as a provider of context in the final appraisal report. Traditionally, these comps are selected manually by an appraiser based on recent transactions within a ten mile radius. This manual selection is biased by the appraiser's market knowledge and the amount of transactions in the area. To replace this process, we developed an automated comparable selection service that does so based on objective characteristics, without restricting itself to a small spatial and/or temporal slice of the market.

Comparable selection does not have an objective ground truth, which complicates or even prohibits the use of many machine learning algorithms that could otherwise have been used. Additionally, the outcome of the service needs to be explainable to its users — it cannot be completely opaque. Finally, our service needed to integrate with a streaming data platform, with incremental new data that arrives continuously and needs to be incorporated into the service's model and output.

Our research phase focused on three aspects: determining the possible algorithms for selecting comparable properties given the constraints of explainability and a streaming environment, how to explain the output of the chosen algorithm to the user, and how to build a service around the chosen model that consumes a stream of input data and can generate a set of comparable properties ad-hoc.

Our process was based on agile methodology, with two-week sprints in which we gradually expanded our service into a fully functional proof of concept. Challenges were encountered while developing the logic to incrementally construct a database of real estate properties from the stream of data. These were resolved by switching from a document store to an RDF database, which better matched the flow of data coming in.

The final product consists of several microservices, each of which handles part of the problem domain and can be scaled out independently. A REST API and a web front-end are accessible to its users. The system was tested using both unit tests and end-to-end testing, whereas the model was refined by scoring output on closeness of features indicative of similarity.

As a future improvement, the current model used by the service is fairly simple and can most likely be improved upon once more data is available. Additionally, due to the lack of a ground truth it will be important to tune both comparable selection and explanation metrics in response to user testing.

Files

BEP_PerfectComps_2018.pdf

(.pdf | 4.39 Mb)