Digital Shark Expo

Samuele Da Mar

Project title

Semantic Data Engine (SDE)

Project aim

The primary objective is to implement a scalable ML system that serves features to an Automated Valuation System (AVS) for predicting property prices.

Project outline

The London real estate market presents a complex challenge for agents and investors to determine optimal property prices. This research applies Machine Learning Operations (MLOps) principles to develop a Semantic Data Engine (SDE) that streamlines feature engineering, a significant pain point for data scientists.

The SDE leverages an event-driven architecture using Apache Kafka and Apache Cassandra to efficiently process and transform raw real estate data into meaningful features for machine learning models. The project explores data pipelines, orchestration platforms, and feature stores to enhance productivity in a business context.

The system’s performance is evaluated using Locust for load testing concurrent user requests. The implemented SDE demonstrates the successful application of MLOps principles and an event-driven architecture in the real estate domain.

The research methodology involved a literature review, project planning, data collection, and SDE development using technologies such as Kafka, Cassandra, and Mage. The project report discusses legal, social, ethical, and professional considerations and proposes future enhancements to the SDE.

Images

snapshot of bert pipeline