All projects

Machine Learning

Real Estate RAG

Production-grade retrieval-augmented generation over Azerbaijani real-estate listings with cited answers.

Overview

An end-to-end Retrieval-Augmented Generation system that scrapes pasharealestate.az via Firecrawl, chunks pages with structure-aware overlap, and stores embeddings in PostgreSQL with pgvector. A single SQL CTE runs hybrid retrieval — HNSW vector kNN fused with full-text search via Reciprocal Rank Fusion — and streams Anthropic Claude answers over Server-Sent Events with strict per-claim [Sn] citations. Ships with an offline evaluation harness that tracks recall and citation rate across query sets.

Key highlights

Tech stack

Topics