WENYUFAN
RAG-Based User Content Summariser
Back to Selected Works

RAG-Based User Content Summariser

PMTech·Hybrid FAISS + BM25 retrievalEvidence-grounded summaries

The Brief

A hybrid retrieval system combining vector search and keyword matching for evidence-grounded content summarization.

CategoryComputer Science / Tech
RoleTeam Lead
Year2025
Tech StackPython, FAISS, BM25, LLM, Django

Problem

Scattered user-generated content on platforms like Reddit makes information-seeking tedious and unreliable.

Significance

Users spend excessive time filtering through noisy content to find relevant, trustworthy information.

My Contribution

Led the development of a hybrid retrieval system combining FAISS (vector similarity) and BM25 (keyword matching) to aggregate evidence-grounded summaries from diverse UGC sources.