Project: Evalgen

Role: Author

URL: https://evalgen-26115925037.us-central1.run.app/

What is it?

Scrapes web pages and PDFs as input sources
Uses an LLM to generate questions, expected answers, and telecom category labels
Stores results in Firestore and serves them via a JSON API endpoint
Includes a clean web UI for browsing generated datasets
Deployed on GCP Cloud Run

The Evalgen interface

Backstory

Built during my work time at Capgemini/Telia, where the team needed evaluation data for an AI support assistant. The challenge: the assistant was being trained on a knowledge base of product documentation, but there was no structured dataset of questions a real customer might ask — which made evaluation difficult.

Evalgen solves this by turning any web page or PDF into a set of plausible Q&A pairs, tagged with a broad support category. The outputs can feed directly into an evaluation framework like Langfuse to measure how well an agent handles different query types.

The architecture is straightforward: a Python FastAPI app running on GCP Cloud Run, with Firestore as the storage backend. The scraping layer handles both HTML and PDF inputs. Results are available through a web UI and a JSON endpoint so they can be consumed by other tools in the pipeline.

The main technical challenge was IAM — making sure each GCP service (Cloud Run, Firestore, the scraping worker) had the right service account with the minimum necessary permissions. Getting that right without over-provisioning took most of the debugging time.

Technical details

Stack

Python (FastAPI)
GCP Cloud Run
Firestore (document storage)
OpenAI API / Vertex AI (Q&A generation)
Web scraping (HTML + PDF)
IAM / Service Accounts