Multimodal RAG Console

1) Ingest Files

PDF, DOC/DOCX/TXT, images (PNG/JPG), and audio (MP3/WAV/M4A/OGG/FLAC/WebM).

2) Build / Rebuild Index

Embeds text (incl. audio transcripts) with e5-small and writes a FAISS index.

3) Ask

Endpoint returns top-1 result. Response mirrors hit.text in summary.

Answer

Documents