AgentIQX π
Modern Multi-Agent Audio/Video/PDF Transcription & Summarization App









π» Tech Stack
| Layer | Technology |
|ββ-|ββββ|
| Frontend/UI | Gradio Blocks, Custom CSS (Glassmorphic, Dark/Cyan theme) |
| Transcription | OpenAI Whisper (local GPU/CPU) |
| Summarization / LLM | Ollama (TinyLlama, DeepSeek, Phi3) |
| Retrieval / Search | SentenceTransformers (all-MiniLM-L6-v2), FAISS |
| PDF Handling | PyPDF2, PyMuPDF |
| Text-to-Speech | gTTS (local), pyttsx3 |
| Email Automation | SMTP, dotenv credentials |
| Utilities | Python 3.11+, Logging, Chunking, FAISS Index Management |
APP_SNAPSHOTS

β¨ Key Features
1. Transcribe Anything
- Upload PDFs, audio (mp3/wav), or video (mp4/m4a/mov).
- Multi-method extraction ensures accurate transcripts via Whisper, PyPDF2, and PyMuPDF.
2. Flexible Summarization
- Generate concise, detailed, bulleted, numbered, or paragraph-style summaries.
- Focus on custom sections or areas of interest using local LLMs (Ollama).
3. Retrieval-Augmented QA (RAG)
- Ask questions about your documents.
- Relevant chunks retrieved via embeddings & FAISS.
- Answers generated by LLM with full context grounding.
4. Step-by-Step Explanations
- Automatically generate pedagogical rationales for any answer.
- Ideal for teaching, learning, or auditing LLM decisions.
5. Semantic Embedding & Chunk Management
- Text is split into word-overlapped chunks, embedded, and indexed with FAISS.
- Supports fast similarity search and RAG pipelines.
6. Text-to-Speech (TTS)
- Convert any summary or transcript to speech locally.
- Audio is playable in-browser or downloadable as MP3.
7. Email Automation
- Send summaries or transcripts via SMTP with .env-based secure credentials.
- Live validation and status feedback in UI.
8. Glassmorphic, Responsive UI
- Smooth, intuitive tab navigation.
- Dark/cyan neon-glass theme with glowing buttons.
- Forward/backward navigation between workflow tabs.
9. Robust Logging & Error Handling
- Detailed logs for each agent.
- Error propagation surfaced to UI for easy debugging.
π Workflow Overview
- Upload & Extract β PDF / Audio / Video β Raw text via Whisper & multi-method extraction.
- Chunk & Embed β Text split into chunks, embedded (SentenceTransformers), indexed (FAISS).
- Summarize β Select style, level of detail, focus areas β Local LLM returns summary.
- TTS & Email β Speak summary using gTTS β Email via SMTP with secure validation.
- RAG Q&A β Ask questions β Relevant chunks retrieved β Answer generated.
- Explain Agent β Generate step-by-step rationale for any answer/context.
β‘ Advanced Features
- Cached embeddings & indices for speed.
- Lazy agent loading for efficient memory usage.
- Full error propagation to UI.
- Drop-in alternative models by editing a single config line.
- Privacy-first design β no cloud calls; all processing local.
π Getting Started
```bash
Create a fresh virtual environment and install dependencies
pip install -r requirements.txt
Set up .env for email credentials
echo βSENDER_EMAIL=your_email@domain.com\nSENDER_PASSWORD=yourpasswordβ > .env
Launch AgentIQX
python app.py