The Core Stack: Audio - Transcription - Contextual Chat + MCP
BPCode.ai has developed an advanced solution that transforms long conversations into intelligent, context-aware chat interfaces capable of accessing enterprise systems in real time.
🧠 Modular AI Infrastructure Stack
We build custom, containerized AI systems for organizations that require secure, domain-adapted, and operational AI.
Our stack is open-source, modular, and production-ready — enabling agents that don’t just answer questions, but also act, integrate, and collaborate.
🧱 Architecture Overview
Our architecture is designed for maximum flexibility and privacy. Each service is independently containerized and communicates over a secure internal network.
Layer | Component | Description |
---|---|---|
Frontend | Streamlit (development) Svelte App (production) |
Streamlit enables rapid prototyping. Our production frontend is built in Svelte with login, roles, and secure session handling. |
Agent Engine | Powerful UI Agent Generator (PUAG) | Visual builder for defining agent workflows, memory, tool calls, and conditional logic. Built on top of LangChain, but abstracted for usability. |
Language Model (LLM) | Ollama (local) or Cloud APIs | Deploy open-source models (e.g. Qwen, Mistral, LLaMA3) locally via Ollama. Cloud APIs (OpenAI, Claude) are optional, controllable, and secure. |
STT Engine | Modular Speech-to-Text Interface | Pluggable STT engine supporting all major languages. Real-time or batch transcription with GPU acceleration. |
Memory & RAG | PostgreSQL + pgvector | Semantic memory for chat history, document search, and retrieval-augmented generation (RAG). Supports multi-tenant indexing. |
Tooling Layer | MCP – Model Context Protocol | Unified protocol for integrating custom tools, APIs, business logic, and other agents. Tools are discoverable, declarative, and secure. |
Infrastructure | Docker | Every service is containerized and managed independently. Only the frontend is exposed. Designed for air-gapped or hybrid deployments. |
🧰 MCP – Model Context Protocol
The Model Context Protocol (MCP) is our universal interface for tools and integrations.
Each MCP "tool" is:
- Self-described (name, description, parameters, response)
- Executable via API or function
- Discoverable by LLM agents at runtime
MCP tools can be:
- Internal code or business logic
- External APIs (e.g. Gmail, Office, Odoo, Slack)
- Embedded automation (data fetchers, generators)
- Other agents
🤖 A2A – Agent-to-Agent Collaboration
One of the most powerful features of the stack is Agent-to-Agent (A2A) interaction.
Any MCP-exposed service — including an agent — can be called by another agent.
How it works:
- Each agent can expose an MCP interface
- Other agents can invoke it as a tool
- Responses can be chained, delegated, or validated
Use Cases:
Scenario | Example |
---|---|
Delegation | A legal assistant agent calls a compliance agent to review a clause. |
Composition | A patient consultation agent calls a diagnosis agent with extracted symptoms. |
Escalation | A general-purpose agent detects a finance query and delegates to a specialized finance agent. |
Validation | One agent cross-checks another agent’s output before presenting it to the user. |
This enables modular, intelligent ecosystems where agents collaborate to solve complex workflows — safely and transparently.
🧠 Powerful UI Agent Generator (PUAG)
Your team (or ours) can build, test, and deploy agent logic using a visual, modular interface.
Key features:
- Prompt design, memory configuration, fallback logic
- Tool assignment (MCP) and permission management
- Versioning and test harnesses
- Deployable as containers or callable services
All flows are compatible with A2A scenarios — meaning agents can be composed, extended, or overridden.
🗃️ Vector Memory with Postgres + pgvector
Our semantic memory layer powers:
- Context-aware chat with historical relevance
- Retrieval of internal documents, structured data, or case files
- RAG (Retrieval-Augmented Generation) for dynamic prompting
Your data remains fully private, locally stored, and auditable.
🛡️ Security & Deployment
- Containerized: Every service is dockerized and isolated
- Internal networking: No exposed ports between components
- Private by design: No data leaves your infrastructure unless configured
- GDPR-ready: Built for compliance and traceability
- Air-gapped compatible: Runs in fully offline environments
⚙️ Example Configurations
Domain | LLM | STT | MCP Tools / A2A | Description |
---|---|---|---|---|
Healthcare | Qwen 4B (local) | Catalan, Spanish | get_radiograph , diagnose , report_agent |
A patient assistant agent delegates to diagnosis and reporting agents |
Legal | Mistral 7B | Spanish, French | fetch_law , check_clause , summary_agent |
The legal agent calls other agents for summarization and risk scoring |
Finance | GPT-4 API | English | odoo_agent , kpi_forecaster |
A dashboard agent delegates requests to specialized ERP and forecast agents |
Education | LLaMA3 | Arabic, English | quiz_agent , tutor_agent |
A course coordinator agent assigns tasks to content- and student-level agents |
✅ Why This Stack?
- Composable & Modular: Swap or stack agents, tools, models, or UI layers
- Private & Compliant: Full control over infrastructure and data
- Domain-Specific: Each agent and flow is tailored to your workflows
- Production-Ready: From rapid prototype to secure deployment
- Ecosystem-Enabled: Build your own AI-powered network of cooperating agents
💼 Our Services
We provide full-lifecycle delivery:
- Architecture and infrastructure setup
- Model deployment and fine-tuning
- Agent design (flows, memory, delegation)
- MCP integration with your internal tools
- Frontend implementation (Streamlit, Svelte)
- Training, handoff, or managed support
We can deliver the full system or empower your team to run it in-house.