The Core Stack: Audio - Transcription - Contextual Chat + MCP

BPCode.ai has developed an advanced solution that transforms long conversations into intelligent, context-aware chat interfaces capable of accessing enterprise systems in real time.

The Core Stack: Audio - Transcription - Contextual Chat + MCP illustration

🧠 Modular AI Infrastructure Stack

We build custom, containerized AI systems for organizations that require secure, domain-adapted, and operational AI.
Our stack is open-source, modular, and production-ready — enabling agents that don’t just answer questions, but also act, integrate, and collaborate.


🧱 Architecture Overview

Our architecture is designed for maximum flexibility and privacy. Each service is independently containerized and communicates over a secure internal network.

Layer Component Description
Frontend Streamlit (development)
Svelte App (production)
Streamlit enables rapid prototyping. Our production frontend is built in Svelte with login, roles, and secure session handling.
Agent Engine Powerful UI Agent Generator (PUAG) Visual builder for defining agent workflows, memory, tool calls, and conditional logic. Built on top of LangChain, but abstracted for usability.
Language Model (LLM) Ollama (local) or Cloud APIs Deploy open-source models (e.g. Qwen, Mistral, LLaMA3) locally via Ollama. Cloud APIs (OpenAI, Claude) are optional, controllable, and secure.
STT Engine Modular Speech-to-Text Interface Pluggable STT engine supporting all major languages. Real-time or batch transcription with GPU acceleration.
Memory & RAG PostgreSQL + pgvector Semantic memory for chat history, document search, and retrieval-augmented generation (RAG). Supports multi-tenant indexing.
Tooling Layer MCP – Model Context Protocol Unified protocol for integrating custom tools, APIs, business logic, and other agents. Tools are discoverable, declarative, and secure.
Infrastructure Docker Every service is containerized and managed independently. Only the frontend is exposed. Designed for air-gapped or hybrid deployments.

🧰 MCP – Model Context Protocol

The Model Context Protocol (MCP) is our universal interface for tools and integrations.

Each MCP "tool" is:

  • Self-described (name, description, parameters, response)
  • Executable via API or function
  • Discoverable by LLM agents at runtime

MCP tools can be:

  • Internal code or business logic
  • External APIs (e.g. Gmail, Office, Odoo, Slack)
  • Embedded automation (data fetchers, generators)
  • Other agents

🤖 A2A – Agent-to-Agent Collaboration

One of the most powerful features of the stack is Agent-to-Agent (A2A) interaction.

Any MCP-exposed service — including an agent — can be called by another agent.

How it works:

  • Each agent can expose an MCP interface
  • Other agents can invoke it as a tool
  • Responses can be chained, delegated, or validated

Use Cases:

Scenario Example
Delegation A legal assistant agent calls a compliance agent to review a clause.
Composition A patient consultation agent calls a diagnosis agent with extracted symptoms.
Escalation A general-purpose agent detects a finance query and delegates to a specialized finance agent.
Validation One agent cross-checks another agent’s output before presenting it to the user.

This enables modular, intelligent ecosystems where agents collaborate to solve complex workflows — safely and transparently.


🧠 Powerful UI Agent Generator (PUAG)

Your team (or ours) can build, test, and deploy agent logic using a visual, modular interface.

Key features:

  • Prompt design, memory configuration, fallback logic
  • Tool assignment (MCP) and permission management
  • Versioning and test harnesses
  • Deployable as containers or callable services

All flows are compatible with A2A scenarios — meaning agents can be composed, extended, or overridden.


🗃️ Vector Memory with Postgres + pgvector

Our semantic memory layer powers:

  • Context-aware chat with historical relevance
  • Retrieval of internal documents, structured data, or case files
  • RAG (Retrieval-Augmented Generation) for dynamic prompting

Your data remains fully private, locally stored, and auditable.


🛡️ Security & Deployment

  • Containerized: Every service is dockerized and isolated
  • Internal networking: No exposed ports between components
  • Private by design: No data leaves your infrastructure unless configured
  • GDPR-ready: Built for compliance and traceability
  • Air-gapped compatible: Runs in fully offline environments

⚙️ Example Configurations

Domain LLM STT MCP Tools / A2A Description
Healthcare Qwen 4B (local) Catalan, Spanish get_radiograph, diagnose, report_agent A patient assistant agent delegates to diagnosis and reporting agents
Legal Mistral 7B Spanish, French fetch_law, check_clause, summary_agent The legal agent calls other agents for summarization and risk scoring
Finance GPT-4 API English odoo_agent, kpi_forecaster A dashboard agent delegates requests to specialized ERP and forecast agents
Education LLaMA3 Arabic, English quiz_agent, tutor_agent A course coordinator agent assigns tasks to content- and student-level agents

✅ Why This Stack?

  • Composable & Modular: Swap or stack agents, tools, models, or UI layers
  • Private & Compliant: Full control over infrastructure and data
  • Domain-Specific: Each agent and flow is tailored to your workflows
  • Production-Ready: From rapid prototype to secure deployment
  • Ecosystem-Enabled: Build your own AI-powered network of cooperating agents

💼 Our Services

We provide full-lifecycle delivery:

  • Architecture and infrastructure setup
  • Model deployment and fine-tuning
  • Agent design (flows, memory, delegation)
  • MCP integration with your internal tools
  • Frontend implementation (Streamlit, Svelte)
  • Training, handoff, or managed support

We can deliver the full system or empower your team to run it in-house.

Ready to Start AI Implementation?

We have the technical expertise. Whether you're exploring AI possibilities or have a specific project in mind, we’ll guide you through the next steps.

If you prefer, you can email us at:

info@bpcode.ai