Architecture Overview

System Design

┌─────────────────────────────────────────────────────────────────────────────────┐
│                           VIBE AI ARCHITECTURE                                  │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│  FRONTEND (Next.js 15 + TypeScript + React Query)                              │
│  ┌───────────────────────────────────────────────────────────────────────────┐ │
│  │  Dashboard  │  Agent Builder  │  Chat UI  │  Wallet  │  Marketplace       │ │
│  │  Autopilot  │  Memory Mgmt   │  Settings │  Bazaar  │  Creator Dashboard │
│  └─────────────────────────────────────┬─────────────────────────────────────┘ │
│                                        │ REST API + SSE Streaming                │
│                                        ▼                                        │
│  BACKEND (FastAPI + Python 3.11)                                               │
│  ┌───────────────────────────────────────────────────────────────────────────┐ │
│  │                                                                            │ │
│  │  ┌────────────┐  ┌────────────┐  ┌────────────┐  ┌────────────┐          │ │
│  │  │  API       │  │  Services  │  │  Agent     │  │  Workers   │          │ │
│  │  │  Routes    │  │            │  │  Core      │  │  (Dramatiq)│          │ │
│  │  │            │  │  • Wallet │  │            │  │            │          │ │
│  │  │  /agents   │  │  • Billing │  │  • Runner  │  │  • Async   │          │ │
│  │  │  /threads  │  │  • x402    │  │  • Tools   │  │  • Jobs    │          │ │
│  │  │  /wallet   │  │  • KB      │  │  • Memory  │  │  • Triggers│          │ │
│  │  │  /triggers │  │  • Goals   │  │  • Thread  │  │  • Goals   │          │ │
│  │  │  /goals    │  │  • Memory  │  │  • Version │  │            │          │ │
│  │  │  /market   │  │  • Trade   │  │            │  │            │          │ │
│  │  └────────────┘  └────────────┘  └────────────┘  └────────────┘          │ │
│  │                                                                            │ │
│  └───────────────────────────────────────────────────────────────────────────┘ │
│                                        │                                        │
│  ┌─────────────┬───────────────────────┼───────────────────┬─────────────┐     │
│  ▼             ▼                       ▼                   ▼             ▼     │
│  Supabase      Redis                Daytona               LLM            MCP   │
│  (Postgres)    (Cache/Pub)          (Sandbox)           Providers      Servers │
│  + Auth        + State              (Docker)            (LiteLLM)      (1000+)│
│  + Realtime    Management                               │                     │
│                                                          │                     │
│                                              ┌───────────┴───────────┐         │
│                                              │   HYBRID INFERENCE    │         │
│                                              │  ┌─────┐   ┌───────┐  │         │
│                                              │  │Centr│   │Decentr│  │         │
│                                              │  │alized│  │alized │  │         │
│                                              │  │OpenAI│  │Corten │  │         │
│                                              │  │Anthro│  │ sor   │  │         │
│                                              │  │Google│  │Ollama │  │         │
│                                              │  └─────┘   └───────┘  │         │
│                                              └───────────────────────┘         │
│                                                                                 │
│  DATA PROVIDERS (13 Providers)                                                 │
│  ┌───────────────────────────────────────────────────────────────────────────┐ │
│  │ Nansen │ DataAPI │ FourMeme │ PumpFun │ Telegram │ Twitter │ Yahoo        │ │
│  │ TopTraders │ Blokiments │ EVA AI │ LinkedIn │ Amazon │ Zillow │ ActiveJobs │ │
│  └───────────────────────────────────────────────────────────────────────────┘ │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

Core Components

1. API Layer

Module	Purpose	Key Endpoints
`api.py`	Main FastAPI app	Entry point, health checks
`agent/api.py`	Agent management	`/agents/*`
`services/coinbase_payments_api.py`	Wallet operations	`/wallet/*`
`services/x402_api.py`	x402 payments	`/x402/*`
`services/marketplace_api.py`	Marketplace	`/marketplace/*`
`services/goal_api.py`	Autonomous goals	`/goals/*`
`triggers/api.py`	Trigger management	`/triggers/*`
`services/agent_memory_api.py`	Memory operations	`/memory/*`
`services/trade_log_api.py`	Trade logging	`/trade-log/*`

2. Agent Core

# backend/agent/run.py

class AgentRunner:
    """Core agent execution engine"""
    
    def __init__(
        self,
        agent_id: str,
        thread_id: str,
        model_name: str,
        tools: List[Tool]
    ):
        self.agent_id = agent_id
        self.thread_id = thread_id
        self.model = get_model(model_name)
        self.tool_manager = ToolManager(tools)
        self.thread_manager = ThreadManager(thread_id)
    
    async def run(self, input_message: str) -> AsyncIterator[str]:
        """Execute agent with streaming response"""
        
        # 1. Build context
        context = await self.build_context(input_message)
        
        # 2. LLM planning loop
        while not done:
            # Get LLM response
            response = await self.model.generate(context)
            
            # Check for tool calls
            if response.tool_calls:
                for tool_call in response.tool_calls:
                    # Execute tool in sandbox
                    result = await self.execute_tool(tool_call)
                    context.append(result)
            else:
                # Final response
                yield response.content
                done = True

3. Tool System

backend/agent/tools/
├── __init__.py
├── browser_tool.py         # Browser automation
├── coinbase_payments_tool.py  # Wallet operations
├── data_providers_tool.py  # External APIs
├── sb_browser_tool.py      # Sandbox browser
├── sb_files_tool.py        # File operations
├── sb_shell_tool.py        # Shell commands
├── web_search_tool.py      # Web search
├── x402_tool.py            # x402 payments
└── utils/
    ├── custom_mcp_handler.py
    ├── dynamic_tool_builder.py
    └── mcp_tool_executor.py

4. Services

Service	Purpose	File
Wallet	Coinbase Payments MCP	`services/coinbase_payments_api.py`
x402	x402 payment protocol	`services/x402_api.py`
NOWPayments	Cryptocurrency payments	`services/nowpayments_billing.py`
Marketplace	Agent marketplace	`services/marketplace_api.py`
Knowledge Base	RAG storage	`knowledge_base/api.py`
Memory	Mem0-style memory	`services/agent_memory_api.py`
Goals	Autonomous goals	`services/goal_api.py`
Trade Log	Trading operations	`services/trade_log_api.py`
Dashboard	Mission control	`services/dashboard_api.py`

Data Flow

Message Processing

1. User sends message
        │
        ▼
2. API receives request → Validates auth
        │
        ▼
3. ThreadManager loads context
        │
        ▼
4. AgentRunner builds full context:
   • System prompt
   • Conversation history
   • User memory (Mem0)
   • Knowledge base (RAG)
   • Available tools
        │
        ▼
5. LLM processes context → Returns response/tool calls
        │
        ├── If tool call:
        │       │
        │       ▼
        │   6a. ToolManager routes to appropriate tool
        │       │
        │       ▼
        │   6b. Tool executes in Daytona sandbox
        │       │
        │       ▼
        │   6c. Result added to context → Back to step 5
        │
        └── If final response:
                │
                ▼
            7. Stream response to client
                │
                ▼
            8. Save to database
                │
                ▼
            9. Extract memories (Mem0)

Database Schema

Core Tables

-- Agents
CREATE TABLE agents (
    agent_id UUID PRIMARY KEY,
    account_id UUID REFERENCES accounts(account_id),
    name VARCHAR(255),
    system_prompt TEXT,
    model_name VARCHAR(100),
    tools JSONB,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Threads
CREATE TABLE threads (
    thread_id UUID PRIMARY KEY,
    agent_id UUID REFERENCES agents(agent_id),
    account_id UUID REFERENCES accounts(account_id),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Messages
CREATE TABLE messages (
    message_id UUID PRIMARY KEY,
    thread_id UUID REFERENCES threads(thread_id),
    role VARCHAR(20),  -- 'user', 'assistant', 'tool'
    content TEXT,
    tool_calls JSONB,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

Memory Tables

-- User memory (Mem0-style)
CREATE TABLE user_memory (
    id UUID PRIMARY KEY,
    account_id UUID REFERENCES accounts(account_id),
    content TEXT,
    embedding vector(1536),
    metadata JSONB,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Knowledge base chunks
CREATE TABLE knowledge_base_chunks (
    id UUID PRIMARY KEY,
    kb_id UUID REFERENCES knowledge_base(kb_id),
    content TEXT,
    embedding vector(1536),
    metadata JSONB
);

Configuration

Environment Variables

# Database & Auth
SUPABASE_URL=https://xxx.supabase.co
SUPABASE_SERVICE_ROLE_KEY=xxx
SUPABASE_ANON_KEY=xxx

# Redis
REDIS_URL=redis://localhost:6379
# or
REDIS_HOST=localhost
REDIS_PORT=6379

# LLM Providers (Centralized)
OPENAI_API_KEY=xxx
ANTHROPIC_API_KEY=xxx
GOOGLE_API_KEY=xxx
OPENROUTER_API_KEY=xxx

# LLM Providers (Decentralized)
CORTENSOR_API_KEY=xxx
CORTENSOR_NETWORK=testnet-1  # or mainnet
CORTENSOR_DEFAULT_MODEL=llama-3.1-70b

# Tools
TAVILY_API_KEY=xxx
FIRECRAWL_API_KEY=xxx

# Sandbox
DAYTONA_API_KEY=xxx
DAYTONA_SERVER_URL=xxx

# Payments - Coinbase Payments MCP
COINBASE_PAYMENTS_NETWORK=base
COINBASE_PAYMENTS_API_KEY=xxx

# Payments - NOWPayments
NOWPAYMENTS_API_KEY=xxx
NOWPAYMENTS_IPN_SECRET=xxx

# Payments - x402
X402_FACILITATOR_URL=https://x402.org/facilitator
X402_NETWORK=base-sepolia  # or base

# Data Providers (13 providers)
NANSEN_API_KEY=xxx
DATA_API_KEY=xxx
TELEGRAM_API_KEY=xxx
TWITTER_API_KEY=xxx
# ... and more

# Monitoring
SENTRY_DSN=xxx
LANGFUSE_SECRET_KEY=xxx
LANGFUSE_PUBLIC_KEY=xxx

Agent Configuration

# Agent config structure
agent:
  name: "Research Agent"
  system_prompt: |
    You are a research analyst...
  
  model:
    provider: "openai"
    name: "gpt-4o"
    temperature: 0.7
  
  tools:
    - web_search
    - browser
    - file_operations
    - code_execution
  
  knowledge_base:
    enabled: true
    sources:
      - "./docs/research.pdf"
  
  memory:
    enabled: true
    auto_extract: true
  
  triggers:
    - type: cron
      schedule: "0 9 * * *"
      action: "Generate daily report"

Scaling

Horizontal Scaling

                    Load Balancer
                         │
         ┌───────────────┼───────────────┐
         ▼               ▼               ▼
    ┌─────────┐    ┌─────────┐    ┌─────────┐
    │  API 1  │    │  API 2  │    │  API 3  │
    └─────────┘    └─────────┘    └─────────┘
         │               │               │
         └───────────────┼───────────────┘
                         ▼
                    ┌─────────┐
                    │  Redis  │
                    └─────────┘
                         │
    ┌───────────────┬────┴────┬───────────────┐
    ▼               ▼         ▼               ▼
┌────────┐    ┌────────┐ ┌────────┐    ┌────────┐
│Worker 1│    │Worker 2│ │Worker 3│    │Worker 4│
└────────┘    └────────┘ └────────┘    └────────┘

Performance Optimizations

Optimization	Implementation
Caching	Redis for frequent queries
Pooling	Database connection pooling
Async	Fully async FastAPI
Batching	Batch LLM calls
CDN	Static asset caching

Hybrid Inference Architecture

VIBE AI offers users the choice between centralized and decentralized LLM providers, enabling cost optimization, censorship resistance, or quality maximization based on preferences.

Provider Types

Type	Providers	Pros	Cons
Centralized	OpenAI, Anthropic, Google	Highest quality, fastest	Expensive, rate limits, centralized control
Decentralized	Cortensor Network	50-80% cheaper, censorship-resistant, PoI validation	Slightly slower, newer
Local	Ollama	Free, private, offline	Requires hardware, slower

Routing Logic

# backend/agent/inference/hybrid_router.py

class HybridInferenceRouter:
    """Routes inference to optimal provider based on user preferences"""
    
    async def route(self, task_type: str, user_prefs: dict) -> str:
        
        if user_prefs["strategy"] == "cost_optimized":
            # Prefer Cortensor for cost savings
            return "cortensor"
            
        elif user_prefs["strategy"] == "quality_optimized":
            # Prefer centralized for best quality
            return "anthropic" if task_type == "analysis" else "openai"
            
        elif user_prefs["strategy"] == "decentralized":
            # Always use decentralized providers
            return "cortensor" if available else "ollama"
            
        else:  # balanced
            # Smart routing based on task
            if task_type in ["simple_query", "bulk_processing"]:
                return "cortensor"  # Cost-effective
            elif task_type in ["complex_reasoning", "coding"]:
                return "anthropic"  # Quality-critical
            else:
                return "cortensor"  # Default to cost savings

Integration with Validation

For critical tasks, the hybrid router can combine providers with PoI/PoUW validation:

# Multi-provider validation for critical decisions
async def validated_inference(prompt: str, validation_level: str):
    
    if validation_level == "high":
        # Get responses from multiple providers
        responses = await asyncio.gather(
            anthropic.generate(prompt),      # Centralized
            cortensor.infer(prompt),          # Decentralized node 1
            cortensor.infer(prompt),          # Decentralized node 2
        )
        
        # PoI consensus validation
        result = poi_validator.validate(responses)
        
        if result.consensus_score >= 0.7:
            return result.selected_response
        else:
            raise LowConsensusError("Responses diverged")
    
    else:
        # Standard single-provider inference
        return await hybrid_router.generate(prompt)

Next: Tool System →