Forward Deployment Engineering Decisions and Tradeoffs
A granular decision framework for building AI products.
1. Web Application Decisions
1.1 Rendering Strategy
- CSR (Client-Side Rendering): Good for dashboards, internal tools. SEO is poor.
- SSR (Server-Side Rendering): Essential for dynamic content needing SEO (e.g., news feed). Higher server load.
- SSG (Static Site Generation): Best for blogs, docs. Fastest performance.
- ISR (Incremental Static Regeneration): Best of both worlds for content that changes occasionally.
1.2 State Management
- Local State (useState): Simple UI interactions.
- Server State (React Query/SWR): Caching API responses. The standard for modern apps.
- Global State (Zustand/Redux): Complex client-side logic (e.g., audio player, canvas editor). Avoid if possible.
2. Backend & API Decisions
2.1 Language Choice
- Python (FastAPI/Flask): The default for AI. Great ecosystem (PyTorch, LangChain). Slower execution than Go/Rust.
- Node.js/TypeScript: Great for I/O heavy apps, real-time chat. Shares types with frontend.
- Go: High performance microservices. Less AI library support.
2.2 Communication Protocol
- REST: Standard, easy to cache, universally supported.
- GraphQL: Flexible data fetching, prevents over-fetching. Complexity overhead.
- gRPC: High performance internal microservices communication. Harder to debug.
- WebSockets: Essential for real-time AI streaming and chat.
3. Database Decisions
3.1 Primary Database
- PostgreSQL: The default choice. Relational, robust, supports vector search (pgvector).
- MongoDB: Flexible schema for unstructured data. Good for rapid prototyping.
3.2 Vector Database
- pgvector (Postgres): Good enough for 90% of use cases. Simplifies stack.
- Pinecone/Weaviate: Managed, specialized for scale (100M+ vectors).
- Chroma/LanceDB: Good for local/embedded use cases.
4. ML & LLM Decisions
4.1 Model Hosting
- SaaS APIs (OpenAI/Anthropic): Fastest time to market. Best performance. High cost at scale. Privacy concerns.
- Open Source (Llama 3/Mistral):
- Self-hosted (vLLM/TGI): Full control, privacy. High ops burden (GPU management).
- Managed OSS (Together/Anyscale): Good balance of cost and performance.
4.2 RAG Strategy
- Naive RAG: Retrieve top-k chunks. Simple, fast.
- Hybrid Search: Keyword + Vector. Better for specific terms.
- Agentic RAG: Agent decides when/what to search. Slower, higher quality.
- Graph RAG: Knowledge graph for reasoning. Complex to build.
5. Agent Decisions
5.1 Architecture
- Single Agent: Simple tool use. Good for linear tasks.
- Multi-Agent (Orchestrator): Router agent delegates to specialists. Better for complex workflows.
- Hierarchical: Manager -> Worker structure.
5.2 Memory
- Short-term (Context Window): Immediate conversation. Limited by context size/cost.
- Long-term (Vector DB): Semantic search over past conversations.
- Structured (SQL): Storing user facts/preferences explicitly.