Back to Engineering
3 min read
Updated recently

Forward Deployment Decisions

Develop the strategic intuition of a Principal Engineer. Learn a rigorous framework for making high-stakes technical decisions. Analyze tradeoffs between build vs. buy, monolithic vs. microservices, and selecting the right tech stack for web apps, mobile, and AI-driven backends.

Forward Deployment Engineering Decisions and Tradeoffs

A granular decision framework for building AI products.


1. Web Application Decisions

1.1 Rendering Strategy

  • CSR (Client-Side Rendering): Good for dashboards, internal tools. SEO is poor.
  • SSR (Server-Side Rendering): Essential for dynamic content needing SEO (e.g., news feed). Higher server load.
  • SSG (Static Site Generation): Best for blogs, docs. Fastest performance.
  • ISR (Incremental Static Regeneration): Best of both worlds for content that changes occasionally.

1.2 State Management

  • Local State (useState): Simple UI interactions.
  • Server State (React Query/SWR): Caching API responses. The standard for modern apps.
  • Global State (Zustand/Redux): Complex client-side logic (e.g., audio player, canvas editor). Avoid if possible.

2. Backend & API Decisions

2.1 Language Choice

  • Python (FastAPI/Flask): The default for AI. Great ecosystem (PyTorch, LangChain). Slower execution than Go/Rust.
  • Node.js/TypeScript: Great for I/O heavy apps, real-time chat. Shares types with frontend.
  • Go: High performance microservices. Less AI library support.

2.2 Communication Protocol

  • REST: Standard, easy to cache, universally supported.
  • GraphQL: Flexible data fetching, prevents over-fetching. Complexity overhead.
  • gRPC: High performance internal microservices communication. Harder to debug.
  • WebSockets: Essential for real-time AI streaming and chat.

3. Database Decisions

3.1 Primary Database

  • PostgreSQL: The default choice. Relational, robust, supports vector search (pgvector).
  • MongoDB: Flexible schema for unstructured data. Good for rapid prototyping.

3.2 Vector Database

  • pgvector (Postgres): Good enough for 90% of use cases. Simplifies stack.
  • Pinecone/Weaviate: Managed, specialized for scale (100M+ vectors).
  • Chroma/LanceDB: Good for local/embedded use cases.

4. ML & LLM Decisions

4.1 Model Hosting

  • SaaS APIs (OpenAI/Anthropic): Fastest time to market. Best performance. High cost at scale. Privacy concerns.
  • Open Source (Llama 3/Mistral):
    • Self-hosted (vLLM/TGI): Full control, privacy. High ops burden (GPU management).
    • Managed OSS (Together/Anyscale): Good balance of cost and performance.

4.2 RAG Strategy

  • Naive RAG: Retrieve top-k chunks. Simple, fast.
  • Hybrid Search: Keyword + Vector. Better for specific terms.
  • Agentic RAG: Agent decides when/what to search. Slower, higher quality.
  • Graph RAG: Knowledge graph for reasoning. Complex to build.

5. Agent Decisions

5.1 Architecture

  • Single Agent: Simple tool use. Good for linear tasks.
  • Multi-Agent (Orchestrator): Router agent delegates to specialists. Better for complex workflows.
  • Hierarchical: Manager -> Worker structure.

5.2 Memory

  • Short-term (Context Window): Immediate conversation. Limited by context size/cost.
  • Long-term (Vector DB): Semantic search over past conversations.
  • Structured (SQL): Storing user facts/preferences explicitly.