Back to Engineering

3 min read

Updated recently

Forward Deployment Decisions

Develop the strategic intuition of a Principal Engineer. Learn a rigorous framework for making high-stakes technical decisions. Analyze tradeoffs between build vs. buy, monolithic vs. microservices, and selecting the right tech stack for web apps, mobile, and AI-driven backends.

Forward Deployment Engineering Decisions and Tradeoffs

A granular decision framework for building AI products.

1. Web Application Decisions

1.1 Rendering Strategy

CSR (Client-Side Rendering): Good for dashboards, internal tools. SEO is poor.
SSR (Server-Side Rendering): Essential for dynamic content needing SEO (e.g., news feed). Higher server load.
SSG (Static Site Generation): Best for blogs, docs. Fastest performance.
ISR (Incremental Static Regeneration): Best of both worlds for content that changes occasionally.

1.2 State Management

Local State (useState): Simple UI interactions.
Server State (React Query/SWR): Caching API responses. The standard for modern apps.
Global State (Zustand/Redux): Complex client-side logic (e.g., audio player, canvas editor). Avoid if possible.

2. Backend & API Decisions

2.1 Language Choice

Python (FastAPI/Flask): The default for AI. Great ecosystem (PyTorch, LangChain). Slower execution than Go/Rust.
Node.js/TypeScript: Great for I/O heavy apps, real-time chat. Shares types with frontend.
Go: High performance microservices. Less AI library support.

2.2 Communication Protocol

REST: Standard, easy to cache, universally supported.
GraphQL: Flexible data fetching, prevents over-fetching. Complexity overhead.
gRPC: High performance internal microservices communication. Harder to debug.
WebSockets: Essential for real-time AI streaming and chat.

3. Database Decisions

3.1 Primary Database

PostgreSQL: The default choice. Relational, robust, supports vector search (pgvector).
MongoDB: Flexible schema for unstructured data. Good for rapid prototyping.

3.2 Vector Database

pgvector (Postgres): Good enough for 90% of use cases. Simplifies stack.
Pinecone/Weaviate: Managed, specialized for scale (100M+ vectors).
Chroma/LanceDB: Good for local/embedded use cases.

4. ML & LLM Decisions

4.1 Model Hosting

SaaS APIs (OpenAI/Anthropic): Fastest time to market. Best performance. High cost at scale. Privacy concerns.
Open Source (Llama 3/Mistral):
- Self-hosted (vLLM/TGI): Full control, privacy. High ops burden (GPU management).
- Managed OSS (Together/Anyscale): Good balance of cost and performance.

4.2 RAG Strategy

Naive RAG: Retrieve top-k chunks. Simple, fast.
Hybrid Search: Keyword + Vector. Better for specific terms.
Agentic RAG: Agent decides when/what to search. Slower, higher quality.
Graph RAG: Knowledge graph for reasoning. Complex to build.

5. Agent Decisions

5.1 Architecture

Single Agent: Simple tool use. Good for linear tasks.
Multi-Agent (Orchestrator): Router agent delegates to specialists. Better for complex workflows.
Hierarchical: Manager -> Worker structure.

5.2 Memory

Short-term (Context Window): Immediate conversation. Limited by context size/cost.
Long-term (Vector DB): Semantic search over past conversations.
Structured (SQL): Storing user facts/preferences explicitly.

Back to Overview