How much does an AI chatbot cost?
A production AI chatbot typically costs $20,000 to $150,000 to build and $200 to $3,000 per month to run. The build cost depends on the integration depth (basic OpenAI wrapper vs RAG over corporate knowledge vs multi-tool agent); the run cost depends on query volume and which LLM you call. A buyer-grade chatbot with RAG, evals, and observability commonly lands at $60,000 to $100,000 build and $500 to $1,500 monthly run cost at moderate volume.
The longer answer
AI chatbot pricing splits into build cost and run cost, and the build-cost variance is much wider than buyers expect because the engineering shape varies substantially with the integration depth.
Build cost bands
Basic LLM wrapper ($15,000-$40,000). A chatbot that wraps OpenAI or Anthropic with a prompt template and a simple UI. No RAG, no domain-specific knowledge, no tool use. Right fit for general-purpose customer-service triage where the goal is volume reduction rather than domain expertise. Build time: 3-6 weeks.
RAG-based chatbot ($40,000-$100,000). A chatbot that retrieves from a vector database of corporate documents (policy manuals, product catalogs, support tickets, contracts) and grounds the LLM answer in retrieved context with verifiable citations. Build time: 6-12 weeks. This is the dominant production AI pattern in 2026.
Multi-tool agent ($80,000-$200,000). A chatbot that calls tools (CRM lookup, order status, billing API, scheduling) and orchestrates multi-step workflows. Build time: 8-16 weeks. Higher risk because the failure modes of autonomous tool use are non-obvious.
Custom fine-tuned model ($120,000-$400,000). When the use case requires specific output format, domain jargon, or compliance posture that prompt engineering cannot deliver. Includes the fine-tuning data preparation, training runs, and ongoing eval / monitoring infrastructure. Build time: 12-24 weeks.
Run cost components
Three line items add up to the monthly run cost.
LLM API calls. Anthropic Claude or OpenAI GPT-4-class queries cost $3-$15 per million input tokens and $15-$75 per million output tokens at retail; cheaper Haiku / GPT-4o-mini class queries cost a fraction. A moderately-busy chatbot doing 5,000 queries per day at 2,000 average tokens per query runs $200-$2,000/month in LLM costs depending on the model.
Vector database. pgvector on existing Postgres is free; managed Pinecone / Weaviate / Qdrant runs $50-$500/month for typical chatbot scales.
Observability and evals. Langfuse, Helicone, or similar tooling runs $50-$300/month. Worth every dollar; running AI in production without observability is operationally negligent.
The cost-of-inaction math
A well-built customer-service chatbot handling 50% of a 200-ticket-per-day support queue replaces or augments roughly two full-time staff at U.S. fully-loaded cost ($200,000+/year). A $60,000 build amortizes in less than four months at that volume. AI is one of the few engineering categories where the math is usually in favor of building.
Common follow-up questions
Can I build a chatbot for $5,000?
Not a production-grade one. $5,000 buys a prototype that demonstrates the idea. A chatbot with proper RAG, evals, observability, prompt versioning, and the integration with your existing systems starts at $20,000.
How long until the chatbot pays for itself?
For customer-service use cases handling moderate volume (100+ tickets/day), 3-9 months is typical. For internal-knowledge chatbots, the math is harder to quantify but the answer is usually "fast" because the alternative (staff hours spent searching documentation) is expensive.
Should I use OpenAI, Anthropic, or self-host?
Default to commercial APIs (Anthropic, OpenAI) unless you have specific compliance, latency, or cost reasons to self-host. Self-hosting adds 20-40% to engagement cost in the first year and only pays back at significant volume or with strict data-residency requirements.
If this answer is useful and you have a real engagement in mind, the contact form routes directly to the principal — James Henderson is the single engineer who scopes, writes, and supports every engagement end-to-end.