Why does AI customer service always miss the point? (A plain-language technical explanation)

Customer Service 2026-05-11 · Satsuma Creative · 9 min read

AI customer service misses the point not because AI isn't smart enough, but because it's being used wrong. This article explains the technical reasons why a generic LLM dropped into customer service makes things up, and how RAG and custom knowledge bases can fix it.

TL;DR

Last week I stress-tested three AI customer service systems with four tricky questions,and all of them made things up(invented refund timelines, mixed up products, made off-hand promises)
It's not that the LLM isn't strong enough — it'smissing information — without your company's dedicated KB, a generic model can only make things up
RAG Fixes 80% of off-topic answers, but it'll still fabricate (the remaining 20% needs ACTION routing + strict system prompt constraints)
The reason SaaS players can't do "customization" well isstructural: their business model depends on "one service for many customers"
Missing the point is a business choice, not a technical limitation

Or: why is ChatGPT so brilliant, but turns dumb the moment you put it on a website?

Last week I stress-tested three AI customer service systems

I asked three "well-known" AI customer service bots the same question:

"I'm moving next month — do I need to reapply for your internet service, or can I just transfer it?"

The first answered: "Hello, based on your question, we do offer moving services. Please see our website for details." (Empty answer, didn't address anything.)

The second answered: "Per regulations, moving qualifies for a 20% discount." (It pulled an old ADSL-era promo rule — the company doesn't even offer that anymore.)

The third answered: "You just need to notify us 30 days in advance." (That's the company's "cancellation" terms, not "moving" terms. It conflated the two.)

All three missed the point. But none of these are bad AI — the engines behind them are probably GPT-4 or Claude.

So where's the problem?

AI isn't unable to answer — it's been put in the wrong position to answer

Wiring a generic large language model (LLM) directly into your website as customer service is like sticking a brand-new hire at the support deskwithout ever handing them the product manual。

What would they do?

Two options: - The honest version: "I don't know, let me check" (but most models are trained to "be actively helpful," so they don't pick this one) - The made-up version: "Based on common sense, it should be..." ← This is what "missing the point" really looks like

The technical term is "hallucination," but it's really just AI's habitual gamble when it's uncertain。

Why can't this bug be fixed? (Short answer: it's not a bug)

An LLM's training objective is to "generate text that sounds plausible," not to "only speak when it knows the answer." So when you ask about something it hasn't seen, it defaults to mimicking similar responses it has seen.

This is exactly the same mechanism as a junior salesperson bluffing in front of a customer and tossing out random answers —the mechanism is identical。

It's not that the LLM isn't strong enough — information is missing

You'll say: just feed it more data, right? The question is:how do you feed it?

Method A: Stuff all your data into every conversation prompt

The brute-force approach. Copy and paste the company's entire FAQ, product manual, and SOPs to the start of the prompt, every conversation.

Problems: - LLMs have a context window limit (GPT-4 is around 128K tokens — a single product manual fills half of that) - Every conversation carries it = you pay token costs every time,costs explode - Overly long contexts cause the model to "forget" the middle (the famous "lost in the middle" effect)

Method B: Dynamically retrieve relevant content per question = RAG

「Retrieval-Augmented Generation" (Retrieval-Augmented Generation), or RAG for short.

Flow: 1. Take all company documentssplit into small chunks(200–500 characters each) 2. Convert each chunk into a "vector" (a string of numbers representing meaning) 3. When a visitor asks a question,convert the question into a vector too 4. Find the 5–10 most similar chunks (this is cosine similarity) 5. Feed only these chunks + the original question to the LLM 6. The LLM answers based on these chunks

→ Context is always precise, always sufficient, always cheap.

And most importantly:The only data the LLM can see is your company's real content. No matter how much it wants to "make things up," it can only assemble from what your company provided — it won't stitch things together on its own.

What RAG still doesn't solve: it'll still fabricate

Most open-source RAG tutorials stop here. But once you actually build customer service, you'll find:Even with RAG, the LLM still improvises.

Example: - Question: "How much is our Plan A?" - The most relevant chunk RAG retrieves: "Plan A features..." (no price listed) - The LLM reads the chunk and still makes it up: "Plan A starts at NT$3,000." ← Customer is toast

To fully solve this, RAG alone isn't enough. You also needthree more things:

1. Hard-code "don't answer what isn't written" into the system prompt

Not "please try to answer based on the data," but:

「Only answer from the knowledge base below. For anything not written, always reply "I'm not sure about this — let me get a human to help." Don't fill in gaps with common sense.」

Get this sentence right, and the LLM's error rate drops 80%.

2. Design ACTION tags to force routing

Make the LLM output a tag before every response (e.g. [ANSWER] / [UNKNOWN] / [HANDOFF]), then write the actual reply. The system routes behavior based on the tag — for example, [UNKNOWN] goes straight to a human,preventing the LLM from freely falling back to fabrication。

3. Show citations

Attach "from knowledge base entry #X" to every answer. This isn't just for the user —it constrains the LLM: knowing it'll be traceable, it stops talking nonsense.

Why can't most AI customer service SaaS do this?

When I lay out these four things — RAG + system prompt + ACTION tag + citation — you might think: "Why don't AI customer service vendors do this?"

Honestly,technically they all could. The problem isn't technical — it's thebusiness model。

The core of the SaaS model is "one product sold to 1,000 customers," so:

The system prompt must be generic → no way to write taboos and tone for each customer
The knowledge base format must be standardized → it takes whatever the customer uploads, with no deep interviews
ACTION tags must be defaults → can't set "escalate when a competitor's name appears" for a specific customer
Customer persona design?SaaS is a tool — it doesn't write personas

This isn't a flaw — it's the nature of the SaaS model. A tool priced at NT$3,000/month can't run 10 interviews, write a custom persona, and design taboo terms for you. The economics don't work.

→ So if what you need is "an AI with a name, a personality, that speaks in your company's voice, and won't talk nonsense," the SaaS routestructurally can't deliver。

So how do you actually do it?

Break it down:

Task	SaaS Model	Custom Model
Organize company knowledge base	You upload the FAQ yourself	We interview + organize
AI persona design	Generic assistant	Customized for your brand, with a name
Write taboo rules	Generic	Aligned with your internal SOPs
ACTION protocol	None	Full workflow protocol
Continuously feed new knowledge	You upload it yourself	Monthly / quarterly audit + gap-filling

→ This isn't "SaaS + deeper customization,"it's a different deliverable。

SaaS sells a tool. The custom model sells "a digital employee who speaks for your company」。

In-house live demo: Xiao-Ai

After all that talk, examples are quickest.

Satsuma's own site has an AI customer service in the bottom right corner calledXiao-Ai. She's read Satsuma's: - Service offerings (the 5 areas of TVC / Social / Media / Web / AI Coworker) - Past work (Sat Online / Mani Mani Club / 81KeysRetro and more) - Founder background (20 years in the game marketing industry) - Engagement process / pricing policy

Things she can roughly answer: - "How much for a TVC spot?" → gives a range directly + guides toward a brief - "I want to build AI customer service, budget NT$1M" → auto-detects hot-lead signalsand pops up a "Fill in engagement form" button - "Do you guys do TVC ads?" → detects the language and switches to English

What shewon'tdo: - "Do you do e-commerce?" → not in the knowledge base, she'll say directly: "I'm not sure about this — would you like to leave your email and a Satsuma specialist will reach out?" ← She won't make things up

You can try her yourself — give her a hard time.

👉 Go try Xiao-Ai

Conclusion: Missing the point is a business choice, not a technical limitation

Back to those three AI customer service systems at the start. They miss the point not because precision is technically out of reach,but because their business model doesn't need that level of precision。

If you just want to "stick an AI on the website to look on-trend," SaaS is enough. If what you need is "an AI coworker who speaks for your brand, doesn't talk nonsense, and gets to know your company better over time," that's a different path.

Neither path is right or wrong, butpicking the wrong one wastes three months and a budget。

If you've thought it through, let's talk

Satsuma's AI Coworker offering takes the custom route —we don't take SaaS-level standardized projects。

Still evaluating? → Come chat with Xiao-Ai
Serious about doing this? → Fill in the engagement form, and a 30-minute brief is on us

Further reading: - What is an AI Coworker? Full introduction → - Satsuma Creative — Integrated marketing creative agency

Satsuma Creative

Integrated marketing creative agency. We believe AI shouldn't be sold as a tool — it should be cultivated as a coworker.