// use cases · customer support

Support at scale,
data in the EU.

Deflect and answer tickets and conversations across chat, email and voice, grounded in your help center, with customer data that never leaves the EU.

// how it works

Resolve more, with your own knowledge.

Retrieval, reasoning, tools and voice from a single OpenAI-compatible endpoint — across every channel, and only inside the EU.

step 01

Ground in your help center

qwen3-embedding

Retrieve from your knowledge base, docs and past tickets so answers are accurate and on-policy — grounded in your reality, not a generic model's.

step 02

Answer & act

deepseek-v4-flash

Resolve tickets and chats with tool calling — look up orders, check accounts, take actions — and escalate cleanly to a human when it's the right call.

step 03

Across every channel

kokoro

Chat, email or voice — transcribe and speak with whisper-large-v3 and kokoro for voicebots and IVR. One stack, every customer touchpoint.

// drop-in

Change one line. Keep your helpdesk.

Grounded answers with tools, one chat completion. Change the base URL and key and your support bot runs on private EU models.

read_the_docs
support.py
from openai import OpenAI

client = OpenAI(
    api_key="sk-...",
    base_url="https://api.helmcode.com/v1",  # one line changes
)

# answer from your help center, with tools to take action
reply = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[
        {"role": "system", "content": "Answer from the help center." + kb_context},
        *conversation,
    ],
    tools=tools,            # look up orders, check accounts
)

// why helmcode

Support that keeps customer data home.

Every ticket and call is full of customer PII. Closed APIs want all of it — the messages, and the recordings.

01

Customer PII never stored.

Tickets, chats and recordings are never logged, and never train a model. The personal data in a conversation stays exactly where it should.

02

Processed in the EU.

Conversations and customer data stay on EU infrastructure — not on US hyperscalers subject to the Cloud Act. GDPR and AI Act native.

03

Lower cost per interaction.

Every ticket, message and minute is included. Limits are RPM and concurrency per key — never total tokens — so cost per interaction drops as volume grows.

04

Chat + voice, one API.

Text and voice — STT, LLM and TTS — behind a single OpenAI-compatible endpoint. One stack for tickets, chat and voicebots.

05

Grounded in your help center.

Answers retrieved from your own knowledge base, with citations — fewer hallucinations, fewer wrong answers, fewer escalations.

06

Drops into your stack.

OpenAI-compatible chat, tools and audio. Change the base URL and key; your helpdesk, CCaaS or bot framework keeps working.

In production across
  • Contact center / BPO
  • E-commerce & retail
  • Telco
  • Banking & fintech
  • Public sector
  • AI-native products
In production at

// support faq

Support, answered.

What CX and engineering teams ask before automating support on their own data.

Can it answer from our own help center?

Yes. Ground answers with retrieval (qwen3-embedding + rerank) over your knowledge base, docs and past tickets, so responses are accurate and on-policy — with citations.

Can it take actions, not just reply?

Yes. With tool calling it can look up orders, check accounts and trigger workflows, then escalate to a human when needed — the same JSON tools you already use with OpenAI.

Does it support voice?

Yes. Transcribe calls with whisper-large-v3 and synthesize replies with kokoro (sub-second, 67 voices) for voicebots and IVR — all from the same API.

Do you store tickets or call recordings?

No. Zero logs — conversations, transcripts and recordings are never persisted and never train a model.

How does this lower cost per interaction?

Flat-rate pricing with no token caps means every deflected ticket and message is included — so as volume grows, your cost per interaction falls instead of rising.

Can it run on-premise for regulated support?

Yes. Run on a dedicated GPU or fully on-premise inside your own datacenter — the same API and code, with customer data that never leaves your network.

// get started

START BURNING TOKENS

Skip the AI infra work. Deploy your first private inference endpoint today.

Flat rate. EU data. OpenAI API compatible.