Configuration & self-hosting
AgentData reads non-secret settings from config.yaml, and environment variables override it and hold all secrets. This page is the reference for self-hosted deployments.
Which key does what
The short version:
ANTHROPIC_API_KEYpowers every text-LLM call — classification, query planning, learn/teach, translation.- Embeddings use a separate key. Anthropic has no embeddings API, so semantic recall needs an OpenAI or Voyage key (or a local embeddings server). The two are unrelated, and embeddings are optional — without them, recall falls back to word-overlap matching.
Swap the cloud LLM for an on-prem/OpenAI-compatible one to keep data in your network — see Security and Self-hosting.
Secrets (set via environment variables)
| Env var | Used for | Required? |
|---|---|---|
DATABASE_URL | Postgres connection | Yes |
SECRET_KEY | JWT signing | Yes (prod) |
FERNET_KEY | Encrypt source credentials/tokens at rest | Yes (prod) |
ANTHROPIC_API_KEY | All LLM text calls | Yes, for any AI feature |
EMBEDDING_OPENAI_API_KEY | Embeddings only (semantic recall) | Optional |
GOOGLE_CLIENT_ID | Google sign-in | Optional |
CUBEJS_API_SECRET | Sign Cube API tokens (federation) | If using Cube + Trino |
MCP_API_KEY | Legacy shared MCP key | Optional (per-user keys / OAuth preferred) |
MAIL_SMTP_*, MAIL_FROM | Invite / welcome email | Optional |
Behaviour & wiring
| Env var | Meaning |
|---|---|
LLM_PROVIDER / LLM_BASE_URL / LLM_API_KEY | Swap Anthropic for an OpenAI-compatible or on-prem LLM. Default anthropic. |
EMBEDDINGS_PROVIDER / EMBEDDINGS_MODEL / EMBEDDINGS_BASE_URL | openai | voyage | "" (off); a base URL points at a local embeddings server (no key). |
EMBEDDING_VOYAGEAI_KEY | Key when EMBEDDINGS_PROVIDER=voyage. |
CUBE_API_URL / TRINO_URL | Federation engine URLs (private hosts). |
FEDERATION_ENABLED | true/false — Cube + Trino vs the in-process shim. |
APP_URL / MCP_PUBLIC_URL | Public UI URL (email links) / public backend URL (OAuth callbacks). |
CORS_ORIGINS / REQUIRE_AUTH | CORS allowlist; authentication gate (set REQUIRE_AUTH=true in production). |
AWS_REGION | Bedrock / Athena region, if used. |
Graceful degradation
If a feature's key is missing, AgentData degrades rather than crashing: the classifier falls back to heuristics, query_nl returns 503, and embeddings fall back to word-overlap matching. Every LLM call is cost-tracked per tenant.