№ 13 · AI products

Bacardi Insights

Analytical chatbot for the C-level: ask in natural language about P&L and market share; get back executed SQL, charts, and an executive summary.

Live2026BacardiLangGraphMLflowFastAPIReact/TS

Summary

A multi-agent analytics platform for the Bacardi C-suite. Natural-language questions on financial performance and market share resolve through a LangGraph supervisor, a three-tier cache, and deterministic chart rendering — and ship with the SQL, the trace, and the routing rationale visible. Built to beat black-box copilots on transparency and speed.

Details

My role: AI product engineer
Period: 2026
Status: Live
Stack: LangGraphMLflowFastAPIReact/TSDatabricksGenie SpacesVector SearchLakebaseRechartsChatDatabricks

Context

C-suite users at a global spirits group needed answers — financial performance and market share — without the 24-to-72-hour analyst loop. The internal contender was an internal copilot tool deployment optimised for breadth and brand familiarity; what it could not offer was legibility. Executives accept AI in the workflow only when they can see why a number is what it is. A black-box answer in a quarterly review is not an answer; it is a liability. The first-generation chatbot proved the appetite but exposed the architectural cost: a monolithic agent, a frontend orchestrating multi-step plans, twenty-six phases of accumulated workarounds, and three dead code paths. Latency was unpredictable, the rendering layer was fragile, and adding a domain meant editing four files in two repositories. v2 had to keep the wins — golden cache, fallback layer, Recharts rendering — and discard the orchestration pattern entirely. The brief: an analytics platform with five modules, a clean LangGraph spine, and transparency as a first-class product feature, not a debug tool.

Architecture

A single Databricks App ships backend and frontend in one deploy. The backend is a LangGraph StateGraph with five specialist nodes; the frontend renders Recharts deterministically and exposes the trace, the SQL, and the routing decision next to every answer.

Supervisor split into three traceable nodes — rewrite, intent, routing — instead of a single opaque step.
Tier 1 — Vector Search-backed golden SQL cache with cosine and an LLM arbitrator in the gray zone.
Tier 2 — text-to-SQL with a five-stage validator enforcing CTE and FULL OUTER JOIN over the financial mart.
Tier 3 — Genie Spaces fallback for the long tail that fits neither the cache nor the validator.
LLM layer with `ChatDatabricks.with_fallbacks([Sonnet → Haiku → backup models])` — resilience at library level.
State persisted in Lakebase Postgres via the LangGraph checkpointer; survives proxy timeout and browser refresh.
MLflow `langchain.autolog()` instruments every node span for LLM auditability.
React/TypeScript frontend renders Recharts deterministically — same data, same chart, always.

Key decisions

Multi-agent over monolith.: Specialist nodes per domain instead of a single agent. Orchestration belongs server-side, not in a React component: the migration removes multi-step plans from the frontend and lets each node assume a bounded, testable responsibility.
Three-tier cache with gray-zone arbitration.: Cosine ≥0.90 executes directly; the 0.80-0.90 band fires an LLM arbitrator; <0.80 falls through to text-to-SQL. Most queries never reach Genie and the gray zone never silently classifies paraphrases as misses.
`ChatDatabricks` + `.with_fallbacks()` + autolog.: Replaces roughly three hundred lines of custom retry, logging, and parsing code. Resilience lives at the library layer, not sprinkled across nodes: a single swap absorbs the entire concern and shrinks the maintenance surface.
Validator-enforced SQL pattern.: The critic enforces CTE-aggregation followed by FULL OUTER JOIN over the financial mart. It is a guardrail in code, not in a prompt: it blocks silently multiplied rows before a wrong number ever reaches an executive screen.
Job-based execution plus Lakebase checkpointer.: Survives the ninety-second proxy timeout and a browser refresh, and enables HITL via `interrupt_before` once the flow needs it. State persistence stops being a frontend workaround and becomes a backend primitive.
Deterministic chart selection in two places.: Backend `chart_spec.py` and frontend `chart-selector.ts`, with parity enforced by tests. Same data, same chart, always — no client-side surprises during an executive review.

Lessons learned

Replacing a custom `call_llm()` helper with `ChatDatabricks` plus `.with_fallbacks()` and autolog deleted roughly three hundred lines of retry, logging, and parsing — one library swap, an entire concern absorbed.
Phase 26 was the lesson. The v1 frontend orchestrated plan→step→synthesise loops because the backend could not. LangGraph eliminates the need: the clean rewrite was cheaper than the retrofit because the v1 orchestration was load-bearing tech debt.
Transparency as differentiator. Showing the SQL, the routing decision, and the per-node trace is what wins against an internal copilot of comparable raw capability. Executives accept "the model" when they can see the work and reject it when they cannot.
Empirical Gap discipline. Some pieces are logically complete but cannot be validated without production access; marking them explicitly prevents shipped work from looking unfinished.
Package conflict caught early. `databricks_langchain` (not `langchain_databricks`) is the namespace compatible with the Vector Search pin — codified as a permanent decision so it never recurs.

Status & roadmap

Current state: Backend and frontend feature-complete on the five-domain scope; LangGraph multi-agent spine running, three-tier cache live, and the ChatDatabricks fallback chain in production covering Sonnet, Haiku, and backup models.
Next steps: Activation of HITL via the `interrupt_before` checkpoint to confirm high-impact steps, production deployment of the four-LLM endpoint smoke test, and promotion of the MLflow Experiment to the production workspace with observability aligned to the platform team.