Ask in Plain Language, Get Trustworthy Answers — With Evidence
AI Business Intelligence (AI BI) lets executives, analysts, and frontline teams ask questions in natural language and get grounded, cited answers—plus automated insights, narratives, and actions—on top of your governed data.
SolveForce builds AI BI as a system: governed metrics → semantic layer → guarded RAG over docs & dashboards → NL→SQL with safety rails → anomaly & forecast services—wired to SIEM/SOAR so you can prove accuracy and access controls.
Connective tissue:
🏛️ Data platform → /data-warehouse • 🔄 Pipelines → /etl-elt
🧭 Governance → /data-governance • 🔐 Access → /iam / /ztna
🧠 Retrieval → /vector-databases • 🤫 Privacy → /dlp
📊 Evidence/Automation → /siem-soar
🎯 Outcomes (Why SolveForce AI BI)
- Executive speed — ask “What drove margin delta last quarter?” and get quantified, cited answers in seconds.
- Analyst leverage — NL→SQL on governed metrics; auto-joins and filters that respect row/column security.
- Intelligent alerts — anomaly detection + narratives with “why” factors and links to source.
- Planning & what-if — scenario modeling on curated data with traced assumptions.
- Trust by design — cite-or-refuse answers; if evidence is insufficient, the system says so.
🧭 Scope (What We Build & Operate)
- Semantic layer & metrics — business definitions, dimensions, time-grain, and calc logic; dbt/metric store/LookML equivalents.
- NL→SQL & analysis — LLM planner + SQL generator in a sandbox with guardrails, cost limits, and safe templates.
- Guarded RAG — vector search over curated docs, dashboards, runbooks with pre-filters (labels/ACLs/region) before ANN. → /vector-databases
- Insight services — anomalies, seasonality, forecasts, drivers (Shapley/SHAP-like), cohort & funnel patterns.
- Narratives & actions — executive summaries, data stories, and triggered workflows (tickets, alerts) under policy.
- Access & privacy — SSO/MFA, RLS/CLS, DLP/tokens for PII/PHI/PAN; audit to SIEM. → /iam • /dlp • /siem-soar
🧱 Building Blocks (Spelled Out)
- Governed core
- Warehouse/lakehouse tables with freshness SLAs; lineage & contracts. → /data-warehouse • /etl-elt
- Semantic layer (metrics, dims, scopes), versioned in Git; PR reviews.
- Retriever & planner
- Hybrid search (keyword + vector) over docs/dashboards with ontology synonyms; label/ACL pre-filters.
- NL intent → metric mapping → query plan → parameterized SQL generation with allow-listed functions.
- Execution safety
- Read-only role; result row/column masking; query budget & timeouts; auto-sampling for heavy scans.
- Refusal ledger when evidence is insufficient or user lacks rights.
- Answer composer
- Inline citations to tables/dashboards/docs; chart suggestions; attach SQL & warehouse profile (bytes scanned, slot secs) for transparency.
- Observability & eval
- Question/answer store, human votes, offline eval sets, prompt A/B, drift monitors; precision@k and refusal correctness tracked.
- Privacy & residency
- DLP labels, tokenization; region-pinned indices; redaction templates for narratives.
🧩 Reference Architectures (Choose Your Fit)
A) Executive Q&A (C-suite Copilot)
- NL→SQL with semantic layer + guarded RAG over board decks; “driver analysis” cards; exportable briefing pack with citations.
B) Analyst Workspace (NL→SQL + Notebooks)
- NL→SQL → review/approve SQL in sandbox → commit to notebook; lineage auto-links; query cost & runtime surfaced.
C) Operational Insights & Alerts
- Streaming anomalies on KPIs with narratives (“Cart conversion −2.1% driven by mobile iOS vX in region Y”); ticket to owner with links.
D) Self-Service for GTM (Sales/Marketing)
- Scoped metric views with RLS; campaign lift, cohort, LTV segmentation; one-click chart → deck narrative.
E) Finance & Supply Chain Planning
- What-if levers (price, demand, lead time) on curated cubes; scenario comparison with assumptions log.
📐 SLO Guardrails (Experience & Quality)
| Domain | SLO / KPI | Target |
|---|---|---|
| Q&A | Answer latency (p95) | ≤ 2–6 s (incl. retrieval & compose) |
| Precision@K (gold Q/A) | ≥ 92–95% | |
| Citation coverage | = 100% | |
| Refusal correctness | ≥ 98% | |
| Dashboards | Load time (p95) | ≤ 2–5 s in-region |
| Data | Freshness (hot marts) | ≤ 15–60 min |
| Access | RLS/CLS policy errors | = 0 in prod |
| Cost | $/question (p50) | Budgeted per domain via cache/sharding |
SLO breaches auto-open tickets and trigger SOAR (fallback to canned view, throttle heavy queries, retrain embedding shard). → /siem-soar
🔒 Controls & Anti-Patterns
- Controls: RLS/CLS, label pre-filters, read-only service role, query allow-list, PII tokenization, refusal ledger, cite-or-refuse policy.
- Anti-patterns: blind NL→SQL without semantic layer; exposing unrestricted ad-hoc writeback; letting LLM “invent” metrics; sending raw PII to external models.
📊 Observability & Evidence
- Question log (user, policy, dataset scope), retrieval set, SQL plan, bytes scanned, answer + citations, refusal reason.
- Dashboards: freshness, precision@k, refusal rate, coverage of defined metrics, $/question, cache hit-rate.
- All events → SIEM; SOAR playbooks for auto-quarantine bad prompt templates, revoke access, or pin heavy queries. → /siem-soar
💸 FinOps for AI BI (Cost That Behaves)
- Per-domain query budgets, slot/time caps, and anomaly alerts.
- Caching layers (semantic result cache, chart cache, vector cache) with TTLs and invalidation on data change.
- Unit economics: $/question, $/dashboard load, $/GB scanned surfaced to owners. → /finops
🛠️ Implementation Blueprint (No-Surprise Rollout)
1) Protect surface & KPIs — list metrics, owners, SLOs; map to tables & dashboards.
2) Semantic layer — define metrics/dims in Git; tests; PR approvals.
3) Pipelines — curate marts; contracts & DQ tests; lineage in catalog. → /etl-elt • /data-governance
4) Retriever — build vector indices with label/ACL pre-filters; ontology synonyms (acronyms). → /vector-databases
5) NL→SQL sandbox — allow-lists, parameterized templates, budget/timeouts, RLS/CLS enforcement.
6) Composer — answer with charts + citations; refusal ledger when insufficient evidence.
7) Privacy & access — SSO/MFA; ZTNA for private apps; DLP for narratives. → /iam • /ztna • /dlp
8) Observability — eval sets, precision@k, $/question; logs to SIEM; SOAR guardrails. → /siem-soar
9) Pilot & rings — exec Q&A → analyst workspace → GTM ops; A/B prompts; publish wins & gaps.
✅ Pre-Engagement Checklist
- 📈 KPI list and metric definitions (owner, calc, SLA).
- 🗂️ Source tables/marts; lineage & DQ posture; freshness SLOs.
- 🔐 RLS/CLS rules; PII/PHI/PAN labels; DLP/tokens.
- 👤 Identity model (SSO/MFA), groups/roles; ZTNA scope.
- 🧠 Embedding/LLM choices; vector index shards; ontology synonyms/acronyms.
- 💸 Budget targets ($/question, $/dashboard); cache policy; concurrency goals.
- 📊 SIEM/SOAR destinations; evaluation set and acceptance criteria.
🔄 Where AI BI Fits (Recursive View)
1) Grammar — data rides /connectivity & /networks-and-data-centers.
2) Syntax — curated truth in /data-warehouse via /etl-elt feeds AI BI.
3) Semantics — /data-governance + /cybersecurity preserve truth & privacy.
4) Pragmatics — /solveforce-ai retrieves with guardrails and cites or refuses.
5) Foundation — consistent terms via /primacy-of-language.