AI Risk Toolkit (Hallucinations, Bias & Guardrails)

AI Across the Credit Lifecycle — Guardrail Workflow Map

Use this map to place controls where they matter most. Green = recommended AI; Amber = human-in-the-loop; Red = automated hard stop.

Hallucination Prevention Playbook

1) Ground every answer in data Must-do

✓Use retrieval (ERP/CRM, bureaus, docs) and pass citations/IDs into the prompt; force the model to only answer from retrieved facts.
✓For numeric outputs (limits, ratios), compute outside the LLM or with tool calls—never rely on pure text reasoning.
!When sources are incomplete, require an explicit “Cannot conclude because…” response.

2) Constrain the output Low effort / High impact

✓Return JSON schemas with required fields (e.g., risk_reason_codes, source_refs, confidence).
✓Whitelist calculators (Altman Z, DSCR) with fixed formulas; block free-form math.
✕No speculation: disallow “best guess” fields without evidence.

3) Answerability & refusal rules Guardrail

✓Add a first-step classifier: “Answerable from current sources?” → if not, return @Needs-Data with missing fields.
✓Set confidence thresholds to trigger human review or require more data.

4) Live validation hooks Production

✓Post-gen checks: totals, date ranges, currency units, customer ID matching.
✓Block release if validation fails; show user the failing check.

Bias & Fairness Controls (B2B Credit)

What to guard against

!Proxy bias: Using non-financial attributes (e.g., location/site reviews) that proxy company size/sector in harmful ways.
!Data gaps bias: Thin-file SMEs vs well-reported enterprises; older vs younger firms.
!News imbalance: Negative media gets over-weighted relative to audited financials.

Mitigations

✓Separate signals from decisions: AI surfaces evidence; policy engine applies approved rules.
✓Group-level evaluation: compare approval/limit outcomes across segments (size, industry) for unexplained disparities.
✓Use feature whitelists and weight caps; log every feature contributing to the recommendation.
✓Counterfactual tests: swap sensitive proxies (e.g., HQ region) while holding financials constant—expect similar outcomes.

Compliance posture — document what’s automated vs advisory

Document that AI outputs are advisory unless a policy rule triggers an automated hold. Maintain an audit trail: inputs → features → model version → explanation → decision owner.

Guardrails: Data, Policy & UX Patterns

Data Controls

✓Source tagging: source_ref for every fact (ERP row ID, bureau report ID, doc page).
✓Freshness SLAs: define max age per field (e.g., financials ≤ 12 months, liens ≤ 7 days).
✓Schema validation: reject incomplete or malformed retrieval chunks.

Policy Controls

✓Thresholds: auto-hold when exposure ≥ limit or when risk score crosses defined gates.
✓Exception workflow with named approver and expiry date.

UX Controls

✓Show confidence & sources inline; one-click open to evidence.
✓“Why” panel with ranked reasons and their weights.
!Force user acknowledgment when acting on low-confidence suggestions.

Release Gates

✓Block deployment if eval metrics regress > defined deltas.
✓Shadow mode first; compare AI vs human outcomes before go-live.

Testing & Evaluation Suite

Offline Evals

✓Factuality@K: % of claims with supporting sources.
✓Conformance: JSON schema pass rate.
✓Numerical accuracy: calc parity vs reference engine.

Online Evals

✓Intervention rate: how often humans override.
✓Alert precision/recall: true vs false risk alerts.
✓Fairness: disparity in approvals/limits across segments.

Grounded Prompt Patterns (Copy & Paste)

Evidence-Only Answer

System: You are a credit analyst. Only answer from the provided sources. If insufficient, reply with {"status":"NEEDS_DATA","missing":["field1","field2"]}.

User: Using the retrieved docs and records (cited as source_ref ids), produce:
{
  "summary": "...",
  "facts_used": [{"claim":"...", "source_ref":"ERP:INV-102358"}],
  "risk_reason_codes": ["DBT_TREND","LIEN_RECENT"],
  "confidence": 0.78
}

Numerical Conformance

System: Do not perform free-form math. Delegate all calculations to the tools provided and cite their outputs. Reject if tools are missing.

User: Compute DSCR and Altman Z from these inputs via the calculator tool, return JSON:
{"dscr": number, "altman_z": number, "calc_source":"TOOL:FIN-CALC-1"}

Bias-Aware Recommendation

System: Provide an advisory-only recommendation. Use only approved features: {financials, payment history, liens, verified trade refs}. Disallow unapproved proxies.

User: Recommend a credit limit with:
{"limit_proposed": number, "rationale":["..."], "features_used":["DSCR","DBT_Trend","Lien_30d"], "source_refs":["BUREAU:123","ERP:CUST-778"], "confidence": 0.72}

Incident Runbook (Hallucination or Bias Suspected)

Detect

!Spike in overrides or user “source missing” flags.
!Schema failures or validation errors post-gen.

Contain

✓Switch affected feature to advisory-only; enable manual holds.
✓Pin model version; route to fallback rule set.

Remediate

✓Patch retrieval filters/freshness; add missing calculators or sources.
✓Re-run counterfactual tests; update weight caps.

Report

✓Summarize root cause, impacted volume, and policy diffs; attach evidence links.

Insights and Updates