The methodology

Why it won’t hallucinate.

The defensibility doesn’t come from a smarter LLM. It comes from architecture: five reasoning stages where the model plans and writes — but never retrieves, never calculates, and never validates its own output.

Three principles

Privacy-first

Evidence stays on your account. Documents are encrypted at rest and in transit. AI analysis is transient — no data is retained by model providers beyond the API call.

Citation-grounded

Every factual claim in a draft traces to a verbatim quote with page number and source document identifier. The system refuses to generate a claim it cannot verify against uploaded evidence.

Jurisdiction-aware

Financial calculations use state-specific rule packs with statutory citations. Support figures cite ORS Chapter 25 and the Oregon Child Support Guidelines — not generic formulas.

Five-stage process

Comprehensiveness

Ingest

Every uploaded page is parsed into semantic chunks. Tables are extracted separately. Text is embedded with a legal-domain vector model (voyage-law-2) and indexed with BM25 for hybrid retrieval. Pages are preserved with byte offsets so every citation is traceable.

Tools: PDF parser, PyMuPDF, pdfplumber, pytesseract (OCR), voyage-law-2 embeddings, BM25 index

Structured pursuit

Extract

A reasoning plan is built before any retrieval. The system decomposes the question into a tree of sub-questions, each with a retrieval strategy. The plan is the contract: every output section traces to a node in the plan.

Tools: Claude Sonnet (planning only), structured output schema, no evidence access at this stage

Grounding

Analyze

Hybrid retrieval executes the plan. BM25 + vector search surfaces candidate passages. A reranker scores relevance. Numerical values are fetched from deterministic SQL queries against the evidence registry — the LLM never reads raw PDFs and never does arithmetic.

Tools: Hybrid retrieval (BM25 + voyage-law-2), Cohere reranker, deterministic SQL for numbers

Honest uncertainty

Synthesize

Retrieved evidence is classified: corroborated, contradicted, single-sourced, or gap. Missing facts are marked [GAP] — never papered over. Contradictions are flagged by priority class before any draft is generated.

Tools: Evidence reconciliation engine, contradiction classifier (3 priority classes), gap detection

Defensibility

Draft

Drafts are generated only from the reconciled evidence state. A deterministic Python validator checks every generated quote against source bytes — and can fail the run if a quote doesn't match. Every output is marked [DRAFT — ATTORNEY REVIEW REQUIRED].

Tools: Claude Sonnet (synthesis only), Python citation validator, DOCX/PDF/MD export, UTCR formatting