End-to-End System Design and Reliability Audit
Explore the process of designing and auditing comprehensive Claude-powered AI systems. Understand how to choose agent architectures, handle partial failures, manage provenance for fair lending compliance, and evaluate quality metrics. This lesson guides you through scenario-based engineering decisions and a self-evaluation checklist to ensure production-ready reliability.
We'll cover the following...
- The scenario brief
- Step 1: Choose the architecture shape
- Step 2: Design the specialist agents
- Step 3: Design the tools
- Step 4: Design conflict resolution and escalation
- Step 5: Batch processing and context management
- Step 6: Provenance and the underwriter packet
- Step 7: Quality metrics
- The architecture sketch
- Self-evaluation checklist
- What’s next?
The scenario studio presents a realistic engineering brief, the kind of problem that appears in a certification exam question or a real architectural review. The goal is not to find the one correct answer, but to demonstrate that every major design decision can be justified using the patterns from this course: where guardrails belong, when to escalate, how to handle partial failures, what provenance to attach, and how to measure quality. After working through the scenario and the architecture, the self-evaluation checklist at the end of this lesson gives you a structured way to audit any Claude-powered system design.
The scenario brief
The following brief captures the organizational context, current state, stakeholder requirements, and operating constraints that the architecture must address. Read through it completely before moving to Step 1.
Organization: A mid-size commercial lending company processes 400–600 loan application packages per month. Each package is a set of documents: income statements, tax returns, bank statements, property appraisals, and a cover letter from the applicant.
Current state: A team of five underwriters reviews each package manually. Each underwriter takes three to four hours per package. The backlog grows during peak season. The organization wants to automate the initial extraction and risk-flagging stage to reduce the time each underwriter spends on routine data gathering while keeping humans responsible for all credit decisions.
Requirements from stakeholders:
Extract structured data from each document type: income figures, asset values, liability amounts, employment status, and property details.
Flag risk indicators: income-to-debt ratio above the threshold, income source not verifiable from the documents, appraisal date more than 90 days old, and missing required documents.
Escalate any package where extraction confidence is low, where two documents provide conflicting values for the same field, or where a required document is absent.
Produce a structured summary packet for the underwriter that shows all extracted fields, their confidence levels, their source documents, and any risk flags, with enough provenance that the underwriter can verify any value without rereading the full package.
Measure extraction ...