ATM System Design

Table of Contents

Clarify requirements the way interviewers expect High-level architecture: what components exist and why Data model: ledger-first thinking (not “balance as a field”)Walkthrough 1: balance inquiry (step-by-step)Walkthrough 2: cash withdrawal (step-by-step)Withdrawal state machine and reconciliation logic Interbank routing and settlement Fraud and abuse prevention Consistency vs availability: realistic trade-offs Reliability, observability, and auditability How interviewers evaluate your ATM design Final takeaway

Home/

Blog/

ATM System Design

In this blog, learn how to ace the ATM System Design interview by designing for ACID correctness, safe cash withdrawal workflows, and robust failure handling with reconciliation, interbank routing, and fraud prevention.

13 mins read

Jan 29, 2026

An ATM network looks simple from the user’s perspective: insert card, enter PIN, withdraw cash. But from a System Design interview perspective, it’s one of the cleanest tests of “real-world correctness.” You’re building a distributed system that touches a core ledger and physical hardware, and you don’t get to hand-wave failure cases away. If the system dispenses cash, the account must be debited. If cash is not dispensed, the account must not be debited (or must be reversed). Everything else is implementation detail.

This is also why ATM design is different from most web systems. In many consumer applications, eventual consistency is acceptable and failures are mostly “retry later.” In an ATM, the failure modes are messier: the network can drop after authorization, the cash dispenser can jam, the ATM can reboot mid-transaction, and foreign cards add multiple external dependencies. A strong answer shows you understand how to design for safe outcomes even when the world is unreliable.

In this blog, we’ll build an interview-ready ATM architecture, explain how transactions stay ACID-correct, introduce a withdrawal state machine (with reversal and reconciliation), cover interbank routing and settlement, and show how to think about fraud prevention and observability.

Grokking Modern System Design Interview

System Design Interviews decide your level and compensation at top tech companies. To succeed, you must design scalable systems, justify trade-offs, and explain decisions under time pressure. Most candidates struggle because they lack a repeatable method. Built by FAANG engineers, this is the definitive System Design Interview course. You will master distributed systems building blocks: databases, caches, load balancers, messaging, microservices, sharding, replication, and consistency, and learn the patterns behind web-scale architectures. Using the RESHADED framework, you will translate open-ended system design problems into precise requirements, explicit constraints, and success metrics, then design modular, reliable solutions. Full Mock Interview practice builds fluency and timing. By the end, you will discuss architectures with Staff-level clarity, tackle unseen questions with confidence, and stand out in System Design Interviews at leading companies.

26hrs

Intermediate

5 Playgrounds

26 Quizzes

Interview signal: ATM design is not about microservices diagrams. It’s about transaction correctness under hardware and network ambiguity.

Clarify requirements the way interviewers expect#

In ATM interviews, you want to start by clarifying scope and then locking the guarantees. The scope is usually a “bank-operated ATM network” with a core banking system behind it, but you should explicitly confirm which features are in scope: withdrawals, balance inquiry, deposits, receipts, and foreign card support. You can mention deposits briefly (they often settle asynchronously), but the main focus should be withdrawals and balance inquiries because they stress correctness.

Once scope is clear, define the invariants. For withdrawals, the invariant is: a customer should not lose money without receiving cash, and the bank should not dispense cash without debiting the account. The hard part is that the ATM is a physical device and your backend doesn’t directly observe “cash delivered.” It receives a confirmation signal that can be delayed, lost, or incorrect. That’s why the withdrawal workflow must be modeled as a state machine with reconciliation.

Also, be explicit about non-functional constraints: PCI compliance, encryption, HSM usage for PIN verification, strict auditability, and extremely conservative failure handling. A good answer doesn’t claim “five nines everywhere.” Instead, it identifies which parts must be strongly consistent (ledger writes) and which can degrade gracefully (receipt printing, UI hints).

Category	Requirement	Notes (what interviewers listen for)
Functional	Authenticate user (card + PIN)	PIN verification via HSM, never plaintext
Functional	Balance inquiry	Read path can be optimized, but must be correct enough
Functional	Cash withdrawal	Two-phase workflow with hold/commit and dispense confirmation
Functional	Deposits	Often asynchronous credit with later verification
Functional	Foreign card support	Requires network routing + auth vs settlement separation
Non-functional	ACID correctness	Ledger is the source of truth
Non-functional	Security + compliance	PCI DSS, encryption, tamper resistance
Non-functional	Auditability	Immutable logs + dispute workflows
Non-functional	Fault tolerance	Safe handling of network and hardware failures

Common pitfall: Treating “withdrawal” as a single database update. In reality, the cash dispenser forces a multi-step transaction with uncertain outcomes.

Summary (after the explanation):

Confirm scope (withdrawal + balance inquiry are the core).
State invariants (no cash without debit, no debit without cash).
Call out compliance and auditability as first-class requirements.
Treat hardware and network ambiguity as expected, not rare.

High-level architecture: what components exist and why#

An ATM system is a layered network: edge devices (ATMs) connect to a bank switch, which routes requests to authorization and transaction processing, which then interacts with the core ledger. The architecture is intentionally conservative: the ledger is centralized or single-writer to preserve correctness, while the edge can be distributed to scale.

At the edge, the ATM is effectively a specialized client with hardware peripherals: card reader, PIN pad, receipt printer, cash dispenser, and sensors. It must encrypt sensitive inputs and operate safely under timeouts. The ATM does not “decide” financial truth; it requests authorization and follows backend instructions.

In the middle, you typically have an ATM switch (or network gateway) that terminates device connections, applies routing and basic policy, and forwards to internal services. For foreign cards, the switch routes to external card networks. Behind the switch sits the Transaction Processing System (TPS) which owns the withdrawal workflow state machine and coordinates with the core ledger.

ATM device	UI + card/PIN capture + dispense + sensors	Hardware boundary; must fail safely
Secure comms module	Encrypt PIN block + session keys	Protect secrets over hostile networks
ATM switch / gateway	Routing, throttling, protocol translation	Central control point for edge traffic
Authentication service / HSM	PIN verification, key management	Compliance + cryptographic trust anchor
Transaction Processing System (TPS)	Orchestrate withdrawals, idempotency, state machine	Prevent double-debits and handle ambiguity
Core ledger / banking DB	Authoritative balances and postings	ACID correctness and auditability
Reconciliation service	Resolve uncertain outcomes, post reversals	Hardware/network failures are normal
Fraud/risk engine	Score and block suspicious withdrawals	Prevent losses and abuse
Audit log pipeline	Immutable event recording	Disputes, compliance, forensics

Interview signal: Strong candidates separate the “routing plane” (switch) from the “money plane” (ledger + TPS), and they never let the ATM directly mutate balances.

Summary (after the explanation):

Keep the core ledger strongly consistent and authoritative.
Put orchestration and idempotency in TPS, not in the ATM.
Use an ATM switch for routing and external network integration.
Treat reconciliation and audit logging as core services.

Data model: ledger-first thinking (not “balance as a field”)#

The safest way to think about bank accounts is: the ledger is the truth, and balance is a derived value (or a carefully maintained cached value) from ledger postings. In interviews, you don’t need to design a full accounting system, but you should demonstrate ledger-first reasoning: every withdrawal produces entries that are auditable and immutable.

For ATM withdrawals, it’s also common to model “holds” (authorizations) separately from “posted” transactions. A hold reduces available balance immediately, preventing double-withdrawal, while the final posting occurs after dispense confirmation. This split is essential when hardware is involved.

You also need idempotency keys and transaction identifiers. The ATM generates a transaction_id that is stable across retries. The TPS uses it to ensure a retried request returns the same result rather than creating a second withdrawal.

Common pitfall: Only storing “balance” and updating it in place. Interviewers want to hear “ledger entries + holds + immutable audit trail.”

Summary (after the explanation):

Use ledger entries for auditability and correctness.
Use holds to reserve funds before dispensing cash.
Use stable transaction IDs for idempotency across retries.

Walkthrough 1: balance inquiry (step-by-step)#

A balance inquiry looks simple, but it’s a great place to demonstrate safe read design. The ATM needs a quick response, but the bank must not show a balance that’s wildly wrong. In most systems, balance inquiry reads from a strongly consistent source (or a read replica with bounded staleness if the bank accepts that trade-off).

The flow starts when the ATM reads the card and prompts for PIN. The PIN is encrypted on the device and sent to the bank’s authentication service (often backed by an HSM). After successful authentication, the ATM sends a balance inquiry request with the authenticated session token and account identifier.

The TPS verifies authorization, checks account status (active, not blocked), and fetches the balance. This may come from a cached available balance maintained by ledger postings, or computed from ledger entries. The TPS returns the balance plus metadata like “available” vs “current,” which matters when holds exist.

What to say in the interview: “Balance inquiry is read-heavy, but I still treat it as sensitive. I validate session, account status, and return available vs current balance to reflect holds.”

Summary (after the explanation):

Authenticate first, then read balance.
Return available/current balance to account for holds.
Prefer correctness over aggressive caching for financial reads.

Walkthrough 2: cash withdrawal (step-by-step)#

Cash withdrawal is the heart of the ATM question because it combines a ledger update with a physical action. The key design insight is that you cannot commit the final debit until you have strong evidence that cash was actually dispensed. But you also can’t dispense cash without reserving funds first, or you risk overdrawing the account.

The correct mental model is: authorize and hold funds → attempt dispense → confirm dispense → commit posting. If anything fails after funds are held but before dispense confirmation, you reverse the hold. This avoids “customer lost money without cash,” which is the worst-case outcome.

The withdrawal flow begins similarly with authentication. The user requests an amount, and the TPS checks limits (daily withdrawal limit, ATM limits, available balance). If approved, the TPS creates a hold and returns a dispense authorization to the ATM. The ATM attempts to dispense cash and uses sensors to confirm whether cash was presented and taken. Only then does the ATM send a dispense confirmation back to the TPS, which commits the final debit and clears the hold.

Interview signal: The phrase “commit only after dispense confirmation” is a strong indicator you understand hardware-driven transaction design.

Summary (after the explanation):

Reserve funds first using a hold.
Dispense cash, then confirm via sensors.
Commit the debit only after confirmation.
Reverse holds when outcomes are uncertain.

Withdrawal state machine and reconciliation logic#

The withdrawal state machine is where you demonstrate senior-level rigor. The system must behave correctly under retries, timeouts, partial failures, and ATM crashes. A state machine makes these cases explicit and auditable.

A good state machine begins after authentication. Once the user requests a withdrawal, the TPS creates a transaction record with a stable txn_id. It then transitions through “funds held,” “dispense in progress,” “dispense confirmed,” and “committed.” The reversal path exists for any failure after funds are held but before commit. This reversal is not a best-effort cleanup; it is a first-class transaction outcome.

The most important point is why “cash dispense confirmation” changes commit behavior. If you commit the debit before dispense confirmation, you risk charging the customer even if the dispenser jams or the ATM loses power. If you dispense before holding funds, you risk giving cash without debit. The state machine enforces the safe ordering.

AUTHENTICATED	User verified	session_id, account_id	→ WITHDRAWAL_REQUESTED
WITHDRAWAL_REQUESTED	Amount chosen	txn_id, amount	→ FUNDS_HELD, DECLINED
FUNDS_HELD	Funds reserved	hold_id, hold_amount	→ DISPENSE_REQUESTED, REVERSED
DISPENSE_REQUESTED	ATM instructed to dispense	atm_command_id	→ DISPENSING, REVERSED
DISPENSING	Hardware in progress	dispenser_status	→ DISPENSE_CONFIRMED, DISPENSE_FAILED
DISPENSE_CONFIRMED	Cash confirmed delivered	sensor_evidence	→ COMMITTED
COMMITTED	Final debit posted	ledger_entry_id	terminal
REVERSED	Hold released / debit reversed	reversal_entry_id	terminal

Common pitfall: Treating “timeout” as failure and immediately reversing, without considering that cash might have been dispensed. Strong designs reconcile using device logs and sensor evidence.

Reconciliation is the safety net for ambiguous outcomes. If the TPS doesn’t receive dispense confirmation, it marks the transaction as “needs reconciliation” rather than guessing. A reconciliation job periodically checks ATM device journals, dispenser counters, and any late confirmations. If it determines cash was dispensed, it commits. If it determines cash was not dispensed, it reverses. If it can’t determine, it escalates to manual dispute workflows.

Summary (after the explanation):

Model withdrawal as a persisted state machine with explicit transitions.
Hold funds before dispense; commit only after dispense confirmation.
Treat ambiguous outcomes as “reconcile,” not “guess.”
Reversal is a first-class outcome, not an afterthought.

Interbank routing and settlement#

Foreign card processing adds a second distributed system: the card network and the issuing bank. When a user uses Bank B’s card at Bank A’s ATM, Bank A is the acquirer, Bank B is the issuer, and the network (Visa/Mastercard/etc.) routes authorization messages between them. This introduces latency, dependencies, and different phases of money movement.

In these systems, it’s critical to separate authorization from settlement. Authorization is a real-time decision: does the issuer approve the withdrawal and place a hold? Settlement is the later process of actually moving funds between banks and reconciling fees. Your ATM system must handle authorization synchronously (because the user is waiting) while settlement happens asynchronously in batch cycles.

Network dependencies also change failure handling. If the card network is down, you may need to decline foreign transactions, route to a backup network, or degrade with conservative limits. The key is to avoid dispensing cash when you cannot obtain a valid authorization response from the issuer.

Interview signal: Saying “authorization is synchronous, settlement is asynchronous” shows you understand real payment rails and why the core ledger can’t be your only dependency.

Summary (after the explanation):

Route foreign transactions through networks to the issuer.
Separate authorization (real-time) from settlement (batch).
Decline safely if you can’t get issuer authorization.

Fraud and abuse prevention#

Fraud prevention in ATM networks is both digital and physical. On the digital side, attackers try to brute-force PINs, exploit stolen cards, or automate withdrawals. On the physical side, skimmers, compromised ATMs, and cash-out attacks are real. A strong interview answer treats fraud controls as layered: edge controls, risk scoring, and circuit breakers.

Start with velocity controls: per-card withdrawal frequency, per-account daily limits, and per-ATM anomaly thresholds. Add geo and behavior signals: withdrawals in two distant locations within minutes, repeated declines, unusual withdrawal patterns, or abnormal ATM error rates. The risk engine can return “allow,” “deny,” or “step-up” (for example, require additional verification at branch, though ATMs are limited).

You also need circuit breakers. If an ATM is suspected of tampering or a region is under attack, you can disable withdrawals, reduce limits, or require online authorization only (no offline fallback). These controls protect the bank even if they temporarily degrade customer experience.

What to say in the interview: “Fraud controls are layered. I enforce limits early, score risk centrally, and use circuit breakers to protect the system during attacks.”

Summary (after the explanation):

Use velocity and anomaly detection to prevent cash-out attacks.
Add tamper signals and ATM health monitoring for skimming.
Apply circuit breakers for suspected compromise or instability.

Consistency vs availability: realistic trade-offs#

ATM systems strongly prefer consistency for ledger writes, but you still need to discuss availability realistically. A common architecture is a single-writer core ledger (or a strongly consistent cluster) that processes all withdrawals for an account. This avoids split-brain balances and simplifies ACID guarantees. The trade-off is that multi-region active-active writes are difficult without heavy coordination.

For global banks, the common compromise is: route transactions to the “home region” for the account or use a partitioned ledger where each account has a single authoritative shard. Reads can be served from replicas, but writes must go to the authoritative shard. If a region is down, you may deny withdrawals rather than risk inconsistency. That’s a business decision, and in interviews, you should state it clearly.

You can also mention limited offline modes (allowing small withdrawals without online authorization), but those are risky and heavily constrained. Most interview settings assume online authorization is required.

Common pitfall: Saying “active-active multi-region with strong consistency” without explaining coordination and failure modes. Interviewers will push on split brain and double-dispense risk.

Summary (after the explanation):

Prefer single-writer or strongly consistent ledger for withdrawals.
Use replicas for reads, but keep writes authoritative.
Be explicit about the business choice: deny vs risk inconsistency.

Reliability, observability, and auditability#

ATM systems must be observable and auditable because disputes are inevitable. Customers will claim “I didn’t get the cash,” or “I was charged twice,” and regulators will demand traceability. That means every step must produce immutable logs: requests, responses, holds, dispense commands, sensor confirmations, reversals, and reconciliation outcomes.

Observability also protects operations. You monitor ATM health (cash levels, dispenser errors), backend latency, authorization failure rates, fraud blocks, and reconciliation queue sizes. For the ledger and TPS, you monitor lock contention, transaction timeouts, and idempotency hit rates (retries). For the network, you monitor routing failures and third-party network availability.

A strong answer includes SLOs that reflect user experience and safety: p95 authorization latency, dispense-confirm latency, reconciliation completion time, and fraud false positive rate. You also want “audit completeness” metrics: every committed debit should have a corresponding dispense confirmation or a reconciliation decision.

Interview signal: “Immutable logs + reconciliation jobs + dispute workflows” is what turns your design into a bank-grade system rather than a demo.

Summary (after the explanation):

Log every step immutably for audit and disputes.
Monitor both software and hardware health.
Track reconciliation backlog and idempotency retries as key safety signals.

How interviewers evaluate your ATM design#

Interviewers are looking for more than a component diagram. They want to see that you prioritize correctness under ambiguity and that you understand why hardware changes transaction semantics. Your strongest signals are: using holds, committing only after dispense confirmation, modeling a state machine, and having reconciliation logic for uncertain outcomes.

They also expect you to take security seriously without drifting into vague statements. Mention HSMs for PIN verification, encryption in transit, PCI compliance constraints, and strict access controls. For foreign cards, they expect you to describe authorization vs settlement and external network dependencies.

Finally, they want trade-offs. If you claim everything is strongly consistent and highly available across regions, they’ll ask how you avoid double withdrawals during partitions. A strong answer chooses safety and explains why.

What strong answers sound like: “I treat withdrawal as a state machine with holds and commit-after-dispense. I design for at-least-once messaging with idempotency, and I reconcile ambiguous outcomes using device journals and immutable logs.”

Summary (after the explanation):

Lead with correctness: holds, commit-after-dispense, reconciliation.
Show security depth: HSM, encryption, compliance.
Explain interbank routing and settlement phases.
Make realistic consistency/availability trade-offs.

Final takeaway#

ATM System Design forces you to design a distributed system where mistakes are expensive and visible. The right approach is conservative: strong ledger correctness, idempotent transaction handling, explicit state machines, and safe failure recovery when hardware is involved. If you can explain why dispense confirmation gates commit, how reversals and reconciliation work, and how you audit every step for disputes, you’ll give a Staff-level answer.

Happy learning!

Written By:

Zarish Khalid

Free Resources

blog

Deliveroo System Design Explained

blog

CamelCamelCamel System Design Explained

blog

Chat System Design

Account	account_id, status, available_balance_cache	Account metadata + fast reads
Ledger entry	entry_id, account_id, amount, type, timestamp	Immutable postings for audit
Authorization hold	hold_id, account_id, amount, state, expires_at	Reserve funds before dispense
ATM transaction	txn_id, atm_id, account_id, amount, state	State machine + idempotency anchor
Dispenser event	txn_id, sensor_status, cash_presented	Hardware evidence for reconciliation

1	ATM	Capture card + PIN, encrypt PIN block	Secure auth request prepared
2	Switch	Route request to auth service	Correct internal routing
3	Auth/HSM	Verify PIN	Authenticated session or denial
4	TPS	Validate session + account status	Authorization enforced
5	Ledger	Read available/current balance	Correct balance computed
6	ATM	Display/print balance	User receives result

1	ATM	Auth + withdrawal request	Start transaction
2	TPS	Check limits + available balance	Prevent overdraft
3	TPS/Ledger	Create hold (reserve funds)	Funds held, not posted
4	ATM	Dispense attempt	Hardware action
5	ATM sensors	Confirm cash presented/taken	Evidence for commit
6	TPS/Ledger	Post final debit + release hold	Money moves permanently
7	ATM	Print receipt	Non-critical side effect

Authorization	Approve/decline + hold funds	ATM bank ↔ network ↔ issuer bank	Seconds
Dispense + confirm	Cash delivery confirmation	ATM ↔ ATM bank TPS	Seconds
Posting	Final debit/clearing hold	Issuer ledger	Seconds–minutes
Settlement	Interbank fund transfer + fees	Banks + network settlement	Hours–days

Velocity checks	Too many withdrawals in short window	Decline or reduce limits
Geo anomaly	Impossible travel pattern	Block + alert
PIN brute force	Multiple incorrect PIN attempts	Card capture / temporary lock
Skimming signals	ATM hardware tamper alerts	Disable ATM, dispatch service
Risk scoring	Suspicious pattern across accounts	Require manual review or deny
Circuit breaker	Network/issuer instability	Conservative declines to avoid loss

Single-writer ledger	Strong	Lower during regional outages	Common for correctness
Multi-region active-active	Hard to guarantee	Higher	Rare, complex
Read replicas for balance	Mostly strong (bounded staleness)	Higher	Common optimization
Offline withdrawals	Weak	Higher	Rare, tightly limited

p95 auth latency	Customer wait time	< 2–3 seconds
Withdrawal success rate	Core availability	> 99.9% (excluding fraud declines)
Reconciliation backlog	Risk of unresolved disputes	Near zero steady-state
Idempotency replay rate	Network instability indicator	Track and alert on spikes
Dispense failure rate	Hardware reliability	Alert per ATM model/region
Fraud block rate + FP rate	Safety vs customer friction	Monitor drift over time