Agentic Lending and the 5× Problem

The conference circuit has a favourite line right now. It is some variant of: “AI will transform collections.” It is delivered with confidence, usually at the end of a slide deck, and it gets nods from rooms full of CFOs and Heads of Operations who have been told the same thing seventeen different ways for two years.

It is almost true. The part everyone skips is the per-call accounting underneath it.

We have started doing that accounting. The number is sobering enough that we think it deserves its own post — because it implies a procurement question that almost no one is currently asking, and an architectural ceiling that almost no LMS in production today is built to clear.

This is the napkin math.

Human collections, baseline

Take a single overdue account. A human collections agent — a real person, paid hourly, trained on your SOPs — will do something like this in the course of a few minutes:

Open the account in the LMS.
Check the dues, the last payment, the bounce status.
Skim the notes from the previous interaction. Look for a promise-to-pay.
Decide the next action: call, WhatsApp, escalate to recovery.
Add a note. Move on.

Each of those steps is, in systems terms, a tool call. Some are reads (fetch the loan, fetch the customer, fetch the notes). Some are writes (post the note, update a flag, log the outreach). When we instrument human collections work in production LMS environments, the typical range is 8–15 tool calls per account, per touchpoint. Most of the actual context-stitching happens off-platform — in the agent’s head, in a paper notebook, in a side conversation with a supervisor.

At a portfolio of 10,000 active overdue accounts, with one touchpoint per day, that is 80,000 to 150,000 tool calls per day. Backend systems have been comfortably absorbing that load for two decades.

Agentic collections, instrumented

Now replace the human agent with a software agent. The work it does looks superficially the same — decide what action to take on this account today — but the systems trace looks nothing like it.

A governed lending agent cannot decide on intuition. It has to fetch and verify the things a human agent could simply recall. It has to log evidence for every decision because nothing happens off-platform — there is no notebook, no side conversation, no implicit context. And it has to emit events so that the rest of the platform stays consistent with whatever it just did.

A representative agentic touchpoint looks something like this. We are spelling it out because the cumulative count is the whole point of the post.

01  Fetch customer 360
02  Fetch loan 360
03  Fetch DPD trajectory (last 30 days)
04  Fetch bounce history
05  Fetch outstanding charges
06  Fetch consent state for outreach
07  Fetch active grievances / complaints
08  Fetch active maker-checker requests on this account
09  Validate eligibility for the proposed action against policy
10  Validate consent freshness for chosen outreach channel
11  Generate draft outreach content
12  Run grounding check against policy text
13  Run hallucination check against canonical loan state
14  Score outreach for hardship signals (escalation gate)
15  Persist the planned action
16  Trigger the outreach itself (which is more calls — write to channel)
17  Capture delivery receipt
18  Log evidence: prompt version, model version, tool transcript
19  Emit lifecycle event for the touchpoint
20  Verify event was consumed by audit trail
21  Verify write-back to canonical loan state
…

Twenty-one is roughly the floor. With consent flows that span multiple channels, with policy checks that branch on tenant configuration, with maker-checker on any action that crosses a policy boundary, the realistic ceiling is somewhere in the 60–120 range per touchpoint. Call it a clean 5–10× multiplier over the human baseline.

That is not a stretch number. That is what governed agentic operations look like when you actually instrument them. We have been measuring this on internal prototypes for nine months.

What 5–10× does to a backend

Run the multiplication. 10,000 active overdue accounts × one touchpoint per day × 80 calls per touchpoint = 800,000 tool calls per day. Quadruple the touchpoint frequency — which is plausible the moment you let an agent re-evaluate accounts as new signals arrive, instead of in a once-a-day batch — and you are at 3.2 million tool calls per day on a portfolio that was previously absorbing 100K.

This is the part of the conference-circuit pitch that gets glossed over. The pitch is a feature claim. The implication is a backend claim. If your LMS was sized for human collections — and almost every LMS in production today was — then the 5–10× multiplier is not a slope you can climb to. It is a wall.

The wall has four layers.

Latency budget. A human-driven LMS can comfortably take 800ms to return a Customer 360 call. The agent operating on top of it cannot. If a single touchpoint involves 20 reads and 80 writes, an 800ms tail latency means each touchpoint takes 80 seconds. That is not a portfolio you can run agentically. P99 read latency under 200ms is not an aspiration; it is the price of admission.

Idempotency. Every tool call must be safe to retry. Agentic execution involves retries — networks fail, models time out, downstream services hiccup. An LMS where re-issuing a “post note” call produces two notes is an LMS that cannot be safely operated by an agent. Every write needs an idempotency key, and the platform needs to enforce it at the edge.

Eventing. State changes must fan out predictably to every consumer that needs to know. Agents read from canonical state and write to canonical state, but the audit trail, the dashboards, the maker-checker queue, and the downstream notification systems all need to see the change. Polling-based or batch-based eventing falls apart at agentic load.

Identity. Agents are not API keys behind a service account. They are first-class principals — separable, scoped, revocable. RBAC has to be granular enough that an agent can be authorised to read Customer 360 but not authorised to disburse. Maker-checker has to recognise an agent as the maker, with a human as the obligatory checker on any policy-crossing action. If your IAM was designed for human users and CRON jobs, it is not designed for what is coming.

What this implies for procurement

If you are at a lender currently evaluating LMS or AI-servicing vendors, the question on most RFPs is the wrong one. “Do you support AI?” is a question every vendor will answer “yes” to. It does not separate the platforms that can carry the agentic load from the platforms that will collapse under it.

The questions that do separate them are uncomfortable to ask, but they are the right questions:

What is your P99 read latency for a Customer 360 call under load? Show us the percentile distribution from a real production tenant.
Is every write idempotent at the platform level, or is idempotency the integrating system’s responsibility?
How does eventing fan out today — push, pull, batch? At what rate does it fall behind?
Are agents first-class principals in your IAM, or do they piggy-back on a generic service account?
What is the audit overhead, in milliseconds, per write? At 5–10× the call volume, this overhead is the new bottleneck.

A vendor that hesitates on any of these is telling you something important about whether they have done the same napkin math we just walked through.

The rails matter more than the locomotive

It is tempting to focus on the model — the agent itself, the prompts, the orchestration framework. We do not blame anyone for that focus. Models are the visible thing. Models are what the conference deck is about.

But the model is the locomotive. The 5–10× problem is a rails problem. You can put a faster locomotive on narrow-gauge track and watch it derail at the first curve. The locomotive does not care that the track was built for a different era. The track will care, and so will the cargo.

Lokta Core is the rebuild for the rails. The agent surface — underwriting copilot, collections, borrower, portfolio — is ready to ship under engagement scope, sequenced to the lender’s priorities. The foundations underneath it — canonical model, governed APIs, audit on every mutation, identity that scopes service principals, P99 latency that holds under sustained agentic load — are what we have been quietly putting in place for two years.

If you are a CTO sizing this architectural ask for your own platform, we would welcome the conversation. The math is the same wherever you build it. The hard part is admitting what the math implies.

Ashok Auty is the co-founder of Lokta and co-creator of Apache Fineract. He has spent two decades building lending infrastructure and the last nine months instrumenting what governed agentic operations actually cost a backend.

Products

Lokta Core

Lokta Originate

Lokta Delight

Agentic Lending and the 5× Problem

Human collections, baseline

Agentic collections, instrumented

What 5–10× does to a backend

What this implies for procurement

The rails matter more than the locomotive

Adopt the next-decade lending stack with Lokta

Human collections, baseline

Agentic collections, instrumented

What 5–10× does to a backend

What this implies for procurement

The rails matter more than the locomotive

The Lending Technology Pain Map: 3 Audiences

Why Lokta Is Polylithic: One Binary, Many Modules

What We Learned Building Apache Fineract

Adopt the next-decade lending stack with Lokta