The Bottleneck Was Never the Code
Your engineering team uses AI every day. Your delivery velocity hasn’t moved. That is the correct outcome for what you bought.
In a regulated bank, writing new code is roughly 25-30% of an engineer’s actual workday. The rest is navigating the CAB submission process, writing compliance documentation, generating release artifacts, responding to audit questionnaires, attending architecture review boards, waiting for environment approvals, drafting incident postmortems — and the meeting tax that doesn’t appear on any productivity dashboard. Sprint planning. Discovery workshops. Design reviews. PI planning ceremonies. CAB meeting attendance. Architecture review board prep and attendance. Risk review meetings. Governance checkpoints. Decision gates with ten stakeholders who each have veto authority. A senior engineer at a mid-size regional bank can spend 20 hours a week in rooms before writing a single line of code.
You bought AI for the 25%. The other 75% is still fully manual. The CAB backlog is unchanged. The meeting calendar is unchanged.
The METR study found developers using AI finished tasks 19% slower while predicting they were 24% faster — a 39-point perception gap. Jellyfish analyzed 20 million pull requests and found that while 75% of engineers use AI tools, most organizations show no measurable delivery gains. These numbers don’t describe a broken tool. They describe a misaligned investment.
You Don’t Have One AI Problem
Your engineering department doesn’t have “an AI for engineering” problem. At a bank running regulated financial workloads, it has at minimum fourteen distinct workflows:
- Incident triage and postmortem documentation
- Change Advisory Board (CAB) submission prep and attendance
- Release notes generation
- Code review and quality analysis
- Security and SAST scanning
- Architecture decision records
- Tech debt scoring and prioritization
- Compliance questionnaire response
- Onboarding runbook execution
- Test generation
- Sprint planning, discovery workshops, design reviews, PI planning ceremonies
- Architecture review board prep and attendance
- Risk and governance checkpoint meetings
- Audit artifact generation
Each of those routes to a different investment decision: automate it, build something custom, buy a market solution, hire to fill the gap, or explicitly wait. When you license a platform to solve all fourteen, you optimize for the demo — which is always code generation — and ignore everything else.
If an accounts receivable team has eight AI problems, an engineering department inside a regulated bank has fourteen. Bundling all fourteen into a Copilot Enterprise contract is a vendor’s dream and a delivery team’s stagnation.
The Rule Already Written in Your Binders
The operating principle is this: don’t invest in a workflow you can’t describe. Inputs, outputs, exceptions, ownership. One sentence of plain English per workflow, or you’re not making an investment decision — you’re making a purchase.
Here’s the banking irony. You already have this documentation. Your CAB runbook. Your incident response playbook. Your SDLC policy. Your change management procedure. These are workflow descriptions written for examiners and auditors. Every input is captured. Every output is defined. Every exception is catalogued. Accountability is assigned on paper.
You use them to pass regulatory exams. You don’t use them to make AI investment decisions. You use the vendor demo. Then you wonder why the backlog didn’t move.
That inversion — compliance documentation sitting in binders while AI investment decisions get made off demo videos — is the institutional failure mode. You already have the spec. You’re just not reading it as a spec.
Five Levers, Applied to Your Sprint Board
You really only have five options with any workflow: automate it, build something with AI, buy a solution, hire for it, or wait. The right answer changes per workflow. Here’s what each looks like inside a regulated bank engineering department.
Automate — when the routine case dominates and mistakes are cheap to catch
The clear wins in bank engineering are not code generation. They’re structured documents and meeting artifacts with high repetition and cheap verification:
- CAB submission documents. The inputs — change record, risk assessment, rollback plan — are defined. The output template is fixed. The pattern is identical across hundreds of changes per quarter. A bad output is caught in the CAB meeting, not in production.
- Release notes from commit history. Deterministic transformation, zero creativity required, saves 30-45 minutes per release from a developer who wants to go home.
- Incident postmortem first drafts. Structured format, repeatable inputs from your incident management tooling, saves 2-3 hours per incident.
- Sprint planning input packages. Aggregate ticket history, velocity data, and dependency maps into a pre-structured briefing before the ceremony starts — so the 3-hour meeting doesn’t spend its first hour aligning on context.
- Discovery and design review pre-reads. Pull the relevant architecture diagrams, prior ADRs, API contracts, and risk assessments into a structured document before the meeting, so the 10 stakeholders spend the hour deciding, not searching.
- ARB prep packages. Synthesize the proposed design against existing standards, flag conflicts, pre-populate the review template. Cut the prep time from a day to an hour.
These are the workflows that should have been automated first. Nobody is doing them. Everyone is arguing about Copilot seat utilization instead.
Don’t automate when the exception is where the value lives. A lot of bad AI demos show you the routine case. Production traffic turns out to be mostly exceptions. The accuracy number is terrible. Nobody lied. The buyer just bought the wrong thing.
Build — when the workflow is specific to your institution
Tech debt scoring against your risk model is not for sale. Regulatory criticality weighting — debt in the core banking layer versus debt in the mobile notification service — is your judgment, your risk appetite, your examiner exposure. Nobody builds this at your institution’s specific risk model. Same for architecture decision records aligned to your governance framework, your approved vendor catalog, your data classification tiers.
If you can’t explain it to a vendor in a 30-minute demo, and the workflow requires knowing your internal systems to do it well, you’re in build territory. The question is not “can we buy this” but “can we define what good looks like?” — because if you can’t define that, you don’t have a build mission. You have a construction project with no blueprint, which is exactly how you end up on Sol Rashidi’s 88% list.
Buy — when the market is mature and the workflow is general
SAST scanners, SCA tools, code quality gates — the market is mature. Don’t build a vulnerability scanner. The “buy primitives” play works here: Copilot for autocomplete, SonarQube AI for quality gates, Snyk for dependency scanning. These are building blocks. Buy them. The rule is: buy the building block, own the workflow that wraps it.
In the age of agentic workflows, misalignment between how the vendor shaped the workflow and how your team actually works is more expensive to fix than it looks. A five-year contract for a tool category that will look different in 12 months is not a strategy. Prototype narrowly. Preserve the right to switch.
Hire — when the workflow requires standard-setting that doesn’t exist yet
“We need an AI engineering lead” is not a job description. Every interviewer reads it differently. The candidate pool is flooded with AI-generated resumes and people claiming skills they don’t have.
“We need someone to define what a production-safe agentic workflow looks like for regulated financial workloads, given our current SDLC maturity, our SOX boundary, and our model risk management obligations under SR 11-7” — that’s a job description. Without that specificity, you’re hunting a purple unicorn: domain expert in core banking, AI builder, systems architect, compliance fluent, change leader. That person exists sometimes. More often, the market clears while you figure out what you wanted.
Wait — when higher-priority workflows deserve your change management capacity first
Your CI/CD pipeline that’s been green for 18 months is not broken. Banking has finite change management capacity. The aircraft carrier doesn’t turn fast — that’s not a weakness to apologize for, it’s an operational constraint to sequence around. The highest-value workflows go first. Everything else gets an explicit queue position, not “let’s see if we get to it.”
Waiting is not doing nothing on AI. It means not starting at the bottom of the priority list just to say you’re doing AI.
The Map Your Vendor Won’t Draw
Two axes: workflow specificity (general → bank-specific) × market maturity (immature → mature). Four quadrants, four clear answers.

- General + Mature → Buy. Code review AI, SAST, Copilot autocomplete, standard help desk. The decision is done. Stop deliberating.
- General + Immature → Prototype or wait. AI sprint planning, AI retrospective analysis, AI standup summarization. The category is still defining itself. Don’t sign a three-year contract for something that will look completely different in 12 months.
- Bank-Specific + Mature Primitives → Buy building blocks, own the workflow. Your risk-aware change management pipeline. Your compliance-gate-integrated release process. The primitives exist. The assembled workflow specific to your institution doesn’t.
- Bank-Specific + Thin Market → Build. Tech debt prioritization against regulatory criticality. ADR alignment to examiner expectations. Audit artifact generation tied to your specific control framework. Nobody sells this. Build it and own the category.
Hiring cuts across all four quadrants. If nobody on your team can define what good looks like for a workflow — if it requires institutional trust, framing judgment, and standards that don’t exist yet — your next investment is a person, not a platform.
What’s Actually Happening in the 88%
Sol Rashidi deployed AI in 200+ enterprise environments over 13 years. Eighty-eight percent of proofs of concept never reach production. Seventy percent of failures are human and organizational, not technical. I wrote about the implications of that data in Automating Chaos Produces Automated Chaos — the short version is that the tool almost never fails. The organization around it does.
In bank engineering, the pattern is consistent: the platform was bought, not the workflow. The engineering team gets the tool. Nobody defines success criteria per workflow. Adoption is measured by seat utilization, not outcome. Six months later: 40% seats active, no evidence delivery velocity improved, vendor renewal approaching.
That’s not the Jellyfish paradox playing out as a statistic. That’s a specific decision: the CIO approved a category, not a workflow. The team built what the executive wanted. The executive didn’t know how to evaluate if it was good. Nobody lied. The buyer just bought the wrong thing.
The 1.7× higher issue rate in CodeRabbit’s analysis of 470 GitHub PRs is the downstream consequence: AI generates at speed, no quality bar was defined, issues compound into the review queue. Faster code generation feeding a clogged review pipeline is not a productivity gain. It’s a new bottleneck.
I’ve seen both failure modes. In When Your Team Starts Building, five engineers each independently hit the same wall — they automated their individual work but the team’s shared processes stayed manual. In From Paperclip to Production, the question isn’t whether open-source SDLC platforms work; it’s whether you’ve done the organizational work to absorb them. The technology is rarely the problem.
The Part I Can’t Make Easy
This framework is harder to apply in a bank than it sounds. Procurement cycles are measured in quarters. Vendor due diligence requires security assessments, data handling reviews, compliance sign-off. Sometimes “wait” is enforced by the institution before you decide it yourself.
The answer isn’t to bypass those controls. A workflow description — inputs, outputs, exceptions, success criteria, ownership — is exactly what makes those controls move faster. A procurement review that arrives with documented workflow specs and measurable outcomes clears faster than one that arrives with a demo and a pilot proposal. The compliance infrastructure you maintain to satisfy examiners is the specification document you’ve been ignoring.
This connects to the broader point in Architecture Without Architects: the artifacts you produce for regulatory reasons are also the artifacts that inform better technical decisions. You’re already doing the work. You’re just not applying it offensively.
Use it offensively.
Your engineering team doesn’t need an AI strategy. It needs a decision for each workflow — and the discipline to make them in the right sequence. The backlog will follow.
The organizational discipline required to make these decisions is the same one explored in Automating Chaos Produces Automated Chaos — Sol Rashidi’s 200+ deployments mapped to banking’s specific constraints.
Find me on X @orestesgarcia or LinkedIn.