Pattern AI × Vertical × Data Moat 16 min Updated Apr 19, 2026

Vertical AI Wrapper SaaS — When Depth Beats Breadth

You rent the model. You own the data, the workflow, and the evals. That is where the moat lives.

Every founder building on Claude or GPT asks the same question. What stops a competitor from copying my product with the same API and the same prompts? This pattern is the answer. The four companies that crossed $100M in ARR between 2023 and 2026 — Harvey, Cursor, Glean, Perplexity — did not win by having a better model. They won with proprietary data the model did not have, workflows it could not replicate, and evals it could not pass without them. Meanwhile, hundreds of thin ChatGPT wrappers shipped in 2023 and died in 2024 when the API commoditized. This page names the difference between the two groups.

13
Products observed
4
Succeeded
3
Partial / acquired
6
Failed / silent
Built from public data — not from founder blueprints
This pattern is extracted exclusively from publicly observable product outcomes (YC, Product Hunt, editorial coverage). If you generate a blueprint on PlanMySaaS, your idea stays private by default — never extracted, never aggregated.
What is this pattern, really?
Vertical AI Wrapper is a recipe — a strategy founders can adopt for their own SaaS idea. The 13 companies listed below are cooks who tried this recipe. Some made the dish work. Some burned it. The page shows you why.
Read this page as: "If I take this approach for my idea, here is the recipe, here is who tried it, here is what they learned, and here is the exact six-week order I should run." You are not reading a company biography. You are reading a recipe + a record of every cook who tried it. New to the concept? Read the "What is a Pattern?" primer →
Pattern DNA
The four invariants that define this pattern. Remove any one and the pattern collapses into something else.
PATTERNDNA01The foundation model is rented, not trainedThe product uses Claude, GPT, or Gemini through an A02The vertical is narrow by designThe product serves one specific domain. Legal dilige03Proprietary data is the actual moatHand-curated data the foundation model does not have04The interface is an opinionated workflow, not a chat boxThe UI is the domain workflow. A document review panREMOVE ANY ONE INVARIANT AND THE PATTERN BREAKS
01
The foundation model is rented, not trained
The product uses Claude, GPT, or Gemini through an API. The founder does not train a new base model or host their own. Training from scratch costs millions of dollars and rarely works for small teams in 2026. The moat sits on top of the model, not inside it. Every attempt to 'build our own LLM' in the 2023-2024 wave ran out of money before shipping a competitive product.
02
The vertical is narrow by design
The product serves one specific domain. Legal diligence. Radiology notes. JEE tutoring. Founder planning. Enterprise search inside one company. Not 'AI for everything'. Breadth kills depth. It also kills the unit economics this pattern depends on. The founders who stayed narrow for 18 to 24 months consistently outperformed the ones who expanded in year one.
03
Proprietary data is the actual moat
Hand-curated data the foundation model does not have. Past-year questions with verified step-by-step solutions. Court rulings tagged by doctrine. Medical terms with the precision the model lacks. This takes expert time, not compute power. That is why a competitor cannot copy it in a weekend — and that is what makes the moat real.
04
The interface is an opinionated workflow, not a chat box
The UI is the domain workflow. A document review pane. A code completion surface. A blueprint generator. A search-and-cite box. Free-text chat is a fallback, nothing more. A product that ships only a chat box is two clicks away from being replaced by ChatGPT itself. The workflow is what tells the user 'this is the domain' — not the model under the hood.
05
Evals exist before the UI does
An evaluation suite of 200 to 500 hand-scored domain examples is built before any user sees the product. The team can measure accuracy, hallucination rate, and quality on a rubric designed with a domain expert. When a new model version launches, the eval tells the team whether to upgrade, skip, or re-prompt. Founders who skip this step ship on vibes — and vibes break in week three. Every vertical AI winner we studied had evals from day one.
Why this pattern wins — and where it breaks
The same wedge that produced the three successes also produced the nine failures. The delta is in execution discipline.
Why it works
Foundation models commoditize faster than vertical data
Between 2023 and 2026, the price of GPT-4-class intelligence fell by roughly 95 percent. The model that cost ten dollars per million tokens in 2023 cost under fifty cents by 2026. Vertical data did not commoditize at the same rate. A legal corpus, a verified medical dataset, or a curated past-year-questions bank is as valuable today as when it was built — and each new verified entry widens the gap.
Users in a vertical trust specialists, not generalists
A partner at a law firm will not paste a confidential contract into general ChatGPT. She will pay two thousand dollars a month to Harvey. A radiologist will not dictate into a consumer AI. He will use Nabla or Abridge. The trust premium for a vertical-labelled product is real, measurable, and sticky.
Quality stops being a matter of opinion
In a generic wrapper, 'the answers are good' is a feeling. In a vertical AI product with a proper eval suite, it is a number the whole team can see on a dashboard. That shift — from opinion to measurement — changes everything downstream. Hiring decisions get easier. Model upgrades get safer. Marketing claims become defensible. Investor conversations stop being hand-wavy. The eval suite is not just an engineering tool; it is an organizational alignment tool.
Price per outcome beats price per seat
Harvey charges per matter and per user seat. Cursor charges per developer per month. Vertical winners price against the business value delivered — a contract drafted, a bug shipped, a lesson completed — not against seats alone. That alignment keeps willingness-to-pay healthy even when the underlying model cost falls.
Unit economics improve with scale, not degrade
The top two thousand requests in every vertical follow a long-tail distribution. Caching them, routing easy queries to small models, and reserving large models for genuinely hard cases drives marginal cost toward zero. Founders who architect this from month one end up with forty to seventy percent gross margins; founders who assume cost scales linearly hit a wall by month twelve.
Why it fails
Wrapping ChatGPT with a new UI and calling it a product
The most common failure mode of 2023 and 2024. A founder builds a slick interface, writes a clever prompt, charges twenty dollars a month. The moat is zero. When OpenAI ships a native feature, a competitor clones the prompt in a weekend, or the user tries the free tier directly, the product evaporates. Dozens of AI resume builders, AI email assistants, and AI copywriters died this way.
Treating the model's output as ground truth
Products that ship without evals cannot tell when a model version changes behavior. Users notice drift before the founder does. In verticals where accuracy matters — legal, medical, financial, educational — a single high-profile hallucination kills trust in a week. You cannot recover. Evals are not nice to have; they are the product's insurance policy.
Generic scope creep toward 'AI for everything'
A successful JEE tutor tries to add NEET, then CAT, then classroom management. A successful legal tool tries to add HR policy. Each expansion halves the curation pace and dilutes positioning. Winners stay narrow for eighteen to thirty-six months before adjacent moves. Losers spread out in year one and lose their wedge.
Per-seat pricing against high-value outcomes
A lawyer generates value measured in thousands of dollars per hour. Charging that lawyer twenty dollars a month for AI assistance leaves ninety percent of the value on the table. Vertical winners price against the outcome — matter, document, lesson, feature — not against seats. Founders who anchor on consumer SaaS pricing in high-value B2B verticals underprice themselves for years.
Single-provider dependency
OpenAI changed pricing four times between 2023 and 2026. Anthropic deprecated older model variants. Gemini added and removed capabilities. Products locked to one provider without an abstraction layer suffered margin shocks and capability gaps. Route through an abstraction, keep a second vendor on standby, and cache aggressively.
Unit economics ladder
This is where most teams lose. Every row below is a lever you can actually pull; the orange ceiling is the line you cannot cross.
Unit-Economics LadderPer paying user per month. Red zone is where margin lives or dies.Plan price (startup B2B)Rs 4000After GST removal (18%)Rs 3390After payment fees (~2-3%)Rs 3305LLM cost ceiling (well-routed)Rs 600Vector DB + infra + storageRs 200Observability + evals toolingRs 100Gross margin floor (disciplined)Rs 2405

The B2B startup tier is shown here for illustration — enterprise contracts range from Rs 20,000 to Rs 5,00,000 per month with proportionally larger margins. Consumer tiers (like Rs 299-999) are possible but require obsessive caching and small-model routing to stay profitable. The lever that matters most is the cache hit rate: above 40 percent, the gross margin is healthy at any price point; below 25 percent, no price point saves you.

Deep dive
Why 2024 ended the thin-wrapper era — and what replaced it
The clearest story in AI SaaS between 2022 and 2026 is the collapse of thin-wrapper pricing power and the rise of vertical-depth moats. The curve that defined the window is the foundation-model cost curve itself.

GPT-4 cost roughly $30 per million tokens when it launched in March 2023. By late 2026, the same power costs under 50 cents. That is a 98% price drop in under four years. Any founder who priced their product assuming LLM cost would stay flat was wrong from day one. Most 2023 AI wrappers priced this way. That is why most of them are gone.

The price collapse did two things. It opened the market — millions of small businesses and solo users could now afford AI features. It also erased the obvious moat of 2022-23, which was 'we have access to GPT-4'. Everyone had access. The wrappers that survived did one of two things. They moved up-market into verticals where the outcome was valuable (legal drafting, clinical notes, enterprise search). Or they moved down-market into volume businesses with strict cost discipline (consumer subscriptions under a dollar a day, cached heavily).

The survivors share three traits. First, they spent the first two months curating proprietary data — hand-labelled examples, domain-expert reviews, workflow-specific edge cases. Second, they built evaluation suites before they built the UI. Third, they priced against the value of the outcome, not the cost of the tokens. Harvey charges per matter. Cursor charges per developer. Glean charges per enterprise seat in six figures. None of these prices track token prices.

The losers made the opposite choices. They treated the model as the product. They priced against consumer SaaS norms, often $20 a month. They leaned on the cleverness of a single prompt that leaked the week it was released. When OpenAI shipped native memory, custom GPTs, or deeper Claude artifacts, the wrappers built on those exact features disappeared without a funeral. The lesson is not that wrappers are bad. The lesson is that thin wrappers are bad. A vertical wrapper — with data, evals, and workflow — is exactly the layer the market rewarded.

Looking into 2027 and beyond, the pattern continues to strengthen. Foundation-model providers are shifting toward platform business models — Anthropic's Skills, OpenAI's GPT Store, Google's Agent Builder. Each move commoditizes the horizontal AI layer further. Vertical AI companies now have two moats to build. One is classical data and workflow depth. The other is the strategic choice to be a specialist a platform cannot economically replicate. For Indian founders, the opportunity is especially sharp. Exam prep, legal aid, agricultural advisory, and small-business accounting are large, underserved, and linguistically complex enough that no global platform will build them first. The window is wide open.

Outcome distribution in the public sample
Read this as a shape signal, not a probability. Founder execution is still the dominant variable — the pattern only tells you what most people missed.
Public Sample — 13 products using this patternOutcomes read from public coverage (YC, Product Hunt, Inc42, YourStory). Directional only.43613TRIEDSuccess signalSucceeded4 of 1331%Partial / Acquired3 of 1323%Failed / Silent6 of 1346%Win rate is directional — founder execution remains the dominant variable.
Founders who tried this recipe
These companies adopted the strategy described above. Some made the dish work, some burned it. The "what worked" and "what missed" columns are the shortest honest summary of each cook's experience — read them as lessons, not as histories.
Product
Outcome
What worked
What missed
Harvey
Succeeded
Narrow legal focus + law-firm specific workflows + enterprise contracts in the Rs 40-200 lakh annual range + raised at $1.5B valuation by 2024
Enterprise-only positioning leaves the solo-lawyer and small-firm segment to competitors; consumer-legal still unclaimed
Cursor (Anysphere)
Succeeded
Deep IDE-native integration + VS Code fork for full UX control + 100M+ ARR by 2024 + developer-love moat that GitHub Copilot struggles to match
Compute cost heavy; racing Microsoft, GitHub, and every VC-funded coding-assistant startup in parallel
Glean
Succeeded
Enterprise search vertical; connectors to Google Drive, Slack, Notion, Jira; $2.2B valuation by 2024 + proprietary permissions and ranking layer the LLM alone could not build
Heavy enterprise sales cycle; hard for solo founders to replicate the distribution moat without a big go-to-market team
Perplexity
Succeeded
AI-native search experience + source citation by default + $500M+ valuation and strong consumer retention; an alternative to Google for research-heavy use cases
Monetization path still evolving; advertising versus subscription tension ongoing; distribution cost against Google remains the open question
Jasper
Partial
Early winner of AI copy wave; crossed $125M ARR by 2022; built a real brand and a paid community before ChatGPT existed
ChatGPT commoditized the core use case in 2023; reported major churn and valuation cut; still active but the moat turned out to be narrower than the pricing assumed
Copy.ai
Partial
Similar early trajectory to Jasper; pivoted toward workflow automation and B2B after 2023 to survive the generic-copy commoditization
Struggled to re-establish clear differentiation post-ChatGPT; pivot is still in progress
PlanMySaaS
Active
Narrow vertical (founder planning) + proprietary 8-stage pipeline + pattern library as compounding moat; see this very page as evidence the architecture is self-aware
Still early; distribution and founder awareness are the current bottleneck; category education remains an ongoing cost
JeeBolo (sibling pattern, voice-first)
Active
Narrow vertical (JEE aspirants) + voice-first + proprietary past-year-question bank + Hinglish STT tuning as a deep data moat
Early stage; sees this pattern as one of two active DNA threads (vertical + voice-first) it must balance
Generic 'AI resume builder' wrappers (2023-2024)
Failed
Low launch friction; rode the AI hype wave to initial traction in months one to three
No moat beyond the prompt; LinkedIn and OpenAI shipped free alternatives; most were dead or abandoned by mid-2024
Generic 'AI email assistant' wrappers (2023-2024)
Failed
Clear consumer pain point; initial Product Hunt traction was real
Gmail's native Smart Compose and later Gemini integration crushed the use case; no proprietary data to defend against platform-native features
Generic 'AI note-taker' wrappers (2023-2024)
Failed
Active user bases in first year; many crossed ten thousand users
Otter, Fireflies, and Notion's native AI commoditized the category; transcript-plus-summary is not a defensible moat without deeper vertical data
Several ChatGPT-clone chat apps (2023)
Failed
Launched within days of GPT-3.5 public access; rode search traffic for 'ChatGPT alternative'
OpenAI made ChatGPT free; Google and Anthropic released direct consumer products; the wrapper layer disappeared
Elicit (research AI)
Partial
Narrow academic-research vertical; proprietary paper corpus; strong retention among PhD students and researchers
Monetization difficult in academic market; pricing resistance remains the core challenge; product quality is not the issue
When to use this pattern — and when not to
A short sanity-check before you commit four months. If you match more of the right column than the left, pick a different pattern.
Use when
  • You have real depth or privileged access in one specific domain — a profession, a geography, an exam, a workflow
  • Users in that domain pay willingly for specialist tools today (existing paid SaaS, consultants, training courses)
  • You can curate or source proprietary data the foundation model does not have
  • The workflow has a clear measurable outcome — a contract, a lesson, a ticket resolved — that can anchor per-outcome pricing
  • You are comfortable staying narrow for eighteen to twenty-four months before considering adjacent verticals
Do not use when
  • The idea is 'AI for everything' or 'a better ChatGPT' — no vertical depth, no defensible moat
  • The only differentiation is the prompt — prompts are not defensible, they leak in a week
  • The target is consumer impulse purchase at under fifty rupees a month — foundation-model cost will eat margin
  • The foundation-model provider (OpenAI, Anthropic, Google) is likely to ship a native feature that matches your product in the next twelve months
  • You cannot invest four to eight weeks in upfront data curation and eval setup before shipping
Anti-patterns · Self-diagnostic
Red flags to check in your own product
Each anti-pattern below is a specific mistake founders in this pattern repeat. If the symptom matches your product, act on the fix immediately — these compound in cost every week they go uncorrected.
The thin wrapper trap
Symptom
You can describe your entire product in one prompt. The user could get the same result by pasting the prompt into ChatGPT directly.
Why it hurts
There is no moat. The moment OpenAI or Anthropic ships a native feature matching your product, you disappear. Pricing power evaporates quickly as users realize the substitute.
Fix
Add proprietary data the model does not have, add evaluation suites that define quality, and wrap the interaction in a workflow that shapes toward your domain outcome.
Prompt-as-the-product
Symptom
The marketing page highlights 'powered by GPT-4' or 'built on Claude'. The differentiation story is the model.
Why it hurts
Customers do not buy access to the model — they have their own. What they buy is your data, your workflow, and your reliability. Leading with the model commoditizes your own positioning.
Fix
Lead with the outcome — 'draft an M&A side letter in thirty seconds', 'answer a JEE doubt in ten'. Model names belong in engineering changelogs, not on landing pages.
Demo-driven development
Symptom
The product looks great in a ninety-second demo and collapses in the first week of real use. Hallucinations surface; edge cases explode.
Why it hurts
Without evals, the founder has no systematic way to know when quality drifted. Users notice before the team does; trust erodes silently.
Fix
Build an eval set of at least 200 real-world examples in the first month. Run it weekly. Publish the accuracy number internally every Monday. Halt feature work when it regresses.
Consumer pricing in B2B verticals
Symptom
The product serves a lawyer or doctor or CFO, but charges twenty dollars a month because that is the SaaS norm.
Why it hurts
The user's hourly rate is a hundred to a thousand times the subscription. You leave nine-tenths of the economic value on the table and cannot fund the depth that makes the product defensible.
Fix
Anchor pricing against the user's hourly rate or per-outcome value. If a lawyer saves an hour, charge meaningful money. Enterprise pricing is not arrogance; it is arithmetic.
Horizontal expansion before vertical saturation
Symptom
The team is already adding a second vertical in month four. The first vertical is still not dominant.
Why it hurts
Each new vertical halves your curation velocity and dilutes the positioning. The pattern winners stayed narrow for eighteen to thirty-six months. Premature breadth is the most common startup death.
Fix
Write down the specific vertical in one sentence. Ship only what moves its metrics. Adjacent verticals are a next-year conversation, not a next-quarter one.
Single-provider lock-in
Symptom
The product hard-codes one model from one provider. A price or availability change would require a week of engineering.
Why it hurts
Foundation-model providers have changed pricing, deprecated models, and removed capabilities without much notice. Lock-in is a compounding risk that shows up exactly when you cannot afford it.
Fix
Route through an abstraction layer from day one. Keep one primary and one fallback provider live. Cache aggressively. Test the fallback monthly so it works in an actual incident.
Ignoring hallucination cost in regulated verticals
Symptom
The product ships in legal, medical, or financial without a human-in-the-loop review step. 'The model is usually right' is the operating principle.
Why it hurts
Usually is not a number. In regulated verticals, a single public hallucination triggers reputational and sometimes legal harm. You do not recover commercial trust easily.
Fix
Require a human-review step for high-stakes outputs until your eval shows domain-specific accuracy above an agreed threshold. Publish the number. Be the product that never embarrasses its user.
Treating fundraising as product-market fit
Symptom
The team celebrates a funding round before fifty paying customers exist. Hiring accelerates; burn expands.
Why it hurts
AI has attracted a lot of capital and a lot of capital theatre. A round is not a contract with users. Burn without PMF shortens the runway to the cliff. Several 2023 AI wrappers raised at unicorn valuations and shut down within eighteen months.
Fix
Define PMF in terms of paid retention, not total users or funding. Keep the team small until retention proves the wedge. Funding expands good businesses; it does not create them.
Same DNA, different domains
This pattern has at least seven viable verticals. Once you ship in one, about 60% of the blueprint carries over to the next — new persona, new retrieval corpus, same core loop.
SAME DNADIFFERENT DOMAINLegal diligence and contract…Rs 40,000-2,00,000 per month per…Clinical notes for doctorsRs 3,000-8,000 per doctor per mo…Sales call intelligenceRs 2,500-6,000 per rep per monthDeveloper coding assistantRs 1,500-3,000 per developer per…Exam-prep tutoring (JEE, NEE…Rs 99-999 per month depending on…SaaS founder planningRs 0-2,000 per month, credit-met…Academic research synthesisRs 500-3,000 per month (research…
Variant 01
Legal diligence and contract review
Clause-level classifier + precedent citation + law-firm workflow integration
Rs 40,000-2,00,000 per month per firm; per-matter add-ons
Variant 02
Clinical notes for doctors
Voice capture + SOAP-format structuring + EHR integration + specialty-specific templates
Rs 3,000-8,000 per doctor per month
Variant 03
Sales call intelligence
Call recording + objection classifier + deal-stage predictor + CRM writeback
Rs 2,500-6,000 per rep per month
Variant 04
Developer coding assistant
IDE-native + codebase-aware completions + repository context + refactor suggestions
Rs 1,500-3,000 per developer per month
Variant 05
Exam-prep tutoring (JEE, NEET, CAT, CLAT)
Past-year-question grounding + step-by-step explanations + vernacular support (see voice-first-vernacular-saas for the India-specific cousin)
Rs 99-999 per month depending on exam tier
Variant 06
SaaS founder planning
Structured multi-stage pipeline + pattern library + domain-verified examples (yes, PlanMySaaS itself)
Rs 0-2,000 per month, credit-metered
Variant 07
Academic research synthesis
Paper corpus + citation tracking + hypothesis generation + systematic-review workflow
Rs 500-3,000 per month (researcher); Rs 20,000+ (institutional)
Variant 08
Customer support agent (autonomous)
Knowledge-base grounded + ticket-aware + escalation-respecting + analytics dashboard
Rs 20-80 per resolved conversation or Rs 30,000+ per month
Six-week founder playbook
The exact order that the three successful products validated the wedge before building product surface area. Run this once, week by week, before you commit to the full blueprint.
01
Week 1 — Pick a vertical sharp enough to name in one sentence
Not 'AI for lawyers' — 'AI that drafts M&A side letters for mid-market law firms'. Not 'AI for doctors' — 'AI that generates SOAP notes for Indian pediatricians in Hindi'. The sharper the sentence, the stronger the wedge. If you cannot say it in one sentence, the vertical is too broad.
02
Week 2 — Curate 500 gold-standard examples by hand
Before writing any code beyond a starter. Real inputs and real outputs from your chosen vertical, verified by a domain expert. These examples become your eval set, your few-shot prompt library, and your onboarding material in one. Solo founders often skip this step and regret it by week ten.
03
Week 3 — Build evals before you build UI
Run your chosen foundation model on the 500 examples. Measure accuracy, hallucination rate, and domain-specific quality with a rubric you designed with the expert. Establish the baseline number. This becomes your North Star metric and the gate for every future model change, prompt change, or feature addition.
04
Week 4 — Ship an opinionated workflow, never a chat box
The UI should shape the interaction toward the outcome. A contract review tool shows clauses in a left pane with a comment thread on the right — not a chat window. A tutor shows a question + streaming solution + citation chip — not a text input. A chat box says 'this is a ChatGPT clone'. A workflow says 'this is the domain'.
05
Week 5 — Price per outcome or per high-value seat
If the user pays the product instead of a human specialist, anchor pricing against the specialist's rate, not consumer SaaS benchmarks. Lawyers charge by the hour; price against the hour saved. Doctors charge by the visit; price against the visit documented. Founders who anchor on consumer pricing leave eight to ten times margin on the table.
06
Week 6 — Run the eval weekly and publish the number internally
Models change. Prompts drift. New features introduce regressions. The team should see the accuracy number every Monday. Any drop below a threshold halts feature work until it is investigated. This single discipline is what keeps vertical AI companies from becoming the next cautionary tale.
Dashboard · What to measure
Metrics to track weekly
The scoreboard for this pattern. Publish these numbers internally every Monday. Any drop below target triggers investigation, not feature work.
Metric
Weekly domain-eval score
Target
85%+ accuracy on a 200-example eval set, stable or rising week over week
Why it matters
This is the single most important health metric. It tells you whether the product is getting better, staying the same, or silently drifting. Publish it every Monday.
Metric
Cache hit rate
Target
40%+ by week twelve
Why it matters
The top two thousand queries in any vertical follow a long-tail distribution. Caching them is what makes unit economics work at scale. If this number is below 25%, no pricing saves you.
Metric
Cost per active user per month (LLM + STT + infra)
Target
Less than 15% of gross revenue per user
Why it matters
Unit-economics health in one number. If this crosses 20%, you are one LLM price change away from unprofitability. Track it weekly and route small models aggressively.
Metric
Thumbs-down rate on answers
Target
Under 15% of rated interactions
Why it matters
Leading indicator of trust erosion. When it climbs, the eval score is about to fall. Triage thumbs-downs manually for the first six months — every one contains signal.
Metric
Time-to-outcome
Target
Under 10 seconds for common queries, under 30 seconds for complex ones
Why it matters
Latency kills retention faster than accuracy does. Users who wait longer than 10 seconds stop using the product even if the answer is correct.
Metric
Monthly retention (D30)
Target
35% or higher for consumer; 85% or higher for B2B
Why it matters
The ultimate scoreboard. Everything else leads to this. Below target for three months running, the wedge is wrong; pivot or kill.
Metric
Gross margin per plan tier
Target
60%+ at startup tier, 80%+ at enterprise tier
Why it matters
Patterns that fall below these thresholds do not raise growth capital on acceptable terms. Rebuild the cost structure before going to market.
Glossary
Terms used on this page
New to the category? These are the seven terms that appear throughout the pattern. Read them once and the rest of the page is faster to scan.
Foundation model
A large pre-trained AI model provided via API — Claude, GPT, Gemini, Llama. Vertical AI wrappers rent these rather than train their own.
Eval (evaluation suite)
A set of hand-scored domain-specific input-output examples used to measure product quality. Running the eval weekly is the standard practice for catching regressions.
RAG (retrieval-augmented generation)
A technique where proprietary documents are retrieved at query time and fed to the model as context. Most vertical AI wrappers rely on RAG heavily, not on fine-tuning.
Moat
The defensible barrier a competitor must overcome to replicate your product. In vertical AI, the moat is typically data curation, domain evals, and workflow depth — not the model itself.
Workflow UI
An opinionated user interface shaped around the domain outcome — a document review pane, a coding surface, a lesson player — as opposed to a generic chat box.
Prompt engineering
The craft of designing the instructions sent to the foundation model. Prompts alone are not a moat — they leak in a week — but disciplined prompt engineering is a real skill that underlies quality.
Per-outcome pricing
Billing anchored to the business value produced — per contract drafted, per bug fixed, per lesson completed — rather than per seat or per token.
Generate a blueprint on this pattern

Describe your idea. We will ground it in this pattern.

The blueprint wizard will inherit the constraints on this page — speech-to-text test in week one, caching-first architecture, UPI AutoPay from day one, parent loop before month three — and flag them in the product-analysis stage.

Get Started Free
100 free credits. No card. Your blueprint stays private.
Related patterns
Founders who study this pattern usually need one of these next. Some combine directly with it; others are the retention mechanism it depends on.
Voice-First Vernacular Micro-SaaS
The India-specific variant of this pattern — same data-moat logic, applied through voice and vernacular instead of desktop workflow
WhatsApp-Native SaaS
A distribution layer that often sits on top of this pattern — many vertical AI products ship a WhatsApp channel as a lightweight entry point
Frequently asked questions
Answers to the questions founders raise after reading a pattern page. Also indexed as structured data for search engines.
Can I really compete against OpenAI or Google with a wrapper?
Not with a thin wrapper. With a vertical AI wrapper that has proprietary data, domain evals, and an opinionated workflow — yes. Harvey, Cursor, and Glean all run on foundation models that anyone can access, and each crossed a hundred million dollars in ARR because what they built on top of the model is what competitors cannot quickly copy.
What is the difference between a wrapper and a vertical AI company?
A wrapper is a thin UI over a prompt. A vertical AI company has three things a wrapper does not — curated proprietary data, a domain evaluation suite, and a workflow shaped around a specific user and outcome. The moat is in those three layers, not in the model underneath.
Do I need to fine-tune my own model eventually?
Probably not. Most successful vertical AI companies through 2026 did not fine-tune base models — they invested in retrieval over proprietary data, prompt engineering, and evaluation. Fine-tuning is an option later, when a specific behavior cannot be prompted cheaply and you have tens of thousands of real interactions to learn from. Start without it.
How do I build evals as a solo founder without a data team?
Hand-curate 100 to 500 domain-specific examples with one expert. Build a simple scoring rubric in a spreadsheet. Run your model on them weekly and record the numbers. This is not a technology problem — it is a discipline problem. The founders who keep the spreadsheet alive end up with the vertical; the ones who skip it end up with noise they cannot diagnose.
Does this pattern work at low consumer prices like Rs 99 per month?
Yes, but the unit economics get tight. Cache aggressively — the top two thousand queries usually cover eighty percent of volume. Route easy questions to small models and reserve large models for genuinely complex queries. If the cache hit rate is above forty percent and the LLM cost stays under twenty-five rupees per active user per month, this pattern scales at Rs 99. Below that, the math breaks.
What happens when OpenAI ships a feature that matches my product?
This is the single most common failure mode. The defense is vertical depth — proprietary data and workflow that the foundation model does not have. If OpenAI shipping a native 'chat with your data' feature kills your product, your moat was only distance, not depth. Move toward a vertical where the data and workflow have genuine complexity the platform is unlikely to invest in.
How much proprietary data do I really need before shipping?
Between 500 and 2,000 hand-verified examples for most verticals. This is not a theoretical number — it is the threshold where evals become statistically useful and where prompt engineering with retrieval starts producing stable quality. You can ship earlier with fewer, but you will iterate blind until you cross this bar.
Should I hire a domain expert full-time or work with a contractor?
Contractor first, full-time later. A two-to-three-month contractor relationship with a practitioner in your vertical lets you ingest the first 1,000-2,000 verified examples without the cost of a full salary and before you know whether the wedge will hold. Hire full-time after the evals stabilize and paid retention clears month three — that is when the hire pays for itself.
How do I explain the moat to investors who say 'isn't this just a ChatGPT wrapper'?
Show them three things in ninety seconds. First, the eval score on your domain with and without your data layer — the gap is the moat. Second, a side-by-side output of your product and raw ChatGPT on a hard vertical query — the quality gap is visceral. Third, the per-outcome pricing and the gross margin math — the business model gap is the financial moat. Investors who still think it is a wrapper after those three proofs are not your investors.
What about compliance — HIPAA in medical, SOC 2 in enterprise, DPDP in India?
Compliance is a moat in itself for regulated verticals. Most thin wrappers never ship compliance because it takes three to six months and real effort. If your vertical is medical, legal, financial, or enterprise, compliance becomes a deepening of the wedge rather than a cost. Budget it as a feature, not an overhead.
Is it smarter to run multiple providers (Claude + GPT + Gemini) or commit to one?
Abstract through a router from day one, but default to one primary with one standby live in production. Multi-provider routing is a real edge when the cost difference matters — you can route easy queries to the cheaper provider and reserve the premium model for hard cases. The discipline is the abstraction; the complexity is kept inside the router, not spread across the codebase.
Sources and transparency
Every claim on this page points back to a public source you can open and read yourself. No opt-in or paid founder blueprint is used to build this library.
Public sources used
  • YC public batch directory entries W23, S23, W24, S24 for AI vertical wrappers
  • Public ARR disclosures from Harvey, Cursor (Anysphere), Glean, Perplexity, Jasper
  • Contrary Research and SaaStr reports on vertical AI 2023-2026
  • TechCrunch and The Information coverage of AI SaaS shutdowns and pivots
  • Product Hunt AI category launches 2023-2026 (observed outcomes only)
  • Disclosed pivot announcements from Jasper, Copy.ai, and other early-wave AI companies
Found a source we missed or a claim that needs sharpening? This page updates as new public evidence appears. If you know a company that adopted this pattern and was not listed above, or if a claim here no longer matches 2026 reality, drop a note from the contact page. We read every correction.
Back to Pattern Library