Case study · accessibility SaaS · CloverX Limited

WCAG 2.2 launched mid-build.
We shipped for it from day one, for free.

A free accessibility platform that catches what most free tools miss: a dual-engine scanner, contrast checker, interactive WCAG 2.2 checklist, and a personal dashboard. I built it solo across design, React, and Node.js at CloverX Limited, the agency behind the product. It's live in beta at a11y.cloverxlimited.com.

Role: Designer + Full-stack Engineer · solo across the stack Agency: CloverX Limited (where I built this) Stack: React · Node.js · axe-core · HTML_CodeSniffer Status: Live · beta · a11y.cloverxlimited.com

The problem

The accessibility tooling market is split, and neither side serves builders

In 2023, accessibility tooling sat in a wedge that didn't open. Free scanners were noisy and surface-level. They'd flag missing alt-text, basic contrast, maybe a heading-order issue, and then stop. Paid tools started at $500 a month and assumed you had a dedicated a11y team to read eighty-page audit reports. Builders shipping fast on solo or small teams sat in the gap. They knew they needed accessibility but couldn't afford the serious tools or trust the free ones.

Then WCAG 2.2 landed in October 2023 and the whole market lagged. A year after the spec dropped, most free tools were still scanning 2.1 criteria, and paid tools quietly bumped their pricing. The gap only got wider. CloverX a11y was built to close it: free, serious, and on WCAG 2.2 from day one.

Market gap

"Free tools miss what actually fails."

Most free scanners check alt-text and basic contrast, then stop. Focus management, keyboard nav, ARIA semantics, and dynamic content stay invisible, so builders ship with false confidence.

Pricing wall

"Paid tools start at $500 a month."

Serious accessibility tooling assumes a dedicated a11y team and a four-figure budget. Small teams and solo builders are stuck in the gap, knowing they need accessibility but unable to afford it.

Spec lag

"WCAG 2.2 is invisible to most tools."

WCAG 2.2 landed in October 2023. A year later, most tools were still scanning 2.1 criteria. The new rules (focus appearance, target size, drag operations) were a feature gap nobody had closed for free.


The constraint

WCAG 2.2 was still being finalized while I was building for it

The spec was about 80% locked when I started. The remaining 20% (focus appearance, target size, drag operations) was still being negotiated by the W3C while I was implementing it. Every other free tool was still on WCAG 2.1. Building for 2.2 from day one meant chasing a moving target, but that was also the whole point of the product. The real constraint was building for a standard that wasn't fully stable, doing it alone across both design and engineering, and keeping a live product available to early beta users the whole time.


The decisions

Three calls that shaped the build

These are the points where I had credible options to choose between. For each one I've laid out what was on the table, why I picked what I did, and what it cost.

Decision 01

Run two scan engines in parallel.

The options on the table

  1. 01 Use axe-core alone. Industry-standard, well-documented, but misses categories of issues it doesn't cover.
  2. 02 Use HTML_CodeSniffer alone. Strong on certain WCAG criteria axe misses, weaker on dynamic content and ARIA.
  3. 03 Run both engines in parallel, deduplicate findings, surface a unified view with per-engine attribution.

Why this one

Two engines weren't just about more coverage. They gave more data points for knowing what actually needs fixing. When both engines flag the same violation in different language, the user gets a triangulated read on the issue. When only one catches something, the dashboard shows which engine found it, which turns into its own learning surface, so builders end up understanding their codebase better than they would with a black-box tool. For a free product whose whole value prop is "trust this more than the lazy alternative," giving people more diagnostic information was worth it.

The trade-off

It cost real time. Adding the second engine extended the build by weeks rather than days, and the deduplication logic was non-trivial on its own: the same violation would come back from both engines with different wording, severity, and DOM paths. I'd still do it, because the value prop falls apart without it.

Decision 02

Build for WCAG 2.2 from day one, even with the spec still moving.

The options on the table

  1. 01 Ship a WCAG 2.1 scanner first, add 2.2 once the spec was fully locked.
  2. 02 Wait until WCAG 2.2 stabilized fully (estimated mid-2024) before launching anything.
  3. 03 Build for 2.2 from day one, accepting that some criteria would need rewrites as the spec finalized.

Why this one

Every other free tool was still on WCAG 2.1 in late 2023. Building for 2.2 was the whole point. Without it, CloverX would have just been a worse axe-core wrapper. The 2.2 spec was about 80% locked when I started, and the moving 20% (focus appearance, target size, drag operations) was the part I had to chase. I built around the stable rules first and treated the moving ones as iterations rather than blockers.

The trade-off

Two criteria needed full content rewrites once the final spec dropped, so the interactive checklist got updated after launch with corrected examples and fix code. That was manageable. The alternative was shipping into a market that didn't need another 2.1 tool.

Decision 03

Ship AI fix value without paying for AI inference per scan.

The options on the table

  1. 01 Integrate a paid LLM API and generate the fix code inline. Best UX, but every scan costs money, which is unsustainable for a free product.
  2. 02 Ship rule-based fix snippets only. Cheap to run, but generic: the same fix template for every instance of a violation, regardless of context.
  3. 03 Ship tuned prompts users copy into their own AI chat, whether that's Claude, ChatGPT, or whatever LLM they already use. The product writes the prompt and the user's own account runs the inference.

Why this one

A free tool that calls a paid LLM API on every finding bleeds money on each scan, and that bleed scales with usage, which is exactly the wrong direction. Copy-paste prompts flip that. The user runs the inference on their own LLM account, one they already pay for and already know, and the prompt comes tuned with the specific violation, the offending code, and the relevant WCAG criterion, so the fix that comes back actually compiles. Operating cost stays flat no matter the scan volume, and the user still gets contextual AI fixes that work with the tools they already have.

The trade-off

We don't capture the AI interaction, so we don't know whether a fix actually worked, can't tune prompts based on outcomes, and can't close the loop on what produced good code. Some users also miss the copy-paste affordance at first and expect the fix to appear inline. I'd still take that over either burning money on every scan or shipping generic fixes that didn't solve anyone's actual violation.


What I built

Every layer of the stack, designed and engineered

What shipped, roughly in the order a user runs into it.

  • Dual-engine scanner

    axe-core and HTML_CodeSniffer running in parallel inside a Node.js worker pool. Findings deduplicated and unified into a single results view, with per-engine attribution preserved for debugging.

  • Crawler + queue + worker orchestration

    Multi-page crawl up to 500 pages. Anonymous quick-scans run in one queue, authenticated multi-page scans in another. Built on Node.js with a custom job runner and no third-party queue dependency.

  • Interactive WCAG 2.2 checklist

    Every WCAG 2.2 criterion documented with a fail example, a fix example, and copy-pasteable code. The only free tool that does this for 2.2.

  • Contrast checker

    AA and AAA results for any color pair, with large-text and graphical-element thresholds. Designed to be the link colleagues actually paste in Slack.

  • Personal scan history dashboard

    Auth-gated. Re-run any scan, share results, export to CSV or JSON, compare runs over time. The reason to make an account, once CloverX has earned it.

  • Honest error and loading states

    No fake progress bars and no error codes. Loading shows which page the crawler is on, and errors explain what went wrong in plain language and what to try next.

  • AI-assisted fix prompts (BYO LLM)

    15+ violation types ship with a tuned prompt template users copy into their own AI chat, whether Claude, ChatGPT, or another LLM. Operating cost stays flat no matter the scan volume, and users get contextual fixes that compile, with no API bills to keep the tool free.

  • Accessible-by-default component library

    The same token-driven design system approach I use across projects, except here every state of every component is documented for accessibility, covering keyboard, focus, and screen-reader behaviour. It made sense to eat our own dog food on an accessibility product.


What I did

Every layer of the stack

Discovery & UX

Scoped the entire problem space

There was no PM and no separate research team. I scoped the WCAG 2.2 coverage, audited every competing tool, and wrote the spec for what "free a11y tooling" should mean.

Front-end

React + the scan engines

I built the dual-engine scanner (axe-core + HTML_CodeSniffer running in parallel) and the entire React UI for the scan results dashboard.

Back-end

Node.js API + worker pool

Crawler, queue, and orchestration for scans up to 500 pages. Designed for both anonymous quick-scans and authenticated history.

Design system

Tokens + components

The same token discipline I bring to other projects, with every state of every component documented inline and accessible by default. Again, eating our own dog food on an accessibility product.


Design

Six screens that show the full product surface


The texture

The layout I spent the most time on is one most teams skip

What I obsessed over was what a scan looks like when it lands in someone's inbox through a shared link, more than the scan itself. That link gets pasted into Slack, a Jira ticket, or an email, and the person opening it might not have run the scan, or even be a developer. In the first five seconds they need to know what URL was scanned, when, how serious the findings are, and what to do next.

Most scanners drop the recipient into the same raw results page the original scanner saw, with the same dense table, no scan-meta header, and no context strip. The shared-scan view in CloverX got its own layout: a context strip with the URL and scan date, a severity summary card, a plain-language "what this means" explainer, and the full findings filterable from there. It's the same data the scanner saw, recomposed for someone arriving cold.

On paper it was just one extra layout, but it's what decided whether a shared scan landed as something useful or as noise in someone's day.


Impact

What's shipped, beta-tested, and live in the wild

4
Free tools shipped
scanner · contrast · checklist · dashboard
2
Scan engines running in parallel
more coverage than any free tool
9
WCAG 2.2 criteria covered interactively
only free tool that does this
15+
Quick fixes and AI prompts
built around fixing issues, not just finding them

The accessibility tooling market is split between cheap but shallow and deep but expensive. CloverX a11y shows a third option is possible: free and serious. The dual-engine scanner finds violations either engine alone would miss, and the interactive WCAG 2.2 checklist covers criteria no other free tool handles. Every finding ships with a fix, plus a tuned prompt the user pastes into their own LLM, which keeps the operating cost flat as scan volume grows. That's what makes it free for users and still sustainable to run.

Why the designer-who-codes matters. The whole premise depended on design and engineering being a single decision rather than two handed-off ones. A free tool can't afford a translation layer between what we said it does and what it actually does. Working across the stack kept the UX promise and the scan-engine output locked together, and the same person who wrote the empty-state copy also wrote the Node.js worker that produced the data behind it.

What's next. A CI integration. The fastest accessibility fixes happen before a PR merges, so the goal is to surface violations in GitHub Actions output instead of at QA time. It's the same product, just one layer earlier in the build pipeline.