Skip to content
AI commercial infrastructure for early-stage founders

An AI-assisted campaign-ready contact list for the entire UK architect market

9,182 verified contacts from 30,001 ARB records, structured for direct CRM import.

A UK construction technology founder2026BuildBuild sprint
AIOpsOperating SystemConstruction

TL;DR

Tincture processed the full UK Architects Registration Board register - 30,001 records — through five enrichment layers (Companies House, website discovery, email generation and selection, SMTP verification, and AI organisation classification) to produce a structured delivery pack with a 9,182-contact verified core send file ready for CRM import. The AI classification layer distinguished core architecture practices from in-house employers at scale, protecting the send list from bounce-rate risk.

The brief

What did the client need?

The client needed a clean, campaign-ready list of every registered UK architect with verified direct email addresses - ready to load straight into a cold-outreach platform without any manual work. The source material was the Architects Registration Board register: the official list of every registered architect in the country, which is public, but raw. One row per person, occasional emails, enough company context to segment intelligently before sending. The deliverable was a single CSV file.

The constraints

What made this hard?

The ARB register is person-led: 30,001 records, one per registered architect, with patchy contact data. Most records include a name, registration number, and practice name — and that's about it. Some have websites. Almost none have email addresses you can use directly. Getting from "here is a list of names" to "here is a campaign-ready contact list" meant enrichment at every layer: finding the practice website, inferring the email pattern for that domain, verifying deliverability before selection, and then making an explicit decision about which contacts belong in the core send file and which to hold back.

The other constraint is the noise in the dataset. Not every registered architect works at an architecture practice. Some are in-house architects at universities, property developers, large retailers, or public-sector bodies real ARB registrants, but not outreach-viable in the same way as a sole practitioner or a small studio. Distinguishing them at 30,000-record scale required an organisation categorisation layer that doesn't come with the raw data.

The approach

How did Tincture frame the problem?

I started with the full ARB register and kept every record, including the ones without an email, because the output is a market record as much as a send list. From there, five layers of enrichment — with AI doing the interpretive work that would have otherwise needed manual review at-scale.

The build

What was shipped?

Companies House. I grouped architects into practice rollups by normalised practice name and ran each rollup against Companies House for company number, status, SIC code, and accounts category. This gave context on the organisation behind each practice — and a size proxy from the count of ARB-registered architects within each rollup.

Website discovery. Websites came from three sources in priority order: the ARB detail page, shared practice-level evidence (if one architect at a practice had a website and another didn't, the gap was filled from the shared record), and scored domain analysis for anything still missing. No guessing; every promoted website carried a source and confidence label.

Email candidate generation and selection. Candidates came from ARB pages, practice websites, role inboxes, and inferred patterns from verified domain evidence. Where a practice had multiple confirmed emails at the same domain, I applied the pattern across other records at that domain. Selection was conservative: direct and person-specific emails over role inboxes, verified domains over mismatched ones, strong pattern evidence over inference alone.

SMTP verification. Every candidate went through SMTP-level resolution, which was particularly relevant on hard-to-call domains. Final statuses simplified to four practical CRM categories: valid, risky, invalid, and unknown. Invalid contacts were quarantined; risky and unknown went into review files rather than the core send.

AI classification. The organisation categorisation layer — sorting practices into core architecture practices, public-sector employers, universities, property developers, in-house architect employers, and unknown — ran through an AI classification pipeline. At 30,000 records, doing this by rule alone would miss too many edge cases: a practice called "London Studios" could be an architecture firm or a film production company, and getting that wrong at volume corrupts the core send file. The AI layer parsed company names, SIC codes, accounts categories, and website content together to make the call, with confidence scores attached so anything uncertain went to review rather than the send file. The point was to protect the core outreach pool from contacts that look like architects but aren't outreach-viable — a Live Nation architect, for example, is a real ARB registrant, but Live Nation is not an architecture practice.

The delivery wasn't one file. It was a structured pack: a core send file, a full email-qualified file, an organisation-context review file, a role-inbox review file, a domain-mismatch review file, a risky and catch-all hold file, and a quarantine file with instructions not to import it.

The outcome

What were the results?

9,182 core contacts

Verified direct emails, matched domains, core architecture practices only — ready for direct CRM import.

14,820 email-qualified

Full email-qualified file including non-core organisation contexts, held for extended campaigns.

30,001 records processed

Every registered UK architect in the ARB register, delivered across a structured seven-file pack.

A further 3,455 catch-all or risky contacts sit in the review file for later consideration; 1,784 valid-email contacts where the email domain doesn't match the practice website are in a separate review pack; 1,901 invalid contacts are in quarantine.

The organisation categorisation means the first CRM import can go straight into the safest segment without any manual review. Everything else is documented, labelled, and held for whenever the client wants to extend the list.

The takeaway

What's the transferable principle?

The useful thing about building a contact list at this scale is that it forces you to make every edge-case decision explicit — and documented. Catch-all domains, role inboxes, in-house employers, domain mismatches: these are the things that quietly inflate bounce rates and get sending domains flagged if they go straight into an import unchecked. The work is in the labelling as much as the scraping, and the review files are what stop the good contacts getting tainted by the bad ones.

More like this

ai analysis platformFeatured

Pilot live by Tuesday. Anonymous collection with silent attribution working end-to-end.

A closed Q&A pilot built for AI analysis, live by Tuesday

Tincture helped a UK founder launch a closed Q&A pilot for their personal project — building anonymous response collection with silent participant attribution, structured for direct AI analysis per participant. The build ran on Vercel with a Supabase back-end, used unique URLs to carry participant identity without respondents ever identifying themselves, and delivered clean, typed response data ready to pass straight to an AI model. Pilot live by Tuesday, exactly as scoped.

A UK founder's personal side project2026
AIOpsOperating System
custom agent operating systemFeatured

12 agents, ~10 issues/day, ~2 hours saved daily, cost = Claude subscription + ~$40/month VM

A 12 agent ↔ Linear ↔ Notion setup

The Operating layer was wired to Linear and Notion, paired via MCP, so each specialist agent could ship from Linear and mirror state into Notion against its per-persona write contract, with identity that survives across sessions. Stand-up took about 2.5 working days of focused build, over a week of elapsed time. Steady state: 12 agents working roughly 10 issues or tasks a day, about 2 hours of time saved daily plus the context-rebuilding tax that doesn't show up on a clock. The whole thing runs on a Hetzner VM in Helsinki, so agents pick up Linear comments and project emails whether the laptop is open or not. Cost is the Claude subscription plus about $40 a month for the VM.

Tincture2026
OpsAIOperating System
bespoke AI operations platform

12 production stages, 11 Supabase tables - replacing WhatsApp and spreadsheets for a bespoke jewellery operation.

Custom AI Operations and CRM Platform

A multi-level access portal platform for an early-stage US/UK jewellery business, a bespoke operation spanning the UK and US, replacing fragmented WhatsApp and email workflows with a structured RFQ-to-delivery system. The platform includes an AI diamond scoring engine, repeat-customer preference learning, AI-assisted CAD and image generation, finance forecasting, and market pricing intelligence, alongside an eleven-table Supabase data model, a 12-stage production pipeline, and integrated Stripe, Sendgrid, Twilio, and blockchain diamond provenance tracking.

Early-stage US/UK jewellery business2025
OpsOperating System

Your own bespoke leads