Methodology
Last updated: 7 June 2026
Reports look like a single confident verdict. Underneath, they're a chain of data pulls, aggregations, and AI-written narrative — each with its own confidence level. This page documents the chain so you can decide how much to trust each part.
The short version
- Market data (listings, occupancy, prices, seasonality) comes from Inside Airbnb — actual scrapes of Airbnb's public site, refreshed monthly. Across 107 cities, we have ~1.5M active listings.
- Regulation data is hand-curated from official council and city government publications. We disclose the source per city.
- Property prices use country-specific government registries where available (HM Land Registry for the UK, DVF for France, INE for Spain). Where no granular source exists, we fall back to OECD country-level house-price indexes.
- Per-property comparables are computed live from listings within 500m of the address you submit.
- Narrative(the "verdict" section, the per-tab summaries) is written by a large language model (Claude Opus) given ONLY the numbers above as input. The LLM doesn't invent data; it interprets what we've already computed.
What "Inside Airbnb" data actually is
Inside Airbnb is a non-commercial project that scrapes Airbnb's public city pages monthly. They publish snapshots as open data (Creative Commons). We ingest the full snapshot per city — typically 50,000 to 200,000 listings — and recompute aggregates from scratch on each ingest.
Per listing, we get: location, room type, bedroom count, listed price, number of reviews, last-review date, and Airbnb's estimated annual revenue (when available). We use the standard review-count × average stay × nightly pricemodel to estimate occupancy and revenue for listings where Airbnb doesn't publish their own estimate.
What this means for accuracy: the numbers are point-in-time snapshots, not real-time. Occupancy can lag reality by 2–4 weeks. A listing that just got de-listed is still in the dataset until the next scrape.
What we calculate vs what's estimated
Per-city market data (median occupancy, ADR percentiles, seasonality curves, supply trends) is computed from real listings — no estimation. We show the listing count behind each number so you can gauge confidence.
Per-property comparables (when you submit an address) are likewise real — actual listings within 500m of your coordinates.
Year-on-year deltas ("supply +12% YoY") compare the current snapshot to the Inside Airbnb snapshot ~365 days ago, using the same reducer. Drift tolerance is ±60 days when an exact match isn't available.
Long-let yields and rental-vs-Airbnb premiums are estimated from OECD rental indexes (country-level) + per-city Airbnb medians. We flag these as "estimate" in the UI.
The AI narrative — what the LLM does (and doesn't do)
The LLM (Claude Opus 4.7 by default) receives a structured JSON snapshot of every number for the city + address you queried. It does not browse the web, look up market data, or invent figures. Its job is to interpret the numbers we've already computed and produce a plain-English verdict, summary, and risk callout.
We use a multi-specialist pattern: separate LLM calls for Market Analysis, Risk Analysis, Neighbourhood Analysis, and a final Synthesizer that combines them. Each specialist's output is validated against a strict schema; if it fails, that section gracefully falls back to a placeholder rather than fabricating.
Limitations to bear in mind:
- LLMs can phrase things confidently when the underlying data is noisy. We try to surface confidence ranges, but read the numbers underneath the narrative — they're the ground truth.
- The LLM knows nothing about you. The verdict applies to the property in general, not your personal situation.
- The LLM knows nothing about events after its training cutoff. If a regulation changed last week, the LLM won't know — but our hand-curated regulation data will.
Regulation data — how it's collected
For each covered city we manually review the local council's short-term-rental rules: night caps, licensing requirements, planning consent, host-presence rules. We cite the official source link in the report.
Regulations change. We refresh the dataset quarterly and on tip-offs from users. If you spot something outdated, email [email protected] — we usually update within 48 hours.
The verdict label
Each report ends in a labelled verdict: Strong buy, Buy, Watch, Wait, or Avoid. This is a heuristic, not a rating. It's computed from yield + risk score + market trend + regulation level, weighted to favour properties that score well on multiple axes simultaneously.
The label isn't investment advice (see Terms of service, clause 3). It's our best compression of the underlying data into one word. Read the rest of the report to understand why we landed there.
What we explicitly don't include
- Buyer's mortgage suitability— we don't know your deposit, income, or credit history.
- Tax position — depends on personal circumstances. We surface high-level rules (e.g. Section 24 in the UK) but the number that lands in your bank account depends on you.
- Off-market or future regulation changes— we report what's public today.
- Property condition or refurbishment costs— visit the property; we can't.
How often the data refreshes
- Inside Airbnb snapshots: monthly per city, ingested within 1 week of publication
- Year-on-year deltas: re-computed mid-week, separate from the full ingest
- Property-price registries: quarterly (HM Land Registry releases on this cadence)
- Regulation data: quarterly review + ad-hoc on news of changes
- OG images + narrative summaries: auto-regenerated after each market data refresh
Auditability
Each report shows the snapshot date the market data was pulled from, and the list of data sources behind each section. If you want the raw aggregations behind a number, email [email protected] with the report ID and we'll send them.
Open data we built on
- Inside Airbnb — listings + occupancy
- HM Land Registry Price Paid Data — UK transactions
- DVF (Demande de Valeurs Foncières) — French transactions
- OECD — country-level house-price + rental indexes
- OpenStreetMap — schools, transport, amenities (via Overpass)
- data.police.uk — UK crime stats
We're committed to giving back: corrections to our regulation dataset are published in plain-text on request. Mistakes in our methodology — tell us; we'll fix them and credit you on this page.