Source-of-truth document

Methodology

Where haio’s numbers come from, how the price estimate is built, and what this dataset cannot tell you. Written so a journalist can cite a figure on any property page without legal or factual exposure, and so a first-time buyer can understand — and decide whether to trust — the estimate they are looking at.

Retrieved 2026-06-25· coverage refreshed hourly

1. Coverage

1.1 Data sources

haio is a join over four public Singapore datasets. Nothing here is proprietary; every figure on the site can be reproduced from the sources below.

Source	Covers	Last refresh	Rows last run
URA (Urban Redevelopment Authority)	Private resale caveats — condo, apartment, EC, private landed. 1995–present.	2026-06-25	8,952
HDB via data.gov.sg	HDB resale transactions, full historical depth. Daily delta from 2026-06-02 onwards.	2026-06-25	233,431
SLA OneMap	Landed property registry, polygon-keyed addresses.	2026-06-25	93,090
MAS (Monetary Authority of Singapore)	SORA daily snapshots (1m / 3m / 6m), used for mortgage rate display only.	—	—

— means the source had not yet written a watermark at page-build time. All four datasets are public and free; haio adds no proprietary data on top.

1.2 Coverage by tenure

Live counts from the address-keyed properties and transactions tables. Pulled at page build, refreshed hourly.

HDB: 9,976 blocks; 974,548 txns
Condo / Apartment / EC: 3,127 projects; 598,259 txns
Landed: 97,065 addresses; 82,274 txns

Corpus total: 1,655,081 transactions across 110,168 addresses.

1.3 Update cadence

The database refreshes daily, in an overnight window (Singapore time). Each source updates incrementally.
SORA refreshes daily. URA caveats land roughly weekly behind URA’s own release schedule (caveats are lodged retrospectively).
HDB resale carries full historical depth, with new transactions added daily.

2. How the estimate works

haio values a home the way a careful human valuer does: find recent sales of genuinely similar properties nearby, then adjust each one for the ways it differs from the home you’re looking at. We don’t guess a price from a black box, and we don’t learn a number from features. Every estimate is built from real recorded transactions and a chain of adjustments you could check by hand.

The unit that does the work is PSF— price per square foot. A $2.4m sale of an 800 sqft flat and a $3.6m sale of a 1,200 sqft flat are both $3,000 PSF; reducing every sale to PSF lets us compare homes of different sizes on the same footing. For landed homes, the comparable unit is PSF of land, because what is being traded is mostly the plot.

2.1 Choosing the comparables

A comparable (“comp”) is a past sale similar enough to inform the subject’s price. Two things make a comp trustworthy: proximity (the same micro-market — prices change street by street) and recency (the market moves, so last quarter beats five years ago). We never mix tenure cohorts (HDB, condo and landed are separate markets) and never mix property types (a terrace is not compared to a detached house).

Rather than draw a fixed radius, the landed model widens its search in ordered steps and stops as soon as it has enough genuine comps:

the same named estate (e.g. Serangoon Gardens, Frankel);
the same subzone (a URA-defined planning sub-area);
adjacent subzones;
the wider planning area.

On top of that ladder, anysale within 300 metres of the subject is always allowed in, even if it sits across an estate boundary — a house two streets away is a better comp than one at the far end of the same label. The search also holds two hard limits: the comp’s plot size must be within about ±25%of the subject’s, and it should have sold within roughly the last 18 months. If a pocket of the market is so thin that these filters return too few sales, the model loosens them one notch at a time — and records every step it took, because how far it had to reach is itself a signal of how much to trust the answer (see §2.4). For HDB and condo pages, the same idea runs in a simpler form: same district and tenure, similar floor area, most-recent sales first.

2.2 Adjusting each comparable

No two homes are identical, so a raw comp PSF is rarely the right answer for the subject. We apply hedonicadjustments — “hedonic” just means decomposing a price into the value of its individual characteristics, then correcting for the ones that differ. Each factor below is estimated from the transaction history itself (a regression across the whole market), not hand-set, so it is defensible to a professional valuer.

Time. The market has moved since the comp sold. We carry its price to today using a quarterly price index— a measure of how landed (or condo, or HDB) prices have drifted, built from our own transaction record. A comp that sold when the index was 150 and is being used when the index is 165 is scaled up by 165÷150. This is why a slightly older comp is still safe to use: we are not pretending it sold today, we are explicitly re-pricing it to today.
Size.Bigger plots usually sell for a lower PSF — land gets cheaper per foot as the parcel grows. We measure exactly how much from the data (an elasticity, typically in the range of a 10–30% PSF give for a doubling of land) and scale the comp toward the subject’s size.
Floor and stack (condos). Higher floors and better-facing stacks command a premium; URA records the floor as a 5-storey band, and the comparable mix reflects it.
Tenure and remaining lease.A 99-year leasehold loses value as its lease runs down; a freehold does not. Within the leasehold cohort we re-price a comp to the subject’s remaining years using Bala’s Table, the standard SLA/URA leasehold curve (see §4). Freehold and 999-year homes need no lease adjustment.
Age, structure and rebuilds (landed). A freshly rebuilt 3-storey house is worth more than a tired single-storey one on the same plot. We separate the land value from the buildingvalue, depreciate the building over its economic life, and credit recent reconstructions. A&A works that refresh a house without rebuilding it do not reset its age.

After these adjustments, each comp is no longer “what that house sold for” — it is “what that sale implies the subjectis worth today, in today’s market.”

2.3 The estimate and the confidence band

We combine the adjusted comps into a single number using a weighted median, not a simple average. The median is robust: one freak sale can’t drag the estimate the way it would drag a mean. The weights give more say to comps that are closer, more recent, and closer in size — a sale next door last month counts for more than one across the planning area two years ago.

Around the point estimate we show a range, not a single figure. The width of that band is the spread of the adjusted comps themselves: when the comps agree closely, the band is tight; when they disagree — or when there are few of them — the band is wide. The band is honest by construction. It gets narrower where the evidence is strong and widerwhere it is thin, rather than being a cosmetic fixed ±percent.

2.4 The confidence score

Every landed estimate also carries a 0–100 confidence score. It goes down when the model had to reach far for comps (loosening the search in §2.1), when there were few comps, when the comps disagreed with each other, or when key facts (like a home’s age) had to be inferred rather than known. A number is far more useful — and more honest — when it arrives with a measure of how much to trust it.

We hold ourselves to a hard test: the estimate must be measurably more accurate when it says it is confident. On a held-out test of 2025 landed sales the model never saw during tuning, its highest-confidence estimates landed within about 6–7%of the eventual sale price (median), versus a wider error on its low-confidence ones — the score tracks reality. Overall median error across all landed estimates was around 9%. That is an honest estimate of central value, not an appraisal: roughly half of homes will transact within ~9% of our number and half outside it, and individual homes — especially unusual ones — can differ more.

3. What the estimate can and can’t say

The method above is only as good as the evidence underneath it. Where the evidence is thin, we widen the band, lower the confidence score, or decline to estimate — we would rather show you nothing than show you a confident wrong number. A clearly-uncertain estimate is the product working, not failing.

Thin markets.Rare enclaves and Good Class Bungalow areas transact a handful of times a year. With one or two comps, the band is wide and the confidence score is low by design — sometimes a single comp is all that exists, and we say so rather than manufacture precision.
Brand-new launches. A project with no resale history has no like-for-like comps; new-sale prices behave differently from resale, and our error is higher there. We flag these rather than pretend otherwise.
Unusual properties.An oversized plot, a corner site, an unusual structure, or a home whose age or lease we can only infer will all widen the band — the further a property sits from its neighbours’ profile, the less a comparable-sales method can pin it down.
Properties with no transaction history at all.Some addresses exist in the registry but have never lodged a caveat (see §7.1). For those we can confirm the address exists — not what it is worth.

The estimate is a transparent, reproducible read on central value from public transaction data. It is not a formal valuation or an appraisal, and it is not financial advice. For a binding number — a mortgage, a sale, a dispute — commission a licensed valuer.

3b. The comparables table on each page

So you can audit the estimate yourself, every property page shows the actual comparable sales behind it (source: lib/data/comparables.ts). The table is deliberately plain — what URA / HDB recorded, filtered, with no price filter and no outlier trimming:

Same tenure cohort and property type as the subject — never mixed.
Nearby, per the proximity ladder in §2.1.
Similar in size to the subject.
Transacted recently, most recent first; up to 12 shown by default.

The table shows the raw recorded prices. The estimate in §2 is what those same sales imply afterthe time, size, tenure and structure adjustments — so a comp’s headline price and its contribution to the estimate can legitimately differ.

4. Lease decay

Leasehold property value decays with remaining tenure. We surface a projected residual value on leasehold pages using the Bala’s Tablecoefficients — the standard curve published by SLA / URA for leasehold valuation.

Implementation: lib/calculators/lease-decay.ts. Eleven anchor points from 99 years remaining (multiplier 1.000) down to 0 years (multiplier 0.000); intermediate years are linearly interpolated between adjacent anchors.

Applied only when tenure_type is 99 or 999 AND lease_start is populated. Freehold and properties with unknown lease start show no decay projection.

5. Affordability: LTV / TDSR / MSR

The affordability calculator implements the MAS envelope as published in MAS Notice 645 (TDSR) and the corresponding MSR framework for HDB / EC purchases. Source: lib/calculators/affordability.ts.

TDSR cap: 55% of gross monthly income, inclusive of all recurring debt obligations.
MSR cap: 30% of gross monthly income, HDB and EC only. The tighter of TDSR and MSR binds.
Stress-test rate: 4.00% p.a. — MAS-mandated medium-term rate. Used for the TDSR / MSR repayment calculation regardless of the live SORA rate, exactly as banks underwrite.
Default LTV: 75% — standard first-loan LTV on a private property or an HDB resale bank loan. Lower (55% or 45%) tiers apply where the loan tenor extends past age 65, or beyond 25 (HDB) / 30 (private) years; surface those through the calculator’s tenor input.
Default tenor: 30 years. Cap is 25 (HDB) / 30 (private) by regulation; the calculator enforces this.

This is a structural calculator, not a credit decision. Banks add their own underwriting (credit history, employment type, foreign-income haircut, etc.) on top.

5b. Mortgage rates

The Mortgage calculator quotes a monthly payment built from a live reference rate plus a bank spread. Two distinct rates do different jobs — one for the payment shown, one for the affordability check — and conflating them is a common source of confusion. Source: lib/calculators/mortgage.ts.

5b.1 SORA — the live reference

SORA (Singapore Overnight Rate Average) replaced SIBOR in 2024 as the MAS-endorsed reference for SGD floating-rate loans. The calculator uses the published 3-month compounded SORA from www.mas.gov.sg/statistics/sgs-rates, updated each business day. The headline rate you see at the top of the calculator is SORA + spread— this is what your actual monthly payment is computed against.

5b.2 MAS 4% stress test — the affordability gate

For TDSR / MSR (see section 5), MAS requires banks to underwrite at a medium-term floor of 4.00% p.a., regardless of the live SORA. This is a regulatory minimum to keep borrowers solvent if rates rise. Haio applies the same 4% rate inside the affordability check, exactly as a bank would. That means the maximum loan you qualify for is computed at 4%, while the monthly payment shownis computed at the live SORA + spread — usually lower.

5b.3 Bank spread variability

The spread over SORA (typically 50–100 basis points) varies by bank, by loan size, by lock-in tenor, and by whether the loan is for a private property, HDB resale, or commercial. The calculator uses a neutral default; real bank packages can come in tighter or wider. Treat the payment shown as directional — for a binding quote, speak to a mortgage broker or the lender directly.

6. Stamp duty

Implemented per IRAS-published bands. Source: lib/calculators/stamp-duty.ts.

6.1 BSD — Buyer’s Stamp Duty (residential)

Band of purchase price	Rate
First $180,000	1%
Next $180,000	2%
Next $640,000	3%
Next $500,000	4%
Next $1,500,000	5%
Remainder	6%

6.2 ABSD — Additional Buyer’s Stamp Duty

Buyer	1st property	2nd property	3rd+ property
Singapore Citizen	0%	20%	30%
Singapore PR	5%	30%	35%
Foreigner	60% on every purchase
Entity / Trust	65% on every purchase

6.3 SSD — Seller’s Stamp Duty

If the property is resold within:

1 year: 12% of resale price
2 years: 8%
3 years: 4%
Beyond 3 years: nil

Source: IRAS Singapore. Rates correct as of the last review on 2026-06-02. ABSD remission schemes (matrimonial, mixed-nationality) are not modelled.

7. Known data limitations

What this dataset cannot tell you. Read this before citing a figure.

7.1 Cadastral-only landed addresses

Of the 97,065 landed addresses we surface, 50,905 (52%) come from the SLA polygon registry alone — we have a postal address but no transactional evidence (no caveat ever lodged). These pages carry a cadastral_onlyflag and explicitly say so. Treat them as “this address exists”, not “this address is on the market”.

7.2 Private leasehold reclassification

Prior to commit e803379 (2026-06-02), private condo rows defaulted tenure_type = freeholdwhen URA’s caveat had no explicit field. This systematically misclassified an unknown number of leasehold projects. The ETL now derives tenure_type and remaining_leasedirectly from URA’s parquet feed. If you are citing a tenure label, prefer a page retrieved after 2026-06-02.

7.3 Sparse HDB floor area

HDB area_sqft is missing on a non-trivial share of older resale rows. When a transaction has no area, no PSF can be computed; charts that aggregate PSF silently drop these. Whereall rows in scope lack area, the chart falls back to absolute price and labels the axis accordingly.

7.4 Floor band granularity

URA publishes the floor of a caveat as a 5-storey band (e.g. 06–10), not an exact floor. We surface the band as-is. Two units on different floors of the same band are indistinguishable to us.

7.5 HDB watermark seeded 2026-06-02

The historical HDB resale corpus is in the database. The daily delta-write watermark was seeded at 2026-06-02T04:00:58Z to keep the cron memory-bounded. Any HDB row with updated_at earlier than that came in via backfill; later rows came in via the daily delta.

7.6 What we don’t have

No rental yields are computed (rental data for HDB exists in our ETL but is not surfaced on property pages). No predicted appreciation. No future-supply impact modelling beyond the raw upcoming-supply registry. No agent or developer ratings. No listings; haio is a transaction-history site, not a marketplace.

8. En-bloc likelihood

On a private condo page we may surface an en-bloc (collective-sale) likelihood tier. It answers one narrow question: relative to other private condos, how does this development rank for the chance of a successful collective sale over roughly the next 5 years?

8.1 A tier, never a percentage

We show a relative tier — Lower, Moderate, or Higherlikelihood — and never a number. The honest reason: our model produces a literal probability, but that probability is not yet trustworthy on its own (it is inflated in the mid-range and moves with the property cycle). What is reliable is the ranking: the top tier genuinely concentrates the developments that go on to transact collectively. So we surface the rank-honest tier and deliberately withhold the raw percentage. A tier is a ranking signal, not a probability and not a prediction of any specific outcome.

8.2 How the tier is derived

A gradient-boosted model scores each eligible private condo from structural factors — building age, remaining lease, Master-Plan plot ratio (development headroom), unit count, and recent collective-sale activity nearby. Each development is then placed by its percentile within the scored universe: the top decile is Higher, the next two deciles are Moderate, and the bottom 70% is Lower. The card lists the two or three factors pushing a given development up its ranking.

8.3 Grounded in real past deals

The model is trained and validated against actual historical collective-sale outcomes from public collective-sales data. Where we have them, the card names a few comparable past deals (development + year) so you can see the basis. When the model has too little structural data for a development, it abstains — we show no tier rather than a guess.

8.4 Limits

A real en-bloc needs the requisite owner consent (80% or 90% by share value depending on the development’s age) and a buyer willing to pay — neither of which a structural model can see. Treat the tier as a starting signal for further research, not as advice or a forecast. It is refreshed periodically as new collective-sale outcomes and URA Master-Plan data come in.

9. How to cite haio

haio is a free public resource. We’d rather you cite the underlying URA / HDB / SLA / MAS sources where possible — haio is a convenience layer over them, not a separate authority. When a figure comes directly from a haio computation (e.g. our price estimate, our comparables ordering, our lease-decay projection), please attribute it.

Suggested format

Source: haio (https://haio.sg), based on URA / HDB / SLA /
MAS public data, retrieved 2026-06-25.

Per-property pages have stable URLs of the form /{tenure}/{slug}— for example /condo/the-orie or /hdb/ang-mo-kio-ave-3-block-123. Linking to a property page is the best way to let a reader audit the number you’re quoting.