The Theory of Real Estate Gravity

There is a particular kind of idea that gets killed at the dinner-party stage. You say it out loud, someone smart leans back, and they tell you it's cute. “Property has gravity — parcels pull on each other like masses. Cute. But that's never going to hold.”

And they're probably right. It's a corollary, a borrowed picture, an analogy dressed up in physics it didn't earn.

But here is the question I keep coming back to, the one that almost nobody asks before they wave the idea off: Have you ever actually run the equation? Have you made the observation?

No. Almost no one has. Because until very recently, you couldn't. The data didn't exist in a form you could touch, the compute was too expensive, and the patience required — the willingness to learn in sequence, one compounding step at a time — wasn't worth it for a metaphor.

That constraint is gone. So I ran it.

What follows is not a proof. I want to be precise about that from the first paragraph, because the entire project lives or dies on its honesty. This is an experiment. It's a platform for noticing. The argument isn't “real estate obeys gravity.” The argument is something smaller and, I think, much more interesting: a market event is now measurable at the parcel level, and once something is measurable, you are obligated to measure it before you tell me it's cute.

A platform for noticing

Start with what's genuinely new, because it isn't the metaphor — the gravity analogy is old enough to have grandchildren. What's new is the platform underneath it.

For most of the history of commercial real estate, the binding constraint on insight was never access to information. It was the cost of aggregating it. The same public records, the same permits, the same business filings, the same demographic releases, the same traffic counts — they were sitting in plain sight, available to everyone, and therefore useful to almost no one, because no human and no small team could hold enough of them in view at once to see the shape they made.

AI changes exactly one thing, but it's the thing that matters: it collapses the cost of aggregation to nearly zero. It lets you take the ordinary, public, democratically-available information that everybody has, and assemble it at a scale where you can take a measurement — where you can do for a local market roughly what a physicist does when they measure gravity on a micro scale, by stacking enough careful observations that a signal too faint to feel becomes a number you can write down.

That is the whole trick. Not prediction. Not magic. Noticing, at volume.

The numbers are almost embarrassingly modest, and I lead with them on purpose, because the modesty is the point.

~5,000Parcels in the predictive surface

11M+Predictive calculations

~$2,509Total build cost

< 3 moSince March 3, 2026

I tell you the cost not to brag but to make a claim about the future: if a solo operator can stand this up for the price of a used car in a single quarter, then the era in which aggregation was a moat is over. What's left as a moat is method and integrity — the discipline of how you measure and whether you'll tell the truth about what you found.

The method is sequential. The system learns in sequence and it learns by different sequences — it changes the values, re-weights the inputs, and re-runs, over and over, so that it is continuously taking the temperature of the market in every facet at once. Each pass is a compounding step on the last. None of it is a single oracle pronouncement; all of it is an accumulation of measurable, auditable observations. That posture has a house style, and the style is non-negotiable: provenance is the product. No mock data, ever. Every forecast is walk-forward and pre-registered — we write down what we expect before we look, so we can't quietly move the goalposts after.

With that on the table, we can finally talk about gravity without lying to you.

Borrowing Newton on purpose

Newton's law of universal gravitation is the cleanest sentence in the history of science:

Newton's law of universal gravitation · F (force) = G (constant) · the product of two masses ÷ the square of the distance between them

Two bodies attract in proportion to the product of their masses and in inverse proportion to the square of the distance between them. Big things pull hard. Distance kills the pull, fast.

The real-estate corollary writes itself, which is exactly why it's suspicious. A parcel has a mass — some accumulated weight of what it is and what's happening on it. That mass exerts a field on the parcels around it. And the field weakens with distance: a transaction next door moves you more than the identical transaction a mile away.

So far, so cute. The dinner-party skeptic is still right. An analogy that feels true is the most dangerous kind, because it asks for your assent before it's earned a single observation.

The way you make an analogy earn its keep is to refuse to let it stay poetic. You force it to become arithmetic. You define your terms so tightly that they can be wrong, you assign them numbers, you run them against held-out reality, and you keep the receipts. That's where Mass and Field stop being a picture and start being a measurement.

Mass and Field

In the GRAVITY engine, every parcel resolves to a signed tuple: (Mass, Field). Both terms are normalized onto a −100 to +100 scale, and the sign carries as much information as the magnitude.

Mass is intrinsic. It's the weighted accumulation of a parcel's own signals — what is physically and economically true of that specific piece of ground: the strength of what occupies it, the density and recency of activity on it, the entitlements, the rent it can command, the tenants whose presence is itself a signal. Mass answers: how heavy is this body in its own right?

Field is relational. It's the net influence the parcel sits inside — the sum of every neighbor's mass, bleeding across boundaries and attenuated by distance. Field answers: what is the gravitational neighborhood doing to this body, regardless of the body itself?

Feel the field

Drag the heavy parcel. Every cell around it re-colors by field = Σ mass / r² — warm where the pull is strong, cool where distance has killed it.

Interactive · drag to move the mass

drag the heavy parcel →

The reason the tuple is signed is the most important design decision in the whole engine. Mass and Field can each be positive (attractive, appreciating, accreting) or negative (repulsive, declining, shedding). A dark anchor is negative mass — a heavy body, but one that pulls value down. And critically, Mass and Field can disagree. A parcel can have weak intrinsic mass but sit in a strongly positive field (the rising-tide pad site). Or strong mass trapped in a souring field (the good building on the wrong block). Those divergences — where the two numbers point in opposite directions — are where the most actionable signal lives, because they are exactly the things a single-parcel appraisal mindset cannot see.

Schematically, for a parcel p with neighbors j, the field is built from three terms:

Field, schematic form Field(p) = Σⱼ [ Aₓ · Mass(j) / d(p,j)^α ] + C(cluster) + G(base)

G — base gravity: the parcel's own mass converted into the floor of its score. The body's own weight.
Aₓ — adjacent bleed: the coefficient governing how much a neighbor's mass leaks across the boundary, divided by distance raised to a falloff exponent α. This is the inverse-square family — the “distance kills the pull” term.
C — cluster bonus: the super-additive term. When several masses concentrate, the combined field is worth more than the sum of the parts — the way a galaxy is more than a list of stars. This is the term that makes corridors and nodes behave like nodes.

I'm showing you the shape, not the constants. The exponents and coefficients are tuned, versioned, and auditable; I won't assert specific numbers in prose because the moment a number leaves its audit trail it becomes a claim I can't stand behind. Provenance is the product.

The quants, and why these and not the others

This is the section the skeptic should actually fight me on, because this is where the choices are. Every one of these methods beat out a more obvious candidate, and the reasons are the argument.

The spatial kernel — why inverse-square, not Gaussian. The bleed term needs a function that says how influence falls off with distance. The lazy default is a Gaussian kernel — a smooth bell that's easy to fit. We use a power-law (inverse-square family) instead, because a Gaussian has thin tails: it decides, essentially, that beyond a certain radius a neighbor's mass simply doesn't exist. Real markets don't work that way. A regional draw — a stadium, a mall, a hospital — exerts a weak but real pull across distances a Gaussian would zero out. Power-law decay preserves the fat tail. It lets distant heavy bodies still matter a little, which is empirically how anchors actually behave.

Time — why a Fibonacci-staged decay, not a single exponential. Signals age. The standard tool is exponential decay, which has exactly one knob: a half-life, a single constant rate at which the past stops mattering. The problem is that markets have memory at multiple timescales at once — last week's comp dominates, but the last cycle still echoes, and a single exponential is forced to pick one rate and betray the other. The THETA engine uses a Fibonacci-staged decay across 167 parameters precisely to approximate a heavier-tailed, multi-horizon memory cheaply: recent observations carry the most weight, while a fatter tail of older signal survives instead of being crushed to zero. It's a way of admitting that the market remembers on more than one clock.

Coupling — why an Ising model, not pairwise correlation. Here's the cross-disciplinary borrow I'm proudest of and most cautious about. The naive way to capture “parcels move together” is correlation — a static, pairwise number. But correlation can't generate collective behavior; it can only describe it after the fact. The ZETA Prime engine uses an LLM-conditioned Ising model instead. The Ising model comes from statistical physics: imagine each body as a spin that wants to align with its neighbors, with a coupling strength J and a “temperature” T that controls how much disorder the system tolerates. Its defining property — the one correlation can never reproduce — is the phase transition: at a critical temperature, a system of locally-coupled bodies will suddenly, collectively flip from disordered to ordered. That is the mathematical signature of a submarket “tipping” — the moment everyone seems to move at once. The LLM's job is to read unstructured signal and set the local fields and couplings; the Ising lattice turns those local states into emergent, market-wide behavior with a critical point you can watch approach.

Temperature — why GARCH, not a constant. That temperature T in the Ising field is not a constant, because market volatility isn't constant — it clusters. Big moves follow big moves; quiet follows quiet. We estimate T with a GARCH process (generalized autoregressive conditional heteroskedasticity), the standard tool for series where today's variance depends on yesterday's shocks and yesterday's variance. The alternative — assuming a fixed variance — would tell the Ising field the market is equally hot in a frenzy and a freeze, which is exactly backwards. GARCH lets the system literally take the temperature, and feed that temperature into how readily the field tips.

Regimes — why a 5-state Markov chain, not change-point detection. Change-point detection tells you where the breaks were — useful, but backward-looking. We use a 5-state Markov chain (think: dormant → accelerating → expanding → peaking → contracting) because a Markov model gives you the thing a broker actually needs: forward transition probabilities. Not “a break happened in Q3,” but “given where this submarket is, here's the probability distribution over where it goes next.” Five states is the deliberate balance — enough to be expressive, few enough to estimate honestly without overfitting.

Clustering — why SIGMA lets the clusters emerge. The SIGMA engine refuses to use zip codes, council districts, or named “submarkets” as its units, because administrative boundaries are not economic boundaries. It lets clusters form from the data — density and graph-community structure — so that a node is defined by how parcels actually behave together, not by a line a county drew in 1974.

Demand — why OMEGA fingerprints, not category labels. OMEGA encodes 230 tenant “fingerprints” — each tenant's real requirement profile as a vector — and matches them against parcels' field signatures across 1.5 million matches. “Fast-casual restaurant” is a label; a fingerprint is the actual co-tenancy, density, visibility, and field profile a specific concept needs. Labels group; fingerprints match.

And stitching it together: DELTA ingests (108+ sources across ten categories), THETA handles time, GRAVITY handles space, SIGMA finds the groups, ZETA holds the entity portfolios (244,277 of them), ZETA Prime runs the dynamics, OMEGA carries demand, and SID orchestrates the whole fabric.

A model you can't be wrong with isn't a model; it's a horoscope.

Every one of those choices is a place I could be wrong. That's the feature.

What the measurements say so far

So — does any of it work? Here the discipline matters more than the result, so let me state the discipline first: every number below comes from walk-forward, pre-registered evaluation. The model is only ever scored on parcels and time windows it was not allowed to see during training, and the expectations were written down in advance.

0.84ROC-AUC · 12-month horizon

58%Top-1% precision (up from 45.5%)

V3Current predictive surface

On that basis, the current V3 predictive surface posts a ROC-AUC of 0.84 at a 12-month horizon. In plain English: hand the model a parcel that's going to do something and one that isn't, and 84% of the time it ranks the right one higher. That is not a coin flip and it is not clairvoyance; it is a real, measured edge on out-of-sample ground.

Reading the curve

ROC-AUC stands for Receiver Operating Characteristic — Area Under the Curve. The curve plots how many real movers the model catches (true positives) against how many false alarms it raises (false positives) as you loosen its threshold.

How to read it

A blind coin-flip is the diagonal line — 0.50. A perfect oracle hugs the top-left corner — 1.00. The area underneath is the score: 0.84 means that 84% of the time, a parcel that moves is ranked above one that doesn't.

More telling than the headline is the top-1% precision: 58%, up from 45.5%. Of the parcels the model is most confident about — the very top of the distribution, the ones it would actually stake a recommendation on — well over half pay off. And that number moved: it climbed from the mid-forties to the high-fifties as the engine learned in sequence, as new factors were added and re-weighted and re-run. The forecast got better over time, on held-out data, which is the only kind of “better” that counts.

The improvement curve is itself the evidence for the central claim. If this were a coincidence dressed as a model, adding data and compounding steps would not systematically raise out-of-sample precision. It would wander. Instead it climbed. The platform is, in the most literal sense, learning to notice.

The unprovable, and why that's permission, not a problem

Now the philosophical objection, because it's the honest one: you'll never prove this. You will never demonstrate that real estate “obeys” gravity the way you'd prove a theorem.

Correct. And I want to argue that this is not the indictment it sounds like — that some of the most load-bearing ideas in all of mathematics are things we cannot prove and use anyway.

Take Goldbach's conjecture: every even number greater than two is the sum of two primes. It is so intuitive that it feels like it must be a definition. It has been verified by computer for every even number past four quintillion — 4 × 10¹⁸ — without a single exception. Every working mathematician believes it. And it has never been proven. We agree with it on the strength of overwhelming measurement, not proof.

Or the twin prime conjecture — that there are infinitely many primes two apart. Believed, partially attacked, fundamentally open.

And then the deepest cut of all: Gödel's incompleteness theorems, which establish that in any formal system rich enough to be interesting, there exist statements that are true and provably unprovable — truths the system can never reach from the inside. Unprovability isn't a failure mode at the edge of mathematics. It's a permanent, structural feature of any system worth having.

So when the skeptic says “you can't prove real estate has gravity,” my answer is: of course not — and neither can Goldbach prove his, and we build on his anyway. The honest stance toward a deep, intuitive, well-measured regularity is not to demand a proof you know can't come. It's to adopt the experimental mindset: we're going to test this theory. We're going to make measurable, auditable observations, write down our expectations in advance, score ourselves on data we couldn't see, and let the precision curve tell us whether the noticing is real.

That's all this is. A platform for noticing, run with the integrity of an experiment that's allowed to fail.

The future pipeline

Everything above is built and measured. This section is explicitly not — it's the pipeline of candidate models I think are worth exploring next, none of them currently in the canon. Bonus tracks, offered in the same experimental spirit.

Hawkes processes (self-exciting point processes). The natural formalization of cascades: a model in which each event raises the probability of subsequent nearby events, with that excitement decaying over time. This is the rigorous version of “one deal triggers the next.”
Graph neural networks / learned message-passing. The adjacent-bleed term is, today, hand-shaped. A GNN on the parcel adjacency graph would learn how influence propagates from neighbor to neighbor — field propagation as trainable message-passing, generalizing the fixed kernel.
Spectral graph theory (the graph Laplacian). Compute the eigenmodes of the parcel network and you get its “resonant frequencies” — which clusters of parcels vibrate together, which submarkets are coupled at a structural level invisible to a map.
Diffusion / heat-equation models on the graph. A PDE describing how a shock spreads and dissipates through space and time — letting you watch a signal radiate outward and cool, rather than treating each timestep as static.
Percolation theory. The mathematics of when scattered development suddenly connects into a contiguous corridor — an explicit tipping threshold for “node” becoming “district.”
Optimal transport. A principled way to measure how demand flows between submarkets — the natural mathematical home for the ECHO engine's aggregation of collective demand across many users' searches.
Causal inference / synthetic control. The leap from “these moved together” to “this caused that,” by constructing counterfactual parcels — what would this block have done had the anchor never opened?
Renormalization-group / multi-scale coarse-graining. A physicist's method for cleanly aggregating parcel-level signal up to corridor and metro scale without double-counting — making the model say consistent things at every zoom level.
Survival / hazard models. Time-to-event: not just whether a parcel transacts, but when — the hazard rate of a deal as a function of its mass and field.

I'm listing these in the open because the point of the whole enterprise is that the noticing is shared and auditable. These are the next experiments, not secrets.

Two examples, made concrete

Example one — positive mass, radiating a positive field. A grocery-anchored center renews its anchor and signs two junior boxes. That center is heavy — high positive Mass. Watch what happens to the pad site three doors down, a parcel with low intrinsic mass of its own. Its Mass barely moves. But its Field lights up positive: the anchor's mass bleeds across the short distance (small d, large Aₓ·Mass(j)/d^α term), and as the second and third tenants commit, the cluster bonus C kicks in super-additively. The result is a parcel that looks unremarkable in isolation and lights up green in the field — a watch-or-buy signal that a parcel-by-parcel appraisal would structurally never produce, because the value isn't on the parcel. It's in the field the parcel is standing in.

Example two — negative mass, and a divergence worth money. A big box goes dark. That's a heavy body with negative Mass — it pulls the neighborhood down, and every parcel around it sees its Field sag, even though nothing changed on those parcels at all. Most readers stop there and avoid the block. But run the full tuple. Find a nearby parcel whose present Field is deeply negative — yet whose demographics are strong, whose visibility is excellent, and whose profile matches an OMEGA fingerprint that fits the dark box's footprint almost exactly. Now you have a divergence: a strongly negative Field masking a strongly positive latent Mass. The model flags precisely that disagreement between the two numbers — the coiled spring the gloom is hiding. That is the trade the field-blind appraiser and the field-only momentum-chaser both miss, and it falls out of the math for free.

When the two numbers disagree

Every parcel reads as a signed tuple — Mass (its own weight) and Field (the neighborhood's pull), each from −100 to +100. When they point the same way, the story is obvious. When they diverge, that's the trade.

Interactive · toggle the two examples

The pebble in the moving river

Here's the picture to leave you with, and it's a correction of the textbook one.

Drop a pebble in a still pond and you get clean concentric ripples — the schoolbook image of a field radiating from a source. Tidy. Symmetric. Predictable. If markets were still ponds, you wouldn't need any of this; you'd just measure the splash and read off the rings.

But a market is not a still pond. It's a moving river. Drop the pebble and the ripples are immediately bent, stretched, and carried by a current that was already there. The same pebble, dropped at two different points in the river, produces two completely different downstream patterns — because the medium it lands in has its own motion, its own structure, its own field.

Fig. Same pebble, two media — clean concentric rings in a still pond; bent and carried downstream in a moving river.

And this is the part the gravity analogy gets exactly, beautifully right: every event in the river has its own mass and casts its own field, and every one of those fields rides the same moving water. A transaction isn't an isolated splash. It's a body with mass, dropped into a medium that is itself made of other bodies' fields, all of them in motion. Which is why you cannot reason about a parcel alone, and why the only honest method is to keep taking the temperature — every facet, every pass, in sequence and by different sequences, forever.

Is it proven? No. Is it cute? Sure. But I ran the equation, and I made the observation, and the precision curve went up on data the model wasn't allowed to see.

It is not a crystal ball. It is a platform for noticing — built on data and integrity, and nothing else.

Have you run the equation yet?

The Theory of Real Estate Gravity

A platform for noticing

Borrowing Newton on purpose

Mass and Field

Feel the field

The quants, and why these and not the others

What the measurements say so far

Reading the curve

The unprovable, and why that's permission, not a problem

The future pipeline

Two examples, made concrete

When the two numbers disagree

The pebble in the moving river

Notes & References

AI CRE Parcel Scoring: Gravity Waves

Run the equation yourself