Links Patent Analysis Last updated 06 · 04 · 2026 ⏱ 22 min read

Patent US9953049B1: How Google Measures Your Distance from the Sites It Trusts

Originally published 27 · 03 · 2026 · By Alejandro Meyerhans

The Seed Distance PageRank Patent

The 2024 Google API leak revealed an attribute called PagerankNS — a 16-bit integer stored on every single page in Google's index. Most people assumed the "NS" stood for "namespace." The strongest reading is that NS stands for NearestSeeds — a naming convention that maps directly to a 2006 patent by Google engineer Nissan Hajaj. If that reading is correct, it strongly suggests that Google has been running a seed-distance variation of PageRank since at least 2006 — measuring proximity to trusted sites rather than counting raw citations.

What this analysis draws from — and what it doesn't prove

This article is built on three sources: the patent text (the legal document filed with the USPTO), the 2024 Google API documentation leak, and observable ranking behavior. The patent proves Google designed a seed-distance ranking system. The API leak shows attribute names consistent with it. Neither proves this exact formula runs in production today — and the patent itself describes its approach as "one possible variation of PageRank." I'll be explicit throughout about where evidence ends and inference begins.

If you've read my analysis of the Reference Contexts patent, that system evaluates what is written around your links. If you've read the Historical Data patent, that one monitors when your links appeared. This patent answers a more fundamental question: how far is your page from the sites Google already trusts? And the answer carries exponential consequences.

The Honest Hedge

Every analysis has a threshold where certainty ends and inference begins. Here's where that line falls for this patent:

What We Know (From the Patent Text)

Google filed and was granted a patent for ranking pages based on shortest distances from hand-curated seed sites, using exponential decay scoring.
The system uses k nearest seeds (k = 3–6), assigns link lengths based on outdegree properties, and pages unreachable from any seed receive no score.
The patent names Google Directory and The New York Times as example seeds.
The parent application was filed in 2006; the continuation was granted in 2018. Year 4 maintenance fees were paid.
The patent describes itself as describing "one possible variation of PageRank" — not a confirmed replacement of the original.

What We Infer

The API attribute PagerankNS (NearestSeeds) stored as uint16 strongly suggests this patent's seed-distance model is reflected in production scoring.
The downstream signal authorityPromotion in CompressedQualitySignals suggests seed distance feeds the Q* composite score — meaning it may contribute to the final ranking computation.
The homepagePagerankNs attribute implies site-level seed distance scoring exists alongside page-level scoring.
Patent continuations and fee payments suggest ongoing legal value — though this proves portfolio maintenance, not necessarily production relevance.

What We Don't Know

The exact seed list — it's presumably evolved significantly since Google Directory was discontinued in 2011.
The current value of k (3? 6? dynamic?).
Whether seeds are global or topic-sensitive (query-specific micro-seed clusters).
Whether Graph Neural Networks or SpamBrain pre-processing have superseded the explicit shortest-path formula while preserving the seed-distance concept.
Whether modern "distance" includes entity co-occurrence (unlinked mentions) alongside hyperlink paths.
The exact weighting of seed distance within the Q* composite score.
A maintenance fee reminder has been mailed — if fees aren't paid, the patent expires. This doesn't mean Google would stop using the technology, but it would mean competitors could freely implement it.

Patent Metadata

📄 US9953049B1 — Producing a Ranking for Pages Using Distances in a Web-Link Graph

Patent Number: US 9,953,049 B1
Common Name: Seed Distance PageRank / "NearestSeeds PageRank"
Official Title: Producing a ranking for pages using distances in a web-link graph
Inventors: Nissan Hajaj
Assignee: Google LLC
Filed: September 21, 2006 (original application); October 19, 2015 (continuation)
Granted: April 24, 2018
Status: Active — At Risk (year 4 fees paid; maintenance fee reminder mailed)
Patent Family Chain: US9165040B1 (parent, granted 2015) → US9953049B1 (this patent, granted 2018)
Forward Citations: 3 patents cite this one directly; 11 patent families cite the family — including US11522758B1
Non-Patent Citations: 747 — one of the most-cited patents in Google's ranking portfolio
Classification: G06F16/951 — Indexing; Data structures therefor; Storage structures (Web crawling)
PDF: Download full patent (PDF)

What This Patent Does (Plain English)

Classic PageRank (the 1998 version) asked: "How many pages link to this page, and how important are they?" This patent describes a variation of that approach — one that asks: "How far is this page from the websites Google has hand-selected as trustworthy?"

Here's what the system does:

Select seed sites — Human reviewers at Google curate a set of trusted "seed" websites. The patent names Google Directory and The New York Times as examples. Seeds must be reliable, diverse across fields of public interest, and well-connected with many outgoing links.
Assign link lengths — Every link on the web is assigned a "length" based on the source page's properties. A page with 5 outgoing links creates short links. A page with 5,000 outgoing links creates very long links. More outgoing links = each individual link is worth less.
Compute shortest paths — For every page on the web, the system calculates the shortest path from the k nearest seed sites (typically k = 3–6). This requires shortest-path computation in a weighted directed graph — the same class of graph traversal used in GPS navigation.
Apply exponential decay — The ranking score is proportional to e^−D(p), where D(p) is the distance from the kth nearest seed. This is an exponential function — every additional hop away from a seed site doesn't just reduce score linearly; it collapses it exponentially.
Produce ranking — Pages are ranked by their distance scores. Pages unreachable from any seed get no score at all — they are unscored by this mechanism.

Styled diagram showing the five-step Seed Distance PageRank scoring flow: Select Seed Sites, Assign Link Lengths, Compute Shortest Paths, Apply Exponential Decay, Ranking Score — The five-step scoring pipeline. What makes this system different from classic PageRank is the shift from counting links to measuring distance — and the exponential penalty for each additional hop.

Link Length: Why Outdegree Destroys Value

Every link in the web graph is assigned a "length" — a numerical value representing how much distance that link adds between the source and destination. This is the mechanism that makes editorial links from focused pages mathematically superior to links from directories, link roundups, and resource pages with hundreds of outgoing links.

Here's the formula:

length(q → p) = α + log(|outgoing links from q|)

Where α = −log(d) ≈ 0.16 (d is the 0.85 damping factor). The patent's "log" is base-unspecified — the natural log (ln) reading is algebraically consistent with the e^−D(p) decay formula and is used for all worked examples in this analysis. The length of a link is a logarithmic function of the source page's outdegree — more outgoing links = longer (weaker) link.

Here's what this means in practice:

Source Page Type	Outgoing Links	Link Length	Relative Value
Editorial article mentioning you in context	5	1.77	Strongest — genuine endorsement
Curated resource page	20	3.16	Strong — selective curation
Blog post with many outlinks	100	4.77	Medium — diluted
Link roundup	500	6.38	Weak — mass listing
Directory page	5,000	8.68	Nearly worthless — commodity link

These values are worked examples using the patent's formula with a natural-log reading. The actual ranking score depends on weighted shortest paths through multiple links, the kth nearest seed calculation, and optional link weights — not on a single link's outdegree in isolation.

WHY THIS MATTERS

Under this patent's model, a link from a page with 5,000 outgoing links creates roughly 5× more mathematical distance than a link from a page with 5 outgoing links. That's not a small difference — it means the link from the directory page adds substantially more "distance" between you and the seed sites. And because the final score uses exponential decay, that extra distance has devastating downstream effects.

2026 reality: outdegree in a mega-menu world

This patent's formula counts all outgoing links on a page — including navigation, footers, and sidebars. Modern web pages routinely have 200+ links in their DOM from mega-menus alone. Google's Reasonable Surfer patent (US7716225B1) and subsequent DOM segmentation models isolate the main content block, likely calculating effective outdegree from editorial links in the body text — not raw HTML link count. The practitioner takeaway is the same (fewer editorial links on the page = stronger signal), but the raw numbers in the table above assume the patent's simplest reading, not Google's likely modern refinement.

Styled diagram showing how link length increases with outdegree — an editorial article with 5 outlinks creates a strong short link while a directory page with 5000 outlinks creates a nearly worthless long link — Link length as a function of outdegree. The patent explicitly models what link builders have known intuitively: a link from a focused editorial page is worth more than a link from a page that links to everyone.

Exponential Decay: The Distance Cliff

This is the mechanism that makes seed distance so consequential. The ranking score for any page is calculated as:

score(p) ∝ e^−D(p)

Where D(p) is the shortest distance from the kth nearest seed to page p. The score decays exponentially with distance — not linearly.

The exponential function creates a cliff, not a slope. The following is a worked example — not a universal lookup from the patent. It assumes each hop passes through a low-outdegree editorial page (roughly 6 outlinks, link length ≈ 2.0):

Hops from Seed	Score (e^−D)	% of 1-Hop Value
1 hop	0.135	100%
2 hops	0.018	13%
3 hops	0.0025	1.8%
4 hops	0.00033	0.25%
5+ hops	0.000045	≈ 0%

Look at that cliff. Two hops from a seed site and you're already at 13% of the maximum value. Three hops and you're under 2%. This is why a single link from a major publication can outperform dozens of links from sites further down the chain — and it's why link chains (whether PBN chains or otherwise) face mathematical headwinds that no amount of volume can overcome.

PRACTITIONER NOTE

This doesn't mean sites three or four hops from a seed are worthless. I wouldn't call sites that are three, four hops away from seed dead — not at all. The aggregate of many medium-distance links still contributes meaningfully to your overall profile. But this table explains why the last link sometimes unlocks everything: you already had 50 links that built the foundation, and the 51st link from a seed-adjacent site was the one piece your profile was missing. That one gap was a much bigger deficiency than missing the other 50 combined.

Styled diagram showing exponential decay in seed distance scoring — 1 hop retains 100% of value, 2 hops drops to 13%, 3 hops to 1.8%, and 5+ hops to approximately zero — The exponential decay cliff. Each hop compounds distance penalties — and because link length is also determined by outdegree, the real-world decay is even steeper than these best-case numbers suggest.

Seed Selection and Tuning: Google's Trust Anchor Network

The entire system rests on one critical foundation: the seed set. These are the "trust anchors" from which all distance is measured. The patent is unusually specific about what makes a good seed:

Reliable — consistently high-quality over time
Diverse — covering "a wide range of fields of public interests," not all in ONE vertical
Well-connected — seeds need large numbers of outgoing links to "facilitate identifying other useful and high-quality pages"
Manually selected — "selecting the seeds involves a human manually identifying these high-quality pages"

The patent gives two specific examples: Google Directory (the curated web directory Google discontinued in 2011) and The New York Times. These are exactly the kinds of sites you'd expect — broad editorial authority, high trust, deep link graphs reaching into every corner of the web.

SEED TUNING — THE FEEDBACK LOOP

The patent describes an iterative refinement process: the ranking results are fed back to "evaluate the quality and the contribution of the seeds, and then modify the list of seeds and/or the weights of the seeds based on this information." In other words, the seed set isn't static — Google continuously evaluates which seeds are producing useful distance signals and adjusts the set accordingly. Seeds that stop connecting to high-quality pages can be demoted or removed. This is an evolving trust network, not a fixed list.

2026 reality: topical seeds and the nofollow barrier

Two structural shifts since 2006 complicate the patent's literal model. First, most major Seed Sites (Forbes, NYT, universities, health portals) now default to rel="nofollow" or rel="sponsored" on outbound links — meaning the pure hyperlink-distance calculation the patent describes faces broken bridges at exactly the points where trust should flow most freely. Second, Google likely uses topic-sensitive seed clusters rather than a single universal list. Your "distance" to a seed isn't absolute — it's relative to the query topic. A DR 30 niche-specific journal that's a topical seed for your vertical may carry more weight than a global news seed linking to a tangentially related page. The patent describes a single global seed list; modern Google likely runs thousands of contextual micro-seed networks.

The k-Nearest Seeds

The system doesn't measure distance from just one seed — it uses the kth nearest seed, where k is typically 3–6. This means your page needs to be close to multiple seed sites, not just one. A single link from The New York Times is valuable, but a page reachable from three different seed sites through short paths is structurally stronger than a page reachable from only one.

This is why link diversity isn't just a best practice — it's architecturally built into the scoring function. Google's system literally measures proximity to k different trust anchors. A diverse link profile that connects you to multiple corners of the seed network will structurally tend to outscore a concentrated profile from a single direction.

The Unreachable Problem

Pages that cannot be reached from any seed through any path receive no score at all. They are unscored by this process. This has implications for brand-new websites, orphan pages, and sites in corners of the web that aren't connected to the broader link ecosystem. The patent states: "not all the pages in the set of pages receive ranking scores through this process. A page that cannot be reached by any of the seed pages will not be ranked."

Seed Distance SEO Implications: What This Means for Your Link Profile

1. The "51st Link" Effect Is Real

I've seen this pattern hundreds of times: a site builds 50 good links and ranks respectably but not competitively. Then they land one placement on Healthline, or Forbes, or a major industry publication — and everything unlocks.

Here's why: SEO is a competition. You're not ranked on an absolute scale — you're ranked relative to the sites around you. If your competitors have built volume but lack seed proximity, that single seed-adjacent link gives you a structural advantage they can't replicate with quantity. You now have something they don't. And because the scoring is exponential, that one link doesn't just add to your profile — it transforms the math. The 50 foundation links that were "doing nothing" suddenly click, because the distance calculation changed. The foundation was always there; the bridge to the seed network was the missing piece.

2. Digital PR Is the Seed Distance Accelerator

Digital PR is amazing — but in isolation, it's nowhere nearly as amazing as when you combine it with everything else. A single high-trust placement shortens your distance to the seed network dramatically. But if that's your only link building strategy, you're missing the aggregate volume that medium-distance links provide. The ideal strategy is both: PR placements for seed proximity plus sustained outreach for breadth.

3. Outdegree Is Your Silent Enemy

Every additional outgoing link on your source page makes your link longer in the graph. A link from a page with 5 outlinks is ~5× shorter than a link from a page with 5,000. When evaluating backlink opportunities, how many other links does this page have? is one of the most underrated questions in link building. Guest posts on exclusive placement pages > guest posts on pages citing 50 other sources.

4. Brand Consensus Works Through Seed Distance

When your brand is mentioned and linked from multiple seed-adjacent publications — industry journals, major media, academic institutions — you're not just building "brand awareness." You're shortening your graph distance to multiple seeds simultaneously. Brand consensus isn't a vague metric — it's the structural result of being close to many trust anchors at once.

5. LLMs Inherit Seed Trust

Imagine an article on Bank of America's website that mentions the best lenders in a given state and links to you. What source will an LLM never question as trustable? Bank of America. The seed distance concept doesn't just apply to traditional rankings — it extends to how AI systems select which sources to cite. Seed-adjacent sources get cited in AI-generated answers without friction, because the trust signal is baked into the graph structure that LLMs are trained on or retrieve from.

RELATED PATENTS

US8577893B1 (Reference Contexts) — Evaluates the editorial context around your links. Seed Distance measures how far you are; Reference Contexts measures how naturally your links are embedded. Both feed into link quality scoring.

US7346839B2 (Historical Data) — Monitors the temporal patterns of your link acquisition. Seed Distance is spatial; Historical Data is temporal. Together they form the positional and velocity dimensions of link quality.

2026 reality: SpamBrain and entity co-occurrence

Two more forces reshape how seed distance works in practice. First, SpamBrain — Google's neural spam detector — pre-processes the link graph before any distance calculation runs. If SpamBrain identifies unnatural link patterns (tiered chains, PBN clusters, sudden link velocity spikes), it can sever nodes from the graph entirely. Your mathematical distance may look excellent in third-party tools, but Google's graph may have already amputated the bridges that created it. Second, modern Google doesn't limit "distance" to hyperlinks. The API's repositoryWebrefEntityJoin system extracts entity co-occurrence from raw text — no <a href> required. If a Seed Entity (like The New York Times or WHO) mentions your brand in running text, Google's neural models can shorten your effective distance without a literal link existing. The patent describes a hyperlink-only graph; the 2026 reality includes entity-level proximity as a parallel distance signal.

Google API Leak Cross-Reference: PagerankNS

The 2024 Google API leak — first reported by Rand Fishkin and investigated by Mike King at iPullRank — revealed five attributes that align directly with this patent's mechanisms:

Patent Mechanism	API Attribute	API Evidence
NearestSeeds distance scoring	`PagerankNS` (uint16, 0–65535) in `GDocumentBase`	✅ STRONG MATCH — NS = NearestSeeds
Exponential decay from seed sites	`pagerankWeight` (float) in `AnchorsAnchor`	✅ STRONG MATCH — per-anchor weight consistent with distance-based decay
Link length = α + log(outdegree)	`outdegree` + `outsites` in `AnchorsAnchorSource`	✅ STRONG MATCH — stores outdegree data the formula requires
—	`scaledSelectionTierRank` (0–32767) — normalized tier position	🔶 API EXTENDS
—	`authorityPromotion` in `CompressedQualitySignals`	🔶 API EXTENDS

✅ STRONG MATCH = the API attribute name and data type align closely with the patent mechanism, but the API doesn't confirm the formula is implemented exactly as filed. 🔶 API EXTENDS = the API reveals a signal not described in the patent but consistent with its architecture. 📄 PATENT ONLY = the patent describes a concept with no matching API attribute discovered in the leak.

THE NAMING IS THE STRONGEST CLUE

PagerankNS — where NS most likely stands for NearestSeeds. This is the strongest available clue — not proof — that Google's production PageRank incorporates a seed-distance model. The API gives us attribute names and data types, not scoring formulas. The patent gives us the formulas. Together they form a strong but still incomplete evidentiary chain. The API also reveals authorityPromotion — a quality boost in the Q* composite score that appears derived from seed distance — suggesting seed distance doesn't just affect PageRank but likely feeds the master quality signal.

INFERENCE VS. CONFIRMATION

The API leak provides attribute names and data types — not the actual scoring formulas. The patent provides the formulas. Together they form a strong evidentiary chain. Neither alone is proof; together, they're as close to proof as we get in SEO. The API also reveals homepagePagerankNs — a site-level seed distance score — which the patent doesn't explicitly describe but is a logical extension of the per-page distance calculation.

Citation Network

Patent Family Chain

US9165040B1 (parent, filed 2006, granted 2015) → US9953049B1 (this patent, continuation, granted 2018)

Forward Citations (Patents Citing This Patent)

Patent	Relevance
US20160335257A1	Extends graph-based ranking methodology
US10692298B2	Builds on distance-based page quality assessment
US20230153924A1	Recent (2023) patent application citing seed distance concepts

Seed Distance: What Doesn't Matter as Much as SEOs Think

The nature of this patent is simple: trust radiates from known sources, and it decays with distance. That's not an algorithm quirk — that's how human trust actually works. You trust your doctor's recommendation more than a stranger's. You trust the stranger your doctor recommended more than someone five referrals removed. The web graph works the same way.

The flavor — the specific math, the kth nearest seed, the logarithmic link length formula — that was the 2006 approach, refined through 2018. Google has likely evolved the implementation. The seed set has certainly changed since Google Directory shut down. The k value may be dynamic. The decay function may have been modified.

But the architecture — measuring distance from trust anchors rather than counting raw citations — that isn't going anywhere. It's too fundamental. It's the difference between asking "how many people vouch for you?" and "how close are you to people we already trust?" The second question is harder to game, more resistant to manipulation, and more aligned with how authority actually propagates through networks.

The SEO industry still obsesses over Domain Rating as if it's the signal. It isn't. DR measures the aggregate power of all incoming links — it's a useful proxy, but it doesn't measure seed distance. Two DR 60 sites can have wildly different distances from the seed network. DR is like measuring someone's height. Seed distance is like measuring their relationship to power. Both are real — but don't confuse one for the other.

Here's what I tell clients: if you're worried about which links to build next, stop wondering about DR and start asking — how close is this site to the sites Google already trusts? That's the question this patent's architecture rewards — and the API naming suggests it's closer to Google's actual scoring than raw link counts are.

Frequently Asked Questions

What does patent US9953049B1 actually do?

It describes a distance-measurement variation of PageRank. Instead of asking "how many pages link to you?", the system measures how far your page is from a hand-curated set of trusted seed sites — and applies exponential decay to the score. Closer pages get exponentially higher scores.

What is PagerankNS in the Google API leak?

PagerankNS is a uint16 integer (0–65535) stored on every document in Google's index. NS most likely stands for NearestSeeds — strongly suggesting that Google's production PageRank incorporates a seed-distance model rather than relying solely on the 1998 citation-counting approach. It appears to feed into authorityPromotion, which is part of the Q* composite quality score.

Does seed distance make PBNs (private blog networks) useless?

Not directly. Seed distance penalizes chained PBNs — where private sites link to each other in chains, adding hops and compounding distance. A PBN using expired or auctioned domains that are already close to the seed network may retain some of the previous owner's graph position — though this patent doesn't address domain ownership transfers, and SpamBrain's ability to detect unnatural link patterns in acquired domains adds significant uncertainty to this strategy.

Why does one Forbes link sometimes outperform 50 niche blog links?

Profile completion. If you already have 50 good mid-distance links, your aggregate foundation is solid but you're missing seed proximity. One Forbes link closes that gap — you were at 99% and it fills the final 1%. That last 1% was a bigger deficiency than the other 50 combined. Forbes alone without the foundation wouldn't rank you either — it's the combination that unlocks competitive positions.

How does seed distance affect AI Overviews and LLMs?

LLMs inherit graph-based trust signals. When an LLM selects which sources to cite in generated answers, it leans toward sources that come from seed-adjacent sites — because those are the sources with the strongest structural trust markers. Being one hop from Bank of America or Healthline doesn't just help your traditional rankings; it makes you a more likely citation target in AI-generated responses.

Is Domain Rating (DR) the same as seed distance?

No. DR measures the aggregate strength of all incoming links to a domain — it's a third-party metric from Ahrefs that approximates PageRank. Seed distance measures proximity to Google's hand-curated trust anchors. Two DR 60 sites can have very different seed distances. Think of DR as measuring how many people vouch for you, and seed distance as measuring how close those people are to actual power. Both are real signals, but they measure different things.

What happens to pages that aren't connected to any seed site?

They receive no score at all. The patent explicitly states that "a page that cannot be reached by any of the seed pages will not be ranked." This affects brand-new websites, orphan pages without inbound links, and sites in isolated corners of the web. Getting your first link from a seed-connected domain isn't just helpful — it's the entry ticket to being scored at all.