Patent US8661029B1: NavBoost — Google's Click-Based Ranking System

NavBoost is Google's system for measuring user satisfaction through click behavior and feeding it back into search rankings. The patent itself does not use the names NavBoost or CRAPS — those labels come from later DOJ testimony and the 2024 API leak, which revealed the internal codename CRAPS (Click Rate Adjusted by Position and Style). This analysis treats the patent as the mechanism most closely aligned with those systems. Filed in 2006, it describes the core mechanism: the LC ratio (Long Click ratio — the proportion of quality clicks to total clicks), click weighting functions, three-level geographic aggregation, IRBoost (Information Retrieval Boost — how click signals modify a page's ranking score), and anti-spam safeguards. Google has filed five continuations of this patent through 2023 — twenty years of legal protection for the same foundational idea.

The Honest Hedge

Every analysis has a threshold where certainty ends and inference begins. Here's where that line falls for this patent:

What We Know (From the Patent Text)

The Long Click (LC) ratio mechanism is explicitly described in the patent. Click weighting (continuous and discontinuous), three-level aggregation (base/language/country), the Information Retrieval Boost (IRBoost), anti-spam safeguards, and viewing length differentiators are all documented. This patent has been continued five times through 2023, and the core mechanism remains intact across all six patents in the chain. The continuation chain shows Google kept legally protecting this mechanism through 2023 — separate corroboration that NavBoost-like systems remain production-relevant comes from the DOJ testimony and leaked API fields documented in the companion deep dive.

What We Infer

The specific weight values (−0.1, 0.5, 1.0, 0.9) given in the patent are examples, not confirmed production values. The actual weights today are likely tuned through machine learning rather than hand-set. The smoothing factors, multiplier M, threshold X, and combination weights X₁, X₂, X₃ are not publicly known. The anti-spam behavioural model has likely evolved well beyond what this patent describes.

What We Don't Know

The exact current production weights. Whether neural approaches have supplemented the explicit formulas. How NavBoost interacts with newer NLP-based ranking systems. Whether the three-level aggregation hierarchy now includes additional dimensions beyond country and language. The relative weight of NavBoost versus other ranking systems in 2026.



Patent Metadata

📄 US8661029B1 — Modifying Search Result Ranking Based on Implicit User Feedback

Patent Number
US 8,661,029 B1
Common Name
NavBoost (internal codename: CRAPS — Click Rate Adjusted by Position and Style)
Official Title
Modifying search result ranking based on implicit user feedback
Assignee
Google LLC (originally filed as Google Inc.)
Inventors
Hyung-Jin Kim, Simon Tong, Noam M. Shazeer, Michelangelo Diligenti
Filed
November 2, 2006 (Application US 11/556,143)
Granted
February 25, 2014
Status
Active — expires January 31, 2031 (M1553 yr 12 paid)
Patent Family Chain
US8661029B1 → US9235627B1 → US9811566B1 → US10229166B1 → US11188544B1 → US11816114B1
Forward Citations
Google Patents lists dozens of direct citations to US8661029B1; the broader family has a substantially larger citation footprint (exact counts vary depending on whether family-level and continuation citations are included)
Classification
G06F 16/24578 — Query processing with adaptation to user needs using ranking
PDF
Download full patent (PDF)

Look at that filing date: November 2, 2006. Google was eight years old. YouTube had been acquired just weeks earlier. And they were already patenting a system to use your clicks as a ranking signal. The patent family chain shows five continuations spanning through November 2023 — that's seventeen years of legal protection for the same specification text. Active patent family and confirmed production deployment are not the same thing, but DOJ testimony and the API leak — covered in the companion deep dive — provide that separate production evidence.

The citation footprint is significant. Dozens of patents cite this document directly, and the broader family has a substantially larger reach. To put that in context, the Entity Scoring patent (US10235423B2) has 38 forward citations. NavBoost's reach is considerably wider. It's one of the most connected patents in Google's ranking infrastructure.

This article covers the patent mechanism — what the patent text describes and what it means. For how NavBoost operates in production today — informed by the 2024 API leak, sworn DOJ testimony, and practitioner case studies — see the companion deep dive: How NavBoost Really Works.


What This Patent Does (Plain English)

Here's the core problem. Google ranks search results using hundreds of signals — links, content relevance, freshness, authority. But all of those signals are predictions about what users want. NavBoost is different. It measures what users actually do when they see those results.

The system works in five stages:

  1. Track clicks — When a user clicks a search result, record the query (Q), the document (D), the time spent (T), the user's language (L), and their country (C)
  2. Measure dwell time — Time from when the user clicks through to when they return to the search results page
  3. Weight the click — Apply a weighting function that translates dwell time into a quality score: longer views get higher weights, shorter views get lower (or even negative) weights
  4. Calculate the LC ratio — For each query-document pair, compute the Long Click (LC) ratio: the proportion of weighted long clicks to total clicks
  5. Modify rankings — Feed the LC ratio into an IRBoost (Information Retrieval Boost) function that amplifies or dampens the original ranking scores

The result: documents that users consistently find valuable rise in rankings. Documents that trigger pogo-sticking — click, quick return, click something else — sink.

Here's what this looks like in the actual patent. FIG. 2 shows the engine interaction — how user click data flows through the ranking system:

Screenshot of FIG. 2 and FIG. 3 from US Patent 10,229,166 B1 showing the engine interaction flow: Scoring Engine (2020) and Indexing Engine (2010) feed into the Ranking Engine (2030), which produces Ranking Result 1 and Result 2 (2040), tracked by a Tracking Component (2060) that generates Result Selection Logs, which feed back into the Rank Modifier Engine (2070). FIG. 3 shows the distributed architecture between Client System and Server System
FIG. 2 from US Patent 10,229,166 B1 — the engine interaction flow. The Rank Modifier Engine (2070) receives Result Selection Logs and feeds adjustments back into the ranking pipeline.

Let me translate that to human.

NavBoost click signal flow diagram showing: SERP Presented → User Clicks Result → Dwell Time Measured → Weighting Function branches into Short Click (-0.1), Medium Click (0.5), Long Click (1.0) → Weighted Click Sum (WC) → LC Ratio = WC / Total Clicks → IRBoost Applied to Rankings
The same pipeline, translated. Your clicks become weighted signals, aggregated into an LC ratio, and fed back to modify rankings.

The LC Ratio: Core Mechanism

What the NavBoost LC Ratio Measures

The proportion of "good" (long) clicks to total clicks for a specific query-document pair. The patent calls this the LC click fraction.

LC = #WC(Q,D) / [#C(Q,D) + S₀]
Long Click Fraction — US8661029B1, LCC_BASE
  • #WC(Q,D) = Weighted click count for query Q and document D (sum of all weighted views)
  • #C(Q,D) = Total click count for that query-document pair
  • S₀ = Smoothing factor — a guard against noise from rare queries
Patent Language — Smoothing

"The smoothing factor S0 can be chosen such that, if the number of samples for the query is low, then the click fraction will tend toward zero. If #C is much larger than S0, then the smoothing factor will not be a dominant factor."

This is elegant engineering. For a query with millions of clicks, S₀ is negligible and the LC ratio reflects pure user data. For a rare query with 3 clicks, S₀ pulls the ratio toward zero — don't trust thin data, default to existing signals. The smoothing factor is the system's way of saying: "I need enough evidence before I override what my other signals are telling me."

NavBoost Click Scoring in Practice

Take a query like "best CRM for small business." Google serves ten results. Over thousands of searches, it tracks every click and every return. If 70% of users who click on result #4 stay for 3+ minutes and never come back to the SERP, that's an extremely high LC ratio. If 80% of users who click result #1 bounce back within 10 seconds, that's a terrible LC ratio — despite being in position one.

Over time, result #4 rises and result #1 falls. NavBoost has spoken. The users voted, and the users won.

Key Insight

The LC ratio is independent of position. The patent is explicit: "the measure of relevance can be independent of relevance for other document results returned in response to the search query." A page in position 8 with a high LC ratio will rise, even if it gets far fewer raw clicks than position 1. It's not how many clicks — it's how good the clicks are.


Click Weighting: Continuous vs. Discontinuous

The raw dwell time — "user stayed for 47 seconds" — isn't used directly. The patent describes two approaches to converting dwell time into a weight:

Continuous Click Weighting Functions

A smooth function where the weight increases continuously with dwell time. Think of a sigmoid curve — short visits produce near-zero weight, the weight rises smoothly through the middle range, and very long visits asymptotically approach a maximum weight.

The patent includes these exact curves. FIG. 4A shows the full click weighting workflow, while FIG. 4B and 4C show two different weighting approaches:

Screenshot of FIG. 4A, 4B, and 4C from US Patent 10,229,166 B1. FIG. 4A shows the workflow: track selections, weight views based on viewing length, combine weighted views, determine relevance, output to ranking engine. FIG. 4B shows a continuous sigmoid-like Weight vs. Time curve. FIG. 4C shows a step-function Weight vs. Time curve with discrete thresholds
FIG. 4A–4C from US Patent 10,229,166 B1 — the click weighting workflow and two curve types. FIG. 4B is the continuous sigmoid; FIG. 4C is the step function with discrete short/medium/long thresholds.
What This Means for Humans

Look at the two curves. FIG. 4B (the smooth sigmoid) means Google can measure the quality of your dwell time on a continuous scale — 47 seconds isn't a binary "long" or "short," it's a precise point on a curve. FIG. 4C (the step function) shows the simplified version: hard thresholds that bucket your visit into short, medium, or long. In practice, Google likely uses some combination. The takeaway: every second a user spends on your page is being measured and weighted, not just whether they bounced.

Discontinuous Click Weighting: Short, Medium, and Long Clicks

The patent describes explicit categories with fixed weights:

Category Weight Signal
Short click −0.1 User quickly returned to SERP — negative relevance signal
Medium click 0.5 User spent some time — potentially good page
Long click 1.0 User spent significant time — strong relevance signal
Last click 0.9 User never returned to SERP — likely satisfied

Notice the short click weight is negative. A short click doesn't just fail to help — it actively hurts the document's score. Every pogo-stick subtracts from the weighted click sum. This means a page that generates a mix of long views and short bounces is worse off than a page with fewer total clicks but consistently long views.

Critical Insight

The "last click" weight (0.9) being slightly lower than the "long click" weight (1.0) is counterintuitive. The patent explains: if a user makes the last click after clicking other results first, that last click is "considered as less indicative of a good page and given only a moderate weight." The system accounts for the journey, not just the destination.


Three-Level Click Aggregation

Here's where the patent reveals its real sophistication. Google doesn't just calculate one LC ratio per query-document pair. It calculates three, at progressively finer granularity:

LCC_BASE = #WC(Q,D) / [#C(Q,D) + S₀]
Level 1: Global — all clicks for this query-document pair
LCC_LANG = #WC(Q,D,L) / [#C(Q,D,L) + S₁]
Level 2: Language — clicks filtered by user language
LCC_COUNTRY = #WC(Q,D,L,C) / [#C(Q,D,L,C) + S₂]
Level 3: Country — clicks filtered by country and language

Then combines them:

LCC_FINAL = X₁·LCC_COUNTRY + X₂·LCC_LANG + X₃·LCC_BASE
Weighted combination of all three aggregation levels

Why three levels? Because search intent varies by location. A query like "football" produces different user behavior in London versus Dallas. British users clicking on the Premier League stay — American users bounce. Without language-and-country-level aggregation, the global signal would be noise. The three-level hierarchy lets Google weight local user behavior more heavily while still benefiting from global click data.

Three-level click aggregation hierarchy diagram showing concentric layers: outermost LCC_BASE (Query + Document pair, smoothing S₀), middle LCC_LANG (filtered by language, smoothing S₁), innermost LCC_COUNTRY (filtered by country, smoothing S₂), all feeding into LCC_FINAL formula
The aggregation hierarchy. Each level adds geographic and linguistic precision. The smoothing factors (S₀, S₁, S₂) allow each level to degrade gracefully when data is sparse.
SEO Implication

This is why the same page can rank differently for the same query in different countries. It's not just about hreflang tags or server location — Google is measuring whether users in each country actually find your page useful and adjusting accordingly.


IRBoost: How Click Signals Modify Rankings

The LC ratio doesn't replace the traditional Information Retrieval (IR) score — the base ranking score computed from links, content relevance, and other signals. It modifies it. The patent describes multiple possible boosting transforms. One example is a linear form:

IRBoost = 1 + min(K, M × max(0, LCC − X))
One of several IR Score Boosting Transforms — US8661029B1
  • M = Multiplier controlling the magnification of the boost
  • X = Threshold below which the LCC doesn't produce a boost
  • K = Cap limiting the maximum boost
  • LCC = The Long Click Computed fraction (the final weighted LC ratio after three-level aggregation)

The patent also describes a sigmoid-style transform and an exponential form. The important mechanism is not one specific fixed formula — it is that the click-derived relevance measure can be converted into a boost applied to IR scores. If LCC is below threshold X, the boost factor is 1 (no change). If LCC exceeds X, the IR score is multiplied upward, capped by K. The transform is designed so that low LC ratios produce minimal or no effect — the system trusts its other signals when click data doesn't show a clear positive signal.

Patent Language — Boosting

"Such transforms can cause lower LCC values (e.g., those below a threshold) to have basically no boosting factor, while allowing the boosting to magnify for higher LCC values."

Translation

LCC values are the final Long Click Computed fractions — the LC ratio after it's been calculated at all three geographic levels (global, language, country) and combined into a single number. A low LCC means users aren't consistently staying on your page. A high LCC means they are. The IRBoost formula says: if users aren't staying (low LCC), don't change the ranking. If users are staying (high LCC), boost the page's score proportionally. It's a one-way ratchet: NavBoost rewards, but it doesn't punish through this formula — the negative signal comes from short clicks dragging down the LC ratio itself.

This is conservative design. NavBoost doesn't override the entire ranking — it nudges. When click data says "users clearly prefer this page," the nudge becomes significant. When click data is ambiguous, NavBoost stays quiet. Same meal, but someone's adjusting the seasoning based on what the diners actually ate.


Viewing Length Differentiators

What counts as a "long click" isn't universal. The patent describes two viewing length differentiators that adjust the thresholds:

Click Thresholds by Search Query Category

The patent distinguishes between:

  • Navigational queries ("BMW") — users want a specific page. Short dwell might indicate success, not failure
  • Informational-quick queries ("George Washington's birthday") — the answer is found in seconds. A 15-second visit is a success
  • Informational-slow queries ("Hilbert transform tutorial") — users need time to absorb. A 15-second visit is a failure
Patent Language — Query Categories

"A person may only need a small amount of time on a page to gather the information they seek when the query is 'George Washington's Birthday', but that same user may need a good deal more time to assess a result when the query is 'Hilbert transform tutorial'."

The thresholds for short/medium/long are adjusted per query type. Google identifies query categories through regression analysis of historical click data and traditional clustering techniques (K-means on average dwell times).

Click Weight Adjustments by User Type

The patent also adjusts for individual user behavior:

Patent Language — User Types

"Computer savvy users often click faster than less experienced users, and thus users can be assigned different weighting functions based on their click behavior. These different weighting functions can even be fully user specific (a user group with one member)."

A user who always clicks position one regardless of relevance gets their clicks downweighted. A user who reads the snippets carefully and clicks selectively gets their clicks upweighted. The system literally measures how discriminating each user is as a relevance assessor.


Anti-Spam Safeguards

The patent dedicates significant attention to click fraud prevention. Two main objectives:

  1. Democracy in votes — One vote per cookie/IP for a given query-URL pair
  2. Behavioral modeling — Entirely remove data from cookies or IPs that don't look natural

The detection signals include:

  • Abnormal distribution of click positions
  • Unusual click durations
  • Suspicious clicks-per-minute/hour/day rates
  • Unnatural distribution of user agents and cookie ages
  • Traffic from regions with historically high spam activity

If a user doesn't conform to the behavioral model, their click data is entirely disregarded. If a query appears to be spammed, the click signals for that query are not used at all.

NavBoost Click Data Filtering Pipeline — Raw Clicks through Behavioural Model Check, Cookie/IP Deduplication, and Spam Region Filter to Clean Click Data and LC Ratio Calculation
The three-stage filtering pipeline that ensures only genuine user signals enter NavBoost's LC ratio calculation. Each filter stage rejects manipulated or duplicate clicks before they can influence rankings.
The Takeaway

Click manipulation is not a viable strategy. The patent describes a behavioral model that profiles normal user behavior and filters everything that deviates. This has been in place since 2004 and has been refined through six patent continuations over twenty years. The safeguards are likely far more sophisticated today than what the patent describes.


SEO Implications

If NavBoost is "more positive on clicks by itself than the rest of ranking," then every SEO recommendation must be filtered through one question: does this make users stay?

NavBoost Signal What Drives It What You Should Do
Long clicks (+) Users stay on the page because it answers their query thoroughly Create genuinely comprehensive content. Not word count — query completeness. Answer the question and the questions behind the question.
Short clicks (−) Users bounce back immediately — mismatched expectations, slow load, poor experience Write titles and descriptions that accurately represent page content. Ensure fast load times. Don't bait-and-switch.
Last clicks (+) Users find their answer and stop searching Be the terminus of the search journey. Provide definitive answers that make further searching unnecessary.
Position independence LC ratio is calculated per Q-D pair, not per position Even if you're ranking in position 7, a high LC ratio will pull you up over time. Focus on quality for the users who do click.
Country-level signals LCC_COUNTRY can receive the strongest weighting when sufficient local data exists Localize content for your target markets. What satisfies users in the US may not satisfy users in Germany.

Core Web Vitals are not described in this patent, but page experience clearly affects the behaviour this patent measures. If a page renders so slowly that users return to the SERP before consuming the content, that registers as a short click — a negative signal — not because the content was bad, but because the user never saw it. In that sense, speed is not just a separate technical factor; it is a satisfaction prerequisite. This is why page speed connects to NavBoost indirectly — it determines whether users can form honest click opinions. For the full analysis of how Google patents page speed as a ranking gateway, see the Page Speed patent analysis.


Continuation Chain & Beyond the Patent

This patent is the first in a six-patent continuation chain. The specification text — the mechanism described above — is identical across all six patents. The continuations add new claims (legal scope), not new technical mechanisms. The chain:

  1. US8661029B1 — This patent (filed 2006, granted 2014)
  2. US9235627B1 — First continuation (granted 2016)
  3. US9811566B1 — Second continuation (granted 2017)
  4. US10229166B1 — Third continuation (granted 2019)
  5. US11188544B1 — Fourth continuation (granted 2021)
  6. US11816114B1 — Latest continuation (granted 2023)

Six patents over seventeen years. The continuation chain shows Google kept legally protecting this mechanism — and while legal protection alone doesn't prove current production deployment, the DOJ testimony and API leak (covered in the companion deep dive) provide that separate evidence.

Beyond the Patent Text

This article covers the patented mechanism — what the patent describes and what it means. NavBoost in production today extends well beyond what's documented here. The 2024 API leak revealed production attribute names (goodClicks, badClicks, lastLongestClicks, crapsData) that map to this patent's mechanisms. Sworn DOJ testimony confirmed a 13-month click data window and that BERT does not replace NavBoost. For the full cross-source analysis — API leak mappings, DOJ corroboration, practitioner case studies, and production-level theory — see How NavBoost Really Works.


Citation Network

Forward Citations

Dozens of patents cite this document directly, and the broader patent family has a considerably larger citation footprint. Notable citing patents include:

Related Articles on This Site


Frequently Asked Questions

What is NavBoost and what does patent US8661029B1 describe?

NavBoost is Google's system for modifying search rankings based on user click behaviour. The patent itself titles the mechanism "Modifying search result ranking based on implicit user feedback" — the names NavBoost and CRAPS come from later DOJ testimony and the 2024 API leak. This patent describes how Google tracks clicks, measures dwell time, weights those dwell times into a relevance score (the LC ratio), and feeds that score back into the ranking engine via IRBoost.

How does the LC ratio work?

The LC ratio measures the proportion of weighted long clicks to total clicks for a given query-document pair: LC = Weighted Long Clicks / (Total Clicks + Smoothing Factor). Higher ratios indicate users consistently find the document relevant. The smoothing factor prevents noise from rare queries.

What are long clicks and short clicks?

Short clicks (negative weight) indicate the user quickly bounced back. Medium clicks (moderate weight) suggest potential relevance. Long clicks (high weight) indicate sustained engagement. Last clicks (user doesn't return) signal query satisfaction. The thresholds are adjusted by query category and user type.

How many continuations does this patent have?

Five continuations spanning 2016–2023 (US9235627B1, US9811566B1, US10229166B1, US11188544B1, US11816114B1). The specification text is identical across all six patents — only the claims differ. The latest was granted November 2023.

What can I do to improve my NavBoost signals?

Write accurate titles/descriptions (reduce pogo-sticking). Create genuinely comprehensive content (increase dwell time). Ensure fast page load so users can form honest click opinions — a page that renders too slowly produces short clicks before users see the content. The anti-spam safeguards make click manipulation actively detected and filtered.

How does NavBoost relate to Google's other ranking systems?

NavBoost is a rank modifier that adjusts IR scores from other systems. It works alongside Entity Scoring, Passage Ranking, and quality systems. The IRBoost formula applies the LC ratio on top of existing scores.