Recently, a significant leak of internal Google algorithms has come to light, which has major implications for the world of search engine optimization (SEO). This leak, revealing more than 14,000 potential ranking factors, provides an unprecedented glimpse into Google's tightly guarded algorithm. For anyone involved in SEO, this information presents new opportunities and insights to optimize online visibility.
Key points from Google's algorithm
The nearest seed-modified PageRank (now deprecated). The algorithm is called pageRank_NS and is linked to document understanding.
Google has mentioned seven different types of PageRank, one of which is the famous ToolBarPageRank.
Google has a specific method for identifying the following business models: news, YMYL, personal blogs (small blogs), e-commerce, and video sites. It is unclear why Google specifically filters personal blogs.
The main components of Google's algorithm appear to be navBoost, NSR, and chardScores.
Google uses site-wide authority and some site-wide signals, including traffic from Chrome browsers.
Google uses page embeddings, site embeddings, site focus, and site radius in its scoring function.
Google includes poor clicks, good clicks, clicks, longest last clicks, and site-wide impressions.
Google utilizes Page Quality
Google has something called 'pageQuality' (PQ). One of the most interesting aspects of this measurement is that Google uses a LLM to estimate the "effort" for article pages. This value seems useful for Google to determine whether a page can be easily copied.
Conclusion:
Resources, images, videos, unique information, and depth of information stand out as ways to score high on "effort" calculations. Coincidentally, these elements have also been proven to satisfy users.
Topic borders and topic authority exist
'Topic authority' is a concept based on Google's algorithm leak. The files support what many in the SEO community have already suspected. In the algorithm leak, we see that siteFocusScore, siteRadius, siteEmbeddings, and pageEmbeddings are used for ranking.
siteFocusScore:
Indicates how strongly a site focuses on a specific topic.
siteRadius:
See how far page embeddings deviate from site embeddings. In plain language, Google creates a topical identity for your website, and each page is measured against that identity.
siteEmbeddings:
Compressed site/page embeddings.
Why is this interesting?
If you understand how embeddings work, you can optimize your pages to deliver content in a way that enhances Google's understanding.
'Topic focus' is mentioned here directly. We do not know why 'topic focus' is referenced, but we know that a number is assigned to a website based on the site's score.
Deviation from the subject is measured, which means that the concept of topical boundaries and contextual bridging has some potential support outside of patents.
It seems that topical identity and topical measurements in general are a focus for Google.
Google essentially creates an identity for your website. Each new page is then mirrored to this identity. If the topic of this page is far removed from the subject of your website, it negatively impacts the previously mentioned metrics.
Also, try blocking search engines from accessing pages that have no relation to the subject of your website. In theory, this will positively influence your 'SiteFocusScore' and 'SiteRadius'. And thus lead to better rankings.
Measuring image quality
'ImageQualityClickSignals' indicates that image quality is measured based on click behavior (usefulness, presentation, attractiveness, engagement). These signals are considered Search CPS Personal data.
Host NSR
Host NSR is the site rank calculated for host-level (website) sitechunks. This value encodes nsr, site_pr, and new_nsr. It is important to note that nsr_data_proto appears to be the latest version. However, this is not mentioned in the current PDF. During a measurement, a website is divided into random parts. These pieces are referred to as 'sitechunks.' Google treats these segments of your domain separately, measures them, and assigns a value to them.
Old pages or blogs that have not been read for years are also included in this measurement! Blocking search engines can provide a solution. Has a page had enough time to rank? Are there no internal or external links pointing to it, and are the user statistics poor? Then deny Google access to these URLs! This way, you prevent a negative impact on your website.
In conclusion, we provide a
number of tips
to keep in mind. Some have already been mentioned in the text, some have not.
Invest in a
well-designed site
with a
intuitive architecture
to optimize for NavBoost.
Remove
of
block pages
that are not topically relevant.
Optimize your
headlines
around search queries and ensure that the paragraphs under the headings clearly and concisely answer these queries.
Write more content that is more
impressions
and
clicking
can earn.
Work content of
outdated pages
regularly.
Consistent posting
is important for the quality score of your website.
Appreciate the growth of
impressions
because this is a good sign.
Remove
poorly performing pages
if the user statistics are poor and there are no links pointing to the page.

