James Dooley: Advanced on-page strategies, NLP and semantic SEO. Today I'm joined with Charles Floate, who is going to deep dive into semantic and on-page strategies.
Charles Floate, how important is on-page SEO?
Charles Floate: It is the foundation of being able to get crawled and indexed.
A lot of people treat on-page SEO as a secondary afterthought because they think links and user engagement are such overwhelming authority signals that they will rank regardless.
However, Google still needs to understand what your pages and websites are about. It needs to understand the context, the intent and the queries you are trying to rank that page for.
The main way you do that initially, at least on first crawl and index, is through your on-page SEO.
That includes your meta title, body content, tags, schema, site structure, internal linking and sitemap setup.
All of these things are massively important for defining your site score, your site focus score, your topical authority, your topical bubble and how all of those documents connect.
Internal linking does not always mean physical internal links either. Google has a good way of scoring all the documents on your website and understanding how focused they are.
James Dooley: With regards to the site radius and keeping things on point, how important is the content brief for a single page?
How important is the heading hierarchy before you even pass it to the content writer, so they cover the right entity attributes and questions around the topic?
Charles Floate: If you create a page that says nothing similar to the other pages ranking in the SERP, there is a very low chance you will overwrite the consensus and rank at the top.
Your page will likely be suppressed, rank very low, or not index at all.
You need to at least match what is already there and use a structure similar to the other ranking pages.
Most people think Google’s algorithm is much smarter than it actually is, even in English.
You do not want crazy long H2s covering loads of different things. You want headings to be clear, structured and formatted so they break the page into specific sections.
You also need to match the consensus of the information Google is looking for within those headings.
Do not start with filler content. Answer the heading straight away with factual information.
Imagine every heading on your website could trigger a featured snippet in the SERP.
That featured snippet needs to answer the heading immediately. You do not want it filled with fluff.
Google’s algorithm is still poor at fully reading and understanding content.
Kyle Roof is a big proponent of this. He says you are better off matching the entities Google expects to find on your page than trying to create a unique story.
James Dooley: For anyone listening, what does NLP mean within semantic SEO and why is it important?
Charles Floate: NLP means natural language processing.
It looks at how humans interact, speak and write. It also looks at whether content appears to be machine-generated or human-generated.
Previously, NLP was more about how Google’s algorithm interpreted words and the positioning of those words next to each other.
Now it is much more about experience, personalisation, the author behind the content and related signals.
Google is looking for personalised information, fact checking, verifiable statistics and information it can trust.
The content needs to match consensus, but it also needs unique information gain and believable support.
James Dooley: So if you are putting information on the page, it needs to answer the topic but also explain why you are saying it.
Is that where people talk about information gain?
It is not just getting AI to write something generic with no data, survey, third-party source or reason behind it.
Charles Floate: Information gain is slightly separate.
Information gain is the difference between the SERP consensus and the new information you add.
For example, if all the top 10 results cover what something is, how it works, who invented it and why it matters, and you cover all of that too, you have matched the consensus.
But if you also add a section about the companies involved, or the influencers currently shaping that space, that would be information gain.
When you are optimising the actual content, it is about making it believable.
You are not just making generic statements. You are explaining why the statement is true and giving Google reasons to trust it.
Google is not always fact-checking every statement directly.
A lot of the time it is comparing consensus against background information and looking for reasons to believe the statement.
James Dooley: With on-page SEO strategies, some people say Googlebot only crawls a certain amount of a page on the first visit.
Some people mention 23 kilobytes, 30 kilobytes or crawl time.
What is your take on how much Google renders and sees when it first visits a page?
Charles Floate: Google announced a few months ago that it processes around two megabytes for HTML files, but that is stripped down.
Most SEOs look at a website and think Google processes everything exactly as they see it. It does not.
You need to look at the page with JavaScript disabled, scripts disabled and CSS disabled.
That is closer to what Googlebot processes and sees in the rendered output.
Google will still look at certain CSS elements and how they affect content positioning. If your content is tiny and unreadable, it may be discounted.
But for NLP and semantic understanding, Google is mostly processing a default text-based experience, the text on the page, maybe image assets and embeds.
If your content fits within the two megabyte HTML limit after everything else is stripped away, Google can crawl and index it.
Most pages should be comfortably under that limit unless you are dealing with a huge 38,000-word guide.
There is not a tiny file-size chunk that Google is limited to. Google wants as much useful information as possible.
James Dooley: Is it important to have the most important n-grams, topics and entities higher up the page?
You mentioned excerpts and summaries. Where should they sit, and how important is that for semantic SEO?
Charles Floate: Anything above the fold that the user and Google can see is generally seen as higher probability text for understanding the document.
Google will take the whole document into account, but the initial understanding and processing are influenced heavily by what appears higher up the page.
The lower something appears, the less likely a user is to see it, and the less important it may become in the document’s meaning.
This is why some cloaked websites use very long pieces of content for Google while users see something completely different.
You need to make sure your key query appears within the above-the-fold content.
If it does not, Google may not see the page as primarily about that topic.
James Dooley: Some people in Kyle Roof’s community call that the above-the-fold centrepiece annotation.
It is the core focus part of the page, and it is important to get the main terms in there.
Anyone watching this, I hope you liked this episode on advanced on-page strategies, NLP and semantic SEO.
Charles Floate, it has been an absolute pleasure.