Summary of SEO 101 Episode 392
Google causes concern for many as it turns off the indexation request tool to make infrastructure updates. John and Ross speculate the updates are partly due to some recently publicized Google bugs while sharing those details. Ross discusses WordPress automatic platform optimization (APO) and the results on his own site followed by an interesting review of Gary Illyes’s comments on how Caffeine’s indexing infrastructure works. Local SEO news follows with some significant updates followed by John Mueller’s announcement of mobile-only indexation coming in 2021.
NOTE: If you like to receive our show notes straight to your inbox, sign up to our SEO 101 Podcast Show Notes Newsletter available at SEO101Radio.com.
NON SEO NEWS
DEXTER IS BACK !!!!!
Listener Question: Josh Rowe
“Google confirmed the “Request Indexing” feature in Google Search Console will be down for
some time due to infrastructure changes. Does anyone have any more info on what’s being changed?“
- Google said the issue was two-fold Google said; one with canonicalization and the other with mobile-indexing.
- The issues are taking some time to resolve
- Google said the canonical issue started as early as September 20th, not September 22nd or 23rd like we thought
- Google said it only impacted about 0.02% of its index
- As of October 2nd, Google restored about 10% but based on what we are seeing, it has restored a lot more since October 2nd.
- The mobile-indexing issue happened way earlier, but Google said the issue “spiked” around the middle of this past week
- it impacted a much larger piece of the issue, around 0.2%
- October 2nd restored 25% of it and continues to do so now.
Glenn’s write-up on the canonical issue – ‘Click here‘
What Caffeine Does:
- crawler information goes to protocol buffer >
- buffer then normalizes HTML (it’s mostly broken after all) using an HTML Lexer >
- Then the txt html page structure is normalized (H1, H2, H3, etc) which includes reviewing the styling to try to determine the “relative importance” (Gary Illyes) of each tag (this is also done for other formats like PDFs, spreadsheets, word docs, lotus files, etc.) >
- Meta Tag examination (first the meta robots) and then others.
He also discusses how they parse PDF. Because it is a binary format that is difficult to process, they have to license the parser from Adobe to convert it from PDF to HTML. The same thing happens for other binary formats they can index.
NOTE: if they find iframes, spam, and other out of place things in the header the HTML Lexer will close the header right before those tags and it starts the body from there on.
“Collapser” is used to verify 404 pages to see if they are valid or soft 404s.
“For example, where you are writing an article about error pages, in general, and you can’t for your life get it indexed… and that’s sometimes because our error page handling systems misdetect your article based on the keywords that you used as a soft error page. And, basically, it prompts Caffeine to stop processing those pages.” Gary Illyes (17:30 in this episode)
LOL: they joke about the “magic PR meta tag” that will get your site ranked after Martin insists the secret meta must be the Keyword tag.
LOCAL SEO NEWS
End of Show Notes
If you have any questions you would like to share with Ross and John, please feel free to post them on the SEO 101 Facebook Group. And, if you enjoy SEO 101 on WebmasterRadio.FM please consider supporting the hosts with feedback on Apple Podcasts, Stitcher, or your favourite podcast stream.