A practice website using Wordpress on Lightsail

Problem: Page indexing issues detected in willaford.com

What is the problem?

I received the following mail for each domain routing to this website (.com, .net, .me, .org)

screenshot of email from google.com indicating page indexing issues.
  • Duplicate without user-selected canonical
  • Page with redirect
  • Duplicate, Google chose different canonical than user
  • Alternate page with proper canonical tag
  • Not found (404)

What have I done about it?

I have started this post. Before starting this post I created a template post with the problem solving template. I practiced using “copy all blocks” in WordPress when creating this post.

What do I need to do about It?

  • I need to click the link to Fix Page indexing issues and figure out the rest of this list

As I write this post, I have yet to investigate. I suspect that changing that the recent update to certificates had side effects. Specifically, the main url changed form willaford.com to www.willaford.com. This site is references on four domains. I used all defaults when setting up google analytics and search indexing, without specifying a canonical URL. I’ve clicked the link, and see a help reference, so I am reading the reference, starting with the non-experts usage guide. Unhelpful for the specific issues, but nice to know that Google sent me four error mails even though “if your site has fewer than 500 pages, you probably don’t need to use this report.” The dashboard is more helpful, telling me that 12 pages are indexed, and 50 pages are not indexed.

Screenshot of the Google Indexing dashboard showing 50 not indexed, 12 indexed
  • Determine which pages are indexed and which are not indexed

I wonder what is behind the ? icon — tooltip is unhelpful. Survey the dashboard, find the link that says “view data about indexed pages” and click it. Examples include all the main site, about, contact, posts, all listed behind wordpress.willaford.com and www.willaford.com. None listed behind willaford.<domain>

  • Find out how to confirm the premise of what is unindexed.

Premise: .me, .com, .net, .org need some configuration to be indexed. How do I do that? Starting with this table, what happens if I follow “Not Started”

Screenshot of Why Pages aren't indexed

Clicked the link, found another graph with a help link.

Aha!  “if you think that Google has chosen the wrong URL as canonical, you can explicitly mark the canonical for this page” Uh-Oh! If you use…WordPress…go down the rabbit hole for setting canonical elements on wordpress

This will take some time. Search for anything that could possibly sound like a how-to. You will see results from everybody with something to sell, and you will see results from everyone referencing somebody else’s content seeking to monetize the their content with an affiliate link to something to sell. If it is for wordpress, then the links will include plugins. What I am looking for is instructions for how to set it for the site, preferrably direction, or as second choice, with an EOM plugin rather than 3rd party. Found a 2019 reference to editing the header.php file directly (with instructions for multi-site and single site). Found lots of references to various plugins. Lot’s of references to Yoast.

  • What is Yoast?

Yoast (https://yoast.com/) has WordPress options ranging from Free, to $$, to $$$. Let’s start with Free and see how many hooks are attached. First question — Is Yoast lying to me (and themselves). My identify and user data has value. Any plugin that requires identification and/or User Data is by definition Not Free. Privacy Policy shows an EU-based company, so GDPR is in place, but doesn’t specify what data is collected when using the Free Plugin. Public references seem clean (https://en.wikipedia.org/wiki/Yoast_SEO and others)

  • Install Yoast Free plugin

Search for plugins in wordpress config, install, activate. Success so far, no requests for personal info. Following link to configuration…yep, there’s the upsell for Premium in header, footer, and sidebar, but the center block still seems clean. A bit wordy and markety, but not skeevy. Personal Preferences has the winner, and I am smiling now: “If you’re using Yoast SEO free, usage tracking is only on when you explicitly opt-in. ” Following basic configuration and everything says “good” but nothing references canonical. How do I know if it worked?

  • Figure out if it did anything

Using the Google URL Inspector on https://willaford.com/about I find that “URL is not on Google” Search help on Yoast and find that I can set the canonical per page or post while editing. Test on hello-world (set at www.willaford…) and inspecting source after reload find that it is successful. Test on “about” and find that the overall site is vending wordpress.willaford.com.

  • Change some stuff and do it again

Going to each page (not post, page) and setting www.willaford.com as canonical. I will use an earlier post today as the test case

<link rel="canonical" href="https://wordpress.willaford.com/2023/06/11/todo-list-june-2023/">
<link rel="canonical" href="https://wordpress.willaford.com/contact/">

After — success for named pages, but I no dice for the full site

<link rel="canonical" href="https://www.willaford.com/contact/">
  • aaa
  • Back to the top — why not simply change headers.php?

Seems like there should be a better way. When I ran bncert I got certifications for all of the domains, so I think they should be covered with no need to rerun. For future WordPress installs I need to think about canonical on initial config (this wasn’t in mind when I created this the first time). So, using the AWS console, navigate to lightsail and connect to the wordpress instance. Then find header.php. Ugh, skip that. the header is per theme [really?!]. OK, now I’m into the config files and want to avoid direct edit. Maybe I can do this with wp cli but first, what does the config file say?

define( 'WP_HOME', 'https://' . $_SERVER['HTTP_HOST'] . '/' );
define( 'WP_SITEURL', 'https://' . $_SERVER['HTTP_HOST'] . '/' );

That is particularly unhelpful, unless you know what the HTTP_HOST is. OK, not really so unhelpful, this is simply telling me that I’m getting defaults. Let’s change this and see what happens. Backup wp-config.php. Publish this work-in-progress post. <PAUSE HERE> From the console, snapshot the instance. <COMING BACK to EDITING> From the command line:

sudo wp option update home 'https://www.willaford.com'
sudo wp option update siteurl 'https://www.willaford.com'

Command succeeds, but nothing changed. It turns out that Jetpack is giving the same error as Google, and give simple, direct, specific feedback on how to fix it — edit wp-config.


define( 'WP_HOME', 'https://' . $_SERVER['HTTP_HOST'] . '/' );
define( 'WP_SITEURL', 'https://' . $_SERVER['HTTP_HOST'] . '/' );


define( 'WP_HOME', 'https://www.willaford.com' );
define( 'WP_SITEURL', 'https://www.willaford.com' );


screenshot of site settings showing update URL

And checking again for the canonical on this post:

<link rel="canonical" href="https://www.willaford.com/2023/06/11/problem-page-indexing-issues-detected-in-willaford/" class="yoast-seo-meta-tag"> 
  • Retest — are we done yet? Check on Google, Check on Yoast, Check on Jetpack

Restarted Validation on Google — will recheck later, but google sitekit requires a reconnect.

Deactivating Jetpack — I have plans to pay, and there is no way to dismiss the upsells, and the upsells obscure the basics. Feels like Norton and makes we wonder if the fix is worse than the cure. No, I don’t want your protection just because I want your SEO. No, I don’t want your threats. What, I can’t dismiss, now you’re the threat (no, this isn’t awareness marketing, it is a threat. I presume that any software I’m loading is a thread and presume that your company is a threat until proven otherwise. If you assume I don’t already know that and try to convince me that <something else is a threat> than I presume you are indeed the attack vector). Deactivate. Uninstall. Blocklist from list of companies I want to do business with.




, ,




Leave a Reply

Your email address will not be published. Required fields are marked *