Skip to main content
Category

SEO

How Does Google Treat Subdomains For SEO?

By SEO, Technical SEO 14 Comments

Time and time again, Google has shown that they treat subdomains very differently from root domains, in some cases treating them as completely different sites. For SEO purposes, it’s generally recommended to use a subfolder instead of a subdomain.

Subdomain vs. Subfolder

A subdomain is a string of characters that precedes the root domain and uses a period to separate them. A subfolder comes after the domain suffix and is separated by a forward slash. You can have multiple subdomains or subfolders, and you’ll frequently see them combined.

Examples:

  • Blog.chrisberkley.com is a subdomain
  • Chrisberkley.com/posts/ is a subfolder
  • Blog.chrisberkley.com/posts/ is a subdomain with a subfolder.
  • First.blog.chrisberkley.com is two subdomains (“first” and “blog”)
  • First.blog.chrisberkley.com/posts/recent/ is two subdomains (“first” and “blog”) with two subfolders (“posts” and “recent”).

Did You Know?

In the URL www.chrisberkley.com, “www” is technically a subdomain. It’s true!

Why Use Subdomains?

There are legitimate reasons that necessitate the use of subdomains and subdomains are not completely unavoidable.

Technical Limitations

Sometimes there are technical infrastructure limitations that prevent the use of a subdomain. In large organizations with big sites, it’s common for access to the root domain to be limited, instead using subdomains for ease of use.

This may include piecing together multiple CMSs. If the core site is hosted on one CMS like Magento or Sitecore, but the blog is hosted on WordPress, it can be difficult (or impossible) to make them work together on the root domain.

Organizational Control

Large organizations often have multiple divisions that operate independently. Such is the case with universities, where individual colleges need to have edit access to their own sites (School of Nursing, School of Engineering, etc.). The same is true for other national organizations like banking institutions.

It’s a lot easier to spool up a separate site on a subdomain and grant a team of people edit access to that particular subdomain. You wouldn’t want the School of Nursing making edits that ended up taking down the entire root domain for the whole college.

International

Sometimes organizations will create international subdomains like fr.chrisberkley.com or en.chrisberkley.com. There’s no inherent SEO benefit to including a country code in the subdomain, but it may comeback to organizational structure or technical limitations. In a perfect world, you’d place those in subfolders (chrisberkley.com/fr/ or chrisberkley.com/en/) and implement hreflang. Alas, we  don’t live in a vacuum and that isn’t always possible.

How Google Treats Subdomains

Working with subdomain-heavy clients, my firsthand experience is that Google treats subdomains as separate sites. A client of mine who had two divisions of their company had one set up on subdomain and another on the root domain. They had some content overlap and we sometimes saw their pages swap places in search results.

It’s my belief that subdomains don’t inherit domain authority or site equity from the root domain. WordPress.com has a domain authority of 94. If subdomains inherited that value, wouldn’t it make sense to setup free blogs on their platform (which uses subdomains) and immediately benefit from the SEO value?

Secondly, Google’s own Search Console requires you to set up separate profiles for subdomains. That’s another good indicator that they value subdomains differently.

That doesn’t mean subdomains inherit ZERO equity from their root domains. They may inherit a greatly reduced amount. OR, Google may adjust the amount of equity they inherit on a case-by-case basis. Since WordPress.com has thousands of low-authority blogs on subdomains, Google may devalue their subdomains more than other sites that only have a handful.

Google has stated that their search engine is indifferent to subdomains vs. subfolders, but the SEO community has repeatedly found that to be false. Industry thought-leader Moz moved their content from a subdomain to a subfolder and saw measurable increases just as a result of that move.

Questions? Comments? Leave them here or reach out to me on Twitter: @BerkleyBikes.

How Long For Content To Rank?

By Content Marketing, SEO One Comment

The number one struggle I face with pitching clients and showing them the value of SEO, is that it takes time. Whereas pay-per-click advertising and social media can be spun up relatively quickly and provide a return on investment rather quickly, SEO is an annuity investment.

To make a relevant analogy: you can’t invest money in the stock market today and expect dividends tomorrow. The money you invest today is done so with the understanding that it will provide value later. SEO is similar.

Nevertheless, that’s a real problem because when clients are making a significant investment in SEO, they want to see results. That’s why I prepare clients by telling them “some of the work we do isn’t going to yield results right away. It’s going to take 6-12 months.”

This is especially true with publishing new content. Ahrefs did a study about how long it takes to rank in Google. They looked at the average age of pages ranking in positions 1-10, and the overall takeaway was that higher positions typically featured pages that have been live for several years. They also noted that higher authority sites took less time to rank well, which is a no-brainer. If there’s one single graph that shows their findings best, it’s this one:

That’s helpful, but does their large scale study align with actual first hand findings? Sure there’s value in a larger data sample, but having actual anecdotal data would certainly help reinforce those findings.

Fortunately I have that data. Across multiple clients in multiple industries, I can highlight examples of pages that rank well for target keywords, but didn’t reach full potential until months after they were published. I’m sharing these examples so that both consultants and clients can form realistic expectations for SEO campaigns, which is something I believe this industry can and should do a much better job at.

Example #1

Client Industry: Construction

Type of page: WordPress blog post

This particular page targeted “rental cost” keywords which are fairly low volume but highly relevant in the client’s industry. The client was hesitant to discuss pricing, but competitors were doing it, so we pushed them to create their own page. Not only does it drive meaningful traffic, but it has resulted in ~3 leads per month since it was published 16 months ago.

Example #2

Client Industry: Web hosting

Type of page: Resource center pages

These two pages were both created as part of a large content initiative – more than 120 pages of long form content over a one year period. Notably, they both saw steady growth and then marked increases in January 2018, possibly as a result of an algorithm update.

 

Example #3

Client Industry: Healthcare

Type of page: Core site page

This page saw long periods of inactivity in the very competitive healthcare space, before eventually moving into ranking positions that drive meaningful amounts of traffic (this is also a result of other improvements made to the site during that time).

Example #4

Client Industry: Local retail

Type of page: WordPress blog post

This example comes from a mom & pop retail store. A blog post that I wrote eventually moved into top ranking positions for some industry head terms, outranking even the brands that the retailer sold in their store. Unfortunately, the business owners did not continue digital marketing efforts after I left my position there, and the content did not retain its visibility in search results.

Example #5

Client Industry: Digital marketing

Type of page: WordPress blog post

The last example comes from my own website (which has lower site authority than any of my clients). While not initially a large traffic source, an analytics blog post I wrote moved into top positions (including the answer box) over a period of one year.

Summary

The key takeaway here is that firsthand data supports the study that Ahrefs did – that content may take months or more to move into top ranking positions, especially for competitive keywords. Site authority absolutely helps – two of the sites included here had domain authority ratings between 50 and 80, which is a rough indicator that they’re authoritative, especially in their respective industries.

With some of the examples, we did employ other tactics like building internal and external links. All pages were submitted to Google Search Console after publishing to make sure they got crawled as soon as possible. Also obvious is the fact that none of these pages were in a vacuum meaning that there were other marketing (and SEO) initiatives that could’ve contributed to better rankings. Nevertheless, there is a clear pattern showing that even highly optimized content on authoritative sites doesn’t always achieve top rankings immediately, and SEO continues to require patience.

How To: Optimize WordPress Posts & Pages For SEO

By Content Marketing, SEO One Comment

WordPress is a brilliant CMS that offers a plethora of SEO functionality out-of-the-box. But like any piece of technology, default settings won’t be enough to truly maximize its potential.This post will show you how to optimize a WordPress post (or page) for SEO purposes.

The WordPress SEO Plugins

While WordPress is good out of the box, it needs an SEO plugin to take it to the next level. The gold standards are either Joost de Valk’s Yoast SEO Plugin or All In One SEO Pack by Michael Torbert. Both add critical functionality for SEO purposes, so make sure you have one installed.

Content

No amount of optimization will help if you’re targeting topics with low or non-existent search volume. The same can be said for high volume (and high competition) topics. You have to pick topics and themes that are realistic and within your wheelhouse to achieve SEO success.

First we’ll start with the post content itself, focusing on how to structure the page with H headings and overall content length.

H Headings & Page Structure

Start by adding a post title. In many WordPress themes, the post title will also be present on the page as an H1 heading. Pages should only have one H1 heading and it needs to be keyword-rich and descriptive of the post’s content. The H1 is the first text a visitor sees when they hit the page.

In addition to H1 headings, it’s increasingly important to structure pages with additional, nested H headings like H2s, H3s, H4s, etc. These should also be keyword-rich and describe the subsequent paragraph. On this very page you’ll see a clear structure where paragraphs are ordered and grouped by similarity and marked up with a clear hierarchy of H headings.

If you know your subject matter and audience well, developing a hierarchy of H headings may be second nature to you. If not, performing keyword research can typically reveal different subtopics and then you can apply common sense to order them in the method that makes the most sense for visitors.

Ordered and Unordered Lists (Bullet Points)

To break up content and make it more digestible, use ordered lists (numbered lists) and unordered lists (bullet points) where applicable. Using these with a keyword-rich H heading may result in securing a featured snippet (answer box) in search results.

  • Anytime you’re describing steps, consider using an ordered list.
  • If you’re listing several things using commas, try bullet points instead.

This is not only helpful for SEO, it helps readers digest a page more easily.

Content Length

Content length is much debated and the honest answer to “what’s the right length” is that there isn’t one. If the content is engaging, people will read it. Know your audience, write quality content and you’ll succeed.

With that being said, 250-300 words is commonly considered the absolute minimum for SEO purposes. Less than that and search engines may deem the content thin. It will be incredibly difficult to add a meaningful structure of H headings to a page with 300 words.

I recommend content that’s a minimum of 500-700 words. In many cases, long form content can do wonders for SEO and when I say long form I mean 1,000 words or more. Most of my successful posts are detailed how-tos in excess of 1,000 words. Your mileage may vary – put your focus on writing good content and worry less about the length.

Video, Images & Media

Video, images and media are also great ways to break up text-based content and provide additional value for visitors. Would the topic you’re discussing be more easily understood if a visual were added? In many cases, yes.

Here I’ll discuss ways to optimize media for SEO, and also for visitors with disabilities or impairments, who may not be able to consume images, video or audio.

Image Optimization

Images can be improved for SEO by using filenames, alt text and by optimizing image sizes (for site speed). Because search engines can’t visually determine the contents of an image, these optimizations allow them to understand image content, helping the page rank better and helping images to rank in image search results. Additionally, visitors with visual impairments may not be able to see images, so these optimizations help them consume and understand multimedia content.

Image Filenames

Including keywords in filenames can have impact. It’s not huge, but every bit helps. Use descriptive keywords in filenames when possible but don’t start keyword stuffing – make them descriptive and methodical.

Image Alt Text

Include image alt text when possible. The alt text is never seen by visitors unless A) the image doesn’t load or B) the visitor is impaired and the alt text is read to them.

Both of these scenarios help visitors understand the content of the image, even if it can’t be seen. For that reason, make your image alt text descriptive of what’s in the image and avoid keyword stuffing.

wordpress seo image alt text

The alt text for the image immediately above: wordpress seo image alt text

Image Size Optimization

Your images should only be as large as they need to be. Often, GIANT images are scaled down to a much smaller size with HTML. The problem is, if you have a giant image with an enormous file size, browsers have to load the entire image, even if it’s being displayed at a much smaller size. That slows down page speed, especially if there are multiple large images on the page.

Make the image as big as it needs to be. If the image will be displayed at 900 pixels wide, then make it 900 pixels wide. Secondly, use JPG images instead of PNGs – JPGs are significantly smaller in file size. If you don’t have an image editing program, you can do it right in WordPress from the Media Library menu.

Featured Images

Add a featured image. The featured image will be used as the default image when a page or post is shared on social media, although this can be changed for different social networks.

Video

Similar to images, video content also has opportunities for on-page optimization. Video content is equally hard for search engines to understand, so we optimize by adding context in other ways.

Embedding

Embedding video content on WordPress posts or pages is quite easy, especially for YouTube, Wistia and Vimeo. With any of these three, you can simply drop the URL into WordPress’ WYSIWYG editor and it will automatically embed the video. Embedding videos on-site is a great way to get more views and provide a superior user experience.

Schema

When you do embed video content, make sure you add Schema as well. If you’re using Wistia, you’re in luck, because Wistia embeds Video Schema by default using Javascript (read more about Wistia videos & schema here).

YouTube and Vimeo users are not as fortunate however, and must add Schema manually, preferably using custom fields. JSON is Google’s preferred version of Schema and creating the Schema is not difficult at all. Schema gives search engines additional information about videos, such as the video’s title, description, length, upload date, etc. Schema is the only way for search engines to get information about video contents.  

<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "VideoObject",
"name": "Contact Form 7 Goal Conversion Tracking Google Tag Manager",
"description": "Follow this 10 minute guide to set up Google Analytics goal conversion tracking for Contact Form 7 submissions using Google Tag Manager.
If you have a WordPress website and you use the Contact Form 7 plugin, you can use Google Tag Manager to create events and set up Goal Conversions in Google Analytics. Then you can attribute form submissions to different marketing channels and campaigns that you're running.
This guide not only shows you how to track submissions, but also ensures that you're only tracking successful submissions where mail is actually sent. It also allows you to specify which forms you want to track, based on the form ID built into the Contact Form 7 shortcode. 
***Links***
Written how-to guide: https://chrisberkley.com/blog/contact-form-7-event-tracking-google-tag-manager/
Javascript code for Tag #1: 
https://chrisberkley.com/wp-content/uploads/2017/11/wpcf7mailsent-javascript.txt
Troubleshooting your setup:
https://chrisberkley.com/blog/troubleshooting-contact-form-tracking-with-gtm/",
"thumbnailUrl": "https://i.ytimg.com/vi/oTZG7A3RjT8/maxresdefault.jpg",
"uploadDate": "2017-11-19",
"duration": "PT10M1S",
"embedUrl": "https://www.youtube.com/embed/oTZG7A3RjT8"
}
</script>

Transcripts

Transcripts can be really critical. Not only do they give impaired users a full transcript of the video’s content, but they can be keyword-rich and help a page rank if the video is especially relevant to the target keywords.

I don’t always include transcripts, but often recommend including them in an accordion drop-down, so as not to disrupt the flow of existing text on the page. If the page doesn’t have much additional text, transcripts can easily be adapted into blog posts.

Meta Data

Meta data is still really important for SEO. Both Yoast’s plugin and All In One SEO make it very easy to add a title tag and meta description, even warning you if you approach character limits.

Title Tags

Using your chosen SEO plugin, write and add an optimized title tag. Shoot for 45-60 characters. Excessively long titles will be truncated in search results.

I prefer to include the target keyword at the beginning and then include branding at the end. Title tags should grab the searcher. I’m a fan of using question-based title tags if they’re relevant. Here’s the title tag for this post:

How To Optimize WordPress Posts & Pages For SEO | Chris Berkley

Meta Descriptions

Meta descriptions should be up to 230 characters and describe the page’s contents – be as descriptive as possible. Meta descriptions are the key to encouraging searchers to click through from search results and can have a big impact on click through rates.

Tell searchers what value the page will provide and what they’ll find. Include branding if possible. End with a CTA telling them what to do once they land on the page.

Here’s the meta description for this page:

Optimizing WordPress posts and pages is critical for SEO. Follow this comprehensive guide to make sure your content is FULLY optimized, using all of WordPress’ advanced functionality.

Social Markup

SEO plugins make it easy to add Open Graph and Twitter Card markup to the page. These meta tags are specifically for social media and add rich snippets when URLs are included in social posts.

Even without social markup, most social networks will pull the page title, description and image to create a rich snippet. However, these aren’t always optimal – they frequently pull the wrong or completely irrelevant images. Optimizing this markup allows you to customize titles, descriptions and images for use on social media.

Open Graph Markup

Open Graph is a standard markup most notably used by Facebook, LinkedIn and Pinterest. The two SEO plugins I mentioned before automate the creation of Open Graph markup using the title tag, meta description and featured image that you’ve added to the page. However, they also allow you to customize these fields specifically for social media – this is especially easy using the Yoast plugin.

Say you wanted to add a catchier title/description/image for use on social media. You can do that without impacting your SEO efforts by changing the title tag & meta description that Google uses.

Twitter Cards

Rather than use Open Graph markup like most other networks, Twitter elected to create its own (very similar) markup called Twitter Cards. You can customize these too just like Open Graph markup.

Linking

Internally linking pages is really important for both visitors (they can find related content) and search engines (can crawl the site more easily). Internal links can be added in a number of ways, but some are more valuable than others.

Links In The Body Copy

Body copy links are arguably the most valuable, assuming they’re done naturally and in moderation. No one likes a page where every other sentence is a link – it’s incredibly distracting and results in a poor user experience. Use links where they fit naturally.

Internal Links

Internal links (links from one page on your site to another page on your site) are valuable for helping visitors find related content and improving the ability for search engines to crawl the site. I recommend setting these to open in the current tab.

External Links

Linking out to other sites is fine too. If there’s a page on another site that would provide value to your visitors, link out to it. I recommend opening these in new tabs, to encourage visitors to stay longer on your site.

Anchor Text

Anchor text is the phrase that gets hyperlinked to another page. You should aim to use keyword-optimized anchor text, especially for internal links (keyword-rich anchor text is not as necessary for external links).

There are several links within this article that link out to other related topics, using anchor text keywords relevant to those topics.

Categories & Tags

Use Categories and Tags methodically. Keyword stuffing them has no benefit for SEO purposes. Instead, they should be used to help visitors browse the site to discover related content. Additionally, Categories & Tags have a ton of value for search engines as they make it easy to crawl the site and find additional pages.

Categories & Tags are the first line of defense against island pages and semi-automate internal linking. However, blog category pages typically contain dynamic content (unless setup otherwise) and typically don’t present much value for ranking purposes.

Build out some pre-determined Categories & Tags and stick to them, adding new ones as you go. Avoid using the same categories as tags and vice versa. Think of Tags as sub-categories. Below is a sample diagram of a Category-Tag structure.

Authors

Adding author details can establish credibility to the post by showing an appeal to authority. WordPress editors have the option of changing the author at the bottom of the post. Don’t ever leave the post author as “Admin.”

Authors should have photos & biographies describing who they are. There’s no inherent SEO value here (not anymore), but it shows readers who actually wrote the content. I always include links to my Twitter page for people to ask questions about my content.

Technical

Schema

Schema (Structured Data) helps search engines crawl and index web pages by specifying specific pieces of content. There are many types (I won’t describe them all) which can be found at Schema.org.

A few common types are:

  • Video
  • Product
  • Person
  • Location

I recommend following Torquemag’s guide to setting up custom WordPress fields for Schema. You can also read more about Schema and Structured Data with Google’s developer documentation.

URL Structure

When you save your post or page as a draft, you’ll see that WordPress automatically takes the post title (H1) and also uses it for the URL. In this case, you may choose to edit the URL, but make sure it’s still keyword-rich. You want your most valuable keywords in the URL.

If you’re creating a page (not a post) you’ll see that you have the option of selecting a parent page. Should you add one? It comes down to site structure and strategy. If the page you’re creating falls naturally as a child page to another page, then take advantage of it.

Adding a child/parent page isn’t a silver bullet for SEO. It’s part of a bigger SEO strategy centering around how content is structured on your site. If have a careful hierarchy built out, adding URLs that reflect the site structure is icing on the cake.

The Difference

Following these steps can be the difference between content that ranks and content that doesn’t. Content has been increasingly important, especially as backlinks have become less influential as a ranking factor.

Checklist

If the number of steps seems intimidating, download this checklist and integrate these steps into your content publishing process.

Download The Checklist

How To Use IMPORTXML & Google Sheets to Scrape Sites

By SEO, Technical SEO 6 Comments

IMPORTXML is a very helpful function that can be used in Google Sheets to effectively crawl and scrape website data in small quantities (especially useful for grabbing titles and meta descriptions, etc.). It can be faster and more convenient that using Screaming Frog or other tools, especially if you only need to pull data for a handful of URLs. This post will show you how to use IMPORTXML with XPath to crawl website data including: metadata, Open Graph markup, Twitter Cards, canonicals and more.

Skip Ahead: Get the free template.

Setting Up The IMPORTXML Formula

This is the IMPORTXML formula:

=IMPORTXML(url,xpath_query)

You can see there are two parts and they’re both quite simple:

The first half of the formula just indicates what URL is going to be crawled. This can be an actual URL – but it’s much easier to reference a cell in the spreadsheet and paste the URL there.

The second half of the formula is going to use XPath to tell the formula what data is going to be scraped. XPath is essentially a language that is used to identify specific parts of a document (like a webpage). Subsequent paragraphs will provide different XPath formulas for different pieces of information you might want to scrape.

Crawling Metadata with IMPORTXML

The following XPath formulas will scrape some of the most commonly desired SEO data like metadata, canonical tags, and H headings. Note that you can scrape any level of H heading by replacing the “h1” with whichever heading you want to scrape (h2, h3, etc.)

Title Tags: //title/text()
Meta Descriptions: //meta[@name='description']/@content
Canonical Tags: //link[@rel='canonical']/@href
H1 Heading(s): //h1/text()
H2 Heading(s): //h2/text()

Social Markup

While social markup has no immediate SEO benefit, it is very important for sites that have active audiences on social media, and implementation of social markup often falls under the umbrella of SEO because of its technical nature. The following XPath formulas will allow you to scrape Open Graph and Twitter Card markup.

Open Graph Markup

Open Graph is used by Facebook, LinkedIn and Pinterest, so all the more reason to make sure it’s implemented correctly.

OG Title: //meta[@property='og:title']/@content
OG Description: //meta[@property='og:description']/@content
OG Type: //meta[@property='og:type']/@content
OG URL: //meta[@property='og:url']/@content
OG Image: //meta[@property='og:image']/@content
OG Site Name: //meta[@property='og:site_name']/@content
OG Locale: //meta[@property='og:locale']/@content

Twitter Card Data

Twitter Card markup is only for….Twitter. Still important though!

Twitter Title: //meta[@name='twitter:title']/@content
Twitter Description: //meta[@name='twitter:description']/@content
Twitter Image: //meta[@name='twitter:image']/@content
Twitter Card Type: //meta[@name='twitter:card']/@content
Twitter Site: //meta[@name='twitter:site']/@content
Twitter Creator: //meta[@name='twitter:creator']/@content

Limitations

Unfortunately, IMPORTXML & Sheets cannot be used to scrape large quantities of data at scale, or it will stop functioning. For more than a handful of URLs, it’s recommended to use a more robust program like Screaming Frog (Screaming Frog does not have a URL limit when using it in list mode).

IMPORTXML Google Sheets Template

You can see how this works firsthand by making a copy of this Sheets Scraper Template and entering the URL of your choice in cell B6. To add additional URLs, copy & paste row 6, then enter a different URL.

Questions? Contact me here or reach out on Twitter!

WWW vs. non-WWW For SEO

By SEO, Technical SEO No Comments

There is no SEO benefit to WWW URLs vs non-WWW URLs. Best practice is to pick one as the preferred version and use server-side redirects to ensure all visitors (human and search engine) end up on one single preferred version of the URL.

What Is WWW?

First let’s start with URL structure:

In the URL above, there are three parts:

  • Protocol
  • Subdomain
  • Domain name

Protocol is a topic for another time, but WWW is technically a subdomain. Websites often use multiple subdomains for different purposes: one for email, one for intranet access, etc. The www subdomain has traditionally been used as the designated subdomain for public-facing websites.

Which Is Better For SEO?

As noted, there is no benefit for SEO purposes. You don’t actually need a subdomain. It’s perfectly fine not to use it and there is zero functional difference for SEO purposes. However, you DO need to pick one version and use it consistently.

Server-Side Redirects

Once a preferred version has been chosen, the other version needs to be 301-redirected at the server level. If it isn’t, it might result in:

  1. Non-preferred URLs returning 404 errors.
  2. The website rendering pages in both variations.

Configuring the server to redirect non-preferred versions to preferred versions ensures that ALL URLs will be redirected automatically.

Configuring Google Search Console

Additionally, it’s recommended to configure Search Console to indicate the preferred version as well. In the top right corner, click the gear icon and select Site Settings. There you’ll see the option to set a preferred version of the URL:

What Are XML Sitemaps? How To Use Them for SEO

By SEO, Technical SEO

XML Sitemaps are critical to help search engines crawl websites, but I frequently see clients with critical errors in their XML sitemaps. That’s a problem because search engines may ignore sitemaps if they repeatedly encounter URL errors when crawling them.

What Is An XML Sitemap?

An XML Sitemap is an XML file that contains a structured list of URLs that helps search engines crawl websites. It’s designed explicitly for search engines – not humans – and acts as a supplement. Whereas web crawlers like Googlebot will crawl sites and follow links to find pages, the XML sitemap can act as a safety net to help Googlebot find pages that aren’t easily accessed by crawling a site (typically called island pages, if there are no links built to them).

Where Do XML Sitemaps Live?

The XML sitemap lives in the root folder, immediately after the domain, and often follows a naming convention such as domain.com/sitemap.xml. A Sitemap declaration should also be placed in the robots.txt file so that Google can easily discover it when it crawls the robots.txt file.

What URLs Should Be Included In An XML Sitemap?

URLs included in the XML sitemap should be URLs that are intended to be crawled, indexed and ranked in search results. URLs should meet the following specific criteria in order to be included:

  • Only 200 OK URLs: no 404s, 301s, etc.
  • Pages do not contain a noindex tag
  • Pages are not canonicalized elsewhere
  • Pages are not blocked by robots.txt

HTTP Status Codes

Sitemap URLs should return clean 200 status codes. That means no 301 or 302 redirects, 404 errors, 410 errors or otherwise. Google won’t index pages that return 404 errors, and if Googlebot does encounter a 301 redirect, it will typically follow it and find the destination URL, then index that.

If you have 404 errors, first ask why: was a page’s URL changed? If so, consider redirecting that URL by locating the new URL. Take that new URL and make sure that is included in the sitemap.

If there are 301s or 302s, follow them to the destination URL (which should be a 200) and replace the redirected URL in the sitemap.

Noindexed & Disallowed Pages

If a page has a noindex tag, then it’s clearly not intended to be indexed, so it’s a moot point to include it in the XML sitemap. Similarly, if a page is blocked from being crawled with robots.txt, those URLs should not be included either.

If you DO have noindexed or disallowed pages in your XML sitemap, re-evaluate whether they should be blocked. It may be that you have a rogue robots.txt rule or noindex tags that should be removed.]

Non-Canonical URLs

If a page in the sitemap has a canonical tag that points to another page, then remove that URL and replace it with the canonicalized one.

Does Every Clean 200 Status URL Need To Be Included?

In short, no. Especially on very large sites, it may make sense to prioritize the most important pages and include those in the XML Sitemap. Lower priority, less important pages may be omitted. Just because a page is not included in the XML sitemap does not mean it won’t get crawled and indexed.

Sitemap Limits & Index Files

An XML sitemap can only contain 50,000 URLs or reach a file size of 10MB. Sitemaps that exceed this limit may get partially crawled or ignored completely. If a site has more than 50,000 URLs, you’ll need to create multiple sitemaps.

These additional sitemaps may be located using a sitemap index file. It’s basically a sitemap that has other sitemaps linked inside it. Instead of including multiple sitemaps in the robots.txt file, only the index file needs to be included.

If there ARE too many URLs to fit into one sitemap, URLs should be carefully and methodically structured in hierarchical sitemaps. In other words, group site sections or subfolders in the same sitemap so that Google can get a better understanding of how URLs interrelate. Is this required? No, but it makes sense to be strategic.

Types of XML Sitemaps

In addition to creating sitemaps for pages, sitemaps can (and should) be created for other media types including images, videos, etc.

Dynamic vs. Static

Depending on the CMS and how it’s configured, the sitemap may be dynamic, meaning it will automatically update to include new URLs. If it’s configured correctly, it will exclude all the aforementioned URLs that shouldn’t be included. Unfortunately, dynamic sitemaps do not always operate that way.

The alternative is a static sitemap, which can easily be created using the Screaming Frog SEO spider. Static sitemaps offer greater control over what URLs are included, but do not automatically update to include new URLs. In some cases I’ve recommended clients utilize static sitemaps if a dynamic sitemap cannot be configured to meet sitemap criteria. When that happens, I set a reminder to provide an updated sitemap, typically on a quarterly basis, or more often if new pages are frequently added to the site.

Submission to Webmaster Tools

Once an XML sitemap has been created and uploaded, it should always be submitted to Google Search Console and Bing Webmaster Tools to ensure crawlers can access it (in addition to the robots.txt declaration).

In Google Search Console

Navigate to Crawl > Sitemaps and at the top right you’ll see an option to Add/Test Sitemap. Click that and you can submit your sitemap’s URL to be crawled.

In Bing Webmaster Tools

From the main dashboard, navigate down to the sitemaps section and click “Submit a Sitemap” at the bottom right. There you can enter your sitemap’s URL.

Finding Pages With Embedded Wistia Videos

By Technical SEO, Video No Comments

Wistia is a great platform for hosting videos on your site with tons of functionality including the ability to embed videos on pages and optimize them using built-in calls-to-action and pop-ups.

Recently I encountered a scenario where I wanted to find every website page that had a Wistia video on it. Going into Wistia’s back end revealed that the client had ~200 videos, but I had no idea where they were actually placed on the site, and wanted to ensure they were being used to full capacity.

With YouTube, you can simply run a Screaming Frog crawl and do a custom extraction to pull out all the embed URLs. From there you can determine which video is embedded based on that URL. However, the way Wistia embeds videos is not conducive to identifying which video is where, based on an embed URL. I couldn’t find any distinguishing characteristics that would help me identify which video was which.

How can such an advanced video platform be so incredibly difficult?

That’s mostly because Wistia relies heavily on Javascript. As Mike King notes in his article The Technical SEO Renaissance, right clicking a page and selecting “view page source” won’t work because you’re not looking at a computed Document Object Model. In layman’s terms, you’re looking at the page before it’s processed by the browser and content rendered via Javascript won’t show up.

Using Inspect Element is the only way to really see what Wistia content is on the page. Doing that will show you much more information, including the fact that Wistia automatically adds and embeds video Schema when you embed a video. This is awesome and saves a ton of work over manually adding Schema like you have to do with YouTube videos.

The video Schema contains critical fields like the video’s name and description. These are unique identifying factors that we can use to determine which video is placed where, but how can it be done at scale when we don’t even know which pages have videos and which don’t?

Finding Wistia Schema With Screaming Frog

Screaming Frog is one answer. Screaming Frog doesn’t crawl Javascript by default, but as of July 2016, DOES have the capability to do so if you configure it (you’ll need the paid version of the tool).

Go into Configuration > Spider > Rendering and select Javascript instead of Old AJAX Crawling Scheme. You can also uncheck the box that says Enable Rendered Page Screenshots, as this will create a TON of image files and take unnecessarily long to complete.

Setting Up a Custom Extraction

Next you will need to setup a Custom Extraction which can be done by going to Configuration > Custom > Extraction. I’ve named mine Wistia Schema (not required) and set the extraction type to regex, then added the following regular expression:

<script type="application\/ld\+json">\{"@context":"http:\/\/schema.org\/","\@id":"https:\/\/fast.wistia.net\/embed.*"\}<\/script>

This will ensure you grab the entire block of Schema, which can be manipulated in Excel later to separate different fields into individual columns, etc.

Then set Screaming Frog to list mode (Mode > List) and test the crawl with a page that you know has a Wistia video on it. By going into the Custom Extraction report, you should see your Schema appear in the Extraction column. If not, go back and make sure you’ve configured Screaming Frog correctly.

Screaming Frog Memory and Crawl Limits

The only flaw in this plan is that Screaming Frog needs a TON of memory to crawl pages with Javascript. Close any additional programs that you don’t need open so that you can reduce the overall memory your computer uses and dedicate more of it to Screaming Frog. With large sites, you may run out of memory and Screaming Frog may crash.

Takeaways

  • Wistia uses Javascript liberally.
  • Schema is embedded automatically, using Javascript.
  • Schema can be crawled and extracted with Screaming Frog, but it’s a memory hog so larger sites might be a no-go.

Questions? Tweet at me: @BerkleyBikes or comment here!

Google My Business Posts

By Local SEO, SEO 2 Comments

A few weeks ago Google rolled out a post feature for its My Business Listings. Now you can create Facebook-like posts in the back end of the Google My business interface, that will display an image, description and website link in a box below your Google My Business listing’s knowledge graph. First I’ll show you how to create & optimize these, then I’ll discuss where I foresee them being most useful.

Creating Google My Business Posts

First log into your Google My Business platform and select the location you want to create a post for (if you have more than one). So far posts have to be manually created for each location, so it’s not easy to roll them out to hundreds of listings. The post you create will only show up for the listing you create it for.

Once you’ve selected your location, click on the “Posts” option on the left nav and you’ll see a box in which you can write a post. You’ll also see previous posts located underneath (this particular post is expired, I’m not sure how long they stay there for).

Once you click into the post editor, it’ll look like this. The interface is admittedly clunky.

If you click on that big gray box, it’ll let you upload a photo and prompt you to crop it into a rectangular shape. (You would think the Photo Guidelines linked at the bottom would provide criteria for sizing, aspect ratio, etc. It does not.) Ideally your image should be engaging and grab attention. You may opt to include text in the image – this reminds me a lot of a Google AdWords Display ad, which may hint at the future of this functionality.

Then you can add a description – you have between 100-300 words.

There are really two types of posts – events and non-events. Non-event posts last a week, while event posts will prompt you to enter start/end dates and will stay up for the entire duration of the event.

You can also add one of several preset call-to-action buttons for people to click on (I’ve chosen ‘Learn More’) and add a URL. I highly recommend tagging this URL, just like you should tag the landing page URLs in your GMB listings. Otherwise, it’ll come through as organic, but you may not know whether it was from a normal SERP or the post itself.

You can use Google’s URL builder – be sure to tag the medium as organic (these URLs should only be accessible from an organic search). The source is up to you, but I’ve been using g-local-post as my source (to differentiate from g-local as my source in the listing URLs themselves).

Then you can preview your post and if it looks good, publish it.

Now you’ll see your post as a small box at the bottom of your branded knowledge graph. Despite the fact that I’ve done everything Google requested, the image is cut off and the description cut short. Hopefully this product evolves a bit and remedies some of those issues.

You might think “I wonder if they look better on mobile?” – the answer is no (see below). If there’s more than one post, you do see a carousel (whereas desktop only displays one post at a time). On mobile, Google does allow you to click on a tab and see the posts by themselves, but who’s realistically going to do that?

Takeaways

The GMB Post format and interface is clunky. The images almost never show up as intended, making them ineffective. Their usefulness is also limited by where they appear. The only time these posts will show up is in a knowledge graph, which typically indicates a branded search took place.

The chance they’d show up for a non-branded search is very limited, so they’re not much use to drive new organic traffic. If anything, they may steal traffic away from the GMB listings themselves, so be aware of that.

While my examples used blog posts, this is probably poor usage. These types of posts would be much better suited to location-specific events that someone searching for a particular location would want to know about.

It’s sort of like free display ads – I wouldn’t be surprised if Google eventually monetizes this with advertising, the way they added and monetized the local map pack with ads.

Questions? Comments? Tweet at me (@BerkleyBikes) or drop a comment here!

Bounce Rates: Are They Bad?

By Analytics, SEO One Comment

Let’s talk about bounce rates. The reason we’re talking about them is because they’re largely misunderstood and less scrupulous (less informed?) marketers than myself routinely make claims about lowering bounce rate…as if that’s universally a positive thing. It may be. Or it might not matter. Read on and I’ll explain why.

What Is Bounce Rate?

Before we really get into it, let’s recap what bounce rate is. Bounce rate is the percentage of visitors who have one single Google Analytics hit during their site session, before leaving. Very often, it is incorrectly stated that bounce rate is the percentage of people who view one page before leaving, but that’s not necessarily true. A pageview is a type of hit, but not all hits are pageviews (I’ll explain why in a few paragraphs).

Bounce rate is one of the metrics in the Google Analytics default report (the first screen you see when you login). I absolutely revile this report. I know its intentions are good, but I’ve seen far too many reports recapping these numbers and nothing else (to be fair, there was a time when a less-worldly version of myself created said reports).

The default report is especially worthless because it looks at bounce rate across the entire site and that’s an utterly stupid way to use that metric. Different types of pages or marketing channels will have vastly different bounce rates and you shouldn’t roll them into one site-wide number.

The red/green colored text leads you to believe that a lower bounce rate is good and a higher bounce rate is bad, but consider this: What if someone lands on a location page where they get directions or a phone number? In that scenario, they might be considered a bounce, but in reality, they completed an action that takes them one step closer to being a customer.

Or, what if the page is lead-gen oriented, contains a form and gives visitors all the information they came for, without the need to click on another page? That may also be considered a bounce, but if the visitor converts into a lead, who cares?

What if the page has a baking recipe on it, and visitors spend 20+ minutes with the page open while they use that recipe, then close the browser when they’re done? Possibly still a bounce.

I could go on, but I won’t because I’ve made my point – there are plenty of scenarios where bounces are not something to be concerned about.

When Is Bounce Rate Bad?

If your site makes money from ad revenue, bounce rate could certainly be an issue. Most ad-based sites rely on multiple pageviews per session so they can cycle more ads and increase the likelihood one one of them is relevant and gets clicked. Ever wonder why “22 Photos of Cats in Boxes” takes 24 pages to finish reading? Ad revenue, that’s why.

Similarly, if you have a high bounce rate on gateway pages that are supposed to funnel traffic to other pages on the site, that could be an issue. Even then, it’s highly dependent on the site/design/business model/industry.

In other words, yes, a high bounce can be a problem, but should be evaluated on a case-by-case basis and with plenty of scrutiny. Looking at a site-wide bounce rate is a complete waste of time, and an exercise in futility.

Interaction vs. Non-Interaction Hits

Earlier I said bounce rate is based on the number of hits and not pageviews. As noted, pageviews are a type of hit, but there are also many others including form submissions, click to call, click to get directions, video plays, etc – These actions can be set up as interaction hits or non-interaction hits.

An interaction hit will affect the bounce rate. A person who views one page and fills out a contact form will count as two hits – not a bounce.

A non-interaction hit does not affect the bounce rate. You might want to track video views, but if they’re less of a priority than form fills, you can track them as non-interaction hits. A person who views one page and watches a video that’s configured as a non-interaction hit, will still count as a bounce.

The choice between interaction and non-interaction hits gives you the flexibility to adjust the way bounce rate is calculated, for better data. But that’s not the end of it by any means.

Multi-Touch, Multi-Channel Conversion Funnels

So a single pageview is OK as long as there’s also an interaction hit, but still bad if there isn’t, right? Well…no. That bounce visit could still serve an important role in the customer’s path to conversion. It’s common for marketers to analyze and report on data using Google Analytics’ Last Non-Direct Click attribution model (guilty as charged) and when that happens, it’s easy to over-emphasize bounce rate.

Let me back up again. Google Analytics lets you apply several different attribution models in order to give attribution (credit) to different channels based on their position in the conversion path. You can read more about it here, but the default attribution model is last non-direct click, where the last channel before the conversion receives attribution, unless that channel is direct.

We know a prospective customer may visit a website multiple times before converting. We also know that customer may arrive at the site via a number of different digital channels. The diagram below shows a three touch conversion path (AdWords > SEO > Direct). While the initial touch was an Adwords ad click, organic will get attributed the conversion because it’s the last non-direct touchpoint.

But without careful analysis (and in part due to the last non-direct click attribution model), these three sessions are likely to be viewed from a siloed perspective, without making the connection they all played an important role.

Without looking at the full conversion path, this is going to look great for SEO and not so great for PPC efforts. An SEO-specific report is likely to ignore both the paid search and direct sessions that contributed to the conversion. A report specific to paid search may look at the ad spend as ineffective, since it didn’t convert in that session.

I realize including multiple channels may have you thoroughly confused, so let’s simplify it further and say all three visits were organic. This way there’s no question that organic should be attributed the conversion. Let’s also assign an arbitrary number of pageviews to each visit, as seen below.

There’s no question about which channel gets the conversion…but it doesn’t change the fact that if analysis is being done at the session-level, these three sessions will not be linked together.

The first visit will be considered a problem because it was a bounce, while the last visit will be celebrated because it resulted in the conversion. Realistically, the last visit may not have happened if it wasn’t preceded by the initial bounce. The middle visit is somewhere in between: not as good as a visit that converts, but at least it’s not a bounce.

It’s possible that all three of these visits played a critical role in the customer journey and discrediting any one of them could result in lost conversions. Similarly, spending time trying to lower bounce rates may be futile if those bounces play a bigger role in a longer path to conversion.

Takeaways

Bounce rate is incredibly complex and boiling it down to good or bad is very, very difficult. As an SEO consultant, I rarely ever focus on bounce rate because I understand this complexity. Bounce rate is not something I report on, and lowering it is never an objective of my campaigns.

SEO projects should be focused on driving relevant traffic that eventually converts into leads, sales and customers, not driving down bounce rates in order to achieve a perceived industry standard or arbitrary metric.

Questions? Comments? Tweet at me (@BerkleyBikes) or drop a comment here!

Google’s Mobile-First Algorithm: What Does It Mean?

By SEO No Comments

Before you read any of this, understand that what’s in this post is not guaranteed to be fact. Much of this information is anecdotal based on phenomena I’ve encountered. In my defense, almost anything you read about this topic is somewhat anecdotal – none of us have a picture perfect idea of how Google’s internal processes work. I welcome and encourage you to leave questions and comments.

In November of 2016 Google’s Webmaster Blog announced the long term goal of moving toward a mobile-first search index. This could have a fundamental impact on search rankings, although it remains to be seen whether it’ll truly be impactful, or go the way of the first mobile-friendly algorithm update, which had minimal impact overall.

How It All Works

First, let’s do a basic recap Google’s process for crawling and indexing web pages. While we don’t know the intimate details, we at least have a high level understanding of how the process works.

Google’s web crawlers (bots) periodically crawl websites and index the content. They follow links to find additional pages and take into account on-page optimization like titles, H headings, body content, images, video, etc.

There are two versions of Googlebot – one for desktop and one for mobile. Similarly, there are two different indexes (also desktop & mobile). At present, the desktop index is the primary index to determine where a page will rank in search results. Traditionally, desktop has been the primary traffic source and was responsible for the majority of searches. Now Google is claiming the split has shifted in favor of mobile.

Exactly how Google rectified differences between the two indices is not well known. For example, a page that provides a phenomenal desktop experience could easily provide a very poor mobile experience if it’s not responsive. When that’s the case, what does Google do? Does it index and rank the page in desktop search results, but not on mobile?

Doing so would provide a very inconsistent searching experience across devices. Imagine if you ran a search on your phone then replicated it later on desktop and you were presented with a completely different set of results? That would be confusing, right?

Mobile Penalizes Desktop

It has been my experience that even though the mobile index is not the primary, it does directly influence desktop rankings. Several months after going responsive, an education client I worked with saw big increases in desktop rankings after the second mobile algorithm update in 2016 (in addition to big increases on mobile).

This backs the theory that in an attempt to present a consistent user experience, Google links the two indices, and mobile, the “secondary” algorithm could have a significant impact on desktop. In effect, we believed their non-responsive site was limited on desktop because their mobile experience was poor. When the algorithm rolled out, we saw big increases on mobile and in order to preserve the cross-device user experience, that necessitated big increases on desktop too.

Influential Pages Can Overcome Algorithms

It has also been my experience that a poor mobile experience does not automatically disqualify a page from ranking well on mobile OR desktop, if the page is sufficiently relevant or influential. Large swathes of IRS.gov are not mobile responsive but continue to rank well on both desktop and mobile.

They have to – it’s a critical government website that millions of people rely on. Despite the poor user experience, Google can’t penalize these pages too much. It would create serious issues if they did and searchers couldn’t find them. So Google continues to rank them, which may be a function of how frequently they’re cited on other sites (have many backlinks they have).

Making Pages Mobile-Friendly

How does Google rectify the fact that it’s ranking non-responsive pages? In September  2016, Google quietly updated Chrome with a feature that gave users the option to “make a page mobile-friendly.” Perhaps recognizing that the user experience of these pages was lackluster, Google offered users of its browser the option to change that.

Un un-responsive IRS page. Note the “Make page mobile-friendly” CTA:

After clicking “Make page mobile-friendly:”

It’s a band-aid fix to an underlying problem, but it does work well on the client-side. As a site owner, I might not be so convinced, depending on how Chrome renders the site. If important CTAs or contact forms are relocated in manner that’s not optimal, it could impact conversion rates.

Moving to a Mobile-First Index

It’s clear that Google has been incrementally pushing sites to provide a better experience. The mobile-first index is just the next step in a long series of steps to provide a better user experience.

What’s the impact going to look like?

It’s hard to say. This is a huge shift – it may be rolled out in phases like the two mobile-friendly algorithm updates were. The first phase may be less impactful to test the waters.

Google’s John Mueller did state that mobile pages should have all the same features as desktop pages, which makes a strong case for a responsive site rather than dedicated m. sites or dynamic serving sites, but it’s impossible to know for sure.

Despite the initial Fall 2016 announcement, Gary Ilyes has indicated the timeframe for release is now sometime in 2018. So there is time to make changes, not that Google has provided any formal criteria or us to follow.

AMP Pages

Perhaps most perplexing is that Google has simultaneously been pushing AMP pages over the past year – AMP pages are a lightweight HTML framework that results in lightning fast load times, at the expense of more advanced functionality (CSS and Javascript execution have significant constraints).

Will Google bias its own products and let AMP pages pass the test? Or will reduced functionality preclude them from ranking as well as responsive desktop pages? I’m curious to see how Google will accommodate these pages.

Takeaways

We know very little, so it’s difficult to determine how this update will affect dynamic or dedicated m. sites, which seem to be the most at-risk. Responsive sites could see a negative impact, if mobile functionality or navigation isn’t consistent with desktop. Despite being the secondary index, desktop may play a larger role than mobile did. It would be smart for Google to weight the secondary desktop index heavier than it did the secondary mobile index, at least when the update initially rolls out.

Questions? Comments? Tweet at me (@BerkleyBikes) or drop a comment here!