Finding Pages With Embedded Wistia Videos

By August 19, 2017Technical SEO, Video

Wistia is a great platform for hosting videos on your site with tons of functionality including the ability to embed videos on pages and optimize them using built-in calls-to-action and pop-ups.

Recently I encountered a scenario where I wanted to find every website page that had a Wistia video on it. Going into Wistia’s back end revealed that the client had ~200 videos, but I had no idea where they were actually placed on the site, and wanted to ensure they were being used to full capacity.

With YouTube, you can simply run a Screaming Frog crawl and do a custom extraction to pull out all the embed URLs. From there you can determine which video is embedded based on that URL. However, the way Wistia embeds videos is not conducive to identifying which video is where, based on an embed URL. I couldn’t find any distinguishing characteristics that would help me identify which video was which.

How can such an advanced video platform be so incredibly difficult?

That’s mostly because Wistia relies heavily on Javascript. As Mike King notes in his article The Technical SEO Renaissance, right clicking a page and selecting “view page source” won’t work because you’re not looking at a computed Document Object Model. In layman’s terms, you’re looking at the page before it’s processed by the browser and content rendered via Javascript won’t show up.

Using Inspect Element is the only way to really see what Wistia content is on the page. Doing that will show you much more information, including the fact that Wistia automatically adds and embeds video Schema when you embed a video. This is awesome and saves a ton of work over manually adding Schema like you have to do with YouTube videos.

The video Schema contains critical fields like the video’s name and description. These are unique identifying factors that we can use to determine which video is placed where, but how can it be done at scale when we don’t even know which pages have videos and which don’t?

Finding Wistia Schema With Screaming Frog

Screaming Frog is one answer. Screaming Frog doesn’t crawl Javascript by default, but as of July 2016, DOES have the capability to do so if you configure it (you’ll need the paid version of the tool).

Go into Configuration > Spider > Rendering and select Javascript instead of Old AJAX Crawling Scheme. You can also uncheck the box that says Enable Rendered Page Screenshots, as this will create a TON of image files and take unnecessarily long to complete.

Setting Up a Custom Extraction

Next you will need to setup a Custom Extraction which can be done by going to Configuration > Custom > Extraction. I’ve named mine Wistia Schema (not required) and set the extraction type to regex, then added the following regular expression:

This will ensure you grab the entire block of Schema, which can be manipulated in Excel later to separate different fields into individual columns, etc.

Then set Screaming Frog to list mode (Mode > List) and test the crawl with a page that you know has a Wistia video on it. By going into the Custom Extraction report, you should see your Schema appear in the Extraction column. If not, go back and make sure you’ve configured Screaming Frog correctly.

Screaming Frog Memory and Crawl Limits

The only flaw in this plan is that Screaming Frog needs a TON of memory to crawl pages with Javascript. Close any additional programs that you don’t need open so that you can reduce the overall memory your computer uses and dedicate more of it to Screaming Frog. With large sites, you may run out of memory and Screaming Frog may crash.

Takeaways

  • Wistia uses Javascript liberally.
  • Schema is embedded automatically, using Javascript.
  • Schema can be crawled and extracted with Screaming Frog, but it’s a memory hog so larger sites might be a no-go.

Questions? Tweet at me: @BerkleyBikes or comment here!

Chris Berkley

About Chris Berkley

Chris is a digital marketing consultant specializing in SEO and Analytics across industries including healthcare, education, finance and others.

Leave a Reply