Wistia is a great platform for hosting videos on your site with tons of functionality including the ability to embed videos on pages and optimize them using built-in calls-to-action and pop-ups.
Recently I encountered a scenario where I wanted to find every website page that had a Wistia video on it. Going into Wistia’s back end revealed that the client had ~200 videos, but I had no idea where they were actually placed on the site, and wanted to ensure they were being used to full capacity.
With YouTube, you can simply run a Screaming Frog crawl and do a custom extraction to pull out all the embed URLs. From there you can determine which video is embedded based on that URL. However, the way Wistia embeds videos is not conducive to identifying which video is where, based on an embed URL. I couldn’t find any distinguishing characteristics that would help me identify which video was which.
How can such an advanced video platform be so incredibly difficult?
Using Inspect Element is the only way to really see what Wistia content is on the page. Doing that will show you much more information, including the fact that Wistia automatically adds and embeds video Schema when you embed a video. This is awesome and saves a ton of work over manually adding Schema like you have to do with YouTube videos.
The video Schema contains critical fields like the video’s name and description. These are unique identifying factors that we can use to determine which video is placed where, but how can it be done at scale when we don’t even know which pages have videos and which don’t?
Finding Wistia Schema With Screaming Frog
Setting Up a Custom Extraction
Next you will need to setup a Custom Extraction which can be done by going to Configuration > Custom > Extraction. I’ve named mine Wistia Schema (not required) and set the extraction type to regex, then added the following regular expression:
This will ensure you grab the entire block of Schema, which can be manipulated in Excel later to separate different fields into individual columns, etc.
Then set Screaming Frog to list mode (Mode > List) and test the crawl with a page that you know has a Wistia video on it. By going into the Custom Extraction report, you should see your Schema appear in the Extraction column. If not, go back and make sure you’ve configured Screaming Frog correctly.
Screaming Frog Memory and Crawl Limits
- Schema can be crawled and extracted with Screaming Frog, but it’s a memory hog so larger sites might be a no-go.
Questions? Tweet at me: @BerkleyBikes or comment here!