How to use XPath expressions to enhance your SEO and content strategy

30-second abstract:

As Google more and more favors websites with content material that exudes experience, authority, and trustworthiness (E-A-T), it's crucial that SEOs and entrepreneurs produce content material that's not simply nicely written, however that additionally demonstrates experience.
How do you perceive what subjects and issues matter most to your buyer base?
Can you employ Q&As to tell content material methods?
 XPath notations may be your treasure trove.
Catalyst’s Organic Search Manager, Brad McCourt shares an in depth information on utilizing XPath notations and your favourite crawler to rapidly get hold of the Q&As in a simple and digestible format.

As Google more and more favors websites with content material that exudes experience, authority, and trustworthiness (E-A-T), it's crucial that SEOs and entrepreneurs produce content material that's not simply nicely written, however that additionally demonstrates experience. One technique to reveal experience on a topic or product is to reply frequent buyer questions straight in your content material.

But, how do you establish what these questions are? How do you perceive what subjects and issues matter most?   

The excellent news is that they're hiding in plain sight. Chances are, your shoppers have been shouting on the high of their keyboards within the Q&A sections of web sites like Amazon.

These sections are a treasure trove of (largely) critical questions that actual prospects have in regards to the merchandise you're promoting.

How do you employ these Q&As to tell content material methods? XPath notation is your reply.

You can use XPath notations and your favourite crawler to rapidly get hold of the Q&As in a simple and digestible format. XPath spares you from clicking by means of infinite screens of questions by automating the gathering of vital insights in your content material technique.

What is XPath?

XML Path (XPath) is a question language developed by W3 to navigate XML paperwork and choose specified nodes of information.

The notation XPath makes use of is named “expressions”. Using these expressions, you'll be able to successfully pull any information that you just want from an internet site so long as there's a constant construction between webpages.

This means you need to use this language to tug any publicly accessible information within the supply code, together with questions from a number of Amazon Q&A pages.

This article isn't meant to be a complete tutorial on XPath. For that, there are loads of assets from W3. However, XPath is straightforward sufficient to be taught with solely understanding the construction of XML and HTML paperwork. This is what makes it such a robust instrument for SEOs no matter coding prowess.

Let’s stroll by means of an instance to indicate you the way…

Using XPath to tug buyer questions from Amazon

Pre-req: Pick your net crawler

While many of the large names in net crawling – Botify, DeepCrawl, OnCrawl – all supply the power to extract information from the supply code, I shall be utilizing ScreamingFrog within the instance under.

ScreamingFrog is by far essentially the most cost-effective choice, permitting you to crawl as much as 500 URLs with out shopping for a license. For bigger initiatives you should buy a license. This will permit you to crawl as many URLs as your RAM can deal with.

Step one: Collect the URLs to crawl

For our instance, let’s faux we’re doing analysis on the subjects we must always embody in our product pages and listings for microspikes. For these unaware, microspikes are an adjunct in your boots or footwear. They offer you additional grip in wintry situations, so they're significantly in style amongst cold-weather hikers and runners.

Example for finding details using Amazon


Here we now have an inventory of 13 questions and reply pages for the highest microspike pages on Unfortunately, there's some handbook work required to create the record.

List of questions - XPath and creating content

The simplest way is to seek for the subject (that's, microspikes) and pull hyperlinks to the highest merchandise listed. If you could have the product’s ASIN (Amazon Standard Identification Number) helpful, you can too generate the URLs utilizing the above format, however switching out the ASIN.

Step two: Determine the XPath

From right here, we have to decide the XPath.

In order to determine the correct XPath notation to make use of to tug within the desired textual content, we now have two essential choices:

      View the Source-CodeDetermine the XPath
      View the rendered supply code and duplicate the XPath straight from Chrome’s Inspect Element instrument

Copy XPath

You’ll discover that the expression wanted to find all questions in an Amazon Q&A web page is:


Here is XPath notation damaged down:

      // is used to find all situations of the next expression.
Span is the precise tag we’re making an attempt to find. //span will find each single tag within the supply code. There are over 300 of those, so we’ll have to be extra particular.
@class specifies that //span[@class] will guarantee all tags with an assigned class attribute shall be positioned.
@class=”a-declarative” dictates that //span[@class=”a-declarative”] solely locates tags the place the category attribute is about to “a-declarative” – that's,

There is an additional step as a way to return the inside textual content of the desired tag that's positioned, however ScreamingFrog does the heavy lifting for us.

It’s vital to notice that it will solely work for Amazon Question and Answer pages. If you needed to tug questions from, say, Quora, TripAdvisor, or some other web site, the expression must be adjusted to find the precise entity you want to gather on a crawl.

Step three: Configure your crawler

Once you could have this all set, you'll be able to then go into ScreamingFrog.

Configuration -> Custom -> Extraction

Configure your crawler

This will then take you to the Custom Extraction display screen.

Custom extraction screen

This is the place you'll be able to:

Give the extraction a reputation to make it simpler to search out after the crawl, particularly when you’re extracting multiple entity. ScreamingFrog lets you extract a number of entities throughout a single crawl.
You can then select the extraction technique. In this text, it's all about XPath, however you even have the choice of extracting information by way of CSSPath and REGEX notation as nicely.
Place the specified XPath expression within the “Enter XPath” area. ScreamingFrog will even examine your syntax for you, offering a inexperienced checkmark if every part checks out.
You then have the choice to pick what you need extracted, be it the total HTML factor or the HTML discovered throughout the positioned tag. For our instance, we wish to extract the textual content in between any tags with a category attribute set to “a-declarative” so we choose “extract textual content.”

We can then click on OK.

Step 4: Crawl the specified URLs

Now it’s time to crawl our record of Amazon Q&A pages for microspikes.

First, we’ll want to change the Mode in ScreamingFrog from “Spider” to “List.”

Then, we will both add our set of URLs manually or add them from an Excel or different supported format.

After we verify the record, ScreamingFrog will crawl every URL we offered, extracting the textual content between all tags containing the category attribute set to “a-declarative.”

In order to see the information collected, you simply want to pick “Custom Extraction” in ScreamingFrog.

Run the desired URLs

At first look, the output may not look that thrilling.

However, that is solely as a result of loads of unneeded area is included with the information, so that you may see some columns that seem clean if they aren't expanded to totally show the contents.

Once you copy and paste the information into Excel or your spreadsheet program of alternative, you'll be able to lastly see the information that has been extracted. After some clean-up, you get the ultimate consequence:

Final list of questions created using XPath

The result's 118 questions that actual prospects have requested about microspikes in an simply accessible format. With this information at your fingertips, you’re now prepared to include this analysis into your content material technique.

Content methods

Before diving into content material methods, a fast phrase to the sensible: you'll be able to’t simply crawl, scrape and publish content material from one other web site, even whether it is publicly accessible.

First, that will be plagiarism and count on to be hit with an DMCA discover. Second, you’re not fooling Google. Google is aware of the unique supply of the content material, and this can be very unlikely your content material goes to rank nicely – defeating the aim of this complete technique.

Instead, this information can be utilized to tell your technique and enable you produce prime quality, distinctive content material that customers are trying to find.

Now, how do you get began together with your evaluation?

I like to recommend first categorizing the questions. For our instance there have been many questions on:

Sizing: What measurement microspikes are wanted for particular shoe/boot sizes?
Proper Use – Whether or not microspikes may very well be utilized in shops, on slippery roofs, whereas fishing, mowing lawns, or for strolling on plaster?
Features: Are they adjustable, kind of fabric, do they arrive with a carrying case?
Concerns: Are they snug, do they harm your footwear, do they harm the kind of flooring/floor you’re on, sturdiness?

This is an incredible perception into the potential issues prospects might need earlier than buying microspikes.

From right here, you need to use this data to:

1. Enhance present content material in your product and class pages

Incorporate the subjects into the product or class descriptions, answering questions consumers might need pre-emptively.

For our instance, we'd wish to make it abundantly clear how sizing works – together with a sizing chart and particularly mentioning sorts of footwear the product might or is probably not suitable with.

2. Build out a brief on-page FAQ part that includes unique content material, answering generally requested questions

Make certain to implement FAQPage markup for a greater probability to look for listings like People Also Ask sections, that are more and more taking over actual property within the search outcomes.

For our instance, we will reply generally requested questions on consolation, harm to footwear, sturdiness, and adjustability. We may additionally deal with if the product comes with a carrying case and finest retailer the product for journey.

three. Produce a product information, incorporating solutions to in style questions surrounding a product or class

Another technique is to supply an intensive one-stop product information showcasing particular use circumstances, sizing, limitations, and options. For our instance, we may create particular content material for every use case like mountain climbing, operating in icy situations, and extra.

Even higher, incorporate movies, photos, charts, and featured merchandise with a transparent path to buy.

Using this strategy your finish product shall be content material that reveals experience, the authority on a topic, and most significantly, addresses buyer issues and questions earlier than they even suppose to ask. This will assist stop your prospects from having to do further analysis or contact customer support. Thanks to your informative and useful content material, they are going to be extra able to make a purchase order.

Furthermore, this strategy additionally has the potential to decrease product return charges. Informed prospects are much less more likely to buy the fallacious product based mostly upon assumed or incomplete information.


Amazon is simply the tip of the iceberg right here. You can realistically apply this technique to any web site that has publicly accessible information to extract, be that questions from Quora a few product class, Trip Advisor evaluations about motels, music venues, and points of interest, and even discussions on Reddit.

The extra knowledgeable you're about what your prospects predict when visiting your web site, the higher you'll be able to serve these expectations, encourage purchases, lower bounces, and enhance natural search efficiency.

Brad McCourt is an Organic Search Manager at Catalyst’s Boston workplace. 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.