How ImportXML can be used in tracking performance of Authors in your blog
In Google Analytics with the help of custom variables you can track performance of the content written by the authors/contributors in your company website’s blog. Using the new Universal Analytics, then try custom dimensions.
Why should I track the performance of authors? To know how each author is performing, identify the person who is sending the most traffic (or conversions or time on website or whatever you define as performance) to the website, and to understand how or why he is more successful than the others.
Due to some reasons, if you aren’t able to do it by the methods above, then I have a very simple solution. In Google Analytics, filter your Landing Pages report to get data on just the blogs, download it and add a new column for “Author”. Go to each blog post and look at the “by” line. Fill the author column! Now that’s the simplest solution. But it will take days and weeks of job, and if you are a website like Moz, then probably an year. So let’s think of a way to automate the same process.
The magic is with the ImportXML function in Google Drive. I promise this function is going to be a huge time saver for lots of your reportings.
The function format is =IMPORTXML(“URL”,”Xpath_Query”). Well, at first I thought Xpath needs some serious coding knowledge. But it’s the opposite, if you know basic HTML then you are passed in Xpath!
Here are a few examples:
=IMPORTXML(“https://www.google.com", “//a”) — This will return the anchor text of all the hyperlinks in Google’s homepage. Note: Copy pasting won’t work. You need to write it in Google Spreadsheet.
Now let’s say I need to get the list of all the urls on the homepage of Medium. We can use
=IMPORTXML(“https://medium.com", “//a/@href”). This will pull the hyperlinks of all the anchor text in the homepage.
Back to the main topic of the post. How do I find out authors for each blog post in my website?
Let’s take a url from Medium. Every blog post in Medium has a byline with the author name and his profile link along with the publishing date.
If you check this with inspect element in Chrome you can see that the author name is inside the “span” class — “avatar-span avatar-span—iconWithText” which again is inside the “a” class. So the formula to get the author name is,
=IMPORTXML(“https://medium.com/@alexandrakmeyer/nasa-identifies-a-mars-rover-archiving-bug-with-keen-io-9a2a34fa8db3", “//span[@class=’avatar-span avatar-span—iconWithText’]”)
which means to pull what is inside the specified span class. Here it is the author name. You can access the ImportXML Sample Worksheet here.
=IMPORTXML(“https://medium.com/@alexandrakmeyer/nasa-identifies-a-mars-rover-archiving-bug-with-keen-io-9a2a34fa8db3", “//a[@class=‘avatar avatar — iconWithText avatar — inline link link — secondary’]”)
Since the same info is available inside the “a” class — ‘avatar avatar — iconWithText avatar — inline link link — secondary’ we can get the author name using the formula above as well.
To get the link of the author, try
=IMPORTXML(“https://medium.com/@alexandrakmeyer/nasa-identifies-a-mars-rover-archiving-bug-with-keen-io-9a2a34fa8db3", “//a[@class=’avatar avatar—iconWithText avatar—inline link link—secondary’]/@href”)
Try this for all the blog posts written in your company blog, there is no limit in the new Google drive for the importxml function.
The video below shows how I applied ImportXML to around 15 blog posts in Medium.
Now you have the blog post analytics data and the author name of each blog to measure each writer’s performance.
How do you measure the performance of the content written by the authors/contributors in your company website’s blog. Please let me know by using the comments section.