The work of a search engine optimization (SEO) consultant revolves around one central theme:
Data.
Especially keyword data.
We collect it from a variety of third- and second-party sources, perhaps even via self-made tracking tools, to then start crunching the numbers and eventually delivering valuable insights to our bosses, clients, or prospects.
However, only running a few tools and employing some analytical magic is not going to cut it.
We also need to be thoughtful about how we interpret data from keyword tools and deal with any inaccuracies or inconsistencies.
Just like any software program, each keyword tool has a characteristic mechanism in place for collecting, aggregating, and manipulating data.
Similarly, tools’ workings affect how they handle queries and present the output keyword data.
An essential part of a marketer’s job function is to validate whether the data values stored for these keywords are represented in a consistent and unambiguous form.
Meaning, is the keyword data I am working with accurate?
The simple answer:
No.
Comparing data values of different tool providers for a set of keywords already proves to contain large inconsistencies – not only in data values but also in if and how your output data is presented.
This study, by my company, OAK, attempts to find clarity by exploring data accuracy and reliability with respect to second- and third-party keyword tool data.
Specifically, this study examines the following topics:
The primary purpose of this study is to grow awareness about the complexity surrounding keyword data values and tool providers’ data collecting and processing mechanisms.
Let us begin at the beginning: Google Search Console.
It is a second-party tool from Google that collects behavioral data for a single domain or entity and, after manipulation, injects the data into the front-end interface.
The mere fact that Google collects and processes the data might you wonder: how close to reality are the data values of the projected data?
This question poses an immediate challenge: Search Console data is not 100% validatable.
Luckily, Google is, to a point, transparent and provides various explanations for why your data values do not reflect reality or add up as you could expect.
A few of them are:
Unfortunately, only the G-giant has access to the exact data values, which means verifying the accuracy of Search Console data is a troublesome process.
The reliability of keyword data increases, however, with third-party tools.
These are tools like SEMrush, Ahrefs, Keywordtool.io, Searchvolume.io, and many others.
To find answers, this study explores the mechanics of these keyword tools apply.
Unfortunately, the companies running these tools disclose little to no information about how they collect, aggregate, or manipulate their data.
It seems fair.
A chef doesn’t just give away her or his world-famous recipe. Hence we attempt to generate insights with the help of the following approaches:
In general, there are five kinds of resources through which keyword tools accumulate their data:
Keyword data is gathered directly from Google’s keyword database through the Google Ads API.
As is the case with Search Console, Google Ads first manipulates the data before injecting it into the database.
Clickstream is nothing more than data derived from consumers’ online surfing behavior.
Aggregators gather this data in a variety of ways.
Large, until recently active aggregators, were, for example, Jumpshot or Hitwise.
Wherefrom do they get their data?
The aggregators then sell the data to keyword tools such as Ahrefs, SEMrush, and Moz, among others.
#seo #tools #data analysis