Remodeling

Google’s Search Quality Evaluators Guidelines Cliff Notes + Audiobook

Google gives all of the criteria that it rates sites on when determining website quality & how it will effect rankings. The only problem? The document is really long – the thing…

Estimated Read Time:  22 minutes

Share: 

Search Quality Evaluators Guidelines

Google gives all of the criteria that it rates sites on when determining website quality & how it will effect rankings. The only problem? The document is really long – the thing is 164 pages long. So you can skim it, but to get the meat and potatoes I’ve put together this Google’s Search Quality Evaluators Guidelines Cliff Notes + Audio Version, for those that were like me and wish they could get the information into their brain in as little time as possible.

Download the (digitally produced) audio version: Google’s Search Quality Evaluators Guidelines Audio Version

Section 1 – Introduction to Page Quality Rating

Section 1 just goes through the general approach to search quality evaluators – that they should represent the area they are searching and rating for, and should understand some basics about the internet and browsers before getting started.

Frodo from Lord of the Rings: "All right then keep your secrets" SEO meme - When Google shares the exact criteria it uses to rank page quality but the PDF is 164 pages long

Section 2 – Important Definitions

Section 2 starts by helping determine why a particular page was created. “Most pages are created to be helpful for users, thus having a beneficial purpose. Some pages are created merely to make money, with little or no effort to help users. Some pages are even created to cause harm to users. The first step in understanding a page is figuring out its purpose.”

  • PQ Rating is what search quality raters can give to pages – based on how well it serves that purpose.
  • First it says to understand the purpose – then to rate the page on how well it accomplishes that.
  • If the page doesn’t accomplish that purpose, and instead harms, deceives users or make money without helping users – it should get a low PQ rating.

Sharing information about the product, entertaining, informing on topics. selling, allowing users to upload etc – can all be examples of quality elements of a site.

A. “Some types of pages could potentially impact the future happiness, health, financial stability, or safety of users. We call such pages “Your Money or Your Life” pages, or YMYL.”

Legal, financial, health and government pages and anything else strongly effecting health, well being of people all fall under this category – and should have much stronger quality standards for raters.

B. The next part shows how to determine which part of a web page is the ‘Main Content’ or MC, which is supplementary content or SC, and which parts of the page are ads.

C. The document also goes over determining the purpose of the website overall – and looking for independent sources and what they have to say about the site, including independent rating sites, professional societies. It requests that independent research be done for all sites big to small – big ones, perhaps to look for journalistic awards or what trade organizations have to say, and for others looking for expert sources and outside accounts.

D. It goes over finding the homepage, and says sometimes there are more than one homepage – for certain types of sites.

E. Sites should be very clear about who created the site and why – through about us and contact us pages. For shopping sites it says to check out their privacy and returns policies, and customer service pages.

F. Looking through customer reviews on Yelp, BBB, and Amazon – and checking for more information on Wikipedia and bigger news sites like New York Times – you’re looking for any evidence of malfeasance.

One way it suggests to search for reputation information is to use advanced search queries to separate out information that the company has put out itself, and focus purely on the third-party examples:

  • [ibm -site:ibm.com]: A search for IBM that excludes pages on ibm.com.
  • [“ibm.com” -site:ibm.com]: A search for “ibm.com” that excludes pages on ibm.com.
  • [ibm reviews -site:ibm.com] A search for reviews of IBM that excludes pages on ibm.com.
  • [“ibm.com” reviews -site:ibm.com]: A search for reviews of “ibm.com” that excludes pages on ibm.com.
  • For content creators, try searching for their name or alias.

Section 3 – Overall Page Quality

A. The Search Quality Evaluators Guideline’s PDF then invites evaluators to rate the page with their slider. It appears according to the PDF, that Google uses all kinds of sub contractors, and teams to get this work done, so wants consistency in criteria and in general context, for local markets and searches.

“On Page Quality rating tasks, you will use the Page Quality sliding scale (slider) to assign the overall PQ rating. The slider looks like this:”

Google search evaluators slider + example process

  1. Websites or pages without any beneficial purpose, including pages that are created with no attempt to help users, or pages that potentially spread hate, cause harm, or misinform or deceive users, should receive the Lowest  rating. No further assessment is necessary.
  2. Otherwise, the PQ rating is based on how well the page achieves its purpose using the criteria outlined in the following sections on Lowest, Low, Medium, High,  and Highest quality pages.

B. Page Quality Rating Recap – Where does the site land on these factors? Purpose, E-A-T or (expertise, authoritativeness, and trustworthiness), main content quality and amount, Website quality and who is responsible for the main content or “MC,” website reputation, and reputation of who is responsible for the main content.

C. The guidelines then go into E-A-T more thoroughly, from why expertise, authoritativeness, and trustworthiness is important for everything from medical devices, news articles, science, financial, financial, legal, and even remodeling and hobbies – because bad advice in these areas can strongly negatively effect people’s living situation.

D. What doesn’t require E-A-T according to Google’s search quality team? Detailed reviews of products, restaurants, movies, life experiences. No penalty for you if you don’t have formal experience in one of these areas. Goodie. Thanks Google.

E. YMYL or ‘Your Money, Your Life’ topics can be shared about in a personal way – whether your sharing a family experience or giving an everyday persons account – raters are encouraged to think about the topic of the page, and whether an average person can give that kind of account accurately.

Section 4 – What constitutes a ‘high quality page’?

A. Here are some examples of things that constitute a high quality page according to Google’s search quality evaluator guidelines:

  • High level of expertise, authoritativeness, and trustworthiness (E-A-T)
  • A satisfying amount of high quality MC (Main Content,) including a descriptive or helpful title. High quality MC takes a significant amount of at least one of the following: time, effort, expertise, and talent/skill. Use the functionality on the page, does the calculator work, does the video play, does the game work, etc. etc.
  • Satisfying website information and/or information about who is responsible for the website. If the page is primarily for shopping or includes financial transactions, then it should have satisfying customer service information.
  • Positive website reputation for a website that is responsible for the MC on the page. Positive reputation of the creator of the MC, if different from that of the website.

B. Examples of high quality pages include:

  • A news site that’s won Pulitzer prize awards
  • A news site that’s won 10 Pulitzer prize awards
  • Written by the largest Newspaper in the state of MN.
  • Naval observatory clock, that’s easy to use and functional.
  • Well known satire site, with a good reputation- cute & funny article.
  • “About Us” page on restaurant website, clear other info – what you’d expect.
  • News and updates that are timely in the past year of checking.
  • Well-known, reputable merchant with customer service information.
  • Shopping site with it’s own line, good reputation, and evidence of deeper expertise.

Section 5 – What constitutes the ‘highest quality pages’?

A. Site needs a ‘comprehensive’ amount of MC to be considered ‘very high quality MC’ and a very positive reputation, and prestigious awards to be consider a ‘very positive reputation,’ – or they have to be highly popular and well-loved, as well as focused on helping users.

B. Formal expertise is important for medical, financial, and legal advice – which is the root of E-A-T. Examples?

  • News – award-winning
  • Government agencies – comprehensive information about a park.
  • Snopes – reputation, well-known in field.
  • Software information from the company that made the software.
  • Credit info site – Verified site by using Wikipedia as a source.
  • Well-known music sites with good reputation, and tons of high-quality MC
  • Many more – as long at there’s authority in that field, high-quality MC, etc.

 

Section 6 – What constitutes ‘low quality pages’?

What is a low quality page?

  • Distracting ads
  • No SSL certificate on a site selling things.
  • Exaggerated title: “If pages do not live up to the exaggerated or shocking title or images, the experience leaves users feeling surprised and confused.”
  • No good reason for anonymity but the author is hard to find.
  • No E-A-T or low trustworthiness
  • Not enough MC.

For a ‘Your Money, Your Life’ website – a mildly mixed reputation on ratings around the internet – a ‘Low’ Rating is reasonable according to the raters guidelines!

Section 7 – What constitutes the ‘lowest quality pages’?

What is a lowest quality page like?

  • Pages intended to scam people
  • Pages that potentially spread hate, cause harm, or misinform or deceive users
  • Content lacks any real purpose whatsoever
  • Gibberish content
  • Content that’s been copied, or copied then ‘search and replace’ changed.

The rater’s guidelines give this kind of suggestion for how to do a search for if something is copied:

Search Evaluators Guidelines - How to find copied content

 

 

  • Not enough main content
  • Obstructed main content
  • Hacked sites
  • Abandoned or unmaintained sites
  • Websites that encourage harm: “pages that promote hate or violence against a group of people based on criteria including—but not limited to—race or ethnicity, religion, gender, nationality or citizenship, disability, age, sexual orientation, socio-economic status, political beliefs, veteran status, victims of atrocities, etc. Websites advocating hate or violence can cause real world harm.”
  • Highly negative reputation… like a Better Business Rating of F
  • Conspiratorial “the moon landings were faked, carrots cure cancer, and the U.S. government is controlled by lizard people.
  • Fake website – for instance someone the site faking to be a celebrity website.
  • Pages that have ads that look like Main Content.
  • Ads as main navigation items
  • Any page designed to trick people into clicking links.

Section 8 – What constitutes the ‘medium quality pages’?

What is a medium quality page?

  • Nothing wrong, but nothing special.
  • Has some strong elements, and some low quality elements.

Section 9 – Page Quality Rating Tasks

This section goes over some finer details about making ratings.

For instance, it suggests that raters ‘shouldn’t struggle’ and should rate quickly – if they are torn between two ratings they should choose the lower, and if they are torn between 3? Choose the middle one. They use the slider to rate, and are given a comment box to explain their rating.

  • They are sometimes asked to rate PDFs, JPEG’s, or PNG’s.
  • The guidelines suggest to use the send to device feature to look at the page on their phones.
  • If there is an interstitial page – if they easily get passed it, fine. But if it makes it hard or impossible to skip, it should factor into their rating.
  • They also are suppose to parse out the original creator of the site, and the creator of a particular piece of content. Parsing out the reputation of an author of smaller pieces of content it says, is particular important on YouTube and Forums.

Section 10 – Page Criteria for some specific types of content

Wikipedia and author encyclopedia’s are to be ranked by the reputation of the site first, but also for the more granular page level attributes since no clear oversight is possible in some cases.

“A Wikipedia article on a non-YMYL topic ( example ) with a satisfying amount of accurate information and trustworthy external references can usually be rated in the High range.

Pages with error messages, or that have ‘deliberately low MC’ should be analyzed for intent… Perhaps a 404 page had care put into it, or on the other hand doesn’t really make a strong attempt to help people. Maybe low content pages could be Ad pages as well.

Custom, well thought out 404 pages, with lots of next steps are generally seen as High Quality. 

Rate forum and Q&A pages from the point of view of a user who visits the page, rather than a participant involved in the discussion.

Watch out for important YMYL information, that could mislead in time of urgent need, but accept high quality information that could be stated from the point of view of a non-expert.

Expert answers (looser for non YMYL sites) – and multiple well thought out personal experiences for reviews etc. are considered to be high quality in these cases.

Section 11 – Ratings and Forums and Q&A Pages

Here are some important questioned answered from this section:

Are we just giving High quality ratings to pages that “look” good?

No! The goal is to do the exact opposite. These steps are designed to help you analyze the page without using a superficial “does it look good?” approach.

You talked about expertise when rating MC. Does expertise matter for all topics? Aren’t there some topics for which there are no experts?

Remember that we are not just talking about formal expertise. High quality pages involve time, effort, expertise, and talent/skill. Sharing personal experience is a form of everyday expertise.

Pretty much any topic has some form of expert, but E-A-T is especially important for YMYL pages.

 


Part 2: Understanding Mobile User Needs


Search Quality Evaluators - Mobile Devices

Section 12 – Does their site (and our web search result blocks) fully satisfy user intent?

Making tasks simple is the key piece to making sure people are supported on their mobile phone.

“Entering data may be cumbersome : typing is difficult on mobile smartphones, and when users speak to their phones instead of typing, voice recognition may not always be accurate.

  Small screen sizes make it difficult to use some phone features, apps, and webpages.

  Some webpages are difficult to use on a mobile phone . Website navigation can be difficult as menus and navigation links may be small. Webpages may require left-to-right scrolling to read text. Images may not fit on the screen. In addition, many mobile devices cannot access webpages with Flash or other similar features.

  Internet connectivity can be slow and inconsistent for mobile users going in and out of networks. App opening, recognition of voice commands, and webpage load times can be very slow on a mobile phone.

The intent becomes more important, and so does the locale they are searching for, and their location:

User Intent : When a user types or speaks a query, he or she is trying to accomplish something. We refer to this goal as the user intent.

Locale : All queries have a locale, which is the language and location for the task. Locales are represented by a two-letter country code. For a current list of country codes, click here . We sometimes refer to the locale as the task location.

User Location : This tells us where the user is located, and should be inferred from the map provided.

 

Quality Evaluators are asked to determine whether for a type of search – someone might be looking in the surrounding cities as well at the one they are in.

Then it talks about when people enter an explicit location, and that of course, their intent is much stronger for that location in particular.

Queries with multiple meanings – the categorize these into ‘dominant interpretation’, ‘common interpretation’ and ‘minor interpretations’:

Dominant, Common, Minor Interpretations

 

Sometimes a particular query can have multiple intents – for instance ‘Wal-mart’ could mean they want to find a Wal-mart near them, or it could mean they want to shop on Wal-mart’s website and shop.

Location should sometimes help in determining what they are most likely to be searching.

A big part of chapter 12 is talking through the different parts of search result blocks for quality checks within these elements as well. 

Without getting intensely into these components, it is extremely interesting that Google is handling the rating of these elements – and the elements on phones, as if they are components on an outside website for the quality raters guidelines. Each type of query and result block gets instructions for determining whether it fully satisfies user intent.

 

Search quality raters are invited to use whatever phone they normally use – and rate websites and special content result blocks, as much as possible as they normally would experience them.

 

 


Part 3: Needs Met Rating Guidelines


 

“There are many different kinds of queries and results, but the process of rating is the same: Needs Met rating tasks ask you to focus on mobile user needs and think about how helpful and satisfying the result is for the mobile users .”

Needs Met Quality Rating Guidelines

 

The guidelines say there are 3 main types of results:

  • Special content results block – It instructs that if they ‘wouldn’t click further’ the result did a very good job of satisfying intent. If it would be appropriate to click further, then the landing page should be considered with it.
  • Web search results block – normal search results, click in and check out the page as ‘normal.’
  • Device action results block – base the rating on the device action result itself.

 

Fully Meets:

  • Fully Meets is a special rating category, which can be used in the following situations: The query and user need must be specific, clear, and unambiguous. The result must be fully satisfying for mobile users, requiring minimal effort for users to immediately get or use what they are looking for. All or almost all users would be completely satisfied by the result—users issuing that query would not need additional results to fully satisfy the user intent.
  • Broader topics like a search query for ‘Knitting’ – cannot have a ‘fully meets’ result because different users may want different types of content. Other examples include simple acronyms with multiple org’s attached to them, famous people (because it’s hard to know what exactly they are looking for about this person) and non famous people (which person with that name are they looking for?)

Highly Meets:

If someone searches ‘Trader Joes’ for instance, they might mean the website or the closest one – if they are served a map result, it may not fully meet the intent.

Moderately Meets:

If someone searches “Shutterfly” for instance, and is served a crunchbase article about shutterfly. This would serve some people well / ok, and others very well.

Fails to Meet:

If someone searches “Dogs” for instance, and is served a map result with in-person visiting information for people looking for dog related services. However the query is very broad and it’s unlikely the searcher wants to go anywhere in person.

 

Section 14 talks about ‘Porn’, Foreign Language, Fails to Load and Upsetting – Offensive Results

For these types of results the quality rater is instructed to assign a flag based on the particular category that it might be in:

How does Google rate porn sites?

Porn flags should be put on any site with anything indicative of porn – including ads.

  • If someone is not looking for porn when they get a porn result – the result should be considered useless.
  • “The following queries should be considered non-porn intent queries: [girls], [wives], [mature women], [gay people], [people kissing], [boy speedos], [moms and sons], [pictures of girls], [pictures of women], [mothers and daughters], [cheerleaders], etc.”
  • Possible non Porn-intent examples include [breast] and [sex] – for grey area, please rate as if they are not looking for porn.

For people that are actually looking for porn – the raters are instructed to determine whether the searchers needs were met still:

  • Is there a good experience on the site – do all videos and pictures load?
  • Raters are instructed to report any illegal pictures.

 

Raters are instructed to flag ‘foreign language’ results if most of the people in their locale would not understand the result.

Raters are instructed to flag pages ‘did not load’ of course – if they failed to load, or if the browser displays a malware warning:

Google Quality Raters Guidelines, Malware or Did not load

If the site – redirects you, or has an error message – but their is a full-page with MC – do not flag the page ‘Did not load.’

For possible upsetting content Google instructs.. “Users may issue queries on sensitive topics to understand why people believe, say, or do upsetting or offensive things. Search engines exist to allow users to find the information they are looking for. Please assign the Upsetting-Offensive  flag to all results that contain upsetting or offensive content from the perspective of users in your locale, even if the result satisfies the user intent.”

“Upsetting-Offensive content typically includes the following:

  • Content that promotes hate or violence against a group of people based on criteria including (but not limited to) race or ethnicity, religion, gender, nationality or citizenship, disability, age, sexual orientation, or veteran status.
  • Content with racial slurs or extremely offensive terminology without context or beneficial purpose.
  • Depiction of graphic violence without context or beneficial purpose.
  • Graphic violence, including animal cruelty or child abuse.
  • Explicit how-to information about harmful activities (e.g., how-to’s on human trafficking or violent assault).
  • Other types of content that users in your locale would find extremely upsetting or offensive.”

Section 15 talks about the relationship between E-A-T and needs met

The ‘Needs Met’ slider is based both on the query and the result.

The E-A-T slider does not depend the query.

Section 16 talks about rating results with both in-person and website intent

A very helpful result for a dominant interpretation should be rated Highly Meets , because it is very helpful for many or most users. Some queries with a dominant interpretation have a FullyM result.

  • A very helpful result for a common interpretation may be Highly Meets or Moderately Meets , depending on how likely the interpretation is.
  • A very helpful result for a very minor interpretation may be Slightly Meets or lower because few users may be interested in that interpretation
  • There are some interpretations that are so unlikely that results should be rated FailsM . We call these “no chance” interpretations.

In Person and Online Results Google Quality Raters Guidelines Cliff Notes

Section 17 talks about specificity of queries and landing pages

For instance with the query: [chicken recipes] / User Location: Austin, Texas – User Intent: Users probably want to make a chicken dish and are looking for some recipes to choose from. Users probably expect and want a list of recipes.

Section 18 talks about needs met and rating freshness

Some queries demand very recent or “fresh” information. Users may be looking for “breaking news,” such as an important event or natural disaster happening right now. Here are different types of queries demanding current/recent results.

  • Breaking news
  • Recurring event queres, like elections, sports, TV Shows and conferences
  • Current information queries
  • Product queries

Section 19 talks about Misspelled and Mis-typed queries

For obviously misspelled or mistyped queries, you should base your rating on user intent, not necessarily on exactly how the query has been spelled or typed by the user. For queries that are not obviously misspelled or mistyped, you should respect the query as written, and assume users are looking for results for the query as it is spelled.

  • For instance ‘Micheal Jordan’ – likely less prominent people named that exact thing’s LinkedIn pages should be lower than the famous person since most people Googling that are likely looking for the famous person.

Section 20 talks about ‘Non Fully Meets’ for URL Queries

For URL Queries – (i.e. https://hookagency.com) there are other results besides the URL destination itself – perhaps like website reputation related items. “However, websites that offer usage statistics about a website are not usually helpful results for URL queries. Most users aren’t interested in this kind of information.”

Section 21 talks about Product Queries: Importance of Browsing and Researching

The ratings guidelines discusses how “buy iPad” might have a DO intent – as in they are looking up reviews, while a ‘iPad reviews’ might have a KNOW intent – meaning the verbiage surrounding a particular query will give some clues to the nature of the search. Quality raters are invited to consider what the indivual might mean – and whether searches would be satisfied by particular results.

Section 22 talks about Visit-in-Person Intent Queries

The ratings guidelines talks about the type of business – if it’s a restaurant they may want something in the general region, but it doesn’t need to be as close. Where-as if it’s a gas station, coffee shop or supermarket – it says – they will want it much closer. Consider ‘near-by’ or ‘near me’ in context to the specific thing they are searching for, whether they enter a location specifically – and the region itself / how far things are spaced from each other in general in this area.

Section 23 talks about Rating English Language Results in Non-English Locales

“Your Needs Met ratings should reflect how helpful the result is for users in your locale. When the query is in the language of your locale, assume that users want results in that language. We know that you can read English (you are reading this document!), but you should only give high Needs Met ratings to English results if users in your locale would expect or want them for a particular query. Unless requested by the query, English results should be considered useless if most users in the locale can’t read them.”


The Appendix talks about the evaluation platform

This section doesn’t seem to have a ton of application for the non-rater – so I’ll just mention that it’s intention is to help the raters navigate around the platform, start and release tasks, etc.

Wrapping up

I’ve tried to strike a balance of ‘cliff notes’ style + keeping the meat of what this means for ordinary web masters, and marketing folks in this rundown.

Of course – it would not be a bad idea to go through this document in-full if you or your SEO team has time – and of course, I wish you good luck in the SERPS!

Enjoy,

Tim Brown

Are You a Home Service Business Who Wants to Increase Your Qualified Leads?

Contact Us Now

 100+ 5-Stars

 Award-Winning

 Industry-Vetted

The Roofing Academy