Search Engine Optimization

De-index Thin & Duplicate Content – Technical SEO Simplified

Quick Concepts Google likes it when you don’t feed it pages that have very little (or duplicate) content. It also likes it when you don’t give it a bunch of content it’s…

Estimated Read Time:  4 minutes

Share: 

De-index thin and Duplicate Content for SEO - Google Likes it video
YouTube video

Quick Concepts

Google likes it when you don’t feed it pages that have very little (or duplicate) content.

It also likes it when you don’t give it a bunch of content it’s seen before. If you absolutely have to post duplicate content (from your website or elsewhere) it’s really important to let Google’s crawler know with a little rel=”canonical” tag in the header of the site like so:

<link rel="canonical" href="https://hookagency.com/de-index-thin-duplicate-content">

Now the most common ways people with WordPress get duplicate content – is tags, category and author pages. Using the Yoast SEO Plugin, you can make sure these are not indexed though.

First make sure in Features > Advanced Settings Pages is Enabled:

de-index categories or pages - advanced YOAST SEO settings

Now under Yoast in the Sidebar – Choose Titles & Meta’s

For any posts where you have thin content – you’ll want to de-index those pages. In my case we have a ‘Coolest Designs’ page, which is a listing of cool designs, but the posts themselves have VERY thin content, so we don’t want Google to index those, perceive our site as low value because of the lack of content – and count that against the rest of my site – so we’ll switch the setting over to noindex.

No Index Categories in Yoast SEO

Now what about Duplicate Content?

By de-indexing categories and removing them from your XML sitemaps – like so, you can reduce the probability Google will find these types of pages redundant. The main reason you’d want to do this is that your content may be getting crawled on those pages.

 

How to check for Duplicate content that’s already indexing

You can do a site search on your website in Google like so – “Site:yoururl.com” – so for instance that turns up for my site:

Check what pages Google has crawled on your site

Now if you want to temporarily remove what of these URL’s go to Google Search Console – Click Google Index > Temporarily Hide > and enter the URL you want to hide (The work we did de-indexing and removing these URL’s in YOAST above would help them stay de-indexed long term.)

Remove URL's from Google Index

 

Lastly – and most simply of all, avoid thin content by NOT WRITING SHORT POSTS – or Adding content to posts that are too short, and getting to at least 350 words per blog post. Also – if you have products on your site, you want to try to hit at least 350 words in the product descriptions.

Full Transcription

Hey, how’s it going? This is the second episode of “Google Likes It.” We’re going to talk today about how Google likes it when you de-index thin and duplicate content. This is a very big piece of SEO, which is, you can’t have all this thin content on your website.

1) Stop writing thin content – Don’t write posts that are less than 350 words, don’t have product pages that have less than 350 words of original content

2) ‘No index’ in Yoast – If you’re on WordPress you can install Yoast SEO, turn on advanced settings, and go to ‘title and metas’ and ‘no index and ‘no follow’ the types of posts that aren’t appropriate for Google to crawl. So let’s say testimonials, if you have a testimonials page with all those testimonials on it, perhaps you want to de-index the posts themselves because they’re very thin. You want to look at those and think to yourself, do these need to be indexed individually, maybe not.

3) Remove from sitemap – Go to the XML sitemaps function, and you say “not in sitemap” for post types and taxonomies that aren’t necessary to have in your sitemap. Perhaps the categories are not important for your site to have in the sitemap, because you have of that listed on the main blog page.

You don’t want to be posting content multiple places on your site. You don’t want to be posting any content that’s been posted elsewhere on the web. If you absolutely need to do that, use the ‘rel canonical’ tag. If you don’t know how to use a ‘rel canonical’ tag, you can check out the full post that describes all of this info at hookagency.com/thin. You can also check out that post, or how you can use Google to see what pages are currently indexed on your website, and temporarily remove those pages while Google starts respecting all of your ‘no indexing.’

I appreciate you joining me today for the second episode of “Google Likes It,” Google like it when you de-index thin and duplicate content. Join us next week for the third episode of “Google Likes It.”

Are You a Home Service Business Who Wants to Increase Your Qualified Leads?

Contact Us Now

 100+ 5-Stars

 Award-Winning

 Industry-Vetted

The Roofing Academy