Duplicate Content Solutions & The Canonical Tag


Now that we’ve gotten the bugs out, it’s time to get technical! Watch as my head spins. Actually, Stephan Spencer is on this panel which means I may actually throw up from the knowledge overload he’s famous for throwing out. Get those motion sickness bags ready, this should be fun!

Alex Bennert is moderating a panel with some of my personal search favorites , including Adam Audette, Nathan Buggia, Jordan Glogau, Maile Ohye and Stephan Spencer.  They’re turning down the Wallflowers, so let’s hop in. Also, Adam Audette has a fresh hair cut. Which means he’s like about eight and a half years old. And I want to be his best friend.

Up first is Jordan Glogau.

What is the Canonical Link Element?  It’s used to flag duplicate content. It’s an HTML tag embedded in the head section of a Web page. It’s treated as an internal 301 redirect. It’s supported by the four major search engines.

What is it good for? It’s best purpose is to revitalize internal discussions about site architecture. Uh, great. An HTML tag to make people talk!  Once implemented it’s a confidence builder to truly correct site design flaws.

Case study: 1800flowers.com

Background: It’s a very old Web site. They have a redirect which takes you to canonical URLs and use level one load balancing. They don’t have a single URL. The Robots redirected to the canonical URL and leaded PageRank.  No one knows code anymore so they’re stuck in their current process.

Six weeks before Mother’s Day they added the canonical tag. It was an easy fix because the canonical URL was the same as the one in the CMS. They just had to swipe the code from the robot farm. It helped them with long tail traffic.

Results: By Mother’s Day they had seen a 20 percent increase YOY in organic search. It firmed up the commitment inhouse to move from level one load balancing to level three load balancing, and it directed traffic to product and category pages.

Case study: Eyeglasses.com

They rushed a revised Web site and simply cloned their old site, and therefore, all of their old duplicate content problems. Obviously wasn’t the best way to do it. They went in and added the canonical tag.

Over the past week they’ve begun seeing some very positive results. Their brand URLs are ranking better, they’re getting the Google indent and they saw a 15-20 percent increase in rankings for a number of the brands.

Up next is Adam Audette.

It’s exciting times, he says! There’s a new gun in town – the canonical tag. He starts humming the theme to Bonanza and we all giggle. He comments that this is his first attempt at adding humor into this presentations. Hee! So far so good, Adam!

The link canonical tag can be used in a lot of ways. So many that it’s a tad confusing. It’s great for duplicate content but has the potential to break things, too.

  • Good: Easy to implement, appears to work
  • Bad: It’s the “poor mans 301”. Powerful + New/Untested can equal Epic SEO Fail. The canonical tag is NOT a replacement for fundamental URL structure. It doesn’t solve duplicate content. May be better to redirect.

Adam goes over a few ways sites can use the new tag:

Zappos: They have an internal tool called the ZFC (Zappos Flux Capacitor) that does fancy stuff with URLs. Adam suggests deploying the link canonical tag on plurals, faceted, sorted URLs and subdomains.  [He shows us a page of socks as an example, only we all giggled cause it sounded like he said “a page that sucks”. There’s lots of sock/sucks banter for the rest of his time. Search marketers are hilarious. :) ]

Ticketmaster: They have a lot of issues with URLs. They can use the canonical tag on Artist pages, main category pages, etc.  Google has lots of versions of Ticketmaster’s page on Madonna (as they SHOULD). They need to consolidate the PR. It’s better to 301 all those to the master, but while that’s happening — add the canonical tag.  Same applies for their [baseball tickets] page.

Google Directory: Google has four different versions of the Google Directory. Adam suggests they use the canonical tag there to redirect to the main one.

Next up is Stephan Spencer. I’m crying already. Stephen is known for killing livebloggers.

The canonical tag includes your sitelinkes in Google. It’s about recovering leaked PageRank. If the page is allowed but meta robots noindexed, it also passes PR. Thankfully, when obeyed, the canonical tag aggregates PageRank.

Tools for Collapsing Duplicates

  1. The Canonical Tag: Great new addition to the SEOs arsenal, but not your best weapon. The tag works best when  its used in concert with other signals. It’s a hint to Google. Don’t rely on it.
  2. 301 Redirect: More absolute. No followed links aren’t even used for discovery by Google.

PageRank Leakage Scenarios

  • Robots.txt disallows the duplicate page = PageRank is leaked to the duplicate and it can show up in the SERPs.
  • Meta robots noindex the duplicate page = PageRank is leaked but won’t show up in the SERPs.
  • Rel=nofollow on the links to the duplicate = PageRank can still accumulate through other links and it can still be indexed.
  • Meta robots nofollow the duplicate = PageRank that accumulates on that page can’t be passed on.
  • XML Sitemaps file only includes the canonical version = only used as a hint, sups still be indexed.

[I’m going to hope that all made sense to you…because it definitely did not to me. Thanks.]

Stephan spends some time talking about the limitations of the canonical tag. He notes that it doesn’t work across domains, though cross-domains are supported.  He also shows a bunch of examples of duplicate content and the canonical tag in action…but I am dumb. And it all goes way over my head. The examples are also pictures, not text. Liveblogging Fail.]

Duplicate Content Issues and Fixes


Excessive pagination dilutes “crawl equity”, causing numerous pages of product listings to not get crawled. Reduce the number of pages in the pagination system to improve. Consider disallowing “view all” links and forcing spiders through subcat pages. Display as many products per page as possible within the 150k file size. Fewer products per subcat = few pagination pages to crawl at subcat level for max product indexation.

Faceted Navigation

Faceted navigation provides clickable product inventory breakdowns by brand, color, price, etc. Doing so also creates a huge number of permutations for the spiders to follow. Problem is exacerbated with clickable, resorted column headings. Nofollow all links leading to low value facets.

Affiliate URLs

Rarely do they help your SEO because they’re 302, not 301. Run affiliate programs inhouse, use 301s and or canonical tags. Third party affiliate solutions have a vested interested in not playing ball.

Click-Tracked URLs

He offers up how to 301 static URLs with a tracking parameter appended to its canonical equivalent. And if you think I know how to blog code….you must be new here.

Distance yourself from the thin affiliates. Augment with substantial amount of unique, valuable content. Use customer reviews. Don’t use mashups with Wikipedia, Twitter and the usual suspects.

“Uniquify” content. It’s not sufficient to shuffle the page’s content around. Think about overlapping “shingles”. Do NOT use the same titles and meta descriptions!

Click-Tracked URLs

He offers up how to 301 static URLs with a tracking parameter appended to its canonical equivalent. And if you think I know how to blog code….you must be new here.

Legacy URLs

What would the lookup table for the above look like?

Distance yourself from the thin affiliates. Augment with substantial amount of unique, valuable content. Use customer reviews,. Don’t use mashups with Wikipedia, Twitter and the usual suspects.

“Uniquify” content. It’s not sufficient to shuffle the page’s content around. Think about overlapping “shingles”. Do NOT use the same titles and meta descriptions!

Nathan and Maile didn’t present but were on hand for the Q&A.

SEO Drama Alert: A debate broke out mid-session when Matt Cutts got involved about whether or not nofollow is still effective. Of course, as soon as it got hot, all search representatives got very tight lipped about who said what and what they really meant. As far as I could, Matt Cutts did NOT say that they ignore nofollow, but he DID hint that it is less effective today than it used to be. Nathan from Microsoft alsooff-the-cuff mention that if you use nofollow as a way of PR sculpting and they feel it’s not beneficial to the user — they’ll adjust the algorithm.

Very, very interesting words. If you were in the session, I’d love to hear your thoughts on what went down, how you interpreted it and what you think is or is not true. Let’s get some debate going.

On another note: That session melted my brain.

Your Comments

  • Alysson

    You know, I wondered recently about the effectiveness of nofollow because WMT is now showing even nofollow links (like those from blog comments and some social media sources) within their external links reporting. Hmm…looks as if those who noticed that change may have been on to something. :)

  • Dwight Zahringer

    Interesting enough to say the least. I wonder if there is little and less attention being paid to the No/DoFollow attributes where the emphasis of back link and link popularity comes in to play? Is the age of the back link and their containing keywords dead, or dying?

    What else is being discussed out there in regards to Links and SEO?

  • Melanie

    I noticed the same thing in WMT… getting credit for links that were nofollowed. As a result, I’ve switched up my link building techniques to start testing this :)

    Thanks for the excellent coverage Lisa. It really helps a poor sap like me who can’t actually *be* there [harumph].

  • Kenny Hyder

    Yeah, except that my blog (hyder.me) has a really hard time even popping to the first page for my name, and I have plenty of nofollow comment links… Although I suspect that Google has some sort of bias against the .me domain extension cause I also have a healthy amount of normal links considering the age of the domain…

  • David Leonhardt

    I have never believed in NoFollow. The end of NoFollow would mean poeoplke would go back to real quality, rather than SE-adjusted quality factors, when it comes to link-building.

  • g1smd

    I am guessing your missing code snippet would have been something like:

    RewriteCond {%QUERY_STRING} &?affilid=([^&]+)&? [NC]

    RewriteRule (.*) http://www.example.com/$1? [R=301,L]

    or somesuch.

  • Bill Hartzer

    There definitely is a use for the canonical tag, but just like rewriting URLs and using 301 Permanent Redirects, you have to know what you’re doing or you’ll mess up your optimization efforts.

  • Alan Bleiweiss

    okay this one made me vomit just reading it. Apparently there’s both some dispute about these issues and also you have to be a rocket scientist (which Adam Audette, Nathan Buggia, Jordan Glogau, Maile Ohye and Stephan Spencer all seem to be)…

    I’ll just stick to 301 redirects til someone can sort this all out for me.

  • Halfdeck

    Hold your horses.

    WMT has always shown nofollow links. People used to debate whether Google actually followed them or not based on the fact that you would see them in WMT and when you ran a Google link: query.

    See my comment I posted back in 2007 here:


  • Steen Seo Öhman

    Seems to be some confusion.

    Did Matt Cutts talk against pagerank (link) sculpting in general, or just the use of nofollow on internal links?

  • Halfdeck

    “Did Matt Cutts talk against pagerank (link) sculpting in general, or just the use of nofollow on internal links?”

    No, his stance has always been that PageRank sculpting of all forms is perfectly fine.

    Your best bet is wait for video or transcript of what was exactly said. Blogs tend to blow things out of proportion for marketing sake.

  • Steen Öhman

    Thanks will do so … was also surprised when quite a lot of people on Twitter quoted him for talking against sculpting.

    Great to have reports from the conference … was not able to go myself … long way from Denmark.

  • Andy Beard

    Matt recorded a video just a few days back (or posted it) that suggested as always to concentrate on external link building.
    There wasn’t any suggestion that sculpting is somehow less effective, because he has always maintained it is fairly small rocks and not to spend too much time on it.

    Good that blocking with robots.txt is taboo for duplicate content, some people were saying that 18 months ago ;)

  • MIke B.

    Would you agree the canonical tag could be used perfectly for ecomm sites that have sort=DESC or sort=ASC pages?


  • Robert Enriquez

    Here’s a recent video of Matt Cutts speaking about Pagerank sculpting using Nofollow


    I think everyone is blowing things out of proportion

  • Suthnautr

    I’ve noticed that Sugarrae.com has implemented Canonical tags – but I’ve also noticed that Sugarrae.com took a leap from an 89 in Website.Grader.com (I know – that tool has it’s problems, but I use it once in a while anyway) to a 99.2. That’s a nice leap – and I can’t help but wonder whether it was Rae’s link sculpting efforts, the canonical tags or a combination of both.

    On the note of “nofollow” not being as well honored by Google more recently, I’ve never trusted the nofollow tag alone so on my site a large majority of my external links are placed within an iFrame on the right of each page. Outside of using Javascript or flash links (both of which may soon be readable anyway) there’s not much else I can think of to stop follows for sure.

    As for the page sculpting comment “Nathan from Microsoft also off-the-cuff mention that if you use nofollow as a way of PR sculpting and they feel it’s not beneficial to the user — they’ll adjust the algorithm” …I’m wondering what that means – does it mean manually per site adjustment, overall ignoring nofollow in links linking files within a domain, or all nofollow tags – external site links as well as internal. (I’m hoping that if this were to occur it would only apply to individual sites abusing nofollows (?) and not an algorithm change for all sites.).

  • Michael Martinez

    Halfdeck wrote: “No, his stance has always been that PageRank sculpting of all forms is perfectly fine.”

    That is absolutely WRONG. Matt Cutts has ALWAYS, consistently cautioned people against pursuing this tactic. It was Rand Fishkin who misattributed his own views on PageRank Sculpting to Matt on SEOmoz, and people have consistently ignored Matt’s correction in the comments to that particular article.

    PageRank Sculpting has always been a stupid idea because no one in the SEO industry has had the ability to measure PageRank. This methodology is equivalent to throwing darts while blind-folded after being spun around and moved through several different rooms.

    You’d have more luck managing your PageRank by playing traditional pin-the-tail-on-the-donkey than by using “rel=’nofollow'” on internal links. At least with the game you have a chance of actually hitting something.

  • Halfdeck

    “That is absolutely WRONG.”

    Michael, that is absolutely RIGHT.

    Matt Cutts has ALWAYS said there’s nothing unethical about PageRank sculpting. In fact, linking pages together (the old fashioned way) IS a form of sculpting, according to him. Are you telling me you want to unlink all the pages on your site? Come on man.

    He consistently made the point that sculpting isn’t necessarily a high ROI tactic and that you’d be better off making $200k extra bucks instead of squeezing 10% interest out of $50k invested in stocks.

    #popcorn #vodka

  • Vern | AimforAwesome

    I’ve never seen this mentioned… what about with multiple state focused sites… we have startillinoisbusiness.com, startgeorgiabusiness.com etc. They are all completely different focus, yet – we duplicate some of the info because it’s all about incorporation – which – at a national level doesn’t change much at all. I’ve been noindexing the pages that are duplicate but heck, that’s like 20 pages per site. Should I or no? I’d think no – Georgia is different from Illinois. The user experience is a good one if the info is same because it’s all great info – despite being the same. Quite a tricky question and I don’t know how to do this the right way… anyone?

  • Damien Anderson

    HAHA! I so loved the ‘on another note’ comment at the end of the feature! Melted mine as well!


  • Bill Sebald

    Sculpting always seemed to be boosting my efforts in at least “light” ways (though I never had the massive lifts that SEOs talked about in the early nofollow days), so I’m taking the side that the SMX words are being misinterpreted until something clear comes out. If something changed, and the PR transfer became even lower than before, then I believe it would have JUST changed within the month.

    It’s still at the least valuable in keeping the index free of noise (and wasting spider efficiency on junk pages).

  • Mani

    I’ve always avoided using NOFOLLOW, and now it seems as though it’s no longer beneficial to site owners to use it. I guess the search engines do need to keep a few things to themselves though, otherwise SEO would turn into a science and you’d have over-optimised sites everywhere… Would that be better? ;)

  • Introspective

    Should I stop publish my articles on article directories? I used to publish my articles, but now I wander should I stop doing this, because the risk of duplicate content penalty.

  • Adam

    What is the code for canonical tags?

  • Rob @ Web Design Talk


    What would you recommend doing for paged product data on an eCommerce site? E.g. domain.com/category/dvds.php?page=1, domain.com/category/dvds.php?page=1#2 etc.


  • Kim

    This is a great post! I appreciate that you have put all the case studies in such detail. Thank you for doing this.

  • Johan - Design Lover

    Thanks for this post. I’m totally into design, and follow eductation for it here in Arnhem (Holland). Keep up the good work! I’ll keep following posts on this blog.

    Greetings from Holland