Pagination & Canonicalization for the Pros – SMX Advanced 2012by Michelle Lowery on 06/06/2012 • No Comments | Internet Marketing Conferences
Whew! Back from lunch, and ready to dive in again? I sure hope so because this is a session you don’t want to miss. Adam Audette, Jeff Carpenter, and Maile Ohye are about to throw some advanced SEO tactics and information your way, as Vanessa Fox moderates. So get ready to take some notes, and learn a few things because I get the feeling you’ll be able to immediately put some of their tips to use. Ready? Let’s do it.
Adam kicks things off by saying this is his favorite conference, and his favorite city, and his favorite audience. I think he may be pulling our leg with that last one. He’s excited to geek out with us about pagination and canonicalization.
He shows a slide with logos from some major companies and says they all have issue with pagination and canonicalization.
He’s going to cover three main areas: pagination dos and don’ts;
Distilling things down to the simplest essence is the way to go.
He shows the Zales site. It’s a large e-commerce site. He shows the home page canonical URL, and it’s still messy. Their page URLs are so similar that it’s very confusing, and it looks like they have a duplicate content problem.
Panda no likey duplicate content!
We’ve always approached this using noindex, which pulls dupes out of the index, but they still get crawled.
He shows another example site that sells aftermarket motorcycle parts that is using noindex to good effect.
Rel prev/next is another good pagination tactic.
When I see rel next/prev used with rel canonical, that’s good, but when it’s used to point back to page one, it creates a conflict.
With use of rel prev/next, keep in mind:
- Pages with rel next/prev can still be shown in results
- Use of rel next/prev consolidates signals across the entire series
Adam covers a few pagination requirements and annotations that get into some detail, so I recommend checking out his slide deck if you can.
View All Challenges
- view all pages must be fast
- big sites (especially e-commerce) sometimes dislike them
- but users love them! (if they’re fast)
Adam is showing more actual site examples with very long URLs that I couldn’t get down in 10 seconds if I set my hands on fire. Again, check out his slides afterward.
Everyone has them these days. It can be problematic for SEO. For a major category page, the approach is take that major category, and every time a facet is selected, consider it either an overhead facet you don’t care about for SEO, or … I missed the second thing. Sorry, apparently taking a lunch break just took me right out of the liveblogging zone! Okay, focusing…
- Identify essential and overhead facets
- Always force the canonicalization path regardless of selection order
- Build URLs according to queries (how users search)
- Solves nothing for decreasing crawl overheads
- Labor intensive and error prone
He’s picking on LL Bean now. Looking at the men’s pants category page. The URL looks good. But if you go into a product page, and look at the canonical tag, they added keyword-rich string in to the URL at the end, but it’s nowhere else on the site. That’s not the best use of rel canonical.
- Use rel canonicalization to signal the preferred URL
- Internal link signals should be consistent
- And a third thing I didn’t catch before he flipped past the slide
Next up is Jeff. PetCo started as a small shop, and now has a foundation that saves animals. [Yay!]
- PetCo has 1000’s of product list pages
- large amount of dupe content
- tracking tagged URLs getting indexed
- recent site changes caused multiple URL variations
- monthly email from Google about “high number of URLs
- reduce refinement options
- cross departmental education
- imp;implement canonical tags
- use noindex, follow tag
- reduced refinement choices
- overall search rankings increased
- direct SERP traffic to product list pages saw traffic and revenue improvements
- offline team supporting proper link building
- Google messages ceased
Testing everything to see if there’s anything else they can do that’s relevant to their platform.
Now on to Maile. In 2009, Google worked through issues of PageRank sculpting. In 2010 Zappos and “faceted navigation issues, exponential URLs to crawl. In 2011, they launched improved Webmaster Tools URL parameters.
Also in 2011, REI pagination issues, trying to use rel canonical for non-duplicate content. Google came out with rel next/prev support five months later. They look at rel canonical to see how sites are using it or misusing it. Many of you are doing it right, which helps us identify many more sequences than we could detect on our own.
URL Parameters in Webmaster Tools
- Assists understanding parameters to crawl site more efficiently (reduce number of dupes)
- saves bandwidth
- helps more unique, fresh content to be indexed
- for removals, go to URL Removals in Webmaster Tools
It’s an advanced feature. Some sites already have high crawl coverage as determined by Google. Improper actions an result in pages not appearing in search.
Issue: Inefficient Crawling
This is a feature available to you to address this problem: key=value&key2=value2
Step 1: Specify parameters that do not change page content
Do I have parameters that don’t affect page content?
Likely mark as “does not change content;” results in One representative URL setting in Webmaster Tools
Step 2a: Specify parameters that change content
Step 2b: Specify Googlebot’s preferred behavior
This is putting a lot of control in your hands. If you have this correct, your site can be crawled much more efficiently.
Changes the order content is presented in (price, bestselling, etc.)
Sort parameter never displayed by default?
Is parameter options throughout the entire site?
Can Googlebot discover everything useful when the sort parameter isn’t displayed?
Or, same sort values site-wide?
Are the same sort values used consistently for every category?
When a user changes the sort value is the total number of items unchanged?
Sort setting: if neither setting applies, let Googlebot decide.
Filters the content on the page. If the narrows parameter shows less-useful content, you might be able to specify “crawl no URLs.” Double check by verifying the URLs shown provide redundant useful content to the parent URL.
Determines the content displayed on the page
Crawl every URL
Crawl every URL, unless you want to exclude certain languags from being crawled/available in the search results.
Displays a component page of a multi-page sequence
- Internal links should only include canonical URLs
- list canonicals in sitemaps
- helps with canonical promotion
- providse more accurate index counts
- On-page indexing markup is still helpful
- rel canonical, rel next and rel prev can be used in tandem
Whew! That was a toughie. Again, if you can get your hands on slides from this session, I highly recommend it. Lots of super technical, but really helpful stuff. Time to put it to work!
Get all the SMX Advanced 2012 coverage here!
About the Author
Michelle Lowery is an ardent word nerd, but is also known to say "y'all" from time to time.