Last session of Day 2, you guys still with me? I hope so because we’re about to talk redirects. Way to leave the super technical stuff for late in the afternoon. Smart guys!
Alex Bennert is moderating a motley crew of speakers including Jordan Kasteler, Carolyn Shelby, Stephan Spencer, and Jonah Stein. Let’s go.
In a perfect world, Web site relaunches would be easy.
In the real world, they’re often anything but. Relaunches have to be done to fix horrendous URL structure and just general wrongness with the existing Web site. These are painful but necessary to the long term success of the site and the sanity of the SEO, content creators and tech people.
Uses for 301s
- Redirects users and bots from the old location of a given Web page to the new location
- Automatically direct users from an old domain to a new domain
- Automatically redirect users from alternate TLDs to the main TLD.
- Corrects canonical issues
- Keeps unauthorized users and bots from accessing live development environments, redirects them to the main site.
A Basic Relaunch
New frosting but the cake is the original recipe.
- Possibly moving to a new domain
- URLs might be losing or changing file extensions
- New pages, but a few new page are replacing old pages
- Few old pages are “disappearing”
- No significant changes to pre-existing information architecture or nomenclature.
With a basic relaunch, a small number of redirects are required. The naming scheme from an old site had consistent easily defined patterns AND the new site is also consistent and definable. Just use normal .htacccess and you’ll be fine.
Complicated Relaunch
The frosting is all new AND the cake is different.
- The site is large
- There are a significant information architecture and significant naming changes
Before you Begin
You should have a list or spreadsheets of all your current site’s indexed pages/URLS, all your site’s indexed pages with backlinks, and the relationship from the OLD to the NEW. You should also have an understanding of how your redirects will be added to your system, how you will be watching/monitoring your 404s and any weirdness unique to your CMS or Web server.
Preparation:
Find a software package to do some of the reports for you. List all indexed pages and pages with backlinks. For tracking 404s, you can read the error logs or install software to help you. Mint is nice and quick. You can also have reports compiled manually. Interns are delicious.
Preparing will limit the number of mistakes you make and the URLs you overbook. You’ll be able to easily identify patterns. You can use mod_rewrite or mod_alias, however, mod_rewrites will execute before the mod_alias.
Once everything is done, major site overhauls will see anywhere from a 20 percent drop in traffic to being completely dropped from the index. Recovery time is generally 6 to 18 weeks. The long tail will suffer more than anything else. You have to explain this to the client, which isn’t going to be pretty. It’s a waiting game. The upside is that for all the pain, you come out with a much better Web site that’s easier to make changes to in the future.
Are We Recovering?
- Analytics should show 10-20 percent increases in traffic week over week once recovery has begun.
- Reports showing indexed pages will show more and more of the new pages and fewer of the old
- That sick feeling in your stomach goes away. Hee.
Next is Stephan.
[Here’s the deal. Stephan ran through this far too fast and I didn’t understand a word of it. Because he is very smart and I am not. Below is my feeble attempt at keeping up. I encourage you to skip over this session and to go download Stephan’s presentation.]
301 redirects via rewrite rules
Use Apache’s mod_rewrite module and set up rewrite rules that use the 301 flag. Or, if on a Microsoft IIS server, use ISAPI-rewrite. The rewrite rules go in either .htaccess or your Apaches config file.
An Example Rewrite Rule
A simple rewrite rule for httpd.conf. Store stuff in memory with () then access via variable $1. A rough equivalent for .htaccess. Rewrite base/ Rewrite
Ah, but there’s an error with the rule immediately above
He goes over the magic of regular expressions
- * means 0 or more of the immediately preceding character [this can be a problem. It might match on nothing. If you want it to match at least one number, then use the + sign)
- + means 1 or more of the immediately preceding character
- ? means 0 or 1 occurrence of the immediately preceding char
- ^ means the beginning of the string, $ means the end of it
- . means any character (i.e. wildcard)
- \ “escapes” the character that follows, e.g. \. means dot [this says that you really do mean a period, not that you’re using it as a match any character]
- [ ] is for character ranges, e.g. [A-Za-z].
- ^ inside [] brackets means “not”, e.g. [^/]
It’s incredibly easy to make errors in regular expression. When debugging, RewriteLog and RewriteLogLevel is your friend. What’s the problem? .* is greedy and so it will capture the “/” within memory.
[I am completely lost and confused and I think most of the room is with me.]
Regular Expression Gotchas
“Greedy” expressions. Use [^ instead of .*
.* can match on nothing. Use .+ instead
Unintentional substring matches because ^ or $ wasn’t specified or . was used for a dot instead of /. [I’m hoping this makes sense to all of you. It absolutely doesn’t make sense to me.]
Deeper Down the Rabbit Hole
[QSA] flag is for when you don’t want query string params dropped (like when you want a tracking param preserved)
[L] flag saves on server processing
Speaking of Tracking Parameters
Here’s how to 301 static URLs with a tracking parameter appended to its canonical equivalent.
Um, there were examples for all of this but there’s no way in the world I can write them down. There are too many weird characters. Seriously. I’m going to cry on Rae’s laptop in a minute. Stephan is trying to kill us all.
Conditional Redirects are bad, but here’s how to do them. Or something.
Selectively redirect bots that request URLs with session IDs to the URL sans session ID:
RewriteCond %{HTTP_USER_AGENT} Googlebot.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^msnbot.* [OR]
RewriteCond %{HTTP_USER_AGENT} Slurp [OR]
RewriteCond %{HTTP_USER_AGENT} Ask\ Jeeves
RewriteRule ^/(.*)$ /$1 [R=301,L]
Okay, I apologize, but you need to go read Stephan’s powerpoint for yourself. The livebloggers have given up. There is no way to record this for you all. We suck.
Oh wait, I can write this part:
Capture PageRank on dead pages
Traditional approach is to serve up a 404, which drops that obsolete URL out of the index, squandering that URL’s link juice. But what if you 301 redirect to something valuable and dynamically include a small error notice? OR return a 200 status code instead so that the spiders follow the links on the error page? Then, include a meta robots noindex so the error page itself doesn’t get indexed.
Whatever you do, don’t respond to garbage URLs with anything but a 404 status code.
Next is Jordan. Maybe he’ll go easier on us.
Alex says she had to pull Jordan away from the bar and encourages the livebloggings to capture his slurs. Wow.
Types of 301 redirects
- Mod_Alias
- redirectmatch
- Mod-Rewrite
- Header Code
Digg.com hates SEO. They hate search marketing firms. They hate any site with commercial intent. You want to submit your content on a news site or one with a noncommercial intent. As soon as something goes hot, redirect it to your site so that you get the links. [Is this ethical?]
Internal redirects: Informational sources get a lot of links. Once they start building up, you can take that content and move it to a URL. Then, redirect the informational page to somewhere else on your site. It’s kind of a bait and switch, so be careful. It’s considered a questionable tactic aka please feel free to ignore this advice from Jordan.
Whiteht 301 Social Media Strategy
Use a multi-part story strategy. Consolidate all the parts into one larger story. 301 redirect to transfer and consolidate all the inbound links into that single location.
Choose your Hook Wisely
Your hook is what’s getting people to link to your content. If it’s a news hook, you’re going to want to 301 redirect it elsewhere. If it’s a resource, it usually keeps gaining links over time so you don’t want to 301 it and do a bait and switch because you’re still building links.
Humor Hook: Wait for the links to come to stop before 301 redirecting.
Canonical Domain Issues: When people are linking to your content they’re going to link in different ways. They’re going to put a trailing slash on it. They’re going to forget the WWW. You want to consolidate that into one.
Next up is Jonah. Hopefully he’ll be slow. My head may explode. For reals.
In 2005, duplicate content became a bad thing. In 2006 we started talking about canonicalization.
Redirects come in many flavors: 301, 302, 303, 307.
He first heard of 302 as a way to steal Web traffic. It does have some other uses.
Display URL
- Display userfriendly URL but the content lives deep inside within the information structure.
Geo Redirects
- Geo-local redirection without changing the ranking for the underlying page.
Rapidly Changing Offers
- Sales, seasonal prodcuts, events related ilstings. The base page accumulates PageRank; users and spiders received updated section.
Bringing a Microsite back
- When some of your related content lives in a subdomain and you want to maintani your information hierarchy.
Getting around Roadblocks
- When you are calling legacy applications or have an aplication with parameters contained in the URL and you want to create search friendly URLs.
HTTP v HTTPs
- Avoid canonical confusion when the page or application is moved to the https layer. Users unlikely to link to https.
Pretty Affiliate Links
- When you are calling an affiliate link within the target page. Google quality raters have very hidden instructions about hidden redirects, so use this with caution.
307 Might Be The Right Answer
Google supports 307 and treats it differently than a 302.
A 302 defaulst to a 303 get method while a 307 uses a post method.
Okay, seriously. This session almost killed me. I need to find alcohol. I mean, um, dinner. Right. Dinner. See you kids tomorrow.