Time to get technical, technical. Time to get technical, technical… [dances]
Speaking we have Patrick Bennett, Arnab Bhattacharjee, Michael Gray and Maile Ohye. Vanessa Fox is moderating. She’s a superhero, if you didn’t know. Why is Michael Gray speaking on a technical panel? I’m so confused. [And am so getting kicked later. Just kidding, Michael.]
Maybe I should focus on the session? Okay, let’s go.
Heh. I just showed the entire room my green knee socks. Vanessa asked! It’s how I roll. And embarrass Rae publicly. Yey for knee socks!
Maile is up first.
AnswerFrames and iFrames questions
- Crawl: Googlebot can extract frame/iframe URLs discovered at crawl time. URLs do not require a separate link on the Web.
- Index: Frames often indexed with the parent page. iFrames are indexed as independent URLs.
The Web keeps getting bigger and bigger. There are different filtering processes for what gets crawled and indexed. On the crawl size, they discover unique content URLs. Prioritize crawling new content and refresh updated content. Keep the good stuff in the index. Return relevant search results.
Review Dynamic URLs
- Recognize by name/value pairs
- Implement often with cookies to hide user details
Create patterns for crawlers to process efficiently.
Crawl most important content first and put the rest in a bucket to tackle later. They have algorithms to run over name/value pairs.
- Disallow the actions that Googlebot can’t perform (ie shopping carts). This will reduce the number of URLs they have to look at so you get to the good stuff.
- Disallow login pages
- Disallow “contact us” forms: Especially if each “contact us” form contains a unique URL.
Avoid maverick coding practices
- Discourage alternative encodings.
- Eliminate positional encodings. Expand/collapse navigation can be infinite, limit to only one category expansion.
- Remove session IDs from path or position. This creates infinite crawl possibilities.
Create an intelligible URL structure
- Disallow bots from useless URLs
- Standardize mavericky behavior
- Consider using all lowercase URLs for safest implementation. Robots.txt is case sensitive, thus mixed case URLs can be a hassle.
- Implement dashes for word separation
Uncover Issues in CMS
- Join your local Web beautification program. Verify your site in Webmaster Tools for message center notifications for crawl space issues
- Help correct bugs in content management systems.
Patrick is up next.
SEO starts with the developer. Developers used to be able to get what they needed to design, code it, return it and everything was fine. Then the marketing team would apply SEO to it. That’s not the case anymore.
Developers and SUMIA (because the industry didn’t have enough acronyms…)
SUMIA: Sitemaps, URLs, Meta tags, Infrastructure, Analytics
He talks about a case study for a site called ShowRoom Logic. It his playground for trying out new things. They offer a product to used car dealerships to help them list their inventory online.
S: Sitemap - A sitemap is an easy way for webmasters to inform the engines about the pages on their sites. It doesn’t directly improve rankings but it does show Google all the pages you want indexed. You want both a human sitemap and an XML sitemap. Be proactive with your sitemap.
- Canonicalization: You need to decide if you’re going to be www or non www. You want to keep your structure the same through your entire Web site. Make sure all your internal links are set up the same. Work with redirects to protect you if people link from the wrong format.
- Structure: Use keywords and have consistency. Use hyphens, not spaces. Be conservative, use short URLs.
- Return Codes Part of doing URLs properly is outputting the right HTTP status code in the header. The important return codes for SEO are:
- 200 OK
- 301 Moved Permanently
- 302 Moved Temporarily
- 404 Not Found
- Mod Rewrite: Redirect anything that’s not a real file, then you can handle it anyway you want.
- 404 Pages :It shouldn’t be a soft 404. Give out the Header 404 Not Found.
M: Meta and Title tags – Make sure you have different title tags on each page.
Code – Keep it clean: XHTML/CSS. Use necessary tags for important content.
Links – simple/global navigation. Use nofollows when appropriate
Robots.txt – Be smart in choosing.
A: Analytics — You can’t live without analytics in SEO. It’s like taking a survey and not looking at the answers. If you’re not tracking your site, you won’t know if your campaign is working. SEO can’t live without analytics. Choose 1 or 2 and learn them.
- Be forward thinking
- Improve your value
- Contribute to the SEO campaign
- Periodic audits
Next is Arnab.
Thoreau’s statement of “Simplicity, Simplicity, Simplicity” also applies to site architecture.
Follow standards. Establish the bedrock of your Web site by following these simple constructs to become more crawlable and search engine friendly
- Follow W3 standards
- Be browser agnostic
- Consider semantic standards, such as MicroFormats and RDF
- Static HTML for key content, structure and navigation
- Use meaningful page titles
- Use simple clear and descriptive anchor text
- Good internal linking that is balanced
- Don’t link to spam
URLs: The Good and the Bad
Brevity is more. Use clean URLs without session IDs and with few query parameters. Your URLs should be easily readable and describe your content. Should scream COPY ME!
Avoid excessive redirection chains. Use stable URLs that people can refer to in their blogs, etc, that continue to work over time. Don’t lead to the same content pages with different URLs.
Enhanced Presentations Through MicroFormats and RDF
Open Standards: Yahoo is committed to the Semantic Web. The more publishers commit to adding semantic data to their sites, the easier it is for Yahoo to create metadata-aware improvements to search.
Easier Extraction: MicroFormats represent standardized vocabularies providing semantic structure.
Crawl-able Architecture: Discovery
Improve crawler discovery by leveraging sitemaps. Inform the search engines about pages on your sites that are available for crawling. Provides page importance, update date.
Robot.txt/meta tag exclusion. Use robots only if required and fully understood. It allows most frequented/important areas of the site. Protects private/restricted access of your duplicates. Disallows nonessential areas of your Web site.
Last up is Michael. I think he’s going to give a case study. That’s what his slides say, anyway.
The first thing Michael did was put analytics on the site because they weren’t using any. They found they were getting 200–500 unique visitors a day. They had 250,000 URLs and Google knew of 10,000. Ouch.
They had a wacky URL structure. Lots of session ID numbers and advertising parameters. They were confusing the search engine spiders.
What they did: Stripped all their URLs down. 301 redirected all old URLs. Anything that had a parameter was 301′d and got a cookie. It’s not a perfect solution, but it’s better.
The site was originally using the format:
Examples.com – For All your Widges Info – Joe’s Diner
It wasn’t recommended. You want to put the important information first. Make it easy for the engines to find out what’s important.
Existing sitemaps were numbered sequentially. Bad. One way through the site and new pages were at the end. The engines couldn’t make it all the way through, which is why only 10,000 pages had been indexed.
What they did: They set up mini sitemaps organized by category, location.
Interconnect the Web site
Now that they had mini sitemaps, they could use breadcrumbs and related page links to help spiders find new pages. Score!
Focus Internal Anchor Text
Majority of on site links were “click here” and “more information”. The, um, changed that to not suck.
Consolidate Pages and Focus Link Equity
Many listings had more than one page of data. A lot of nonessential pages with content that was duplicated. Bad because it created extra pages and lots of duplicate content.
Put all the info on one page behind CSS tabs. Made it easy for the search engines to figure out and reduced the number of pages dramatically. Helped them to consolidate their link equity.
Reduce Page Bloat
What they did: They moved the CSS into remove files and split the JS up so that the pages were as light as they could be. It brought the content up to the top of the page to help the engines see what it was about.
As soon as they rolled out the changes, traffic dropped to near zero! It takes away for the search engines to figure it out. This is important to know to help you manage expectations.
After some time, they were up to 5,000 uniques a day. Then 60,000 uniques. Then 80,000 uniques. Right now they’re up to 90,000 uniques a day, just by doing the changes above.
- Make your Web site as easy as possible to crawl
- Make your URLS as friendly as you can
- Titles and Content closer to the top and easy to understand
- Follow basic white hat SEO guidelines. You don’t need black hat tricks. [Aw, Matt Cutts would be so proud!]
Holy cow, that was a lot of information. If I messed up any of that, I invite the speakers to come correct me. I’m pretty; not technical. That’s where Rae and Rhea come in. When they’re not beating me, of course.