Technical SEO Issues for Developers

Time to get technical, technical. Time to get technical, technical… [dances]

Sorry.

Speaking we have Patrick Bennett, Arnab Bhattacharjee, Michael Gray and Maile Ohye.  Vanessa Fox is moderating.  She’s a superhero, if you didn’t know. Why is Michael Gray  speaking on a technical panel? I’m so confused. [And am so getting kicked later. Just kidding, Michael.]

Maybe I should focus on the session? Okay, let’s go.

Heh. I just showed the entire room my green knee socks. Vanessa asked! It’s how I roll. And embarrass Rae publicly.  Yey for knee socks!

Maile is up first.

AnswerFrames and iFrames questions

  • Crawl: Googlebot can extract frame/iframe URLs discovered at crawl time. URLs do not require a separate link on the Web.
  • Index: Frames often indexed with the parent page. iFrames are indexed as independent URLs.

The Web keeps getting bigger and bigger. There are different filtering processes for what gets crawled and indexed.  On the crawl size, they discover unique content URLs. Prioritize crawling new content and refresh updated content. Keep the good stuff in the index. Return relevant search results.

Review Dynamic URLs

  • Recognize by name/value pairs
  • Implement often with cookies to hide user details

Create patterns for crawlers to process efficiently.

Crawl most important content first and put the rest in a bucket to tackle later. They have algorithms to run over name/value pairs.

  • Disallow the actions that Googlebot can’t perform (ie shopping carts). This will reduce the number of URLs they have to look at so you get to the good stuff.
  • Disallow login pages
  • Disallow “contact us” forms: Especially if each “contact us” form contains a unique URL.

Avoid maverick coding practices

  • Discourage alternative encodings.
  • Eliminate positional encodings. Expand/collapse navigation can be infinite, limit to only one category expansion.
  • Remove session IDs from path or position. This creates infinite crawl possibilities.

Create an intelligible URL structure

  • Disallow bots from useless URLs
  • Standardize mavericky behavior
  • Consider using all lowercase URLs for safest implementation. Robots.txt is case sensitive, thus mixed case URLs can be a hassle.
  • Implement dashes for word separation

Uncover Issues in CMS

  • Join your local Web beautification program. Verify your site in Webmaster Tools for message center notifications for crawl space issues
  • Help correct bugs in content management systems.

Patrick is up next.

SEO starts with the developer. Developers used to be able to get what they needed to design, code it, return it and everything was fine. Then the marketing team would apply SEO to it. That’s not the case anymore.

Developers and SUMIA (because the industry didn’t have enough acronyms…)

SUMIA: Sitemaps, URLs, Meta tags, Infrastructure, Analytics

He talks about a case study for a site called ShowRoom Logic.  It his playground for trying out new things. They offer a product to used car dealerships to help them list their inventory online.

S: Sitemap - A sitemap is an easy way for webmasters to inform the engines about the pages on their sites. It doesn’t directly improve rankings but it does show Google all the pages you want indexed. You want both a human sitemap and an XML sitemap.  Be proactive with your sitemap.

U: URLs

  • Canonicalization: You need to decide if you’re going to be www or non www. You want to keep your structure the same through your entire Web site. Make sure all your internal links are set up the same.  Work with redirects to protect you if people link from the wrong format.
  • Structure: Use keywords and have consistency. Use hyphens, not spaces. Be conservative, use short URLs.
  • Return Codes Part of doing URLs properly is outputting the right HTTP status code in the header. The important return codes for SEO are:
    • 200 OK
    • 301 Moved Permanently
    • 302 Moved Temporarily
    • 404 Not Found
  • Mod Rewrite: Redirect anything that’s not a real file, then you can handle it anyway you want.
  • 404 Pages :It shouldn’t be a soft 404. Give out the Header 404 Not Found.

M: Meta and Title tags – Make sure you have different title tags on each page.

I: Infrastructure:

Code – Keep it clean: XHTML/CSS. Use necessary tags for important content.

Links – simple/global navigation. Use nofollows when appropriate

Robots.txt – Be smart in choosing.

A: Analytics — You can’t live without analytics in SEO. It’s like taking a survey and not looking at the answers. If you’re not tracking your site, you won’t know if your campaign is working. SEO can’t live without analytics. Choose 1 or 2 and learn them.

SUMIA Recap

  • Be forward thinking
  • Improve your value
  • Contribute to the SEO campaign
  • Periodic audits

Next is Arnab.

Thoreau’s statement of “Simplicity, Simplicity, Simplicity” also applies to site architecture.

Crawlable architecture

Follow standards. Establish the bedrock of your Web site by following these simple constructs to become more crawlable and search engine friendly

HTML Standards

  • Follow W3 standards
  • Be browser agnostic
  • Consider semantic standards, such as MicroFormats and RDF

Content:

  • Static HTML for key content, structure and navigation
  • Use meaningful page titles
  • Use simple clear and descriptive anchor text
  • Good internal linking that is balanced
  • Don’t link to spam

URLs: The Good and the Bad

Brevity is more. Use clean URLs without session IDs and with few query parameters. Your URLs should be easily readable and describe your content. Should scream COPY ME!

Avoid excessive redirection chains.  Use stable URLs that people can refer to in their blogs, etc, that continue to work over time. Don’t lead to the same content pages with different URLs.

Crawlable Architecture

Evolve with more elaborate schemes leveraging Flash, Video, AJAX, Javascript only if needed. When using Flash or JS, create text alternatives for crawlers. If you’re using AJAX, ensure that each page is bookmarked with a unique URL. If you’re using JS for navigation or menu items, have complementary static links available on the same page.

Enhanced Presentations Through MicroFormats and RDF

Open Standards: Yahoo is committed to the Semantic Web. The more publishers commit to adding semantic data to their sites, the easier it is for Yahoo to create metadata-aware improvements to search.

Easier Extraction: MicroFormats represent standardized vocabularies providing semantic structure.

Crawl-able Architecture: Discovery

Improve crawler discovery by leveraging sitemaps. Inform the search engines about pages on your sites that are available for crawling. Provides page importance, update date.

Robot.txt/meta tag exclusion. Use robots only if required and fully understood. It allows most frequented/important areas of the site. Protects private/restricted access of your duplicates. Disallows nonessential areas of your Web site.

Last up is Michael. I think he’s going to give a case study. That’s what his slides say, anyway.

The first thing Michael did was put analytics on the site because they weren’t using any. They found they were getting 200–500 unique visitors a day. They had 250,000 URLs and Google knew of 10,000. Ouch.

URLs

They had a wacky URL structure. Lots of session ID numbers and advertising parameters. They were confusing the search engine spiders.

What they did: Stripped all their URLs down. 301 redirected all old URLs. Anything that had a parameter was 301′d and got a cookie.  It’s not a perfect solution, but it’s better.

Page Titles

The site was originally using the format:

Examples.com – For All your Widges Info – Joe’s Diner

It wasn’t recommended. You want to put the important information first. Make it easy for the engines to find out what’s important.

Sitemaps

Existing sitemaps were numbered sequentially. Bad. One way through the site and new pages were at the end. The engines couldn’t make it all the way through, which is why only 10,000 pages had been indexed.

What they did: They set up mini sitemaps organized by category, location.

Interconnect the Web site

Now that they had mini sitemaps, they could use breadcrumbs and related page links to help spiders find new pages. Score!

Focus Internal Anchor Text

Majority of on site links were “click here” and “more information”. The, um, changed that to not suck.

Consolidate Pages and Focus Link Equity

Many listings had more than one page of data. A lot of nonessential pages with content that was duplicated. Bad because it created extra pages and lots of duplicate content.

Put all the info on one page behind CSS tabs. Made it easy for the search engines to figure out and reduced the number of pages dramatically. Helped them to consolidate their link equity.

Reduce Page Bloat

Site had all their CSS and JavaScript up at the top.

What they did: They moved the CSS into remove files and split the JS up so that the pages were as light as they could be. It brought the content up to the top of the page to help the engines see what it was about.

As soon as they rolled out the changes, traffic dropped to near zero! It takes away for the search engines to figure it out. This is important to know to help you manage expectations.

After some time, they were up to 5,000 uniques a day. Then 60,000 uniques. Then 80,000 uniques. Right now they’re up to 90,000 uniques a day, just by doing the changes above.

Actionable Items

  • Make your Web site as easy as possible to crawl
  • Make your URLS as friendly as you can
  • Titles and Content closer to the top and easy to understand
  • Follow basic white hat SEO guidelines. You don’t need black hat tricks. [Aw, Matt Cutts would be so proud!]

Holy cow, that was a lot of information. If I messed up any of that, I invite the speakers to come correct me. I’m pretty; not technical. That’s where Rae and Rhea come in. When they’re not beating me, of course.

Lunch!

Share this post

About the Author

Lisa Barone

Lisa Barone co-founded Outspoken Media in 2009 and served as Chief Branding Officer until April 2012.

Get social with Lisa at Twitter

9 thoughts on “Technical SEO Issues for Developers

  1. Thanks Lisa! This is a great way to keep up with the conference without being there. Seems to me like you understood quite well what they were talking about. Much appreciated!
    @jennita

  2. Jen: I can guarantee you that I understood very little. Technical SEO is not my strong point. Being a smart ass is. Thanks for the comment, though. :)

  3. Yes, many thx to Lisa for the live-blogging, and to @jennita for letting me know about it! Really appreciate the efforts of both you ladies :)

    That said, I have a few questions:
    1) What are CSS tabs? (I am on the marketing, not dev, side of SEO and am just starting to learn CSS.)
    2) What are MicroFormats and RDF?
    3) Why/how the knee sock obsession?

    @denverish

  4. Christy ,

    See:
    Re: css tabs http://htmldog.com/articles/tabs/

    Microformat is an approach to semantic markup that seeks to re-use existing xhtml tags to denote certain attributes that can be later scanned/used by software such as location, calendar data etc. It’s a way of making more with existing data in a way. Same with RDF, it makes more clear code by using existing code snippets.

    Re: knee sock obsession I wouldn’t know other than that they look cute!

    Thanks for the cover Lisa, even if we in the same room

  5. Christy, CSS Tabs are UI/interface “devices” that allow a developer/designer to provide multiple “panels” of content on a page, without having to refresh the whole page (example: http://stilbuero.de/jquery/tabs_3/).

    Using Tabs, we can output content into an HTML document (good for crawlers), but using the CSS and Javascript cabilities that browsers give us, we can control the experience the user has with that content and show them a little at a time.

    A tab is obviously a visual device, you can use the same technique using other visual devices, but the key is that the content is within the HTML document itself when the server delivers it back to the browser.

    Microformats are HTML conventions/patterns developers can use behind the scenes of a page to give the content more meaning. (http://microformats.org/wiki/what-are-microformats)

  6. Christy, I am Pat and here are some answers to those questions…

    What are CSS tabs?
    These are tabs made via css rather than images, some examples here:
    http://unraveled.com/projects/css_tabs/

    What are MicroFormats and RDF?
    Microformats – These are code that basically can be read by humans and machines, a great description can be found here..
    http://www.smashingmagazine.com/2007/05/04/microformats-what-they-are-and-how-to-use-them/

    RDF – stands for Resource Description Framework:
    http://www.xml.com/pub/a/2001/01/24/rdf.html

    Knee sock obsession:
    Lisa will have to answer that, but me and Lisa have a website kneesockz.com where she has photos of her in all ofher knee socks, as in dozens and dozens of knee socks, to learn more about her obsession I suggest:
    http://alloveralbany.com/archive/2009/01/23/lisa-barone-is-a-raging-knee-sock-fanatic
    or you can search Google for “sock fanatic” and it will take to the same article about her :) (really)

  7. Lisa: Funny! I wonder what that makes me? Technical SEO is my strong point AND I’m a smart ass. But you have mad blogging skills. I tried at Pubcon and utterly failed! Thanks again, looking forward to a week of great info from you.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Comments links could be nofollow free.