The Great Correlation vs Causation Debateby Lisa Barone on 09/13/2011 • 2 Comments | Internet Marketing Conferences
Welcome back, friends. Do you have Internet where you are? Yes? Fantastic. Could you send some of it my way because I gotz. Which, you know, would be fine if my job didn’t depend on it. Stupid MIFIs crapping out.
Oh, there’s a conference going on. That’s what you’re here to get updated on. Right.About that.
Up first is Kristine. Danny says when he first mentioned this panel, Kristine tweeted that she wanted to be on it. He told her to pitch, she did and she got it. Danny says, see, Twitter led to a conversion. Heh.
Kristine says she’s frazzled. She broke her phone and then found out it wasn’t a 10 min ride from her hotel to the conference but a 30 minute one. Ouch. Been there. [Later in the session she’ll knock her phone off the stage and it will again shatter on the floor. Someone’s not having the best day.]
Correlation is: A mutual relationship between two or more things where one has a measurable effect on the other.
Causation is: A relationship in which one action or event is the direct consequence of another.
In every casual relationship there is a correlative one, but there is not a casual effect in every correlative one. Sure.
So if I hit a ball with a tennis racket, the force exerted by the tennis racket has a definitive measure CASUAL effect on the ball. However, if I don’t eat breakfast before the game there is a CORRELATIVE, indirect, association affect on my game. Only the casual one is directly measurable.
Often correlation is mistaken for causation. Why? Perception. A is perceived to cause B by the observer when in actuality it may or may NOT have a measurable effect. Perception is like an optical illusion.
[You’re not getting this at home, but there have already been four hungover references in the first five minute of this presentation. Not that SHE’S hungover (she isn’t), she’s just from Vegas so it’s her go-to analogy.]
In SEO terms, we could make assumptions based on old ideas or incorrect perception. For example, site code on rankings – pre Google’s site speed announcements.
A Spurious Correlation is the perception that A effects B, but in actuality a third hidden variable does. For example, you may think it’s because SEOs are here that alcohol sales are going up in NYC. However, that’s because you don’t know that Fashion Week is also going on right now. You didn’t know that variable existed so you ignored it. [Also, yeah, I didn’t know Fashion Week was in town until I tried to GET A CAB at Penn Station. That took forever. Lovely]
In a casual relationship, you can directly measure that A does CAUSE B. In correlative relationships, you can only measure the strength of the effect of A on B.
What is a confidence interval?
There are many ways to show correlative strength, but the most common is the confidence interval. Also more seen as the plus or minus margin. This a confidence level. This tells you that there is a relationship.. The relationship is correlative that X is affecting Y.
This is just the tip of the iceberg. Because whenever you measure something you have to do testing. You have to know:
- your variable types – random, snowball, etc.
- What is your sample size – is it large enough to normalize your data? Do you need it to?
- Are you controlling for outliers?
- Are you choosing the proper analysis method?
- If you are not controlling these, are you able to understand the limitations of your conclusions?
SEO is mostly correlative. While we can measure the effect of variable A on variable B, the environment dictates that there are a multitude of variables that are either possibly unknown or their effect is unknown.
Next up is Micah.
How to tell good versus bad presentations? Correlation should be shown with scatter plots. They visualize what you’re trying to present. Just saying that there’s a correlation out there doesn’t tell you that it matters. Scatter plots help you present the data.
He says to avoid integer numbers. Generally integers make it harder to visualize correlations.
Correlations are a place to start, not to end. You want to take your correlation and sanity check it with your SEO. Ask yourself, does this make sense? What if we factor for X?
When to go Linear
- Amount of effort: Do you have enough data to prove it statistically significant? How long will collection take?
- Quality data: Do you trust where the data comes from? Do you have enough metrics that factor into the algorithm.
- Facebook Shares: Enough data and factors to run a linear regression
Panda and Average Time on Site
Does the correlation hold up by site type? Are sites being punished the same way?
Testing and Understanding Your Data
You’re going to look at things day over day. But are things improving week over week? You have to account for where your data is, what time of year it is, and what’s going on. You need to be able to break the site apart. Find ways to split your site in half along standard site architecture to run tests. Don’t have these? Use numbers in URL or even/odd.
Test and control. Randomize what your data sets are to avoid bias. Two controls are better than one.
Length of time: The small the data set, the longer the test has to run.
Common Correlating Pitfalls
- Other marketing channels – dd your brand team launch something during the same time period? Did your UX group modify the layout
- Extraneous online events – Did Google update? Was there a change in how your analytics tracks events? Did something break? Did you launch a change that affected your own tests?
- Various Online Events: Was there a holiday that skewed week over week data?
- Did a world event happen?
Question everything. Don’t accept what’s out there. Drill deep to help amp up the SEO.
Next up is Mitul to give some information thoughts.
He runs a platform for enterprise SEOs. They look at what SEOs can actually do with this information. There are three things to worry about when it comes to correlations vs causation.
Measurement: How do you measure and report on correlation vs causation? When you are measuring something, make sure you are looking at it with ALL the variables in mind. You don’t want to take your traffic increase in September and attribute it to something else. Know EVERYTHING that could be impacting your data.
When you’re measuring something, don’t look at it as just a snapshot in time. He did catalogs for years. You send a catalog out, you hope and pray for results to come back, and then you sense the temperature in NY must have been nice because your response rate there was good. That’s how SEO is managed right now. You take snapshots in time and then try to tie it back to things you’ve been doing for months and months. SEO is on a timeline.
When you’re looking at any of your data, then you make sure that there really is a cause and effect. When looking at measurement, TEST. A single event does not a trend make. It has to be repeatable. You have to be able to take it, repeat it, and get similar results back.
Tony Wright is up next to chat. He says he’s from Dallas, TX and today he’ll be playing The Curmudgeon
The clients he says most affected for Panda were the ones looking for magic bullets. He’s going to give us a secret about how to do SEO. Ready? This is all you need and then you can stop coming to conferences.
- Connections [Connections are links]
- Conversations [social media]
Do these things BETTER than your competitors and you will rank higher than them. Don’t fixate on the causation, just do it.
Next up is Eric Enge.
He’s in the camp that it just doesn’t matter. He calls himself The Curmudgeon, 2.
You don’t have time to change one variable and wait six months before changing anything else. You have a business to run.
- Does correlation cause causation?
- Do correlations cause traffic?
- Aren’t the signals just going to change anyway? [Yes. They are.]
Correlation and Causation DON’T MATTER. You want to build great stuff that causes both of those things to happen. Don’t follow every search trend you see. The basics have not changed in the past 6-7 years.
About the Author
Lisa Barone co-founded Outspoken Media in 2009 and served as Chief Branding Officer until April 2012.