Good morning, good morning, good morning! We’re coming to you live from SES New York 2011. Are you excited? Just a little? You should be. This is gonna be good. If you’re attending the conference from home I’ll encourage you to head over and check out our 2011 SES NY schedule to see where we’ll be and what type of coverage you can expect from Outspoken Media while at the show.
Kicking things off we have Mike Grehan on stage giving us the morning announcements. He first apologies for the odd microphone which he says is making him sound like he has a British accent. Oh, Mike….
Mike has been involved in search marketing since 1997. He’s written a few books on the subject. People say, if you know so much about search, can you get my site ranking well in Google. The question he always wants to ask is, “why”? Why should you be at number one? What’s so great about your Web site? Why is your site better than the guy next to you? Trying to get an answer to this question is so difficult. Before this session started he put a Post It note under someone’s chair that says “why”. That person with the Post It note has to get up and say why they’re site deserves to be number one. He asks everyone to look under his chairs.
He’s kidding. He didn’t really do that. I’ll spare Mike and won’t tell you how many times I’ve heard that joke. ;)
Mike talks about social media and how it’s like teenage sex. Everyone’s really excited about it but no one knows what they’re doing yet. Seriously, who wound up Grehan this morning? :)
On stage we have a keynote with Duncan Watts, Principle Research Scientist at Yahoo. Mike says he’s one of the smartest guys you’ll ever meet.
Social Media Science
1n 1940s Harold Lasswell laid out the essential problems of what became communication science: Who talks to whom about what, through which channel and with what effect?
Although easy to ask, Lasswell’s Maxim has proven difficult to answer.
- Measuring “whom talks to whom” hard at scale
- Measuring “who influences whom” is even harder
- Lots of channels to keep track of.
Our communication scientists have been unable to come to grips with this. Web 2.0 may finally bring the answer within reach because we can track data and communications.
Exp 1 (2001-2002)
1960s: Stanley Milgram and Jeffrey Travers designed first “small world” experiment.
- A single target in Boston
- 300 initial sends in Boston and Omaha
- Each sender asked to forward a packet to a friend who was closer to the target.
- The friends got the same instructions
Protocol generated 300 “letter chains” of which 64 reached the target. They found that typical chain was 6 people. This is what led to the famous “six degrees” phrase. [Hmm, who knew that wasn’t actually created by Kevin Bacon? Oh, mostly everyone? Never mind then.]
The Small Word On The Web
2011-2002 – decided to recreated Travers and Milgrams experiment but using email/web server instead of physical packets.
They used 18 targets around the world. 24, 163 chains passed through 61,168 hands in many, many different countries. The results mostly confirmed Milgram’s findings. It really is six degrees…or close anyway.
But we also learned something else. They managed to run an experiments with more than 60,000 participants, on a global scale, at virtually no cost.
What to do next?
Small-world experiment not really a “lab” experiment. Could we create a virtual lab on a Web scale? This was the beginning of their discovery called the “bored at work” discovery. People are sitting at work bored. if you can give them something to do that is vaguely entertaining and quiet (they’re in cubicles, after all), you can get them to do social science for you on a large scale. Heh! They thought this was fantastic and they needed to use this. I love that we’re now using people’s ADD and laziness against them. This really IS the future. Huzzah!
Exp 2 (2004-2005): Success in cultural markets
In market for books, music, etc. Could inequality and unpredictability be explained by social influence? Why is one movie a blockbuster when another is not? Is it based on quality or could it be manufactured?
- People rely on each other to determine quality
- People want to read/see/listen to same things as their friends.
Previously, support for hypothesis was purely theoretical. Could we measure the impact of social influence in an experiment?
Problem was that the experiment they wanted to run required tens of thousands of participants. Each market required hundreds of participants. Need to compare many markets, run multiple conditions. It’s impossible to run these sorts of experiments in a lab.
They created the lab online through Music Lab. They got people to come from a Web site called Bolt.com – an early social networking site for teenagers. They came to the site, answered some questions about themselves, and then they ask them about 48 bands they’ve never heard of/one of their songs and a social signal that tells you how many times that song has been downloaded before you. You can play the song and then you’re asked if you want to download the song. From there, you can download as many songs as you want and then you exit the experiment.
What they did know is that when they arrived they were shunted into two conditions. In one situation you saw the name of the bands and the songs. In the other you saw the number of previous downloads. In addition, the social influence condition is broken into eight worlds. Subjects in each world can see downloads of previous participants in that world only.
- Individuals are influenced by their observations of the choices of others. The stronger the social signal, the more they are influenced.
- Collective decisions are also influenced. The popular songs are more popular and the less popular songs are less popular. However, which songs become the popular ones becomes harder to predict.
- The paradox of social influence is that individuals have more information which to base choices. But the collective choice reveals less and less about individual preference. Manipulating social influence is not so easy. We can create self-fulfilling prophecies at level of individual songs, but not for entire market.
Attention and Influence on Twitter
- Music Lab showed importance of influence. But influence in real life diffuses through networks. Who listens to whom and to what effect?
- Twitter is ideally suited to study these questions since people are connected to each other because they’re interested in what other people have to say. People who are tweeting want to propagate information.
- Fully-observable networks of “who listens to whom”
- It includes many types of actors: CNN, NYTimes, Governments and Fortune 500, Celebrities, bloggers, journalists, experts, ordinary people.
- URL shorteners enable us to see information flows
Classifying Users with Lists
To understand what types of users matter on Twitter, we first need to categorize them. How do you do that?
In November 2009, Twitter introduced Lists. The purpose was to allow users to filter their feeds according to particular topics. They treat lists as crowd-sourced labels for users who appear on them. They focus on four main categories for elite users: Celebrities, Media, Organizations, Bloggers.
Twitter Data: They had lots and lots of data. He shows all the data they had but…I’d rather not jot all that down. Just know they have like 42 million users, millions of lists and 260 million bit.ly URLs. A lot of data.
They took all those users and created a seed of users who they know fit into the right categories. They then crawl all the lists the people are listed on. They prune the lists for keywords they’re not interested in. They keep doing that until they’ve organized enough people to seed the sample.
What did they find? The top 20,000 users account for 45 percent of following relationships and 50 percent of all tweets received. Celebrities outrank all other categories. Which probably isn’t all that surprising to you. I, personally, blame Justin Bieber.
Production of information – elite users are also more active than ordinary people. However normal folk produce many more URLs. Which, again, makes sense. The elite people make the news, the normal people share their information with the URLs they create.
Attention between elites: Celebrities follow other celebrities and don’t pay much attention to anyone else. The media focuses on media. The bloggers focus on bloggers. The only group that cares about anyone outside of themselves are the organizations.
Retweets – Celebrities do not retweet. The media does a little. Bloggers do a lot of it. Bloggers are deciding what’s important and sending it off to the rest of the world.
In 1950s Katz and Lazarsfeld proposed that a subset of individuals, called opinion leaders, act as intermediaries between mass media and the masses. These people are more influential and more exposed to the media. The same thing is happening on Twitter. Ashton Kutcher reweets more than Lady Gaga. In case you were curious.
Who are the Intermediaries? Users who rely on intermediaries generally light consumers of media. Opinion leads tweet more and have more followers than normal users.
This is all very consistent with the two-step flow.
From Attention to Influence
Striking concentration of attention on Twitter. Interesting that information passes through intermediaries. But these results are derived simply by observing what shows up on users’ feeds. Would also like to know if they are being influenced by the information. Better yet, we’d like to be able to predict it.
Twitter Influence Project
- Crawled 56m Twitter users, 1.7B edges
- Record 1B public twitter posts
- 74m posts contain bit.ly URLs from 1.6 million users.
Tweets, on average, generate a fraction of a retweet. Most tweets don’t spread – more than 90 percent.
Characterizing Content – They’ve found they can predict, on average, what content will do well. Only two features matter – past local influence and your number of followers. Surprisingly, it doesn’t matter how good the content actually is. Yeah, so I’ve noticed.
Large cascades more likely to be triggered by individuals who have many followers and who have triggered large cascades in the past. Most of the time, however, these individuals don’t trigger large cascades either. “Necessary” features aren’t “sufficient” to account for outcomes. Common problems for rare events: School shootings, successful companies, rich people
Give up on predicting individual events. Focus on the typical event size and try to optimize that many, many times.
How do you do that?
On average, some types of influences are more influential than others. Many of them are highly visible celebrities, but some like Kim Kardashian are very expensive.
Should you target:
- A small number of highly influential seeds
- A large number of ordinary seeds with few followers
- Somewhere in between
Everyone’s an Influencer
Large cascades are rare, hence:
Probably impossible to predict them or how they will start. It’s better to trigger many small cascades.
Ordinary influencers are promising. Many influence less than one other person on average but they’re relatively cheap.
Putting the pieces together:
The small world experience shows how even larger networks are connected. Music Lab showed how social influence drives popularity and also unpredictability. Twitter studies showed that attention is highly concentrated but hard to predict at an individual level. Still haven’t been able to put ALL the pieces together.