Dataclysm
I got really excited about this book from the second I heard about
it. Big Data is everywhere these days, and it's something I've been
working on, adjacently or directly, for years. The data I'm working with
now isn't included in the scope of Christian Rudder's work, which
focuses on applications in the fields of sociology and psychology. But
there was a time when I almost broke into this world. There are even
some projects he talks about in this book that I was involved in or
hearing really cool talks about. Alas, I never quite broke into it in
the way I wanted to.
I've remained interested in this field, though. We're experiencing what will be the beginning of an absolute golden age for sociology and psychology. Data is available in unprecedented amounts and ways. It's not only that you can finally look at the behaviors of literally millions of people, but free from a lab those behaviors are far more natural than in any hitherto performed experiment or survey. Facebook, Twitter, Google Search, Reddit, OKCupid and the like are veritable treasure troves, just waiting to be examined for what they can tell us about the human condition. And computer science has reached a point where we can process all this data and start to come to grips with it.
Christian Rudder started as a founder and analyst at the online dating site, OKCupid. His blog over there has always interested in me, as he looks at the various ways people interact and how sex, race, and orientation still have a huge impact on our interactions with the world. In his book he expands to looking at data from other social sites.
I should emphasize that all of this, this whole field really, is still in the preliminary stages. There aren't any formal experiments being done. There aren't really conclusions being drawn, and when there are I don't always agree with them. There's just a whole lot of interesting analysis. At this point it's about finding patterns in human existence and providing hard numbers instead of anecdotes. I remember a former colleague telling me that 95% of humans behave completely predictably 95% of the time. Up until now psychology has necessarily been focused on the outliers; that's where the interesting stuff was happening. But now we can start to for a picture of the broader, typical human experience.
As I said, I came very close to breaking into this field at my previous job (I didn't for a whole bunch of reasons). But the projects I did work on, and the projects I heard people talking about are almost all represented in this book. Facebook networks can be used to determine the stability of a marriage or the likelihood that someone is gay. I attempted to use them to predict other things as well. Twitter can be used to track unrest and take the temperature of social movements, and I remember attending a talk that likened information spread on the site to viral spread. The same equations apply. Twitter is also challenging previously held beliefs about linguistics. And more long form communications (blogs and dating profiles) can be used to find trends among like people or pick out anomolies.
I got a bit bored by some of Rudder's math, mostly because his explanations were simplistic. I've done most of this math, so I found it unnecessary to got through the motions like a high schooler, but those explanations are probably more interesting for people who haven't done this analysis before (some of it would have been useful for me to read 5 or 6 years ago). The conclusions are also frustrating, both in their large absence and because it almost seems premature to be drawing any conclusions at all. Getting to the heart of what some of this data really means is going to take more people with more expertise that Rudder (trained in mathematics) can bring to the problem.
We're at the very beginning of a new era right now, standing at the brink of a much deeper and broader understanding of ourselves than ever before. I'm really excited to see where this all goes in the next few years.
I've remained interested in this field, though. We're experiencing what will be the beginning of an absolute golden age for sociology and psychology. Data is available in unprecedented amounts and ways. It's not only that you can finally look at the behaviors of literally millions of people, but free from a lab those behaviors are far more natural than in any hitherto performed experiment or survey. Facebook, Twitter, Google Search, Reddit, OKCupid and the like are veritable treasure troves, just waiting to be examined for what they can tell us about the human condition. And computer science has reached a point where we can process all this data and start to come to grips with it.
Christian Rudder started as a founder and analyst at the online dating site, OKCupid. His blog over there has always interested in me, as he looks at the various ways people interact and how sex, race, and orientation still have a huge impact on our interactions with the world. In his book he expands to looking at data from other social sites.
I should emphasize that all of this, this whole field really, is still in the preliminary stages. There aren't any formal experiments being done. There aren't really conclusions being drawn, and when there are I don't always agree with them. There's just a whole lot of interesting analysis. At this point it's about finding patterns in human existence and providing hard numbers instead of anecdotes. I remember a former colleague telling me that 95% of humans behave completely predictably 95% of the time. Up until now psychology has necessarily been focused on the outliers; that's where the interesting stuff was happening. But now we can start to for a picture of the broader, typical human experience.
As I said, I came very close to breaking into this field at my previous job (I didn't for a whole bunch of reasons). But the projects I did work on, and the projects I heard people talking about are almost all represented in this book. Facebook networks can be used to determine the stability of a marriage or the likelihood that someone is gay. I attempted to use them to predict other things as well. Twitter can be used to track unrest and take the temperature of social movements, and I remember attending a talk that likened information spread on the site to viral spread. The same equations apply. Twitter is also challenging previously held beliefs about linguistics. And more long form communications (blogs and dating profiles) can be used to find trends among like people or pick out anomolies.
I got a bit bored by some of Rudder's math, mostly because his explanations were simplistic. I've done most of this math, so I found it unnecessary to got through the motions like a high schooler, but those explanations are probably more interesting for people who haven't done this analysis before (some of it would have been useful for me to read 5 or 6 years ago). The conclusions are also frustrating, both in their large absence and because it almost seems premature to be drawing any conclusions at all. Getting to the heart of what some of this data really means is going to take more people with more expertise that Rudder (trained in mathematics) can bring to the problem.
We're at the very beginning of a new era right now, standing at the brink of a much deeper and broader understanding of ourselves than ever before. I'm really excited to see where this all goes in the next few years.
Comments
Post a Comment