Information overload is NOT filture failure
Wednesday, January 14, 2009 at 9:02 pm UTC by David Crotty permalink
This has been bothering me for a while now, dating back to last year, when I first heard Clay Shirky’s very pithy statement that information overload isn’t a real problem, the real problem is a failure to build effective filters. It’s a catchy little phrase, and like most theories from Web 2.0 gurus, it seems reasonable on the surface, but when applied to the world of scientists, it’s less than useful.
The O’Reilly Radar blog had a link this week to an interview with Shirky where he discusses the concept in detail, which was helpful to finally get a handle on what he means and why it’s irrelevant in my world:
“…the information overload people are the most narcissistic because information overload started in Alexandria, in the library of Alexandria, right? That was the first example where we have concrete archaeological evidence that there was more information in one place than one human being could deal with in one lifetime, which is almost the definition of information overload. And the first deep attempt to categorize knowledge so that you could subset; the first take on the information filtering problem appears in the library of Alexandria.
By the time that the publishing industries spun up in Venice in the early- to mid-1500s, the ability to have access to more reading material than you could finish in a lifetime is now starting to become a general problem of the educated classes. And by the 1800s, it’s a general problem of the middle class. So there is no such thing as information overload, there’s only filter failure, right? Which is to say the normal case of modern life is information overload for all educated members of society.
If you took the contents of an average Barnes and Noble, and you dumped it into the streets and said to someone, “You know what’s in there? There’s some works of Auden in there, there’s some Plato in there. Wade on in and you’ll find what you like.” And if you wade on in, you know what you’d get? You’d get Chicken Soup for the Soul. Or, you’d get Love’s Tender Fear. You’d get all this junk. The reason we think that there’s not an information overload problem in a Barnes and Noble or a library is that we’re actually used to the cataloging system. On the Web, we’re just not used to the filters yet, and so it seems like “Oh, there’s so much more information.” But, in fact, from the 1500s on, that’s been the normal case.
Okay, so if by “information overload”, you mean that there’s more interesting stuff out there than I could ever handle if I tried to read all of it, fine, Shirky’s comments make sense. But that’s not what the scientists I talk to on a daily basis mean by “information overload”. What they mean is that we’re seeing huge increases in both the numbers of people doing scientific research, and the numbers of scientific papers being published. While I hate to quote Wikipedia, the numbers listed there (take these with a grain of salt as one should all Wikipedia content) show an estimate of 11,500 total scientific journals in 1981, and over 40,000 listed in 2008 in PubMed in fields related to medical science alone.
Now, most scientists are familiar with the “cataloging system” of scientific journals, they’ve been reading them their entire careers. Everyone has their own filters, their own rankings of which journals are more interesting, or publish better work than others. And all kinds of tools are available for filtering things down to just the relevant essentials for keeping up with your own field. But even so, most people that I talk to are left with more useful, relevant articles that they need to read than they have time to get to. These are not articles that should be filtered out. These are important, quality findings of direct relevance to their own work. And there are too many of them without even factoring in a need to keep up with science in general and see what developments in other fields can be applied to one’s own.
So no, it’s not a filter failure. It’s a genuine overload. A “filter failure” implies that scientists are just not tossing out the less relevant material, but that’s not what’s happening (as an example, almost no scientists I know read science blogs–those are something filtered out as being of less value than the primary literature). Is it so hard to believe that as science and technology move forward, that more and more research is being done, and that there’s more knowledge generated that one should take in? Is it wrong to want to be as informed as possible of one’s own field, and to seek ways of assimilating more research, rather than ways of discarding valuable information?
Shirky’s suggested solution is of no help here:
“So, the real question is, how do we design filters that let us find our way through this particular abundance of information? And, you know, my answer to that question has been: the only group that can catalog everything is everybody. One of the reasons you see this enormous move towards social filters, as with Digg, as with del.icio.us, as with Google Reader, in a way, is simply that the scale of the problem has exceeded what professional catalogers can do.”
I don’t know about you, but I’m not sure how much I’m willing to trust a random group of strangers to tell me how relevant a particular paper is to my own research. Sure, you can get some sense of the quality of the work, perhaps even a decent summary. But no one knows your work as well as you do, and no one is going to be able to tell you what tiny details in a paper will or won’t act as a springboard for new avenues of research. I’d also argue that the top researchers are probably better at discerning those details, and if they leave the paper-reading to others, they’re going to miss out on much of what makes them better than their peers and science is going to suffer.
So while social filtering like that described does have its uses, it’s not the solution here. Social filtering is nice for discovery, for finding papers you might not have read on your own, but that’s not the problem I hear from most scientists. Most aren’t looking for more to read.
Shirky’s point may be relevant in some situations (certainly anyone looking to read every book in the Library of Alexandria will learn a valuable lesson from him), but like most Web 2.0 wisdom, it fails when applied to the particular needs of scientists. As the old phrase goes, “To a hammer the world looks like nails” and Shirky often strikes me as yet another Web 2.0 evangelist trying to convince us that our individual needs are all the same easily hammered nails.
Update: in response to some of the comments, I’ve tried to clarify things with a further posting on this subject, part 2, available here.
Posted in Developmental Biology, General, Online Tools, Science Publishing, Social Software, Web 2.0 | 13 Comments »
RSS feed for comments on this post. | TrackBack URI
Add to:
Del.icio.us
Digg
Technorati
Blinklist
Furl
reddit

Wednesday, January 14, 2009 at 11:13 pm UTC
David, two things strike me here. I think social filtering can (and does) work but not in the way that Shirky suggests, at least for a naive reading. But we have always used social filtering (“hey read this paper and see if we need to pay attention to it”, journal clubs etc) as a way of deciding what to read. Some web2 tools make it possible to simply run slightly more sophisticated journal clubs – nothing wildly new but perhaps helping a bit with the firehose.
The second is that for many cases I don’t need to read “all those paper” – I need a specific piece of information out of those papers but for some proportion (differing by field to field and subject to subject I would guess) I just need the big picture, or just need a single number, or just need the pointer to the structure. So there is a filtering that could in principle be done by directly picking out the info that _I_ need. And this is the promise of layering useful services (filters) over the existing literature. So it is more sophisticated filtering than Shirky (or his soundbites at any rate) often refer to but it is a useful soundbite in raising the issue of whether there are useful filters that could be built and what they might do.
Thursday, January 15, 2009 at 9:03 am UTC
intuition, confidence, faith, trust …. all tools that set up a situation wherein one realizes that there is no possibility of missing precisely what you need exactly at the moment you need it ..
yogis are the source of this understanding … they understand the relationship between the self, the mind, and reality …
of course, no scientist would ever accept such a source … except in private
Thursday, January 15, 2009 at 9:42 am UTC
But I don’t think you can know this until after you’ve read the paper. I was trained (and it’s always served me well) to read papers in an intense manner. I want to know what was done, why it was done, how it was done, is the data valid, does the data support the author’s conclusions, what have the authors missed? There’s no way you can set up a semantic filter to answer those questions, to read between the lines and really understand a given paper. I may walk away with nothing important, or a small piece of information, but I can’t know that going in. If you set up a filtering system to look only for specific things that you already know you want to find, you’re severely limiting yourself, shutting yourself off from any new directions or inspirations you may stumble across. If a painter told his visual cortex to filter out any images that differ from the kind of stuff he already paints, then he’s only going to see the same things, and he’s never going to grow as an artist. The same goes for scientists. That creative leap, that moment of inspiration, often comes from an unexpected direction.
And I do agree that social filtering does (and always has) helped narrow things a bit. But you still have to read the papers, and what you get out of it is going to be different from what anyone else gets out of it because your work is (or at least should be) unique. Someone working in Xenopus development is going to get something very different out of a paper from someone working on yeast metabolism.
Thursday, January 15, 2009 at 9:45 am UTC
I think most scientists would use different terms than you’ve used, but most would certainly accept that science is a creative endeavor, one that relies on individual leaps of both induction and deduction. And attempts to outsource those leaps to the crowd are misguided at best.
Thursday, January 15, 2009 at 12:21 pm UTC
But the proliferation of scientific journals/information IS a filter failure in one sense: the natural, built-in filters of the publishing industry (productions costs, editorial-labor costs, time) have been taken away as it became cheaper and more efficient to publish online. More information is available not just because there is more information (though this is true), but due to the fact that more information CAN be available.
One obvious downside to this is that there is more bad information out there. But there is also more good information. Either way, individuals have to filter out the bad information (what was previous filtered by publishers due to costs) and filter in the good (finding information specifically relevant to you).
Thursday, January 15, 2009 at 12:54 pm UTC
Hi John,
My response to your comment would be “yes and no”. You are right, in that there’s more being published, so there’s more low quality material out there. But most scientists are pretty good at filtering that out, not investing a lot of time in it. That’s not the sort of information overload I hear people complaining about. Because there’s more science being done, and technology has advanced to a point where it’s easier for more people to use the more advanced techniques, there’s more good work out there too. If it’s good quality, and relevant to advancing your research and career, then you don’t want to filter it out.
Friday, January 16, 2009 at 12:52 am UTC
The first thing I check every time I get Science or Nature (assuming I am not looking for a specific article) is the ‘News and Views” section. Not just to read a summary but to see if there is any interesting work that can lead me to important papers I might have missed due to the information glut.
None of us have the time to personally check every paper in depth to find the useful information we need. We need ways to ‘filter’ the huge amount of published information, not to simplify, not to have someone write a summary so that we do not have to read the paper ourselves but just to find the paper to begin with.
So we rely on methods to led us to the critical papers, to important meetings, the ones that directly affect our work. Lab or journal reputation is one. Social connections are another. PubMed also works. Gossip has led me to a lot of important ideas. News and Views can work similarly.
I don’t know how many times I have gotten an idea or been led in the right direction because of talking with another scientist who said “Hey, I read a paper that could be useful.” I do think that happened more often than my simply finding it myself.
The filter is not in producing summaries. I wouldn’t trust a random group of strangers to be a good filter either. But the social aspects of the web permit a community of like-minded scientists to more rapidly say to each other “Hey, I read a paper that could be useful.”
Friday, January 16, 2009 at 1:21 am UTC
[...] Information overload is NOT filture failure: This has been bothering me for a while now, dating back to last year, when I first heard Clay Shirky’s very pithy statement that information overload isn’t a real problem, the real problem is a failure to build effective filters. It’s a catchy little phrase, and like most theories from Web 2.0 gurus, it seems reasonable on the surface, but when applied to the world of scientists, it’s less than useful. [More] [...]
Friday, January 16, 2009 at 1:23 am UTC
[...] Information overload is NOT filture failure: This has been bothering me for a while now, dating back to last year, when I first heard Clay Shirky’s very pithy statement that information overload isn’t a real problem, the real problem is a failure to build effective filters. It’s a catchy little phrase, and like most theories from Web 2.0 gurus, it seems reasonable on the surface, but when applied to the world of scientists, it’s less than useful. [More] [...]
Friday, January 16, 2009 at 3:03 am UTC
Sorry David, but I’ve got to disagree with you on this one: http://scienceoftheinvisible.blogspot.com/2009/01/information-overload-is-filter-failure.html
Friday, January 16, 2009 at 7:21 am UTC
Richard–
What you’re talking about is not “filtering” as much as it is relying on various methods for discovery. You’re talking about adding to the stack one has to read, not about eliminating items from that stack. That’s not a problem I hear people complaining about. No one has too little to read. For discovery, the various activities you mention are great, but they’re not really the issue here.
Saturday, May 16, 2009 at 12:22 am UTC
I google data. There is an abundance, yes, of data on the web. Therefore, I don’t blog or input more. I just let someone else add it for me.
Friday, July 10, 2009 at 12:38 pm UTC
In my job as a CIO, I’ve been working on tackling information overload with mixed results. My company, a professional services firm, suffers more than most because of a couple of infrastructure problems that arose from a couple of mergers.
I’ve been trying to get my colleagues to acknowledge that attacking our information overload problem will improve our overall knowledge sharing collaboration efforts and also contribute to our bottom line. But some people here just don’t understand the extent of the problem.
I just read about information overload awarenesss day and I’ve signed up our company as a participant and designated site – I hope this will get my point across to my colleagues and help them understand what we can do to improve our overall position relative to information overload. For others in my position (and I’m sure there are many of you) I encourage you to do the same, Information is available at http://www.informationoverloadday.com