Home About Contact CSH Protocols Home

More on online tools for scientists

Monday, February 25, 2008 at 11:10 am CST by David Crotty permalink

Time to add a few links, as some recent articles on online tools for scientists are worth visiting. These include an article on career networking, a warning about the corporate forces behind social networks and an interesting piece by Peter Murray-Rust.

—article continues—

Nature Jobs has published an article on networking and the tools available. It strikes me as something of a “puff piece”, a bit of advertising for the Nature Network, but does have a couple of real-world caveats buried within:

Even computer-savvy investigators can find it difficult to kindle an active community. Peter Brantley, executive director of the Digital Library Federation in Washington DC, created a Datanet group on Nature Network to publicize a National Science Foundation (NSF) call for proposals to establish cyberinfrastructure centres of excellence internationally. “I developed the forum as a way for people to communicate outside the bounds of institutional affiliations,” he says, but adds that it has been hard to achieve a critical mass of participation.

Unfortunately, some 90% of social networking sites won’t succeed, says Contractor. “For every MySpace and Facebook, hundreds of others have failed,” he says.

Again, this points to analysis I made in my previous blog posting on the subject. Despite the promise and the continued evangelism, participation is low, and very few of these companies jumping on the bandwagon are likely to succeed. Again, don’t believe the hype.

Inside Higher Ed has an article on the huge corporate forces behind your favorite social networks. It is perhaps a bit politically radical for my tastes, but it does make one realize an odd phenomenon that’s happening in the scientific world regarding online tools. Usually those most enthusiastic about Web 2.0 and such are also those heavily involved in the Open Access, Open Data and Open Science movements. While they decry the policies of the big corporations and publishing conglomerates, they seem to be very willing to jump into bed with them for the creation of new online tools. Why the willingness to trust your data and your time-consuming efforts to for-profit corporations in this arena? Is it just that the big corporations are the only ones with enough cash to fund big experimental sites?

Peter Murray-Rust has an editorial out in Nature looking at online tools in the world of chemistry. He makes some great points in the article. The first is that making primary data sets available would be a very valuable thing. I’m not sure I see this as an either/or situation with the current state of science publishing as he does. Rather than tearing down the current system, perhaps a better approach is working towards the full release of data-sets as results are published. He does seem to gloss over the issue of storage space though. Perhaps the research to which he refers does indeed only generate a few megabytes of data, but I know imaging labs where 20 students/postdocs are each generating terabytes of data on a weekly basis. He notes that Google is willing to provide storage space, but again, that gets back to the questions asked above about private corporations controlling supposedly open data.

The article details some interesting tools being developed to overcome one of the biggest barriers to participation in such systems, the vast amount of time and effort required. By automating as much of the process as possible, you’re much more likely to get buy-in from users. I like that he at least tries to address the question of giving incentives for participation, although his solution, some huge government-funded program is vague and unlikely to occur any time soon in this era of reduced funding.

I think he over-rates the usefulness of Second Life (which is pretty much a dead end these days) and his comments on science bloggers in chemistry seem in line with what we’re seeing in biology (it’s still practiced by a very small minority).

I’ll also point out that in my last posting, I stated that science blogs are mostly read by other science bloggers, and most comments on science blogs are made by science bloggers. Please note who left comments on the post.

Posted in General, Online Tools, Science Publishing, Social Software, Web 2.0 |

RSS feed for comments on this post. | TrackBack URI

Add to: Del.icio.us Del.icio.us  Digg Digg  Technorati Technorati  Blinklist Blinklist  Furl Furl  reddit reddit

2 Responses | Add your own »

  1. Comment by ChemSpiderMan:

    David,
    Regarding your comments stating “Usually those most enthusiastic about Web 2.0 and such are also those heavily involved in the Open Access, Open Data and Open Science movements. While they decry the policies of the big corporations and publishing conglomerates, they seem to be very willing to jump into bed with them for the creation of new online tools. Why the willingness to trust your data and your time-consuming efforts to for-profit corporations in this arena? Is it just that the big corporations are the only ones with enough cash to fund big experimental sites?” This is a very interesting statement and I do sense that some of this is going on, yes. The opposite also happens…the small players trying to make a difference get slammed pretty hard
    http://www.chemspider.com/blog/another-response-to-constructive-feedback-from-peter-murray-rust.html

    The article you refer to is missing a lot of the issues around QUALITY of data in my opinion. Robots fail..and they can often fail in a big way. Manual curation of data is still key and crowdsourcing should be encouraged. There are quality issues everywhere. Some example posts include: http://www.chemspider.com/blog/struggling-to-scrape-crystaleye.html
    http://www.chemspider.com/blog/how-big-is-the-challenge-of-curation-and-what-is-the-structure-of-ginkgolide-b.html
    http://www.chemspider.com/blog/will-the-correct-structure-of-taxol-please-stand-up-part-3.html

    and the efforts going on here: http://www.chemconnector.com/chemunicating/dedicating-christmas-time-to-the-cause-of-curating-wikipedia.html

    Moving large amounts of data around without care for quality is going to be problematic. I admit to being part of the problem myself. We are aggregating millions of chemical structures and it is not easy to clean it all. However, we are trying. We have offered online curation of data to the users of ChemSpider as detailed here: http://www.chemspider.com/docs/The_Process_of_Curating_Identifiers_on_ChemSpider.pdf . This type of annotation of data I believe is critical to generating data of sufficient quality to be used by others but this is limited to chemical structure datasets. Labs do generate terabytes of data and while disk space is cheap etc I wonder whether the saturation of available data will cause other issues.

    Regarding “I’ll also point out that in my last posting, I stated that science blogs are mostly read by other science bloggers, and most comments on science blogs are made by science bloggers. Please note who left comments on the post.” I run three blogs. It’s a tight community. Many people don’t comment even though they read. Public comments exposes peoples opinions and most are shy to speak in public. The community of vocal participants is VERY SMALL.

  2. Comment by David Crotty:

    Antony–thanks for the comments, and particularly for the links to your site. Although chemistry is not my particular field of interest, I’m sure the lessons being learned there will be valuable and applicable elsewhere, and I look forward to digging deeper through your site. A few responses to your comments:

    —This is a very interesting statement and I do sense that some of this is going on, yes. The opposite also happens…the small players trying to make a difference get slammed pretty hard—

    This is something I don’t really understand. I think perhaps the motivation here, the reason why smaller efforts from within the community are less appreciated is that there’s a strong desire to see sweeping changes that encompass an entire field. The big corporations seem to be the only ones with the financial backing and the attention reach to accomplish something like this, hence they end up favored. I’m not sure if they’re likely to succeed, as it kind of makes more sense to think of these sorts of sites as appealing to niche communities, rather than a “one-size-fits-all” approach. Participation in social networks for the general public continues to decline:
    http://news.bbc.co.uk/2/hi/business/7257073.stm
    It’s likely that this behavior will become less mainstream, or at least the behavior will fragment into smaller interest niches, rather than everyone being on Facebook/Myspace. The question from a publishing point of view is whether smaller niches like this can financially support such efforts. Which is why they may end up better served from within the community (although this will require grant funding). You also have the corporate control of data, with terms of service that can be altered at any moment, as I mentioned in my posting.

    —I run three blogs. It’s a tight community. Many people don’t comment even though they read. Public comments exposes peoples opinions and most are shy to speak in public. The community of vocal participants is VERY SMALL.—

    Yep, agreed. That’s a point I made in a talk given to a publishing association:
    http://www.cshblogs.org/cshprotocols/2008/02/14/why-web-20-is-failing-in-biology/

    Some of the commenters on that particular post felt that my depiction of the science blogosphere, as a largely circular self-referential community was inaccurate.

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.

Copyright © 2008 by Cold Spring Harbor Laboratory Press.