Christina's LIS Rant
Tuesday, July 03, 2007
  Accessing the scientific literature through images, a rant
I went to a presentation on CSA's Illustrata, and I've read pieces by Sandusky and Tenopir of UTK on its development and evaluation... it seems like a very useful endeavor and a useful tool when it comes to the subjects I'm primarily concerned with.

While browsing my feeds just now, I saw a mention of Marti Hearst's project, Biotext. Biotext indexes the 150 PubMed Central journals. You can search the abstracts or captions or captions and show the results in a grid format. I'm actually somewhat disappointed, because I know of Hearst's work with Flamenco and faceted presentation of search results -- yet in this obvious place to do that, they do not. In Illustrata it shows the descriptors and all that and you can do searches on them...

What's even more disheartening is that Biotext seemingly has no visibility of Illustrata... I read one of the articles linked from the "about" page and scanned the other. There is this footnote:
Recently a commercial offering by a company called CSA Illustrata was brought to our attention; it claims to use figures and tables in search in some manner, but detailed information is not freely available.
A company called CSA, wtf? Ok for environmental science they're like the best A&I service (not to mention materials and aerospace but, ok, that's not pertinent to this post). It's not like they're new or unheard of.... it's also not like the folks at Berkeley couldn't learn more about it... maybe by oh, I don't know, reading the white paper or going to a presentation or maybe signing up for a free trial, or talking to their librarian? So they mostly talk about TREC stuff... great.

Oh, and I do love this bit:
"Recently, online full text of bioscience journal articles has become ubiquitous, eliminating one barrier. The intellectual property restriction is under attack, and we are optimistic that it will be nearly entirely diffused in a few years"
(I have the Beach Boys' "wouldn't that be nice" going through my head)... Gee, I hope the actual life scientists do appreciate that there are still a ton of things not available OA... like the Nature stuff, for example?

My thing is that I think this is really important, and we could really get some good work done if we build on each other's work. You know, cumulating as if we're doing science instead of ignoring as if we're computer scientists (sorry, I don't really mean to offend my one CS major reader, hi John!, but really, they did just invent taxonomies, knowledge representation, and information retrieval, you know).

I'd also like to see this in materials science or mechanical or aerospace engineering -- wouldn't it be great to see the computation fluid dynamics or finite element images? Maybe the micrograph pictures or failure pictures...
Hey, Christina, it's all good. Most CS-types feel the need to reinvent INSPEC at least once in their career...
Hello Christina,
Thank you for mentioning our project at http://biosearch.berkeley.edu

I agree that some form faceted navigation, which our group has played an important role in studying and promoting, will enhance this interface. However, rather than just repeating how we've done it in the past for other kinds of collections and other kinds of users, we are carefully testing what works and does not work for bioscientists searching the literature. This is how I do my search interface research these days. I try to figure out what works before exposing it to people broadly. Our search website mentions that the interface is a work in progress and we are adding features continually.

I welcome your readers to get involved with our usability studies; all you biologists out there, please drop me a line if you'd like to get involved. (We prefer people who are in biology directly, rather than bioinformaticians, since the latter don't usually do the kinds of search we are most interested in supporting.)

As for your point about Illustrata, we did read the 3.5 page whitepaper that is freely available on their website, but it is very short on information and has no illustrations. There is a longer white paper available but it requires registration. I do not want to have to give this information in order to read about a product; I hope the reasons why are obvious. This is why we said this information is not freely available.

There is also a webpage that shows some information about the product; it looks similar to the work by Hong Yu that we cite but also appears incomplete.

Finally, your lack of optimism about biosience lit becoming more freely available I think will be proven wrong. Nature is actually open to experimentation in this area.

Marti Hearst, hearst at ischool dot berkeley dot edu
Hi Marti- Thanks for the comment! I appreciate that you are varying one thing at a time to determine what works and what doesn't, but I do think there are some things that we know about how life scientists interact with information. In other words, I think you could have started farther along, if that makes sense.
CSA, now ProQuest, formerly Cambridge, is very well known in scholarly life sciences -- not the name, it keeps changing -- but the resource.
I'll look for the project manager's business card and send her information offline-- it would serve us all best if they could learn from you and you could build from what they've learned.
I certainly hope I am proven wrong about access to the scientific literature -- but I spend my whole life immersed in these issues and I don't think we'll get there any time soon.
BTW- I have a tremendous amount of respect for your work and am really happy that you commented.
Hello Christina,
Here is the latest information on Illustrata...
There is a link to a recent recorded webinar sponsored by the Library Journal which anyone can access as well.
Mark Hyer
VP Secondary Publishing, ProQuest
Post a Comment

Powered by Blogger

This is my blog on library and information science. I'm into Sci/Tech libraries, special libraries, personal information management, sci/tech scholarly comms.... My name is Christina Pikas and I'm a librarian in a physics, astronomy, math, computer science, and engineering library. I'm also a doctoral student at Maryland. Any opinions expressed here are strictly my own and do not necessarily reflect those of my employer or CLIS. You may reach me via e-mail at cpikas {at} gmail {dot} com.

Site Feed (ATOM)

Add to My Yahoo!

Creative Commons License
Christina's LIS Rant by Christina K. Pikas is licensed under a Creative Commons Attribution 3.0 United States License.

Christina Kirk Pikas

Laurel , Maryland , 20707 USA
Most Recent Posts
-- WP Blog entry on why libraries?
-- The purpose of vendor interactions
-- Richard Akerman liveblogging IATUL
-- Systems telling the users about themselves...
-- huh... librarians as Elsevier's bouncers
-- Ok, big exhale Nature...
-- Ruth's Posted Her Talk
-- SLA2007 Astro II
-- sla2007: ebooks on steroids
-- SLA2007: Astro Roundtable
02/01/2004 - 03/01/2004 / 03/01/2004 - 04/01/2004 / 04/01/2004 - 05/01/2004 / 05/01/2004 - 06/01/2004 / 06/01/2004 - 07/01/2004 / 07/01/2004 - 08/01/2004 / 08/01/2004 - 09/01/2004 / 09/01/2004 - 10/01/2004 / 10/01/2004 - 11/01/2004 / 11/01/2004 - 12/01/2004 / 12/01/2004 - 01/01/2005 / 01/01/2005 - 02/01/2005 / 02/01/2005 - 03/01/2005 / 03/01/2005 - 04/01/2005 / 04/01/2005 - 05/01/2005 / 05/01/2005 - 06/01/2005 / 06/01/2005 - 07/01/2005 / 07/01/2005 - 08/01/2005 / 08/01/2005 - 09/01/2005 / 09/01/2005 - 10/01/2005 / 10/01/2005 - 11/01/2005 / 11/01/2005 - 12/01/2005 / 12/01/2005 - 01/01/2006 / 01/01/2006 - 02/01/2006 / 02/01/2006 - 03/01/2006 / 03/01/2006 - 04/01/2006 / 04/01/2006 - 05/01/2006 / 05/01/2006 - 06/01/2006 / 06/01/2006 - 07/01/2006 / 07/01/2006 - 08/01/2006 / 08/01/2006 - 09/01/2006 / 09/01/2006 - 10/01/2006 / 10/01/2006 - 11/01/2006 / 11/01/2006 - 12/01/2006 / 12/01/2006 - 01/01/2007 / 01/01/2007 - 02/01/2007 / 02/01/2007 - 03/01/2007 / 03/01/2007 - 04/01/2007 / 04/01/2007 - 05/01/2007 / 05/01/2007 - 06/01/2007 / 06/01/2007 - 07/01/2007 / 07/01/2007 - 08/01/2007 / 08/01/2007 - 09/01/2007 / 09/01/2007 - 10/01/2007 / 10/01/2007 - 11/01/2007 / 11/01/2007 - 12/01/2007 / 12/01/2007 - 01/01/2008 / 01/01/2008 - 02/01/2008 / 02/01/2008 - 03/01/2008 / 03/01/2008 - 04/01/2008 / 04/01/2008 - 05/01/2008 / 05/01/2008 - 06/01/2008 / 06/01/2008 - 07/01/2008 / 07/01/2008 - 08/01/2008 / 08/01/2008 - 09/01/2008 / 09/01/2008 - 10/01/2008 / 10/01/2008 - 11/01/2008 / 11/01/2008 - 12/01/2008 / 12/01/2008 - 01/01/2009 / 01/01/2009 - 02/01/2009 / 02/01/2009 - 03/01/2009 / 03/01/2009 - 04/01/2009 / 04/01/2009 - 05/01/2009 / 05/01/2009 - 06/01/2009 / 08/01/2010 - 09/01/2010 /

Some of what I'm scanning

Locations of visitors to this page

Search this site

(google api)
How this works

Where am I?

N 39 W 76