Christina's LIS Rant
Saturday, August 30, 2008
  My takeaway from watching from the sidelines of Science Blogging London 2008
I've been browsing the FriendFeed Room.... evidence. Systematic study. Peer-reviewed. Published. (good news for me, if I get in gear)

A lot of what's coming across is people claiming value and other people not getting it. Or maybe, people claiming different benefits and arguing against each other. Also unsupported (like those I sometimes make (blush)) statements about why blogs are good/bad/indifferent for scientists/the public/policy makers/science in general.

Most of the action in journals is still in the letters and editorial bits - not based on systematic evidence, but journalistic inquiry or personal experiences. Well, and also in the CS literature about information retrieval, sentiment analysis, geotagging, community detection, etc. Maybe in the education literature a bit a few years ago about the pedagogical value.

I have lots of ideas for qualitative studies, but not so many for quantitative... but that's probably what's needed to get attention.
Friday, August 29, 2008
  The meaning of citations
ResearchBlogging.orgWhat a grand post title, but actually, what I mean is slightly more like: the meaning of citations: what Garfield said he means in a bunch of articles vs. what people say he means and even worse what people do with his work, plus some commentary on a review chapter.

Today I read the whole Nicolaisen[*] article which I just browsed earlier (ok, so it's been A LOT longer than I intended). This is not a review of how to *do* citation analysis, that's included in the several ARIST chapters on bibliometrics and informetrics. Rather, this is a review of two streams of literature about citations: why do scientists cite (and theories about that) and more weakly, one aspect of/model for/theory of how citation patterns "reflect the characterics of science and scholarship" -- how citing patterns can be used to model science/knowledge... **

First, because I always run out of steam at the end, and because it's most important, what Garfield says vs. how his work is used.
L.C. Smith (1981, cited in *) provides these assumptions that underlie citation analysis:
1. Citation of a document implies use of that document by the citing author.
2. Citation of a document (author, journal, etc.) reflects the merit (quality, significance, impact) of that document (author, journal, etc.).
3. Citations are made to the best possible works.
4. A cited document is related in content to the citing document.
5. All citations are equal.
So there's this idea that there's a linear relationship between quality and number of citations (as evidenced by linear regressions used everywhere - also in a note in *). More citations mean better paper, mean better institution, mean more money. BUT, that's not what Garfield said:
A highly cited work is one that has been found useful by a relatively large number of people, or in a relatively large number of experiments. … The citation count of a particular piece of scientific work does not necessarily say anything about its elegance or its relative importance to the advancement of science or society.…The only responsible claim made for citation counts as an aid in evaluating individuals is that they provide a measure of the utility or impact of scientific work. They say nothing about the nature of the work, nothing about the reason for its utility or impact. (Garfield, 1979, p. 246, cited in *)
In fact, Nicolaisen elsewhere provided evidence for Bornstein's suggested J- shape between quality and citations. Utility could be to illustrate a point and impact can be negative...

So back to the content of the review article. Why study citation analysis? Because it's used for (as Zunde said and Nicolaisen added to)
1. Qualitative and quantitative evaluation of scientists, publications, and scientific institutions
2. Modeling of the historical development of science and technology
3. Information search and retrieval
and Nicolaisen's addition (here I paraphrase, above I quote) 4. knowledge organization/mapping through bibliographic coupling and co-citation analysis
So it can be pretty important in the life of an individual scientist as well as in the success of institutions. (particularly in certain European countries that allocate research funding this way)

But there isn't a cut and dried accepted theory of why people cite. Seems pretty obvious, right? Here are the ones that the author reviews
As for the symbolic nature of citations - this goes to the heart of using citations to map knowledge. What can we say about paper A because it cites B, or about A and C if they both cite B? Citations as indicators that provide a formal representation of science - Wouters Reflexive Citation Theory. But look, we don't know why the citation was useful to the author - maybe the context is, "what an idiot Pikas is, see for example Pikas (2008)." So according to the author, Wouter's theory can't handle that.

An interesting (and now on my research questions list) application of all of this is to look at explicit link-love mentions in SCTs used by scientists or well, really anyone. This idea is mentioned in Efimova, L., Hendrick, S., & Anjewierden, A. (2005) but not explicitly researched.

[*]Nicolaisen, J. (2007). Citation Analysis. Annual Review of Information Science and Technology, 41, 609-641.
[**] I do appreciate that research blogging is supposed to make articles more clear not less clear but hopefully I'll get better with practice ;)
Tuesday, August 26, 2008
(commence rant - my opinion only obviously)
I am a librarian - I am a professional - I help connect people to information. I work reference, public services, whatever you want to call it. I do research for scientists and engineers at a research lab (and for the 1,000,000,000th time, no, we do NOT have students) and also an occasional Sunday at the local public library.

I still believe in research. I believe in libraries - libraries as place, libraries as collections, libraries for and as community. I believe in librarians - professional librarians with MLS degrees - who can organize said collection and who can help connect people to the information they need. I think the degree gives you valuable training and insight that you can't get from OJT.

I believe in library schools - professional training places. I actually kind of "get" iSchools sometimes when I squint really hard (for other information professionals - information managers, records managers, HCI professionals).

I want my PhD (and will start continue working on my comps proposal - I promise - as soon as I finish this rant!). I can't write for s$%+, I'm scattered, my husband is dead set against it, my in laws think I'm crazy, my dog is worried, my parents are supportive but don't know how to help... but I want it and I will do it! (even if it kills someone, hopefully not me)

  1. I want to be proficient at gathering that research needed for evidence-based practice
  2. I honestly believe that I can make a difference - I can study science/scientists, social software, scholarly communication, and help scientists do their work better and yes, eventually make the world a better place. I can help science librarians and science libraries *be* better. Maybe one day, I can help train new science librarians (but it will never be my first job - I am still a librarian)
  3. It's not just a piece of paper or a set of random hoops. It's a journey. It's an apprenticeship. It's a transformation. It's a beginning. There are hassles -- and more for me than for full time students -- but that's trivial. (I was in the Navy, so I know hassles and believe me, you know nothing of hassles)
(rant off)
Friday, August 22, 2008
  How did I end up as a librarian meme
I was tagged by Jill - actually, this is probably the first meme I've been tagged for (so that I've noticed) in more than a year (or years?).

I've always used libraries. Growing up, my mom took me to the Kirkwood branch of the New Castle County (Delaware) library for story time and to check out books. I still remember my card - it was card stock and kind of an orangey yellow. And then when we moved to Maryland, we went to the Rising Sun branch of the Cecil County library - this is back when it was the size of a walk-in closet - it was right there in town and sharing a building with the police department. After I left it moved out to a new building. I also used the main branch in Elkton when I had a bigger report that I needed to do.

In college, I really didn't use the library - at all - except when I had to write a paper or the quiet study room. But when I was off on my own alone in New Port at Surface Warfare Officer School Division Officer Course, I was a heavy user of the New Port (town? county? township? borough?) library - which I could walk to from my apartment and where I discovered a reference book that was a guide to historical fiction...

In Jacksonville, I was a heavy user of the libraries - I remember calling from Curacao on liberty to renew books :) (ok, I shouldn't have taken library books underway, but it was just a 6week deal). That was the first library I ever saw that had a security guard - very off-putting right by the front door. Once again, back in Maryland, I headed out to the Long Branch branch of Montgomery County...

So there I was, getting out of the military with a physics degree - by the skin of my teeth and 4 years old so anything I had learned was rusty - experience driving a ship, shooting some guns, .... I talked to some defense contractors, and it was more of what I was trying to get away from. So I took the Strong type indicator and the MBTI... and as I recall, I came up either librarian or priest. Since I'd just converted to Catholicism and gotten married (neither of which is up for discussion on this blog!) ...

My husband was very supportive - he told me this would be perfect and that I would love it. I interviewed my mom's college roommate who was a chemistry librarian and through Jean Hort who headed the Navy Department Library, Brenda Corbin who was head of the Naval Observatory Library. These women are great role models.

I applied, got in, and the rest is history.... long story.

I'd like Mark (even though I think we discussed this already), Randy, Joe, and Ruth to answer this -but only if they feel like it, no pressure.
  Like prices and hemlines, why do impact factors always go up?
ResearchBlogging.orgEver notice that certain time of year when every journal publisher announces how the impact factors of their journals is up? When the Journal Citation Report (JCR) comes out... the press releases follow. The impact factor is a measure of how important the journal is - if it is cited. It's a rolling measure so journals can't rest on their laurels (so much), but there is time for the articles to actually be cited after they're published.

Impact factors of journals are a perennial discussion topic - used by libraries (with other measures) for collection development and by researchers to decide where to publish. They're also mis-used, abused, and misunderstood. But this article isn't about all that. This article looks at whether the impact factors are going up, what aspect of the impact factor provides the greatest contribution or explains the increase, and if the increase is different in different disciplinary categories?

I'll try to use the same notation used in the article*. Let's define the impact factor (IF) first. IF looks at how many articles (n) in a current year (t) cite articles from a journal from the previous 2 years (t-1 and t-2) divided by the total number of articles (A) in journal (i) in the two years t-1 and t-2.

It's pretty clear from the numbers that impact factors have gone up in absolute terms, but the authors are interested in the average rate of change so they need to create a weighted impact factor to account for the fact that some journals have a lot more articles than others. The weighting factor is the number of articles from the particular journal from the previous two years divided by the total number of articles from all of the indexed journals over those two years.

The weighted impact factor (where S is the set of all JCR journals in year t) is

So with that- yeah, pretty consistent change - 2.6% per year from 1994-2005.

To find out what caused the increase, they decomposed it to these 4 factors:
  1. if there really are just more articles in t than in t-1 and t-2 (alpha sub t)
  2. to what extent are the new articles citing the past 2 years (maybe citing older stuff) (p sub t)
  3. to what extent are the new articles citing non-JCR journals (newer or regional or less well-respected or too specialized) (v sub t)
  4. how many articles are the new articles citing? how many references or how long are the reference lists? (c sub t)
and through some math they get here:
I don't want to give away the whole article, so I encourage you to check out the math and the tables, but it turns out that c is the only thing that makes a difference. It's really almost completely due to the increase in the number of references cited per article!

As for disciplinary differences. We know that cited half life in math, for example, is >10 years -- they cite older stuff. Immunology is under 6 years. We know that biomedical researchers cite a lot more articles than engineers... there are just different ways of doing science, applied science, and math so we expect some difference in impact factor - and really it's not at all a good idea to compare journals that serve different fields by impact factor. The authors make some interesting choices here - made no doubt because they run the Eigenfactor site (stated explicitly as a competing interest). Instead of going with the easy JCR subject categories, which are somewhat disputed - they use the 50 largest of the 88 non-overlapping categories they found using a random walk method (see their PNAS article).

They calculate the weighted impact factor for each of these categories and sure enough, math has a 0.56 weighted impact factor and molecular and cell biology has a 4.76 weighted impact factor. The growth rate is highest for pharmacology 0.098 and negative for history (hm?). After various assorted linear regressions and a hierarchical partitioning, it turns out that v accounts for the largest part the difference between the disciplines. Scroll up, that's right, citing non-JCR journals. CS and math suffer while biomed wins out.

An interesting article and I can really recommend reading it - it's very understandable, and you almost feel like you have your professor carefully walking you through the steps. I always see the press releases, so it's nice to connect those to some sort of reason.

Althouse, B.M., West, J.D., Bergstrom, C.T., Bergstrom, T. (in press). Differences in impact factor across fields and over time. Journal of the American Society for Information Science and Technology DOI: 10.1002/asi.20936
*grrrr having problems with the equation formatting... I did try MathML but grr doesn't work right on the preview
Thursday, August 14, 2008
  What is e-science?
(this post was mostly written a while ago but is just being finished on 8/29/08)
There's an explosion of meetings and conferences and conference sessions on e-science, and in particular, how computer and information scientists and information professionals can/should/(do?) support e-science.

Ok, in that case, what is e-science? Carol H pointed out to me in e-mail that many of the current wave of librarian sessions seem to only be covering big science. Massive efforts using cloud computing to handle exabytes of data coming from telescopes and other big science instruments.

Also h/t Carol H, Carole Palmer quotes:
Data from Big Science is … easier to handle, understand and archive.
Small Science is horribly heterogeneous and far more vast. In time Small Science will generate 2-3 times more data than Big Science. (‘Lost in a Sea of Science Data’ S.Carlson, The Chronicle of Higher Education, 23/06/2006.)
This is small-er science, but it's still data curation.

At the SLA session. librarians from the Biodiversity Heritage Library talked about their work. Other roles of librarians in e-science include as taxonomists, catalogers, and as digitization experts (maybe this is curation, too?)

I don't think that's all there is. I think that e-science is also leveraging the power of the web for collaboration and information sharing using both social software and more traditional databases.

So I'm asking and proposing that e-science is
What do you think? Is it just one of these or all or some subset?

Update: John weighed in. And there's a friendfeed string (with a couple of real scientists!).


  The Great Planet Debate: an information science question and public understanding of science issue
I attended the Great Planet Debate between Mark Sykes (Planetary Science Institute) and Neil deGrasse Tyson (American Museum of Natural History) today -- in person. It was moderated by Ira Flatow and will be made available at that web site.

This was really fun. If you're not into planetary science (how could you not be? but I hear that there are people out there aren't), the debate revolves (intended) around:
Used to be that just about anything in the sky at night was called a planet. But over time, asteroids, stars,.... other things were identified and named. So we sort of know what a planet is - and thanks to powerful telescopes and robotic missions - we know a lot about a lot of planets. If we have a whole loosely defined pile of things, is the term useful? When scientists talk to scientists, can they use the term planet and be understood?

IAU is an international body with the job of naming things. They name things for use in science, so there is some common ground. They work towards consensus on the names. Naming things sometimes depends on first agreeing what they are. In a meeting in Prague, they had a vote: how do we define planet? Their definition is:

(1) A "planet" is a celestial body that (a) is in orbit around the Sun, (b) has sufficient mass for its self-gravity to overcome rigid body forces so that it assumes a hydrostatic equilibrium (nearly round) shape, and (c) has cleared the neighbourhood around its orbit.

(2) A "dwarf planet" is a celestial body that (a) is in orbit around the Sun, (b) has sufficient mass for its self-gravity to overcome rigid body forces so that it assumes a hydrostatic equilibrium (nearly round) shape, (c) has not cleared the neighbourhood around its orbit, and

(d) is not a satellite.

(3) All other objects, except satellites, orbiting the Sun shall be referred to collectively as "Small Solar-System Bodies" [from]

This is all pretty unsatisfying for many people, because it clearly doesn't apply to exo planets - things not going around our Sun. Apparently the clearing the neighborhood thing isn't clear either.

What I got from the debate was that Sykes wants to call a lot more things planets, and Tyson wants to call nothing planets, because the term isn't useful. He said we need a new lexicon to group like things like the rocky ones, the gassy ones, the things in the Kuiper Belt....

It gets even more interesting when you realize how much this has become not a science issue, but a classification problem, sociology of science problem, and a public understanding of science problem (which I am using to include public communication and also science education).

There's a lot of philosophy, cognitive science, and sociology on how people put things into categories. Sykes was arguing for a hierarchical model with "planet" being the highest level with the only definition really being that they are 1) round 2) not a star 3) not a satellite (although this last point seemed a bit flexible). He argues that there is a common understanding of what you mean when you say planet and that we need a way to refer to all of these things and none of this precludes sub-categories.

Tyson says "planet" is useless, and furthermore, that it will restrict our vision for further exploration. He points to some of the advertising for New Horizons that seemed to indicate that this mission would complete planetary expeditions. Historically, apparently science was held back because scientists didn't believe their observations because they thought that they already knew all of the planets.

The public understanding part of this is interesting, too. Tyson's argument is in part that the whole idea that learning science is being able to enumerate a set number of planets is faulty. In other words, science and science literacy should be about the joy of discovery or even about understanding the scientific method, not about memorizing and regurgitating a list. Certainly either of these other two versions will set a person up better for lifelong learning, which is a necessity to remain scientifically literate.

But some teachers, many journalists, and lots of the public want a simple answer. They want to know for sure and be told a fact by a scientific authority. They don't want to know that there are controversies in science that are not caused by bad behavior. And the universe is complex, and things aren't always neat and tidy, and humans want to put things in boxes.

As always, I lose steam towards the end, but to sum up:
I recommend the Planetary Society's podcast and blog for anyone who wants to learn more or who is interested in planetary science.

Oh, and I stayed a bit to try to chat, but Tyson was mobbed - we did get him to autograph one of our library books that he wrote :)

This is my blog on library and information science. I'm into Sci/Tech libraries, special libraries, personal information management, sci/tech scholarly comms.... My name is Christina Pikas and I'm a librarian in a physics, astronomy, math, computer science, and engineering library. I'm also a doctoral student at Maryland. Any opinions expressed here are strictly my own and do not necessarily reflect those of my employer or CLIS. You may reach me via e-mail at cpikas {at} gmail {dot} com.

