Friday, December 30, 2005
  Presenting literature search results
My colleague SMF and I have an ongoing discussion about how we present literature search results to our customers.  We know that:
1) our view of relevance is broader than our customers and we might have some very good articles that don’t explicitly use their terms in the title and abstract
2) we are very thoughtful and careful in our search strategies and cognizant of the strengths and weaknesses of the resources and tools we use (see Dave Hook’s discussion of quantifying a negative search)
3) our work provides an extremely important basis for ongoing work and can cost/save the customer or enterprise money, prestige, time – but, users of our work need to be aware of any disclaimers
4) we work with people who are used to looking inside the black box
5) we may have misunderstood or incompletely covered the subject so gaps in the results might be mistaken for a lack of information, not that the subject wasn’t searched.

So, it sounds like I’m already decided.  I like to start my literature search reports with an executive summary, and then provide a methodology section before presenting the results.  This way I can bring attention to any results as I see fit.  A couple of times, I have create a pie chart and this was well received.  Sometimes, though, it is like getting someone to eat your Brussels sprouts, not drink the kool-aid.  All this writing takes time and we charge for time and there is never enough time.  Our descriptions and disclaimers can be cut off of the results.  Our customers do trust us and sometimes just want the answer.

Anyway, I try to be disciplined about keeping a log of my search strategies and decisions I’ve made in literature searches.  I do a write up when I have time and can mostly do one later if I need to from my notes.  

This is something that should be emphasized in library school, I think.
  Why we blog: Encourage re-use of our intellectual assets
Why we blog: encourage re-use of our intellectual assets

One of my pet ideas is to use blogs for personal knowledge management (pkm). An argument is that knowledge workers do not like to submit their work to a centralized system because they lose control and accessibility – the codification approach (Desouza, 2003). My argument was that blogs are inherently personal while still allowing for search, sharing, and re-use (supporting the personalization approach, also from (Desouza, 2003).

Another reason that these centralized databases frequently aren’t used is that context is stripped from the artifact, so it becomes a disconnected document.(Desouza & Awazu, 2004). The metadata should capture some of this context and there should be a methodology section in every report (although goodness knows few librarians explicitly write out their search strategies on their research reports… more to come on this) – nevertheless, there’s no history there. Besides, all of this front-end work is expensive and discouraging (mention of this in discussions of finding vs. refinding and re-finding as use). Blogs tend to provide this context in terms of history, linking, and narrative.

If we want to become an expert or if we have a really good idea that we want to see used (not always the case as mentioned in (Desouza, 2003)), how do we publicize it? If we just enter it into a centralized system, it will get lost. Basically, we have to advocate for the idea. We have to make ourselves available for questions so that we can be known as experts. Also, we have to provide a record of work or a history so that we will be found and trusted. A directory isn’t enough, a resume is better, a knowledge map better still, but wouldn’t a blog be best?

Desouza studies software engineers. It could be that Open Source Software communities like SourceForge capture enough of the process that blogs would be redundant. SourceForge also explicitly enables reuse of code pieces and provides a wiki-like history of the software, documentation, a narrative of the development process, etc. So, a community like SourceForge, when seen as a whole can probably solve a lot of these questions. In fact, it might be better to offer this set of tools, but it’s also a lot more time consuming on both ends. For the non-programmer knowledge worker, a blog might be the best first step.

Incidentally, Jon Udell is looking for comments on his book idea to explore using professional blogs as combination CV and autobiography. Mine is kind of going that way with the exception that I do try to keep a lot of more personal, motivational, posts out of the blog. I think these things would be necessary in an autobiography. I’d really like to see everyone who publishes self-archive their work using their blogs as pointers/indexes. Talk about context (. That would allow for re-use of our intellectual assets(Davenport, Thomas, & Desouza, 2003).

Davenport, T. H., Thomas, R. J., & Desouza, K. C. (2003). Reusing intellectual assets. Industrial Management, 45(3), 12.
Desouza, K. C. (2003). Barriers to effective use of knowledge management systems in software engineering. Communications of the ACM, 46(1), 99-101.
Desouza, K. C., & Awazu, Y. (2004). How to put context in the knowledge base. KM Review, 7(2), 8-9.

Updated for formatting and to add tags
Tags: ,
Thursday, December 29, 2005
  Search all public library sites?
Stephen asked for a cross-search of library web pages. You can set one of these bad boys up on Gigablast, but you can only add up to 500 sites. Here's a demo search of public library pages from the mid-atlantic (MD, VA, DC, PA, DE) (VA, PA, DE are from DMOZ, not comprehensive). BTW - you can cross search all Maryland libraries using this form.

  ProQuestPress Release: Dissertations via Amazon?
ResourceShelf pointed to the list of bestselling dissertations of 2004 (note, a University of Phoenix business PhD is top... hm? wonder why? really? maybe he's marketing it more? most of the top sellers are from night or distance or online-only schools? are the schools marketing the dissertations for recruitment of new students? are the libraries playing a role?) but didn't mention this quote farther down:
"ProQuest is continuing its tradition of preserving and providing access to this vital material by making portions of its PQDT database available for purchase through Amazon.com. "
Wednesday, December 28, 2005
  CiteSeer: Acknowledgement Search
Well that's cool. One of the uses is to find funding agencies... in this they're not alone. In fact w/biomed topics funding is all important as NIH now has the public archiving (encouragement) policy. See, for example, the MeSH Research Support. DTIC, NASA and DOE tech reports have contract number, corporate author, etc. Anyway, it's nice to have it for journal articles and conference proceedings.
Tuesday, December 27, 2005
  Notes on Searching the Live Web by Hodder
Mary Hodder, lecture to UC Berkeley's SIMS 141 class, 11/22/2005, available in rm format. (accessed 12/27/2005)

Live web - blogs wikis, etc., subset of web
She lists - blog pulse, sphere, technorati, bloglines, ice rocket, pub sub

Difference between static web and live web searching
- return of results (pagerank/relevance vs. reverse chronological), emphasis on live
- link searching vs. kw searching (not immediately obvious where search terms appear)
- engines find blogs by underlying structure produced by common blogging s/w (therefore not all retrieved are blogs, not all blogs retrieved)
- things drop off the front page ("aged") in liveweb search vs. google, which keeps archive (slower to crawl, relevance most important, deeper search)

Metrics of blog search
- links (technorati (last 6 mo), pubsub (not explicitly reported in search results), bloglines (forever)), different from site to site, confusing
- number of blogs searched ... bloglines gives #articles, others #feeds
- what are you actually searching? (see her venn diagram at 18:24)
- number of RSS subscribers (in bloglines, feed via feedburner) -- using one or the other to look at influence or reach is inadequate... people try to extrapolate from both figures, using knowledge of subject area and how techie people are in that area

- her proposed metrics (see her blog, 22 different metrics, search only across smaller communities, not the whole blogosphere)
- re-order search results by "authority" -- number of links received. Sphere will allow by relevance

- 13k blogs in an hour
- Google doesn't work as hard as they might to get rid of because of advertising dollars

What's needed
- (stop comparing everything to google and static web search from 1997)
- sophisticated interfaces
- topic browsing
- sophisticated weighting tools (more than just inbound link counts)
- adjustments to static web search to fine tune it

Her project to tag w/identity

In response to questions...
Another way to help liveblog search:
- microformats (technorati approach, rel=)
- structured blogging (pubsub approach)

Problem with co-mingling blog links with static web: her example of looking for bank location -- it's not helpful to find blog posts about the bank.
Friday, December 23, 2005
  Search plug-in addiction
I created like 4 firefox search plug-ins yesterday and a greasemonkey script. Well, I actually customized ones I found but 2 of them are now listed on http://mycroft.mozdev.org/download.html?submitform=Find&category=14. You can search on my name. The other 2 are a intranet-type deal and a MetaLib search. It was actually much easier than I thought it would be.
  Friday thing: E-S quiz....whoa.
Hm. Maybe this says too much about me, but I scored 23 on Empathy and 57 on Systemizing -- I think this means that I don't give a #$%^ but like to put things in order, lol. Really, although I am heartless, it's not completely my fault. After all, I was in the military and a science major. Are librarians more likely to be systemizers? Maybe that's the science thing since I'm pretty handy and like to know how things work.... I'm an extreme type S... oh joy:
"The extreme male brain (bottom right of the graph) may be a manifestation of autism."
I was reminded of the article and quiz by Richard Akerman, who is even more heartless than me...
Wednesday, December 21, 2005
  Everything Bad, Gorman, Levy, Liu
Everything Bad, Gorman, Levy, Liu:  How we read electronic media

The new Current Cites (December 2005) (via e-mail) has a review by Leo Robert Klein of an article on reading in digital libraries (Liu, 2005) this connected to several other things I’m looking at right now.

The essence of the argument is that while the digitization of information has enabled powerful tools like hypertext, it has dramatically altered reading for the worse by fragmenting attention, discouraging deep reading, comprehension and retention.

This is what Gorman was complaining about (Gorman, 2005) (in part).

On the other hand, there is Every Thing Bad is Good For You (Johnson, 2005).  I only just started reading it so I can’t give a thorough review of his point of view or argument (in fact, there’s nothing in the cover bio about the author’s qualifications to write such a book and this *may* be important).  What I’ve read so far says that narrative in new media is much more complex, greater cognitive dexterity (my words, if they make sense) is required to interact with video games (trivializing the different if you’re just saying fine motor control), and to judge games/internet/media on book standards is wrong/unfair/inaccurate.

Levy’s article is a bit older, but carefully reviews the history of reading and attention.  He states that “in reading, the partiality of attention means both that the document itself is selectively attended to and that to which the document points is also only partially grasped. As a process that is linear in time, it is only capable of fixating on one small fragment at any one instant.” (Levy, 1997, 206).  He goes on to talk about a change in design of the New Yorker because readers moved from consuming each issue as a whole, from cover to cover, to only “dipping in.”  He talks about disaggregation of books into smaller and smaller chunks in the same way moving images on television have become “sound bites” (see my discussion of disaggregation of journals by APS and IOP).

He argues that “current work in digital library design and development is participating in a general societal trend toward shallower, more fragmented, and less concentrated reading” (Levy, 1997, 202).  Also that “while one might argue that hypertext is integrative because it permits information units to be gathered up and linked together, it is exactly the integration of fragments that it encourages. And at the same time, as larger units, such as journal articles and even books, are put into hypertextual form, the creation of links among their parts contributes to their further fragmentation or atomization.” (Levy, 1997, 208)

Liu’s new article is essentially a literature review of studies of reading in digital environments# (2005).  People do browse and scan more documents and look for keywords.  Also, more emphasis is placed on putting the main content above the fold or in the lead paragraph.  This has been true for a long time in newspapers, but it is being done in scientific literature.

These articles all report what the authors believe is happening without clear scientific evidence of why or if it’s good or not.  Anyone who took lots of standardized reading comprehension tests was told to read the questions first, then scan the document for answers and then move on – that’s the best way to get the most correct answers on the SAT and other tests.  So, we also do this when the time constraints are self- or world-imposed instead of just test-imposed and we continue to do it even after we’ve finished our last standardized test (the GRE, hopefully).  As I keep saying, and Levy also says, how we read is really based on context.  I think bloggers like everyone else sometimes read intensively and sometimes scan for facts.  In reality, librarians in public service should be *much* better at scanning for facts than the majority of the population – are we really expected to read that whole article from Physics Review Letters to see if it’s relevant to our customer?  Of course not!  We scan for keywords many, many times a day.

Final thoughts:  I want real scientific work on this area with valid, reproducible results.  Librarians are frequent and skilled keyword scanners, but should also be good intensive readers for scholarly, peer-reviewed articles in their own field once the article is identified to be of interest.

#nb:  I feel that there are some large problems with the methodology and presentation of results in this paper so will not refer to his results, but use his article as a literature review. 1) he doesn’t provide his survey 2) “sample of convenience” – not clear how participants were selected 3) shows *perceived* answers, not actual (IOW, he doesn’t measure the differential in time spent or any of the other questions, and he doesn’t do a critical incident method – he asks the participants to say what they think has changed over the past ten years… this doesn’t tell you what has changed, rather what they perceive is different – this could be impacted by media, participation in the study… 4) mixes in his ideas and what he’s read in the literature with results – his survey results do not support his statements.  For example, he states reasons the participants are spending more time reading online, but the survey asks no questions on this.  He is either stating the obvious (so not necessary), or saying more than the survey shows.

Gorman, M. (February 15, 2005). Revenge of the blog people. Library Journal, Retrieved 12/21/2005,
Johnson, S. (2005). Everything bad is good for you : How today's popular culture is actually making us smarter. New York: Riverhead Books.
Levy, D. M. (1997). I read the news today, oh boy: Reading and attention in digital libraries. DL '97: Proceedings of the second ACM international conference on digital libraries, Philadelphia, Pennsylvania, United States, 202-211.
Liu, Z. (2005). Reading behavior in the digital environment: Changes in reading behavior over the past ten years. Journal of Documentation, 61(6), 700-712.
  Blogging for PKM with multiple blogs?
Not trivial. I'm trying to write a thoughtful post about a couple of things I'm reading right now and I wanted to cite something I've blogged about. I searched all kinds of ways on my blog... then remembered (ah-ha!) ... that was on a conference blog, not my personal blog. Hm. So I might just copy all of my notes from that conference here for PKM purposes. I think only having the posts on the conference site was good for publicity and marketing but a month out, maybe not a big deal?

So anyway, how to you m your pk if your pk is blogged in several places?
Sunday, December 18, 2005
  Ugh, teachers assigning wikipedia edits? Ugh.
From a post by A.V. on Web4Lib.
However even non-controversial articles can sometimes fall prey to
disheartening (for those who care about what they wrote or corrected or
reorganized)and constant vandalism. To give but one example: It would
seem (or so the regulars surmise) that the very non-controversial
"Industrial revolution article" has been assigned as a class (or a
multi class) exercise in a school with good Web access, somewhere. As
a result we have been recently getting swarms of infantile vandalism (
Jason is a -expletive deleted-)(our teacher Miss -name X- is a
-expletive deleted-) on the article, at more or less regular intervals.
This is exactly what I feared would happen when I saw W.R. demonstrate vandalizing a page on another wiki, but ten times worse. These teachers have the same understanding as the guy who wrote the Sigenthaler article --" ... well no one really uses this as a reference or believes anything it says."

It's one thing to throw trash around in your own home, but to throw your garbage out in a park or community space is much different.
Wednesday, December 14, 2005
  Confused about the long tail?
Dr. Garfield posted the citation to this article on ASIS&T's SIGMETRICS DL.

MEJ Newman. "Power laws, Pareto distributions and Zipf's law" Contemporary Physics v46 n5 (September 2005): 323-51. doi:10.1080/00107510500052444

The link in the title is only for subscribers via metapress (it's a T&F title)

Anyway, don't let its presence in a physics journal throw you, it's actually a nice little overview of power law distributions. Since blog popularity also seems to follow this distribution, you've probably heard about this if you've been around in the blogosphere.

Update: 12/15 Courtesy of QLB on the discussion list, it's also available (free!) on cond-mat
Tuesday, December 13, 2005
  What's New in MathSciNet: RSS feeds for Journals
I haven't seen anyone else mention this...

What's New in MathSciNet: "Current awareness of journals using RSS

For those using RSS feeds for current awareness, RSS 1.0 and RSS 2.0 links are provided on the bibliographic information page for each journal covered by MathSciNet. These links will provide updated information about a journal issue as soon as that issue is loaded in MathSciNet."
Monday, December 12, 2005
  Woods Hole Oceanographic Institution (WHOI) : 10,000 Earth & Ocean Scientists. Five days. : Dec. 2, 2005: Previewing a Week at AGU
I just saw on Real Climate (there by accident, but maybe I'll sub if they really talk about science, not politics, but anyway)... Woods Hole commissioned a reporter to blog the AGU annual meeting. Now *that's* cool... except for the fact that it doesn't appear that there's real blog navigation or commenting or linking, you kind of just use the boxes on the left to navigate, I guess.
  An new Carnival of the Infosciences(#17)
Aack. I really wanted to submit, but I left it until the last minute and internet access went out at my local public library where I was subbing. So it's still really good and worth a read. Anyway, if you're reading this, you probably have seen my would-be submission :)
Saturday, December 10, 2005
  CIL2006: So Exciting
As Steven Cohen mentioned, the preliminary program(pdf) is out... and I'm on it (twice!). It is very exciting

Wednesday, March 22
Searching the New Digital Formats
3:15 p.m. – 4:00 p.m.
Christina Pikas, [mpow]
Greg Schwartz, Louisville Free Public Library, & Publisher,
Open Stacks Weblog
Pikas discusses tools and techniques for searching the blogosphere and
mining information in Weblogs while Schwartz discusses finding information-
rich podcasts in your topic area.
Thursday, December 08, 2005
  The Chronicle: 12/9/2005: Information Anarchy or Information Utopia?
Article by James G. Neal (not sure if it's only to subscribers?)
"We will be legacy, responsible for centuries of societal needs and records in all formats. We will be infrastructure, the essential combination of space, technology, systems, and expertise that define our excellence. We will be repository, guaranteeing the long-term availability and usability of our intellectual and cultural output. We will be portal, serving as a sophisticated and intelligent gateway to expanding interactive multimedia content and tools. And we will be enterprise, more concerned with innovation, business planning, competition, and risk."
  Announcing: Quoth the Raven, the Blog of the Maryland Chapter of SLA
I wasn't sure if we were ready to go live, but what the heck....

If (like Catherine in a comment to my last post) you'll be attending the SLA Annual Meeting next year in Baltimore... you may want to subscribe to the new SLA Maryland Chapter blog.

I already have a request to find places to hear Industrial music... what else are you all looking for? We'll have a guide with restaurant reviews, too, as well as cultural events.

If you're a member of the Maryland chapter and want to blog. Contact me or another member of the board.

BTW- if you're puzzled over the name, check this out or this. (Nevermore!)

Wednesday, December 07, 2005
  Connoteeea, Conno-tay-ah, Conno-tee
I corrected someone at DASER -- then realized that I didn't actually know how to say Connotea. Ah-ha! Now there's a little screencast showing how to use it... con oh tee a (and she's American if that matters).

Update 12/8: The videos are available from the Connotea "about this site page" in Flash, Windows Media, QuickTime formats.
  New blog: the Blog Section of the IT Division of SLA
Very cool.

Update: I should really, really have said where I found out about it... I am *so* sorry. I read about it on Catherine Lavallée-Welch's blog, EngLib.
Sunday, December 04, 2005
  Daser posts, Sunday 12/4/05
Please see original posts on http://asistdaser.tripod.com/daserblog/index.blog?from=20051204
Sunday, 4 December 2005
Feedback, evaluation, wrap-up
Mood: chatty
Notes by Christina Pikas
Bob Kelly: APS focus groups

presentations up on the web page
list of participants out to the participants

evaluation of process and structure:
David - usually the best part of a conference is in the hallway, I feel like I've been in the hallway the whole time. This is a really good thing and it made it more interactive and a think tank... good people and the right people for the discussion.

Eating all of our meals together is a good thing. Have roundtable discussion topics at lunch

Good having a single track so we could see everything.

Presentations were easy to follow.

Suggestion -- Charleston model w/action items

Link to presentations. Also fewer speakers with more time to each speaker is better and more useful.

Wasn't well enough advertised.

Liked the tight focus.

Introductions are good.

Get a speaker who's not for open access (ML tried and was turned town)

She was expecting more technical presentations

Where are the blog notes available?

Get questions at meals, then have a program in the evening that addresses those questions.

Different seating arrangements.

Backchannel comms. Better wireless. Better power.

Affordable hotel.

David Stern, Yale: STM Libraries in the Future: Quo Vadis
Mood: quizzical
Purposeful abandonment

Notes by Christina Pikas
We live in a world of conflicts

we answer to faculty and administration
we love books, but don't buy a lot

mutually exclusive expectations
convenience vs. enhanced navigation
more options creates confusion
customization vs. personalization


standards vs. branding
ease of use across platforms, consistency of icons/metaphors

environmental challenges
federation vs harvesting
package plan vs unbundling items
seamless pre-paid vs. transaction vs. tiered
IR - oa or archival
individual archives or consortial?
competing info resources
google/scholar easy vs. comprehensive

-appropriate tools
-which inspec, holdings for pubmed

Incorporating multi-media
-teaching tools
-large datasets

-personal/lab databases (lab results, local storage of group knowledge, links to published literature)
-data manipulation (not just pdf, repurposing of raw data, permissions)

-create quiet rooms
-keyboard noises
-new group study spaces (just higher noise, technology, food/social)

IRs need to deal with unpublished, non-peer reviewed materials as well as peer-reviewed journal articles. (conf proceedings, white papers, technical reports)

searching these distributed archives is far from perfect
hybrid journals aren't well handled by link resolvers
(permissions are handled at the journal level, not at the article level)

strictly preservation archives
-lockss, dmca restrictions on sharing
-unnecessary redundancy/cost
-only saving pdfs (not data, not repurposable)

Portico (quality controls xml downloads from publishers, stores, metasearching, migration possibilities, runs on the JSTOR software)

Updated: 12/5 to add tag and picture

David L Osterbur: Drop Your Tools and Run Faster
Mood: caffeinated
Notes by Christina Pikas
Weik 1996 Drop Your Tools: An Allegory for Organizational Studies. Administrative Science Quarterly 41(2):

Brown and Marek 2005 applied this in Library Admin and Mgmt 19(2): 68-74

Explanations for failure (to survive for firefighters in forest fire)
-listening (hearing, taking in, understanding that there's a different perspective)
-skill at dropping
-admit failure (if you drop your tools you're admitting failure)
-social dynamics (following the crowd)
-consequences (proof that dropping tools will be a benefit)
-identity (how much is your identity tied up with those tools)
-replacement skill

Replacement skills
-bioinformatics support
big open access area, all of the data is available for free online... yet libraries aren't teaching these tools
who has control (luddites in control)
-adding value

Bioinformatics support
-we don't have to pay for, it's out there and used extensively
-no good service model for providing that support
-bioinformatics support groups don't have service mindsets -- they're researchers themselves and are interested in research, not in helping
-like driving school vs. building the car
-librarians like to search (librarians have a rich source of things to show users that users don't know about)
-libraries need to regain role

Why don't libraries do it?
- "we have a support group" (help the support group, provide a service they don't offer, enough information for everyone)
-no one trained in it
NLM offers regional introductory course
advanced course yearly in Bethesda (5 day 9-5 course)
13% of participants in the advanced course have humanities degrees.
-tricks of the trade... free full text textbooks other fab things that you can show in bioinformatics research
example: article re h5n1 increase virulence in mamals... sequence... blast... OR genbank, click on blink look up protein (never use keyword in genbank), gets to the point where he can show the differences between the 1918 pandemic flu virus and the avian flu currently of concern

can draw both sequences in 3d structure and see the differences (rotate, align, etc)

July in J MLA, article telling you how to do this and who is doing it... then just do it

ACS biggest provider (*cough* luddites)
Peter Murray-Rust
mass spec, machine readable vs. picture for pub or human understanding... also only give maxima

grad student takes 100 hours to take information away from what he has to
Chemical Markup Language (CML), XML markup... want to make searches for chemicals google-like... semantic grid for chemistry

Value Added
Notre Dame DSpace implementation services offered

Word about digital archives--
we won't be able to migrate fast enough (can't migrate all the data before it has to be migrated again)
Stuart Scheiber Microtome Publishing... ascii text

stay adaptable

from audience:
don't limit your constituency

Updated: 12/5 to add tag and picture

Michael Leach, Harvard: Whither the collections? Whither the librarian?
Mood: caffeinated
Notes by Christina Pikas
Series of questions to the audience:
How many here work with S&T collections?
What % of collection development $ are serials?
What about the staff who had been employed to manage print collections?

To publishers, how do you want librarians to work with you?
Jan- free material needs to be cataloged, too. Role for librarians in financial setting - advocate open access to administrators who can pay for publication
Vivian- librarians discuss impact factor to administration, understand how the publishing industry works and work with publishers

Jan - q to librarians:
ARL used to derive status from the size of the budget -- librarians may not want the financial things taken away because it may impact status, is this an impact to you
a: yes... # of volumes (old school), budget collection vs. salary (ratio so that to lower collection budget but to hire librarians to manage free or less expensive eletronic resources would penalize you)

also to move money to other bugets to support authors publishing in oa from collection development budgets does take power away from the libraries -- we won't be selecting materials and providing access -- there will be universal access and all materials will be selected (maybe instruction)

BK: librarians have a key role in organizing information and new finding tools like folksonomies and other social software... especially with the proliferation of freely available information. Open access is a given. Tools to cross disciplines so physicists can see what biologists are saying... Librarians out of the stacks and into the world.

ML: he still sees a lot of his colleagues tied to their physical collections; but the e resources are so rich, so librarians need to be moving toward.
1) think beyond the traditional collections (even if e)
2) work closely with producers of materials (faculty, post-docs, etc), become a support mechanism for these researchers. Help them submit to journals. Help them submit to archives (oa, ir, etc).
3) teaching and advocating. more than information literacy... scholarly communication at all levels
4) google is a good thing. spend less money on OPACs and spend the money elsewhere
5) libraries as a whole do not put enough money into R&D: user needs analysis, develping user interfaces, marketing... this is left to third parties like vendors. (exception Rochester, has a staff anthropologist?!?) we are too passive, meek

BC: instruction... historically first course in cheminformatics included using the chemical literature but methods of presentation, ethics, intellectual property, knowledge of the prizes/awards... developing information fluency ...
"pardon me, is my eye hurting the end of your umbrella"
PW: library is not based on the physical space, we need to do value-added service, contextual support could be added to the repository... how about forming the citation for me?
T (from SPIRES): a lot of this is already going on, libraries have a tradition, being physically co-located with the library as a programmer is very important
P L-S: our library has already lost complete control of the budget. they are already there becase that's all they do, they don't manage collections

Updated: 12/5 to add tag and picture

Tim Hays, NIH: NIH Public Access Policy
Mood: caffeinated
Notes by Christina Pikas
Announced in Feb, implemented in May. In PubMed Central

- archive of NIH research
- advance science
- access to the public

Driving Forces included Congress, new IT, increasing public use of the internet

56% of internet users bring documents with them when they visit the doctor's office (?!?)

Internal drivers: need archive to study the outcomes of funding efforts, make information available that they paid for on the public's behalf

The policy:
-0-12 month embargo
-peer-reviewed, original research publications, supported in whole or in with direct costs from NIH. Not book chapters, editorials, review, or conference proceedings.
-currently funded (or if accepted for publication after May..)
-does not affect copyright
-authors are encouraged to add a line to their publication agreement that says that they will submit to NIH
-should not effect peer review
-should not affect scientific publishing (1% journals in pubmed have more than 50% of their articles funded by NIH, 10% of the articles in pubmed were funded by NIH)
-has had some positive effect with journals now having a self-archiving policy
(audience comment that Nature is backsliding from 0 month self-archiving to 6 month self-archiving)

Final policy 2/3/2005
System released 5/05
New website 10/05
by Feb/06 hope to have a batch upload function (they're working with Elsevier, Wiley, Nature)

About 2% to 3% (not including the 8% from PubMedCentral journals that are in there by default) as SH said this 10% mirrors the participation world-wide in self-archiving with no policy (or TLC from a librarian :) )

- about 60% immediate pub, 20% after 9-12 months
- Have removed 40 articles because of too early publication (from about 10 journals)

Issues with researchers who want/need to get published, but worry that the copyright negotiations may delay or prevent paper acceptance, plus figuring out funder policy, their institution policy...
NIH is doing outreach

Public Policy Working Group (11/15/2005)
Limited survey of 19 health sciences libraries... 87% of faculty were aware of the policy, 4% had submitted
Largest factors
-confusion over copyright and version

Q from audience:
-make mandatory (this is being considered, but somewhat difficult because part of the regulatory process, also Dr. Zerhouni saw this as a way of changing the landscape -- did not want to cause bad will with groups with whom NIH cooperates)
-group doesn't include open access publishers

Working with publishers
-3rd party submissions
-Elsevier, Nature, Wiley submit directly (they control version, embargo)
-software tool for offline verification of grant numbers
-will post the publisher version over the author version, place links to the publisher site, correct author errors, place links to article correction notificationon the publishers web site
-will have xml and pdfs of all documents

Questions to the working group
1) should participation be mandatory - 12 out of 14 yes (two who said no, Elsevier (no, really?) and FASEB)
2) what should be the embargo?
A variety, many said 6months
3) what is the best version
publisher version, but not clear on whether xml or pdf

Next steps
-continue outreach
-batch uploading
-report to congress

From the audience:
sh: "flawed policy that missed historic opportunity... but can be improved" flaws: voluntary, let the word embargo be said, demand central deposit
should have done: request that the depost be made either in own IR or in PMC, then PMC can harvest automatically from IRs. make it into an instant deposit upon acceptance. build into metadata that shows up immediately - with e-mail to author to request e-print. NIH should also offer to pay reasonable publishing in open access

From Jan V: springer is not on committee, but should be mandatory, no embargo, both xml and pdf, explicitly say on the NIH page that it is OK for open access publishing to be covered by grants (Wellcome trust does this).

From Brad: mandatory is the important thing

From Mary: mandatory needs to happen that will really be transformative

From Paige: impact to publishers... what would it be if all of the large science funders in government (DOD, NASA, ... ) did this

From Peiling: theoretically mandatory is necessary, but getting the regulations in place is a long term thing... what if all new grants from this point forward have that requirement?

Vivian: meeting of publishers hosted by Blackwell where everyone got up and said how horrible this is... unnecessary duplication of what highwire pubs etc are doing...
Is there an analysis that shows if people can get to things that they would not have been able to get to otherwise... IOW, does it really make things available that aren't elsewhere available?
Answer: not enough content, plan to evaluate....

Updated: 12/5 to add tag and picture

Stevan Harnad, Southampton, U Q at Montreal: (no title)
Mood: on fire
Notes by Christina Pikas

Data and slides available online and may be reused. There were slight technical difficulties.

Why do researchers publish?
Not for money but to communicate results

Open access is:
free, immediate, permanent, and full text online access
primarily peer reviewed journal articles, theses

Lawrence 2001, more citations to online articles than offline articles in the same venue (not open access effect)
To what extent were Lawrence's results only a CS effect? The compared OA vs. Non-OA in Astro
(he states that there are basically 12 astro journals, all astronomers are at institutions that cover these 12 at minimum, so to them there is open access) and other physics, sociology and biology.

Lots of fast moving graphs here, based on a robot that gathered 1.4 million self-archived articles across about 10 fields (not phys, but does include some social sciences). Grouped articles by # of citations (in bins), then graphed for each bin, #articles for each pub year. Did the same for non-open access articles. Then took the ratio of one to the other, found (I believe and I think he'll correct me) that in general, open access articles are more highly cited across disciplines than non-open access articles.

Dollar value value of citation (a la Diamond 1986 and adjusted to current year money), $85/citation... The UK is losing 300k potential citations and 1.5B GBP based on the above calculations.

Research assessment, research funding, citation impact
All of the factors that are used to evaluate researchers and research groups.... correlate highly with citation counts. Citation counts would better highlight a really good article in a lesser journal.

Changing Citation Behavior
Peak of the curve is moving earlier and earlier. Citations may occur within 3 weeks of self-archiving (!) These charts come from citebase. Self-archiving has speeded up citation behavior, immediacy, and the movement of physics.

Open access - how?
Archives without an institutional self-archiving policy remain nearly empty.
What prevents us from open access in the form of self-archiving is keystrokes, not copyright

Awareness of author compliance (study?)
81% would willingly comply with self-archiving policy. 5 archives that have mandates are some of the largest (so this works) ... examples CERN, Southamption. University of Tasmania vs. Queensland productivity (?) +archives +librarian assistance +mandate ... (See upcoming D-lib article by Arthur Sale?)

388 institutional archives worldwide (they've found), vast majority are empty. In Germany every institution has an IR, but no policy (and sometimes no tender loving care)

Audience questions
a: Have the successful archives in the above mentioned Australian universities experienced citation effects?
a: early yet, but some

q: Infoglut or version control-
Will there be problems with too many versions... or access to the correct version
a: no, researchers just want the materials. researchers know what they're doing, what's a post-print, what's a pre-print, and what's good literature -- this is not done by librarians but is part of being a professional researcher

Robot -- didn't specify that articles were true OA, just that they were available online fulltext for free at the time of the crawl. Future work may try to address this

Q: about the wording of self-archiving... SH says that it's a supplement to subscription access.

Q: citation life cycle -- doesn't that bias this because articles might be self-archived only after they have proven to be good articles (I may have gotten this wrong sorry D.S.)
SH: they are studying latency, life cycle, immediacy

Updated: 12/5 to add tag and picture, correctly spell the speakers first name.

Saturday, December 03, 2005
  Daser posts, Saturday 12/3/05
Please see original version on http://asistdaser.tripod.com/daserblog/index.blog?from=20051203
Saturday, 3 December 2005
Peiling Wang, UTK: Research-related Use of Internet-enabled Information Resources
Mood: on fire
Notes by Christina Pikas
Preliminary study, but she believes that it will scale up. Standard deviation is very small.

- identify interdisciplinary differences in the use of internet-enabled information resources for research (not just technology rich or poor, rather, the type and nature of the research)
- identify factors affecting use or nonuse of these resources
- influence design

Research Questions
- which internet information technologies are used in research
- who are these technologies used in information seeking (model of 6 types of information seeking (Ellis?))
- how important are each

Research Design
- indepth f2f interviews
- semi-structured questions (her guide is available ask her)
(for how does each tech type support each of the 6 of Ellis' types, plus one more type: organizing)
What percentage of your needs are met by electronic resources?
Chaining - forward or backward citation searching

Productive and active researchers (faculty and doctoral students):
Computer Science
Information Science
Humanities/Social Sciences (not yet complete)

In progress - 42 interviews right now

Preliminary results
- average 5-7
(sorry for the poor table)

importance cs eng is JEM
1 web db db web
2 email web web opac
3 e-j ftp e-j database
4 dlib opac opac email
5 opac email email e-j

What % e-resources? Eng highest, CS next, InfoSci next....

2 outliers
1-CS prof, 100% electronic
2-Journalism prof, 98% print

Factors affecting use:
- nature and type of research
- availablity of digital archives (humanities, historians)
- accessibility of digital archives
- awareness of the resources
- usability of the internet technology
- perception of source quality and reliability
- individual preferences & constraints

do not save (search again)
do not delete (periodically discard all)
create folders and subfolders
save multiple copies on multiple machines
keep a print copy of the digital documents
work group maintained collection

- information seeking in the digital age is easier for some but harder for others
- user tools for diverse users
- revamp the metaphor of folders
- provide easy access to digital objects at an atomic level (disaggregation)

Updated: 12/5 to add tag and picture

Marie Martens, BioMed Central: Open Access, Moving into the Mainstream
Mood: a-ok
Notes by Christina Pikas

Subject areas embracing open access
- bioinformatics
- cancer
- arthritis
- public health
- infectious diseases

senior authors believe article downloads more credible than citations (?)
(independent study by CIBER: http://www.ucl.ak.uk/ciber/ciber_2005_survey_final.pdf)

"All truth passes through three stages.
First, it is ridiculed.
Second, it is violently opposed.
Third, it is accepted as being self-evident"
Arthur Schopenhauer

Updated: 12/5 to add tag and picture, correct glaring spelling problem

Karla Hahn, Association of Research Libraries: Institutional Repositories, Emerging Frontiers for Policy Making
Mood: not sure
Notes by Christina Pikas
She's citing Wikipedia :)

Diffusion of Innovations.
Pattern at which people adopt successful innovations .. Everett Rogers.

We're down at the beginning. Westrienen and Lynch D-Lib June 2005 (limitations on data), table, number of IRs per country, number of docs per IR. In September D-Lib, article by Lynch and Lippincott on US IRs.
Academic Institutional Repositories: Deployment Status in 13 Nations as of Mid 2005
Gerard van Westrienen, SURF Foundation; and Clifford A. Lynch, Coalition for Networked Information

Institutional Repository Deployment in the United States as of Early 2005
Clifford A. Lynch and Joan K. Lippincott, Coalition for Networked Information

Other work by Foster and Gibbons, Jan 2005 D-Lib

Three main barriers from Foster and Gibbons articles:
- our language, jargon... users don't know IR, metadata, etc
- time ... to find out about IR, understand why and how to use it...
- copyright

A la Clifford Lynch, IRs are sets of services, not softwares

"Never forget posterity when divising a policy. Never think of posterity when making a speech." Robert Menzies, former Prime Minister of Australia


- authors do not understand their rights, options
- publishers encourage authors to regard as pro forma that they transfer all rights to the publisher
- practices are not consistent among authors, publishers

Peer review
- chicken - egg, get content to look at quality, look at quality to get content
- this is more than just being peacocks, it's their bread and butter, life and death of their careers

New models for scientific works
- MIT CogNet
- Real Climate
- Columbia Earthscape

Digital data
- more on long-lived data (mentioned at ASIS&T, read document here)
- data management plans

Commercialization and content control
- previously, limiting access to make money
- we are not home free

Investment - who pays?

- underinvestment (investment in science scholarly communication systems has not kept pace with funding in science... can't keep cancelling journals to build repositories, that is not sustainable)
- copyright over-management, under-management
- commercialization

- good that we've jumped on this in new and potentially risky roles, and taken this as a job for librarians

Comment from the audience
- tension also exists between roles as editors, authors, researchers (within the same person)

Updated: 12/5 to add tag and picture

Leslie Johnston, UVa: Repository Development at the University of Virginia Library
Mood: sharp
Notes by Christina Pikas

She is discussing a curated digital library. It's been around since 2003.

Fedora: Flexible Extensible Digital Object Repository Architecture

Not an out-of-the-box repository, it's the underlying toolkit that is a Digital Asset Management architecture (Mellon funded, UVa and Cornell, for the software development, but not for their implementation)

- part of a global network of repositories
- all media types
- searching and browsing equally important
- curated
- primary users UVa community, they do have restricted content
- they'd like to have all digital collections in this repository

Phase I, 2003, prototype
- electronic texts from the library's special collections
- art architecture ...
- got a lot of feedback (130 comments), they categorized, ranked, prioritized them
- number one comment: have more stuff

Phase II, Fall 2004 (final for Fall 2006)
See her article in D-lib for information on testing.

What did it take?
- ad hoc group documented production standards for media files
- metadata steering group documented local encoding practice, minimum standards, mapped various standards to the local standard
- community digitization standards

Content production
- subject librarians select, with technical assessment (ease of production, need for metadata enrichment, time constraints such as instructional deadlines, funding)
- centralized digital library production service (w/7.5 FTE plus student "scanning monkeys")
- new software tools and scripts

- working groups for functional requirements
- functional requirements and analysis of media files and metadata to document content models (classes of objects and behaviors and mechanisms)
- processes for ingest
- interface
- search


- they had no budget
- they borrowed people from other parts of the library

Library Content
- huge queue of stuff to be done
- science stuff (herbarium images, glass astro slides from the parallax project)

Faculty Content
- born digital
- digital humanities projects

Support- librarians, programmers

D. Stern - seems intimidating, but it only took 3 programmers 2 weeks to be able to add info

Marie Martens, BioMed Central: Open Repository
Mood: spacey
(spacey as in dspace :) )

Notes by Christina Pikas

They are basically a hosting facility for Dspace for institutions that can't host in-house (openrepository.com).

- complete set up within 3 months
- technical support
- hosting

In-house solution has a lot of hidden costs. They have predictable costs -- set-up and maintenance.

The hierarchy is communities and collections. She then went through what it looks like to upload papers and showed an example implementation at UConn.

According to Stevan Harnad, the trouble isn't setting up the software, it's getting the content. (he talked about http://www.eprints.org/, his product, which also offers hosting)

Question from Vivian Siegal - would I have to submit three times - to the journal, to the NIH repository, to my institution? -- no they have automatic feeds.

Mary Steiner, Penn Library: The Toddler Years for ScholarlyCommons@Penn
Mood: chillin'
I've been forgetting to sign my posts so I will do it at the start and re-edit the others.
Notes from Christina Pikas

Why get into this business anyway?
(partial list, changes over discipline and over time)
- access to information, highlight student work, stable archives
- specific desire of a campus unit (at Penn, Eng.)
- campus climate (centralized, funding structure, top down vs. grassroots, relationship between IT/library/archives)
- improve visibility/stature of academic research and scholarly activity

Scope and Nature
- short term, where to start, pilot, seeding the repository
- long term, across campus, cooperation
- pilot: the best work from the School of Engineering & Applied Science, recent work, with the endorsement of the leadership

- oversight (IT, cataloger, administration, license/copyright person from serials)
- they chose the PQ product to get a turn-key solution with less burden on library and university IT
- f/t only
- they took on the burden of copyright compliance (to encourage submissions without reading through and complying with extensive publisher-specific policies)
(according to Stevan Harnad, you can just put everything up, and make f/t only available to the institution where required, then provide author e-mail and suggest that outsiders send e-mail asking for e-print, then author can send an e-print from the repository)

Operations -- getting content
- submit via e-mail
- harvesting faculty web pages
- alerts on relevant databases, e-mail authors

Going public
- timed the launch (not over break)
- marketed via demos, write-ups, mailings, links, share statistics
- registered with search engines


Updated: 12/5 to add tag and picture

Jeff Riedel, ProQuest: Alternative Models of Scholarly Communication
Mood: caffeinated
Notes by Christina Pikas
Back after the break. Change in order, we're now on to sessions again...


Digital Commons is their product... current implementations, UConn, TexasTech....

Other products eprints at SOTON and Dspace. About 500 institutions now have institutional repositories. See http://www.oaister.org for a sample.

Right now
- majority of objects still text-based
- some disciplines are more likely than others
- discovery paths (75% general web search engines -- going right to the paper, 7% front door, 19% referral/e-mail/direct access). Google acces starts at about 90% and then drops to about 40% years later when the IR is established. OAI has very few referrals -- they've done a good job marketing to producers but not to users.

More on marketing OAI
- more content
- better tools can be built on oai, but haven't yet
- needs to be a part of federated search implementations

Challenges and answers
- content recruitment
- no pain for the researchers (yet!), they have to know that it will benefit them first
- regular e-mail reports (your paper has been downloaded x times)
- branded personal researcher pages (contribute to their egosystem)
- citation harvesting

Their product includes a journal publisher module with peer review management, etc.
- see Boston College, Studies in Christian-Jewish Relations (open access)

The overlap of IR and OA
- Both eliminate costs with accessing scholarly info (really move the cost)
- Possible because of the internet

Bottom line-
This will not help the stranglehold publishers have on institutions (yep!) -- institutions still pay, maybe not the library.
"There's money in the system. You can move it around, but it can't disappear without a quality loss"

Updated: 12/5 to add tag and picture and to sign the top

Vivian Siegel: Building an Open Access World
Notes by Christina Pikas
What it's like to launch an open access journal (she was at PLOS).

PLOS uses Berlin definition of Open Access -- not only to read, but to reuse content.

Why open access?
- matches the needs to the researchers as readers and authors
- matches the goals of the funders of the research
- best meets the publishing mandate to widely and rapidly disseminate information

Starting vs. Transitioning to OA
developing reputation vs. established reputation
building submissions, readership, usage (same as with any new journal) vs. established
no legacy data concerns vs. legacy data
can set expectations vs. legacy economics

Additional challenges at PLOS
Building an organization

Possible because of...
- philanthropy
- credibility within scientific community
- support from scientific and library communities

To build submissions...
- Kept an updated list of authors who had papers accepted
- Impact factor!

How to do you fund...
- philanthropy
- author charges
- memberships
- advertising
- commercial reprints
- a la BMJ - value-added material subscribers only, research open access
- print supports online

Costs to publishing
- copy editing
- figure manipulation
- professional editors
- front section
- fee waivers

How do you build an open access world?
- have open access options for traditional journals (like the PNAS model)
- front section subscription only (BMJ)
- put open access in researchers evaluation model
- put aside the money in the funding so that it can't be spent elsewhere.
- reduce the costs of publication (print on demand, more control and responsibility to authors for copy editing and figure manipulation, better open source publication management software)

Updated: 12/5 to add tag and picture, signature

Bob Kelly, APS: Expanding Readership in Theoretical and Experimental Physics
Mood: bright
Notes by Christina Pikas
Anton Chekhov, "there is no national science just as there is no national multiplication table, what is national is no longer science..."

1994 APS workshop: publishing preprints on the web is not pre-publishing for submission purposes
Dropped page charges (for all but PRL)
1995 NRL first institutional repository (TORPEDO)
1998 Physical Review Special Topics Accelerators and Beams -- did not charge, libraries reluctant to catalog since they weren't paying for it.

They (like IOP) are moving away from journals towards articles and collections and integrated collections. (note: this has a lot of implications in a lot of spheres, I believe we discussed this at a PAM session at SLA WRT IOP)

He compared institutional subscriptions (now) with the way it was in the late '90s: individual subs, group/department subs, and library subs. Individual subscriptions have doubled in cost, but in general institutions are paying less because access is available across the institution with the online and with the single print copy in the library. hm.

New Services
- ENTS: essential non-text stuff (ex. holograms)
- wikis
- folksonomies
- blogs
- full text xml, single column pdf

Copyright statement - has been evolving, but according to Stevan Harnad (audience) has been a step ahead and was the first green publisher (meaning allows self-archiving, etc)

They're looking at non-traditional revenue sources such as archives back to 1893.

Updated: 12/5 to add tag and picture, signature

Steve Moss, IOP: Open Access Perspective from an STM publisher
Mood: bright
Notes by Christina Pikas
He introduced IOP.

IOP's Open Access Initiatives
1) This month's papers (current 30 days' worth of papers available at no charge on the web page)
2) IOP Select (editorial boards select the best and highest impact papers and make them available at no cost)
3) New Titles (are freely available for up to 2 years)
4) Developing Countries
5) Open Access Journals (New Journal of Physics, Conference Series, Journal of High Energy Physics)

These efforts
- have not increased cancellations
- downloads and submissions are up
- upgrades and subscriptions are up
- has helped impact factors (10 titles more than 20%)
- 18% of the downloads are free, not to subscribers

Expect to break even for a year in 2007 (started publishing in 1998) ... sufficient impact, viable only in long term, requires deep pockets to get there.

See presentation online to see extracts from recent studies on why authors choose to publish where they do. For one study, see Mary Waltham's site. For another see APLSP.

Updated: 12/5 to add tag and picture, signature

Mood: caffeinated
Notes by Christina Pikas
We're now doing introductions. There are people here from IOP, APS, AIP (so physics is well represented), Springer, Biomed Central, and other academic, governmental, and corporate organizations.

High energy physics, medicine, astronomy, chemistry, and engineering environments are all represented.

Updated: 12/5 to add tag and picture, signature
I'm blogging DASER (Digital Archives for Science and Engineering Resources conference) real-time over on the DASER blog.

Caveat Lector is also real-time blogging (from about 5 feet away).
Friday, December 02, 2005
  Personal Role Management
Pointed to on ResourceShelf (for my benefit?!?)

So I've long complained that tools and resources don't understand multiple identities/personalities, etc. online. See here, here, and I know there is at least one other place where I talked about multiple personalities. Oh, and I stood up and asked the question at the PIM panel at ASIS&T.

The article is:
Catherine Plaisant and Ben Shneiderman with H. Ross Baker, Nicolas B. Duarte, Aydin Haririnia, Dawn E. Klinesmith, Hannah Lee, Leonid A. Velikovich, Alfred O. Wanga, Matthew J. Westhoff. "Personal Role Management: Overview and a Design Study of Email for University Students" December 7, 2004. Available online at: http://hcil.cs.umd.edu/trs/2005-30/2005-30.htm
(not sure why we're finding it now? wait, they say they've been doing this since 1994 -- why hasn't anyone implemented their suggestions yet?)

Note: I put different roles in different windows of Firefox, with tabs for each search or related item for that role... I wonder if I could use session saver to archive the various roles so that I don't have to decide to keep to ditch (cognitive process) every time I switch roles???

Another thought, in Gmail, why can't you have it automatically change your signature when you change the "send from" e-mail? Likewise, is there a way in outlook to automatically respond to internal e-mails with an internal signature and an external for external? I can actually do that with my phone... why not e-mail?

Will the new Vista (formerly longhorn) with the stacking ability, allow some version of this?
  Whew. Worth reading. Really. Now
FRL and others have pointed to this article in multiple places, so I finally read it. (full text is at the link, scroll down.)

Siva Vaidhyanathan. "A Risky Gamble With Google." Chronicle of Higher Education
Volume 52, Issue 15, Page B7
The presumption that Google's powers of indexing and access come close to working as a library ignores all that libraries mean to the lives of their users. All the proprietary algorithms in the world are not going to replace them. There was a reason why Franklin, Jefferson, Madison, and others of their generation believed the republic could not survive without libraries. They are embodiments of republican ideals. They pump the blood of a democratic culture, information.
I must admit that I've pretty much blown off any and all complaints about the whole library project thing, but, wow. Anyway, worth thought.

