<$BlogRSDURL$>
Christina's LIS Rant
Thursday, June 05, 2008
  Some advice for the rank beginner in citation analysis part 1
I keep realizing how much I have to learn in citation analysis and social network analysis. So I think that I have nothing to offer; yet, I do know a lot, and I've learned some lessons the hard way. I'll try to give back a little here because it will probably be good for me and it might help someone else.

This post is for librarians and other information professionals who might want to dabble in or dip a toe into citation analysis and who are a bit lost with all of the massive amounts of advice and help out there.

First, I'm talking about studying the structure that is created through the linking of people or works by co-authorship or citation (in some fashion). It's a fairly straight forward thing to calculate someone's or some organization's h index (using two competing tools and some free things), and it's fairly straight forward to tally up citations. Although, it is nigh impossible to be comprehensive in just about any field. What is more complicated is building a network of relationships between authors and using this network to understand the collaborations and potential information flows. So that's what I'm talking about. You might use this within an organization to see the patterns of how people in one department write papers with people in another. You might use this to look at some sort of similarity based on who all cites the same paper. This sort of analysis is a value added service that librarians and information professionals can provide for their organizations.

The data
Where do you get the data? Well, Web of Science (henceforth WoS) from ISI, Thompson, Thompson Scientific Thompson Reuters is still the best choice. You will need a site license to this or the CD-ROMs because this is way expensive for DIALOG. Yes, there are competitors, but many of the tools are built to work with WoS data. If you think you're going to find cleaner data, hah! Let me know how that works for you. Ok, but, it is still a bit dirty and it has those pesky known weaknesses: western bias, journal articles only (so CS and some areas of engineering under represented), not intended to be comprehensive. Scopus, in my experience, has crap for data (they messed up and at one point marked a bunch of stuff from the 1980s as from 2008, they think MPOW is in New Jersey (I mean really - have you not heard of our parent institution???))... As for Google Scholar, there are some tools that use it, but we don't know how comprehensive (or what it covers), how far back it goes, how frequently it's updated...

Now, for co-authorship, you can really use just about anything or some combination, but we'll talk about that later.

What Software Do I Need?
Preparing the Data
Do your search in WoS. Mark the records and save the full record including citations as plain text. I believe it will only do 500 at a time, so you'll have to paste these files together (take off the preliminaries like FN and the EF (end of file) for the middle records. You can do it in Sitkis, or you can concatenate at the command line, whatever works. Here's an example of a record (note they do NOT have my zip code and city right, and there's an extra letter in my e-mail, sigh):

FN ISI Export Format
VR 1.0
PT J
AU Pikas, CK
TI Blog searching for competitive intelligence, brand image, and
reputation management
SO ONLINE
LA English
DT Article
C1 Johns Hopkins Univ, Appl Phys Lab, Baltimore, MD 21218 USA.
RP Pikas, CK, Johns Hopkins Univ, Appl Phys Lab, Baltimore, MD 21218 USA.
EM cchristina.pikas@jhuapl.edu
NR 0
TC 5
PU ONLINE INC
PI WILTON
PA 213 DANBURY RD, WILTON, CT 06897-4007 USA
SN 0146-5422
J9 ONLINE
JI Online
PD JUL-AUG
PY 2005
VL 29
IS 4
BP 16
EP 21
PG 6
SC Computer Science, Information Systems; Information Science & Library
Science
GA 936SE
UT ISI:000229874900006
ER

EF


When you install Sitkis you'll get a manual and a user guide. The manual helps you with setting up and importing data. I followed the manual closely and didn't have any problems.

Coming up:
 
Comments: Post a Comment


Powered by Blogger

This is my blog on library and information science. I'm into Sci/Tech libraries, special libraries, personal information management, sci/tech scholarly comms.... My name is Christina Pikas and I'm a librarian in a physics, astronomy, math, computer science, and engineering library. I'm also a doctoral student at Maryland. Any opinions expressed here are strictly my own and do not necessarily reflect those of my employer or CLIS. You may reach me via e-mail at cpikas {at} gmail {dot} com.

Site Feed (ATOM)

Add to My Yahoo!

Creative Commons License
Christina's LIS Rant by Christina K. Pikas is licensed under a Creative Commons Attribution 3.0 United States License.

Christina Kirk Pikas

Laurel , Maryland , 20707 USA
Most Recent Posts
-- HCIL Symposium Workshop: Social Technology and Bi...
-- ROTFL: SPSS is the "secret sauce"
-- Workshop: Social technology for biodiversity
-- "Trusting" Pseudonymous Bloggers
-- Goodbye to a library advocate, and physicist
-- STGlobal2008: Day 2 Wrap-Up
-- Nature Geosciences Commentaries on Science Blogs
-- STGlobal2008: Second Plenary
-- STGlobal2008: Opening Plenary
-- BTW - Reminder, I'm still a committed liblogger
ARCHIVES
02/01/2004 - 03/01/2004 / 03/01/2004 - 04/01/2004 / 04/01/2004 - 05/01/2004 / 05/01/2004 - 06/01/2004 / 06/01/2004 - 07/01/2004 / 07/01/2004 - 08/01/2004 / 08/01/2004 - 09/01/2004 / 09/01/2004 - 10/01/2004 / 10/01/2004 - 11/01/2004 / 11/01/2004 - 12/01/2004 / 12/01/2004 - 01/01/2005 / 01/01/2005 - 02/01/2005 / 02/01/2005 - 03/01/2005 / 03/01/2005 - 04/01/2005 / 04/01/2005 - 05/01/2005 / 05/01/2005 - 06/01/2005 / 06/01/2005 - 07/01/2005 / 07/01/2005 - 08/01/2005 / 08/01/2005 - 09/01/2005 / 09/01/2005 - 10/01/2005 / 10/01/2005 - 11/01/2005 / 11/01/2005 - 12/01/2005 / 12/01/2005 - 01/01/2006 / 01/01/2006 - 02/01/2006 / 02/01/2006 - 03/01/2006 / 03/01/2006 - 04/01/2006 / 04/01/2006 - 05/01/2006 / 05/01/2006 - 06/01/2006 / 06/01/2006 - 07/01/2006 / 07/01/2006 - 08/01/2006 / 08/01/2006 - 09/01/2006 / 09/01/2006 - 10/01/2006 / 10/01/2006 - 11/01/2006 / 11/01/2006 - 12/01/2006 / 12/01/2006 - 01/01/2007 / 01/01/2007 - 02/01/2007 / 02/01/2007 - 03/01/2007 / 03/01/2007 - 04/01/2007 / 04/01/2007 - 05/01/2007 / 05/01/2007 - 06/01/2007 / 06/01/2007 - 07/01/2007 / 07/01/2007 - 08/01/2007 / 08/01/2007 - 09/01/2007 / 09/01/2007 - 10/01/2007 / 10/01/2007 - 11/01/2007 / 11/01/2007 - 12/01/2007 / 12/01/2007 - 01/01/2008 / 01/01/2008 - 02/01/2008 / 02/01/2008 - 03/01/2008 / 03/01/2008 - 04/01/2008 / 04/01/2008 - 05/01/2008 / 05/01/2008 - 06/01/2008 / 06/01/2008 - 07/01/2008 / 07/01/2008 - 08/01/2008 / 08/01/2008 - 09/01/2008 / 09/01/2008 - 10/01/2008 / 10/01/2008 - 11/01/2008 / 11/01/2008 - 12/01/2008 / 12/01/2008 - 01/01/2009 / 01/01/2009 - 02/01/2009 / 02/01/2009 - 03/01/2009 / 03/01/2009 - 04/01/2009 / 04/01/2009 - 05/01/2009 / 05/01/2009 - 06/01/2009 / 08/01/2010 - 09/01/2010 /

Some of what I'm scanning

Locations of visitors to this page

Search this site
(gigablast)

(google api)
How this works

Where am I?

N 39 W 76