Community detection in co-authorship networks
Wow - this paper uses a lot of the same techniques (and the same algorithm) I use in my IEEE eScience conference paper (details and pre-print to follow, probably not 'til the beginning of November though).
Rodriguez, M. A., & Pepe, A. (2008). On the relationship between the structural and socioacademic communities of a coauthorship network. Journal of Informetrics, 2(3), 195-201. DOI:10.1016/j.joi.2008.04.002
They are looking at a large multi-institution, multi-disciplinary NSF research center. They want to know if the communities - areas that are more connected to each other than to the rest of the network - detected in the co-authorship network using several standard algorithms correspond to any of these characteristics of the authors
- department (more or less a proxy for discipline - biology, civil engineering...)
- affiliation
- position (like PhD student, professor, assistant professor...)
- country of origin
I would like to have seen some other characteristics - but this is what was available (actually, come to think of it, how did they know country of origin? not stated - ew... seems problematic).
So then they did a contingency table and found the chi-squared. Turns out that department and affiliation are the only statistically significant characteristics -- that seems pretty obvious. I'm sort of glad the country of origin isn't. I think the characteristics seem a bit weak, but I like the general idea of the article. I'd like to see more things like gender, and a more granular representation of their discipline (so biology isn't enough, but what type of biology or maybe what lab or research group, too).