Swanson's Postulates of Impotence
(oh - this is going to get me *such* search engine traffic I don't want!)
I do so love the rantings of the cranky old men and women of information science. I hope to feature some of these on my blog as I continue to compile my comprehensive exam proposal as well as actually re-reading for my comprehensive exams.
I had forgotten about this article assigned in the Information Structure class taught by Rebecca Green. But it's a good one.
Swanson, D. R. (1988). Historical note: information retrieval and the future of an illusion. Journal of the American Society for Information Science, 39(2), 94-98.
Swanson is one of those big names in IR. He basically goes over a little of the history of IR and then puts forth, as suggested by Fairthorne, nine postulates of impotence - or things that cannot be done in IR- or at least in subject-oriented IR (as opposed to known item, for example). He suggests that these might be a useful in developing new research directions and he hopes to start some arguments.
- "an information need cannot be fully expressed as a search request that is independent of innumerable presuppositions of context -- context that itself is impossible to describe fully, for it includes among other things the requester's own background of knowledge"
- can't write rules to precisely translate a request into a set of search terms
- "a document cannot be considered relevant to an information need independently of all other documents that the requester may take into account"
- can never get 100% recall (or be completely sure of the % recall you did get)
- "machines cannot recognize meaning and so cannot duplicate what human judgment in principle can bring to the process of indexing and classifying documents. Corollary: Some indexers all of the time, and all indexers some of the time, also cannot duplicate what human judgment in principle can bring to the process of indexing."
- "word-occurrence statistics can neither represent meaning nor substitute for it"
- the process is iterative, so can't evaluate an ir system based only on a single iteration [more important now than ever, perhaps]
- "you can have subtle relevance judgments or highly effective mechanized procedures, but not both"
- "consistently effective fully automatic indexing and retrieval is not possible"
His point: humans are subtle, complex, and relevance judgments "entail... artful leaps of the imagination unconstrained by logic, reasoning, or the clammy hand of consistency..." But he does not deny that machines are incredibly important to IR - just that they cannot take us the whole way.
Wow, he studied the work of intelligence analysts in 1955.... and their polished analyses coming from large quantities of fragmented information.
He's not all negative - he talks about some of the things that can be done, too. But the entertaining bits are the couple of times when he mentions that ideas had been thought up in the 50s or earlier and then reinvented in the 70s and 80s. Of course, we're still reinventing these ideas now - some people think that just because there's a computer involved that information and how people deal with information is
completely new. There are definite changes, but some things proposed in the 50s are now really possible.
Labels: information retrieval