Monday, July 11, 2005


GOOGLE PRINT is an interesting search engine. I had heard about the Google plan to digitize libraries (Jared writes about it here; Rory Litwin critiques its implications here) but I was unaware of what searching Google Print could do for me until recently. I have used it to track some pretty obscure research items in the current scholarship by doing full text key word searches that scan thousands of recent books. Now, Google Print only allows you to see a few pages worth of hits, but even so, that and the "search within the book" feature in the engine, can be an invaluable way of identifying books that may be of use. In this regard, Google Print also promises to help authors of scholarly monographs reach their specialized audience, as Irish Studies publisher Mike Collins persuasively argues here (for this reason, I wish that my book were in Google Print, but apparently --of course, because it is me, after all-- my publisher is not contracted with them).

SEARCHABLE DATABASES LIKE JSTOR, ACADEMIC SEARCH PREMIERE, HISTORICAL ABSTRACTS have enabled me to compile full text PDFs of and bibliographical references for hundreds of articles that would have been much more slower in coming if I had to rely on my ILL service of my campus library. Not only has this made class preparation much easier (for example with regards to coursepackets), but it has sped up my research time.

SUBSCRIPTION AND FREE FULL TEXT DATABASES have enabled me to do archival research without spending thousands of dollars to travel out of the country. I don't get alot of research support at the U and I don't have the liquidity necessary to fund these trips on my own yet (gotta get outta debt and into a house by 2007!) Being able to download full-text PDFs of old newspaper articles and such has helped me to uncover some incredible information that has helped me publish recently. (The Evans Digital project is particularly exciting for Americanists, but unfortunately it won't take individual subscribers--if you're not at a select few institutions you can't get easy access.)

JUST FOR FOOLIN' AROUND Amazon's search within the book feature is a hoot. For example, take Homi Bhabha's The Location of Culture and do the concordance search. It will give you a list of the 100 most used words in the book. In this case:

act agency always ambivalence authority becomes between black book colonial come community cultural culture demand desire difference discourse discursive does double effect emerges english enunciation fanon form foucault historical history human ibid identification identity image india itself knowledge language location london man may meaning modern modernity moment must narrative nation national native new nor object once people place point political politics position postcolonial power pp presence present press problem process produces question relation repetition representation see sense sign simply social society space strategy structure subject suggest temporality terms theory time translation truth turns university western white without words world writing

Kinda like the Homi Bhaba poetry refrigerator magnet set! Other stats include (as explained in the Amazon webpage):

* The Fog Index was developed by Robert Gunning. It indicates the number of years of formal education required to read and understand a passage of text. A score between 7 and 8 is considered ideal, while a score above 12 is considered difficult to read. Bhaba scored 18! (He's not called Holy Babble by some for no reason at all!)

* The Flesch Index, developed in 1940 by Dr. Rudolph Flesch, is another indicator of reading ease. The score returned is based on a 100 point scale, with 100 being easiest to read. Scores between 90 and 100 are appropriate for 5th and 6th graders, while a college degree is considered necessary to understand text with a score between 0 and 30. Bhaba=25.

* The Flesch-Kincaid Index is a refinement to the Flesch Index that tries to relate the score to a U.S. grade level. For example, text with a Flesch-Kincaid score of 10.1 would be considered suitable for someone with a 10th grade or higher reading level. Bhaba=15.5 (wha? Grade 15? Second Semester of Junior Year in College?)

UPDATE: Check out Scott Eric Kaufman's discussion of Statistically Improbable Phrases in Faulkner here.

Ridiculous stuff? I don't know. I think the Concordance feature is cool (the 100 most used words feature). It does get to the essence of something real but difficult to quantify in a book.

PARTING THOUGHTS The problem with these tools, particularly Google Print and the Full Text Databases is that full text searching removes the frame of reference that actual handling of documents and ephemera allows, whether original or on microfilm. Those frames can be really important (for example, what appears on a different page of a newspaper, a page that did not give you a hit).

Now that I think about it, I'm amazed at how much of my scholarship is dependent on a new and growing bag of tech tricks.


At 8:24 PM, Blogger Scott Eric Kaufman said...

First, a book in the hand is worth ten books online. And I mean that. When I work an actual book far harder than I work a .pdf document--more underlines, more, well, more thought. Second, I wrote about SIPS--Amazon's statistically improbable phrases--a while back. You can read it here, but be forewarned: I looked up some Faulkner titles, and, um, his SIPS are a little on the offensive side. (You can also check out the Crooked Timber conversation on SIPS. Many, many great finds among the comments.)

On another note entirely: I love what you've been writing here, and I hope you continue blogging well into the foreseeable future. (Even though your account of the fate of your first book depressed me beyond words. But I did, imaginatively, sincerely sympathize.) Too few academic bloggers focus on academic issues, and I understand their frustration with them--breathing them day-in and day-out--but it's refreshing to see an academic blog that's openly and vigorously about academia.


Post a Comment

<< Home