Mucho Mashup

The best part of winning a competition is, by far, is that I get to sport my very own winner’s tile!

Given all the other wonderful entries, I was very surprised and honored to find out that I won Talis’s Mashing up the Library competition. The fact that Talis offered the competition in the first place is a testament to their dedication toward building a new kind of library system–one that places its users at the center. Whether that user be a patron or a systems guru, mashups imply openness: a virtue that the beleaguered Old Way lacks. By promoting development in such a way, by extending this competition to anyone–not just their customers–Talis has valued the message over the messenger. Back in June, the competition was announced. It was followed by a series of blog posts and a podcast introducing the library community to mashups and their significance. This gave librarians time to digest the concept and experiment with it.

I was impressed with all the entries. Several, in particular, caught my eye as either particularly interesting or meaningful. Of course, the second place entry, submitted by the Alliance Library System for their Second Life library is unique indeed. Paul Miller lauds their spirit of cooperation. I think it’s just dang cool and I would participate more in it if I could figure out how to take the grand piano off my head!

Another interesting entry was submitted by Art Rhyno and Ross Singer. Their idea was to use Google Desktop as a catalog repository. As far as mashups go, it’s a little more technically involved, but its possible applications are very compelling indeed.

The folks at The State and University Library (In Denmark?) submitted what looks to be a search engine that aggregates a number of data sources. It appears to be modular so that new sources can be easily plugged in. Definitely a project worth watching. They call it Summa.

Mike Cunningham threw together a slick little book cover browser using Yahoo’s carousel component. It’s definitely a mashup in the true sense of the word and it gives me some ideas…

Someone whose username is dburden put together a little text-to-speech reference robot. I couldn’t get it to work on my laptop, but the idea seemed solid and was ingenious!

Talis has decided to renew the competition but keep it open on an ongoing basis. My hope (and theirs too) is that more library folk will join the fray and showcase their creations for the betterment of libraries everywhere.

Thanks to Talis and all who participated.

Incorporating Google Books into the Hit-list

So the folks over at Google Books think they can go ahead and incorporate our catalogs into their search, do they?

Actually, that's fine, I have no problem with that, which means... They should have no problem with me incorporating Google Books into our hit-list. Right?

Now when users search the AADL catalog, they will be given the option to peek inside the books on the hit-list--that is, if there is a record over at Google Books. Basically, the first time that record is displayed in the list, the middleware queries Google Books to see if it has that item in its database. If it does, the middleware makes note of that in a MySQL table so that the remote query doesn't need to be run again. That way, future queries save time and bandwidth.

Looking at the Syndetics offerings next to it, this seems like a much richer and more useful resource. Enjoy!

** Update 1: 8/24/06 9:45 PM **

Ha! It looks like that was short-lived! (Thanks to Ryan for giving me the heads-up), Google apparently doesn't return the favor:

We're sorry...

... but your query looks similar
to automated requests from a computer virus or spyware
application. To protect our users, we can't process your request
right now.

We'll restore your access as quickly as possible, so try again soon. In the meantime, if you suspect that your computer or network has been infected,
you might want to run a virus checker or spyware remover to make sure that your systems are free of viruses and other spurious software.

We apologize for the inconvenience, and hope we'll see you again on Google.

And here I was, trying to be nice by caching the results... Guess we'll have to wait for the API.

** Update 2: 8/25/06 8:50 AM **

So, I think I found a way to fix this. Essentially, the way I was previously determining if Google Books has a record for and ISBN what by using this URL template:

http://books.google.com/books?vid=ISBN$isbn&printsec=frontcover&dq=isbn:$isbn

Now I'm using a different URL that does not return 404:

http://books.google.com/books?as_isbn=$isbn

If there was no record for that ISBN, Google would throw a 404. I think the fact that one IP was requesting so many 404s is what spooked Google, not the retrieval rate. Also, I noticed that I could no longer use wget on the command-line to grab the data--Google would return a 403 (Forbidden). So, my thought was to ditch PHP's file_get_contents for CURL which allows you to spoof a user agent. I took a peek at our apache logs and chose:

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.6) Gecko/20060728 Firefox/1.5.0.6

So, instead of looking like a "virus or spyware", the script now appears, to Google, as an extremely zealous Google Books user. We'll see how long it lasts, but it seems to be holding...

** Update 3: 8/25/06 11:40 AM **

No go, they've blocked us again. I'm sending an email to the kind folks at Google Books, and we'll see if they reply. Until then, I've got a few more tricks up my sleeve... In the meantime, I'll leave the cached information active...

** Update 4: 8/25/06 4:07:PM **

Google scores major points in my book! One of the managers over at Google Books just emailed me to say that he likes the idea of the hit-list links and that he is going to see if they can accommodate these types of queries.

[tags] Google, GoogleBooks, Sneaky, AADL, Library, OPAC, Catalog [/tags]

Go-go Google Gadget!

I was having serious doubts that I'd find the time to put something together for Talis's Mashing Up The Library competition. But that's what late-nights are for, right?

Anyway, my idea was to submit a suite of Google gadgets that, even if it doesn't win, will serve another important purpose of providing a proof-of-concept for my PatREST specification. So I've written four gadgets using the Google Gadgets API. These gadgets then consume the PatREST service for their data. If you're unfamiliar with the Google gadget, they are the little customizable panels on Google's personalized home page. This is what the page looks like with all four enabled:

Like all the other gadgets, they can be dragged around the page and individually configured. Also, you can choose which gadget to install, if you don't want all four. They are:

  • tops.xml - - Displays the hottest items at the library. You can configure it to display books, CDs, books on CDs, or everything. You can also select the number of results you want returned.
  • new.xml - - Displays the newest material at the library. You can configure it to display books, CDs, books on CDs, or everything. You can also select the number of results you want returned.
  • curitems.xml - - Displays all your currently checked-out items.
  • holds.xml - - Displays all your requested material.

The beauty of these gadgets is that they require no modification of the code at all in order to be used with other libraries. The catch is that the library needs to provide the PatREST API (and right now, that's only AADL). My hope is that other libraries will see and recognize the usefulness of a patron-friendly API.

Installing the gadget is dead easy. First, you need to create an account with Google so that you can personalize the Google home page. Then, click the "Add content" button (top-left). You'll notice a small option next to the search box, "Add by URL". Click it, and paste in the URL of one of the XML gadget files above. If successful, you'll see something like:

You can repeat this process for all of the gadgets, if you like. When you're finished, go back to your personalized homepage, and configure your gadgets:

Configuring the Top Items gadget:

Configuring the Check-outs gadget. Note that you will need a special token. This is the token used to access your personal RSS feeds:

That's it! I doubt it could be muh easier. AADL patrons can benefit from these gadgets right away. Hopefully other libraries will consider the PatREST specification, (vendors too!). Once I had done the prototyping, creating additional gadgets was very easy because of PatREST's simplicity and accessibility.

Naturally, these XML files are released under GPL v.2. Feel free to modify and extend them as you like. I have a number of ideas that I just don't have time for right now, but hope to add in the future.

Also, because of the quality of XML data I get out of the III system, and, what I think is a bug on Google's end, the results occasionally do not display. Once Google clears it's internal cache, however, things usually fix themselves. I think this little bug is out of my hands--at least until I can learn more about it.

Please use these, and enjoy them responsibly!

*** Update 8/19/2006 ***

Added subject search to the New Items gadget.

*** Update 9/1/2006 ***

Not sure why I didn't do this before, but I added "Add to Google" buttons to the links in this post, as well as on my files page. All you need to do to use these gadgets is click on the button and confirm by clicking on "Add it now". How's that for simple?

John Wilkin to speak at AADL tonight

Last year, I blogged a talk that University of Michigan's John Wilkin gave to our staff during our annual staff training day. I found the talk to be very interesting as he covered the Google digitization process from the University's perspective. His thoughts are particularly useful because he's not a Google employee so he's not spouting the company line, yet he is, in every way, an insider to the entire digitization program.

At any rate, he's speaking again tonight at AADL. If you're in the area and have the evening free, I'd highly recommend his talk. From the AADL website:

What does the UM/Google partnership to digitize the UM Library mean; what significance will this have for libraries, researchers and the public; and why is this so controversial? Through this project, UM hopes to guide more users: to their local libraries; to digital archives of some of the world's greatest research institutions; and to out-of-print books they might not be able to find anywhere else--all while carefully respecting authors' and publishers' copyrights. This event is the Library Director's program for 2006.

Each year, AADL Director Josie Parker chooses a current topic of community, state or national concern to highlight at a special program during this national week of celebrating libraries.

His talk will be at the Downtown branch in the Multi-Purpose Room. Regretfully, I'll be missing it due to prior commitments.

[tags] Google, Wilkin, Books, OCR, Ann Arbor, AADL [/tags]