A Brief Setback – Bookshelf Design, Part 2

LibraryThing is awesome. It’s pretty much the only way my library can possibly be in any sort of order and the database export is the starting point for the cluster analysis shelving project. On top of that it’s a great community of tech people with a love for books, real actual physical printed books, which is a rare and valuable quality in the world today.

Unfortunately, in the process of cataloging and double-checking books as I boxed them up to take over to the new house, I discovered that some of the auto-generated physical dimensions are out-of-order (swapping height and length, length and thickness, etc.) or wrong.

Which means that I am now ~1% done with verifying and/or entering the dimensions of the books piled throughout the first floor.

So…

download (1)

GitHub – Bookshelf Design, Part 1.5

And it has begun…

https://github.com/dnorum/cluster_analysis

Technically, as can be seen from the commit history, it began at the end of November last year. But with books piled knee-deep around pretty much the entire periphery of my first floor, it needs to begin to get serious.

It still needs a bit more refactoring and updating to go from being a personal project, i.e. a hideously messy set of ad-hoc scripts racing to stumble and flop across the finish line before they achieve a critical mass at which point any bug ‘fix’ introduces k>1 new bugs resulting in an event horizon from beyond which useful results can no longer escape, to a ‘real’ project with an actual useful README and notes and well-structured directories and all of that that you can be reasonably sure you can run without accidentally rm-rfing (remreffing? rm -fr and rimfiring, maybe?) yourself in the foot.

And then I need to actually finish it.

But it’s getting there.

Overkill is Underrated – Bookshelf Design, Part 1

As part of moving to a new house I have the opportunity to (pending consultation with a structural engineer) actually shelve – rather than box – my entire library. Twenty-plus years of literary (and, well, some not-so-literary) acquisitions organized for the first time – the idea was thrilling. But it would require surmounting some serious practical difficulties.

(See the structural engineer mentioned above.)

According to LibraryThing’s wonderful stats page, my library at time of moving was roughly 4,377 volumes [‘roughly’ because it sometimes doesn’t fully capture multi-volume sets, pamphlets and magazines, etc.] weighing about 7,858 pounds and occupying (if they were to be somehow shelved) 725 linear feet. [Adding epsilon to account for my prolific use of post-it scrids.]

Obviously, my previous shelf-building strategies (to wit, building a case to fit whatever wall-space was available and then filling it up with whichever books were of an appropriate size; or piling up books of like size until I had a case’s worth and then building such a case) would be insufficient for this task.

A complication is that the library contains books of widely varying sizes. Lots of paperbacks, yes, and those are certainly straightforward to shelve, but also textbooks, square-format newspaper comic collections, textbooks, long-format comics collections (both large and small), graphic novels, trade paperbacks, hardcover books, etc. Given the number of bookcases I would be making, optimal usage of construction materials was also a consideration – textbook shelves would have to be constructed much more sturdily than paperback shelves; I wanted to avoid excess shelf depth for cost, weight, space, and dust-collection reasons; etc.

So I decided that this would be a good opportunity/excuse to try a ‘practical’ programming project – pull my LibraryThing collection info into a PostgreSQL database, scrub and format it, spit out the well-formed records to pass into an R script to run a K-means cluster analysis, plot the results as well as summary views before and after with GNUplot, pull it back into PostgreSQL to generate statistics for each cluster and add appropriate shelving tags to my LibraryThing database, and wrap as much as possible up in a bash script for simplicity’s sake.

(Before I began, I was well aware that after all of this I could very well end up with the classic “case for textbooks and tall books, two cases for trade paperbacks, the rest split between hardcovers and paperbacks” distribution, but in the first place it might very well not, and in the second, well, it’d be fun anyways.)

And…

Overkill is underrated.