E-book publishers know more than ever about our reading habits. They know, for example, that “the ideal hero for ‘Great Escapes’ readers is tall with black hair and green eyes, a rugged, burly build and a moderately but not overly hairy chest.” And yet we, whose reading powers the industry, have access to only a tiny fraction of this data — even at the coarsest level. At some point, I expected that shifting my reading habits from pulp to pixels would make tracking these habits simpler, easier, epiphanic. So far, little luck. One example that constantly taunts me: My Kindle’s main screen lists how much I’ve read of each book it holds, and yet I still can’t export even this most basic information.
So imagine my glee when I stumbled upon Instapaper’s
articles.ipdb file amid a recent iPad’s backup.1 This file, an SQLite database, appears to contain almost everything Instapaper knows about me: the dates I queued up each article, how much of each I’ve read, which I’ve “liked”, and which I’ve archived, among other details. It’s no “black hair and green eyes,” but its a start.2
First, some caveats. Instapaper is a skewed reflection of my reading habits; I use it mostly for medium-length news and magazine articles. Anything shorter I’ll read at my computer; anything longer I’ll read either in print or on my Kindle. For a very long stretch between October 2011 and mid-March 2012, I didn’t use Instapaper at all. I’m analyzing only the items I added to Instapaper since the end of that dry spell, and only up until September 1, 2012.3
During that time, I added 209 articles to Instapaper. I read at least a little bit of 59% of those 209 articles. I got roughly 78% the way through those I began reading. (Though I’m not sure how much to trust take Instapaper’s read-position data.) I tapped “Like” (in the Instapaper sense, not the Facebook sense) on one-third (32.5%) of the articles I began reading; or one-fifth (%19.6) overall.
Those 209 articles came from 112 different web domains. Eighty-four domains appear just once; another 15 just twice. Only five sites show up at least five times: nytimes.com (24), slate.com (15), newyorker.com (10), niemanlab.org (8), and reuters.com (5). Notably missing among the most-Instapapered domains is my employer’s site, wsj.com — I pick up the print edition every morning in the office.
Of my most-Instpapered domains, it seems I’ve been happiest with articles from reuters.com. Of the five pieces I added from that site — entirely composed of blog posts by Jack Shafer and Felix Salmon — I liked two, or 40%. (Admittedly, five articles is a small sample.) Next-most-liked was newyorker.com (20%), followed by nytimes.com (17%), slate.com (13%), and niemanlab.org (0%).4
It turns out, though, that I only began reading 10 of the 24 nytimes.com articles. If we look only at percentage-liked-of-begun, the Times’s rate improves to 40%; so do articles from newyorker.com. Both tie reuters.com, whose rate doesn’t change under the new metric, because I began reading all of theirs that I added. Slate’s rate improves marginally — from 13% to 17%.
Slate’s low “like” rate surprises me. I used to work there, and enjoy the magazine’s general style. There’s at least some upside for Slate in my stats: I tend to read further into Slate articles — at least on a percentage basis — than those from any other most-Instapapered site. Of the 11 slate.com articles I began reading, I stopped reading them 88% the way through, on average. By comparison, my average stopped-reading position was 81% for nytimes.com, 77% for reuters.com, and 65% for both newyorker.com and niemanlab.org. To be fair to the New Yorker, their articles are typically much longer than the others under this digital microscope.
Enough of the site-by-site comparisons. Let’s look temporal trends. I divvied up my Instapaper habits into 21 week-wide chunks, to see if my habits changed over time. They did, to some extent. I never added more than eight articles to Instapaper during any of the first 10 weeks. Then, in the following five weeks, I added 17, 20, 30, 23, and 12 articles. That this spike directly followed a couple of personal crises may be pure coincidence, but I suspect not.
The “like” rates and stopped-reading positions seem to carom sharply from week to week, without a readily discernable trend. Among the weeks I added at least 10 articles, the one that began June 25 was especially strong; I added 17 articles, began reading 16 of them, and liked nine (or a lofty 56%) of those. Less than a month later, for the week of July 12, I was apparently a less savvy — or at least more critical — Instapaperer. Of the 23 articles I added that week, I ultimately began reading just 39% of them, and liked just 33% of even that culled group.
Have you done any similar analyses of your reading habits? Notice any gruesome flaws in my numbers? Any philosophical objections? It’d be great to hear from you.
Further reading: In December 2011, Read It Later ranked writers by the number of times the service’s users saved their articles, and how often readers “returned” to those articles. In April 2012, Longform.org analyzed the 2,805 articles it had recommended since launching two years earlier. Readability keeps a running leaderboard of the hour’s, day’s, and week’s most popular articles.
To find this database, use iTunes to backup the iOS device you use for Instapaper. Then, extract the
articles.ipdbfile should be in the
I’m not as familiar other article-saving services — such as Readability and Pocket (née Read It Later) — as I am with Instapaper. If you’ve found a way to export or analyze your reading histories on these services, let me know and I’ll add a link or instructions. ↩
I downloaded the copy of the database I’m using for this analysis on September 12, 2012. If I hadn’t read something in my queue 10 days since adding it, I figure probably never will. ↩
For the record I do like the Nieman Lab; apparently, I just wasn’t so enthusiastic about the eight posts I Instapapered. ↩