Viewing scientific articles on the iPad: browsing articles

touchevents.pngIn previous articles I've looked at how various apps display scientific articles. The apps I looked at were:

So, where next? As Ian Mulvany noted in a comment on an earlier post, I haven't attempted to summarise the best user interface metaphors for navigation. Rather than try and do that in the abstract, I'd like to create some prototypes to play with various ideas. The Sencha Touch framework looks a good place to start. It's web-based, so things can be prototyped rapidly (I'm not going to learn Objective C anytime soon). There's a moderately steep learning curve, unless you've written a lot of Javascript (I've done some, but not a lot), but it seems to offer a lot of functionality. Another advantage of developing a web app is that it keeps the focus on making the content accessible across devices, and using the web as the means to display and interact with content.

Then there is also the issue (in addition to displaying an individual article) of how to browse and find articles to view. Here are some possibilities.

Publisher's stream
Apps such as the Nature app and the PLos Reader provide you with a stream of articles from a single publisher. This is obviously a bit limiting for the reader, but might have some advantages if the publisher has specifically enhanced their content for devices such as the iPad.

Personal library
Apps such as Mendeley and Papers provide articles from your personal library. These are papers you care about, and one you may make active use of.

Social
Social readers such as Flipboard show the power of bringing together in one place content derived from social streams, such as Twitter and Facebook, as well as curated sources and publisher streams. Mendeley and other social bookmarking services (e.g., CiteULike, Connotea) could be used to provide social similar streams of papers for an article viewer. Here the goal is probably to find out what papers people you know find interesting.

Spatialipadmap.png
In an earlier post I used a map to explore papers in my BioStor archive. This would be an obvious thing to add to an iPad app, especially as the iPad knows where you are. Hence, you could imagine browsing papers about areas that are near you, or perhaps by authors near you. This would be useful if, say, you wanted to know about ecological or health studies of the area you live in. If the geographic search was for people rather than papers, you could easily discovering what kind of research is published by universities or other research bodies that are near your current location.

Of course, Earth is not the only thing we can explore spatially. Google maps can display other bodies in the solar system, (e.g., Mars), as well as the night sky. Imagine being interested in astronomy and being able to browse papers about specific planetary or stellar objects. Likewise, genomes can be browsed using Google maps-inspired browsers (e.g., jBrowse), so we could have an app where you could easily retrieve articles about a particular gene or other region of a genome.

Categories
Another way to browse content is by topic. Classifying knowledge into categories is somewhat fraught, but there are some obvious wasy this could be useful. A biologist might want to navigate content by taxonomic group, particularly if they want to browse through the 1000's of articles published in a journal such as Zootaxa (hence my experiments on browsing EOL). Of course, a tree is not the only way to navigate hierarchical content. Treemaps are another example, and I've played with various versions in the past (see here and here).

qt.png

I have a love-hate relationship with treemaps, but some of the most interesting work I've seen on treemaps has been motivated by displaying information on small screens, e.g. "Using treemaps to visualize threaded discussion forums on PDAs" (doi:10.1145/1056808.1056915).

Summary
These notes list some of the more obvious ways to browse a collection of articles. It would be fun to explore these (and other approaches) in parallel with thinking about how to display the actual articles. These two issues are related, in the sense that the more metadata we can extract from the articles (such as keywords, taxonomic names and other named entities, geographic localities, etc.) the richer the possibilities for finding our way through those articles.

Viewing scientific articles on the iPad: Papers

ipad_iphone.gifContinuing the series of posts about reading scientific articles on the iPad, here are some quick notes on perhaps the most polished app I've seen, Papers for iPad. As with earlier posts on the Nature and PLoS apps, I'm not writing an in-depth review - rather I'm interested in the basic interface design.

Papers is available for the Mac, as well as the iPhone and iPad. Unlike social bibliographic apps such as Zotero and Mendeley, Papers lacks a web client. Instead, all your PDFs are held on your Mac, which can be wirelessly synced with Papers on the iPad or iPhone.

Navigation popover

Papers makes extensive of use of the split view, in which the screen is split into two panes, the left-hand split becoming a popover when you hold the iPad in porrtait orientation. Almost all of the functionality of the iPhone version is crammed into the left-hand split. The popover displays the main interface categories (library, help, collections that you've put PDFs into), collections of documents, metadata for individual papers (which you can edit), as well as search results from a wide range of databases:

papers.jpg

Some of these features you encounter as you drill down, say from library to list of papers, to details about a document, others you can access by clicking on the tab bar at the bottom.

PDF display
Like the PLoS app, Papers displays PDFs. It doesn't use a page-turning effect, rather you swipe through the pages from left to right, with the current page indicated below in a page control (what Sencha Touch describe as a carousel control).

touch5.jpg

Given that the document being displayed is a PDF there is no interaction with the images or citations, but you can add highlights and annotations.

Summary
Papers is the first of the iPad apps I've discussed that isn't limited to a single publisher. If and article is online, or in your copy of Papers for the Mac, then you can view it in Papers for iPad. It is the app that I use on a day to day basis, although the PDF viewer can feel a little clunky. I think anyone designing an application reader should play with Papers for a while, if only to see the level of functionality that can be embedded in the basic iPad split view.

BioStor gets PDFs with XMP metadata - bibliographies made easy with Mendeley and Papers

The ability to create PDFs for the articles BioStor extracts from the Biodiversity Heritage Library has been the single most requested feature for BioStor. I've taken a while to get around to this -- for a bunch of reasons -- but I've finally added it today. You can get a PDF of an article by either clicking on the PDF link on the page for an article, or by appending ".pdf" to the article URL (e.g., http://biostor.org/reference/570.pdf). In some ways the BioStor PDFs are pretty basic - they contain page images, not the OCR text, so they tend to be quite large and you can't search for text within the article. But what they do have is XMP metadata.

XMP metadata
7950C5D8-3299-48CF-96A0-9D02BA6F0B03.jpgOne of the great bugbears about organising bibliographies is the lack of embedded metadata in PDFs, in other words Why can't I manage academic papers like MP3s? (see my earlier post for some background). Music files and digital photos contain embedded metadata that store information such as song title and artist in the case of music, or date, exposure, camera model, and GPS co-ordinates in the case of digital images. This means software (and webs sites such as Flickr) can automatically organise your collection of media based on this embedded metadata.

mendeley.pngpapers.pngWouldn't it be great if there was an equivalent for PDFs of papers, whereby the PDF contains all the relevant the bibliographic details (article title, authorship, journal, volume, pages, etc.), and reference managing software could read this and automatically put the PDF into whatever categories you chose (e.g., by author, journal, or date)? Well, at least two software programs can do this, namely the cross-platform Mendeley, and Papers, which supports Apple's Macintosh, iPhone, and iPad platforms. Both programs can read bibliographic metadata in Adobe's Extensible Metadata Platform (XMP), which has been adopted by journals such as Nature, and CrossRef has recently been experimenting in providing services to add XMP to PDFs.

One reason I put off adding PDFs to BioStor was the issue of simply generating dumb PDFs for which users would then have to retype the corresponding bibliographic metadata if they wanted to store the PDF in a reference manager. However, given that both Papers and Mendeley support XMP, you can simply drag the PDF on to either program and they will extract the details for you (including a list of up to 10 taxonomic names found in the article). Both Papers and Mendeley support the notion of a "watched folder" where you can dump PDFs and they will "automagically" appear in your reference manager's library. Hence, if you use either program you should be able to simply download PDFs from BioStor and add them to your library without having to retype anything at all.

Technical details
This post is being written as I'm waiting to catch a plane, so I haven't time to go into all the gory details. The basic tools I used to construct the PDfs were FPDF and ExifTool, which supports injecting XMP into PDFs (I couldn't find another free tool that could insert XMP into a PDF that didn't already have any XMP metadata). I store basic Dublin Core and PRISM metadata in the PDF. The ten most common taxonomic names found in the pages of the article are stored as subject tags.

FA644E5D-F2DE-4B00-BADD-140806C4EE00.jpgInitially it appeared that only Papers could extract XMP, Mendeley failed completely (somewhat confirming my prejudices about Mendeley). However, I sent an example PDF to Mendeley support, and they helpfully diagnosed the problem. Because XMP metadata can't always be trusted, Mendeley compares title and author values in the XMP metadata with text on the first couple of pages of the PDF. If they match, then the program accepts the XMP metadata. Because my initial efforts as creating PDfs just contained the BHL page images and no text, they wouldn't pass Mendeley's tests. Hence, I added a cover page containing the basic bibliographic metadata for the article, and now Mendeley is happy (the program itself is growing on me, but if you're an Apple fanboy like me, Papers has native look and feel, and syncing your library with your iPhone is a killer feature). There are a few minor differences in how Papers and Mendeley handle tags. Papers will take text in the Dublin Core "Subject" tag and use those as keywords, whereas to get Mendeley to extract tags I had to store them using the "Keywords" tag (implemented using FPDF's SetKeywords function). But after a bit of fussing I think the BioStor PDFs should play nice in both programs.
Powered by Blogger.