September 9, 2012

Visualization of All My Publications

I rebuilt my home page, including a dynamic visualization that shows an overview and allows selection of all my publications going back 20 years. I used D3, which is a fantastic Javascript visualization library - built on SVG, which is supported in all modern browsers.


The key idea of the visualization is to provide a rich, interactive overview that gives clear access to my life's publications without requiring any training. In order to reach this goal, I simplified - offering only the essential features that lets people filter based on the three fundamental characteristics:

  • People - my co-authors
  • Time - when I published
  • Publications - what I published, including hand-picked keywords
I chose a bubble plot for my co-authors that displays people larger depending on the number of papers I co-authored with them. People are colored based on their basic category (Faculty, Student, Industry or Researcher). People's names are truncated (and sized) depending on available space - but show the full name (and number of co-authored publications) as a tooltip.

All visual elements are selectable and filter based on the thing that was clicked (person, person type, category - and even individual papers, represented by orange dots in the timeline). It is also possible to perform meta data text search.  To support all of this interaction without making it overly complex, I chose a simplistic design that doesn't support history or queries any more complex than simple conjunctions. To start over, press the "Reset" button.

The implementation is entirely custom. I built a Python/Django back end with a custom database that enables me to have entities for each author (avoiding typos, different names, etc.), venue and even keyword. Plus, there is an administrative interface for updating content, so I don't have to muck with HTML and ftp anymore on a day to day basis.  The database goes further to support the main page as well with news and project updates.

The visualization uses the standard D3 bubble plot and a custom timeline. The timeline includes range sliders on top to enable filtering of dates as well as keywords. Each publication is represented by an orange dot. Clicking on a dot shows the first page of the paper (created using imagemagick) to extract the image of the page from the .pdf file. I've also made PDFs all papers directly accessible.

All the code is open source with a BSD license.  Although I didn't bother to publish it on a public repository because it is somewhat complicated to deploy a Django app.  So instead, simply look at the HTML source and follow the javascript links. For example, the primary D3 visualization code is available here: http://www.cs.umd.edu/~bederson/js/vis.js

Also, there is a simple API that you can use to access the database in a RESTful format should you want to access my publications for some reason.

4 comments:

  1. So very, very cool

    ReplyDelete
  2. Very nicely done animation, it has been a great inspiration for a similar project that I am working on.

    ReplyDelete
  3. AWS Injector Module is a very useful tool that provides live data for your data lake. It is anAWS servicethat GETS data from AWS, and POSTS data to AWS, all on command. It is what is known as a ‘middle man’. With this tool your data lake can easily get data from your S3 buckets, Dynamo DB tables and SNS topics. (We used this tool to get the data for the Data Lake blog posts on our website.) In addition, you can use this tool to get data from other sources.

    ReplyDelete