Nexus 3D RDF Visualization as an OpenSimulator Region Module Displaying Researcher Interest VIVO Data

The trouble with triples (not tribbles ;-), for me, is that there are alot of them. Last October, I ported my Nexus 3D RDF Visualizer into Opensimulator and was quite happy with the high availability of graphics primitives that I could use to display larger numbers of triples.  I started working with a set of 31,934 triples that represented the molecular structure of a strand of DNA but realized the programming model I was using wasn't going to scale as far as I wanted.

Nexus, to this point, existed as a massive number of coordinated scripted primitives that communicated to a back-end server that kept them all coordinated.  In my prior post, I was able to display 658 triples of FOAF data without any problems.  This 658 triples amounted to about 935 scripted graphics primitives (prims) and it worked and it was fast.  The front-end operated as a massive parallel processor but when the number of scripts (or in general terms, threads) increases without a corresponding increase in actual physical computation cores, the over-head of the separate threads becomes a liability.  In my case, I was looking at trying to run in excess of 40,000 scripted prims just in the case of the DNA RDF data set. *sighs heavily*  To solve this problem, I rewrote a new Nexus OpenSimulator front-end, not as a series of scripted prims, but as an actual Opensimulator region module.

Region modules are extensions of the core Opensimulator server software.  Opensimulator is written in C#, and being open-source, I was able to dig right in and excercise far greater control of Opensimulator operating at this level than when I was working with the easier, but more limited, scripted prims method.  The following images show a display of 20,002 triples (~25,000 prims - front&back with close ups) which represents about 534 individual SUNY Researchers and their research interests extracted from a VIVO installation that we are developing at Stony Brook (thank you Tammy DiPrima and Dr. Jizu Zhi who are part of that team).  The RDF data was then normalized by using the extracted MeSH terms from the researcher PubMed publications and then linked through a RDF representation of the UMLS (Unified Medical Language System) that was created by my collegue, Dr. Janos Hajagos. 

What is this normalization?  When I first tried visualizing our researcher interests from VIVO in Nexus, I found out that the research interests did not really link up and that I have 500+ little separate RDF graphs.  Why?  Because everyone had their own way of saying the same thing but slightly different.  We took the publication information we had for these researchers and linked it to a RDF version of PubMed that we developed at Stony Brook.  In this linkage, we extracted the MeSH terms (MeSH is part of the UMLS) and then linked and normalized them through the RDF UMLS.  Once this was done, things began to link up.  Multiple datasets linked are more interesting than a single data set. 

At the moment, the Nexus visualization of the data is more interesting to look at than useful.  Removal of the over-linked trivial data which obscures the more useful information needs to be done next.  Hover-text labeling of triples and data-sensitive clustering algorithms are also on the to-do (these features were present in the old front-end).  Fortunately, I did not have to change the back-end for the new front-end because it was my intentions from the beginning to have multiple front-ends connecting to the back-end(s).  Concurrently, I started last fall to develop an HTML5-based/WebGL front-end version of Nexus that would be able to see and share the same sessions as the OpenSimulator-based front-end with an extra twist of using WebSockets rather than http to pass RDF between the front-end and back-end.  Data persistence between sessions is handled thanks to OpenLink Virtuoso being tied to the back-end.  On a personal note, it's a lot of fun playing around with collaborative 3D Semantic Web visualization. :-)

Nexus Commands used:

color <0,1,0> p where {?s ?p ?o} color green all predicates (the sticks that represent the predicates)
color <1,0,0> s where {?s ?p ?o} color red all subjects
color <0,0,1> o where {?s ?p ?o} color blue all objects
color <1,1,1> o where {?s ?p ?o . filter(isliteral(?o))} color white all literals (this would over-write the blue of the last command on literals
shiny 3 spo where {?s ?p ?o} add metalic sheen for all triples