Chart Bum

Because every place worth going can be found on a topo map.

Reading Response the last

So here we are at last.  What happens next with this blog, I wonder?  I’ve been giving it some thought, but I’ll leave that for another post. 

For the moment, we’re discussing text visualization.  There is a meta-visualization element inherent to text visualization, since text is, after all, already a symbolic representation.  If you go back far enough in human history, to pictograms, hieroglyphics and their ilk, the difference between writing and visualization disappears entirely.  There is something philosophical to be said here about semantics and symbolic meaning, but I’ll save that for a late night conversation over a bottle of wine (or two). 

As it is, text visualizations are faced with the double-edged sword of visualizing data which already contains inherent meaning.  When your glyph is a sphere, you have a lot of flexibility in how you use it, given how abstract it already is.  When your glyph is the word, ‘methamphetamine,’ however, you’re faced with the challenge of using a symbol that is not only unwieldy, but contains more information and associations than any normal glyph.  I think this is the primary reason that both the best and worst visualizations I’ve ever seen were, at least partially, text visualizations.  Furthermore, the decision of what is important in the data becomes quite difficult when dealing with subjective data, such as a large amount of text.  Though word count can be revealing, it is only an indirect way of getting at the variable of concern, which is semantic meaning.  Both of the visualizations we looked at for this week had admirable ways of dealing with these issues, and I was impressed overall.  That these very intelligent and complex visualizations were still so flawed speaks more to the difficulty of the task than to specific issues with the visualizations themselves.

DocBlocks, a tool for visualizing the content of legislation, primarily focused on this problem of semantic meaning.  By using a learning program it attempted to create a way to classify sections of legislation by the subject area, regardless of the overall subject of the bill (for instance, locating a gun rights law in a credit card bill).  Though the authors assert that their categorization program works ‘well,’ they also point out that it often multiply categorizes sections or does not appear to give a specific categorization at all.  This makes me wonder, since the authors intend to make this publicly available, why they can’t implement a wiki-model, using the program to identify uncharacteristic passages within a bill (which their program can already do) and allow users to categorize the section themselves.

In terms of the actual visualization of the data, the stacked lines were servicable, though nothing specatular.  Multiple categorizations could become cluttered quickly, but the combination of a search bar and an intuitive zoom tool would be incredibly useful.  If this was going to be a public tool, it would be useful to have a way for other users comments on specific sections to be accessable through the visualization, perhaps as a tool-tip menu.

Parallel text clouds is a really impressive project.  The number of different (and original) visualizations that have been linked in a highly interactive manner is prodigious.  It takes nearly eight minutes in this video just to gloss over all of the capabilities of this viz.  Though all this was very shiny, I couldn’t help feeling by the end that I had just used a supercomputer to solve a sudoku.  Though all of the tools were useful, I felt that the amount of information in the wordcloud itself, while interesting, was relatively sparse.  It strikes me that cases could be categorized by subject based on relevant case law cited and the same data could be presented more legibly in a series of simple bar graphs. 

Words already carry so much meaning that visualizing them without reducing them to abstractions or numbers is more likely to create a mess than a masterpiece.  Textclouds are a perfect example.  Though they are the primary text visualization available, they are notoriously useless for actually extracting relevant information.  The text visualizations that I’ve seen work are those like Ben Fry’s On the Origin of Species that charts the changes and additions to the text over different editions.  The major difference here is that the semantic meaning of the words is irrelevant to the visualization itself.  The text is, of course, available directly from the viz, such that insights gained from the latter can be applied easily to the former.  However, visualizing the structure of a text is much more accessible than visualizing the meaning.  We’ve spent the last ten-thousand years creating systems of writing that can convey information too complex to put into pictures.  In text visualization, to a certain extent, we’re trying to reinvent the wheel. 

h/t Ming

mingminhui:

Josh will appreciate

thedailywhat:

Kristian Bjornard: Venn Diagrams … A Diagram

Meta!

[flickr.]

Beware:  An infographic can only be as good as its data.

Reading Response 8

The common question in the readings for this week was how to deal with too much data.  This is a more and more common problem in our information society, especially in the field of security where critical information is often hidden beneath piles of random noise.  Our natural abilities to visually identify patterns is ideally suited toward this problem, but we still need to find a way to present the data in a usable format.  To do so requires culling large data sets down to the critical elements, and, as in most cases with visual analytics, results in rather complex figures with steep learning curves. 

Computers and computing networks are, as usual, prime candidates for this sort of visualization, given the immense amount of data available to the visualizer.  The paper we read only begins to touch upon the many many ways visual analytics are currently being used in network security.  In fact there is an entire manyeyes-esque site, secviz.org, which represents a substantial community devoted to the subject.  Having spent a fair bit of time working through some of the visualizations, I’ve realized I should have been a comp sci major.  As it is, though, it’s all over my head:  Like I said, steep learning curve.

A little more accessible, but arguably more complex, is the case of visualizing videos. The paper on the subject was one of my favorite we’ve come across so far.  The problem of creating (relatively) simple two dimensional representations of complex four dimensional data is staggeringly difficult and I found their approach surprisingly usable.  My intuition was supported by their user study, the existence of which, of course, warmed my heart.  However, I was underwhelmed by the success rates for identification in the various conditions.  However clever their representation of the data, if it doesn’t reliably communicate what’s going on in the video, then what good is it?

The one part which got me really excited, however, were the examples using real video.  The clarity of the bag and box being left and taken as demonstrated in the visualization, where they may have been obscured in the video itself, is a powerful tool.   

Reading Response 7

The papers this week reminded me of why I hated freshman bio.  The vast amount of classification data is the sort of thing that is usually learned through extensive rote memorization which must be constantly updated as genetic family trees are arranged and rearrange in the presence of new data.  In short, this is a topic that my natural instinct tells me to run away from as fast as I possibly can.  In terms of biology, much of what didn’t go over my head in this weeks readings fell on willfully ignorant ears, but I will do my best to touch upon the graphical issues that were touched upon.

A theme we’ve touched upon again and again is how to deal with a huge amount of data without losing breadth in order to spot trends, or depth, in order to examine specifics.  We’ve seen a number of of ways to deal with this, including multiple dimensions, complex representations with colors and shapes, and multiple representations.  Dynamic zoom is a particularly useful tool, and is used to great effect in TreeJuxtaposer.  It brings up the interesting problem, however, of maintaining context of the data while using the zoom.  The combination of dynamic scaling and the use of color is particularly effective in both highlighting the needed information and keeping the sense of overall place in the hierarchy.

Mizbee was one of the more beautiful visualizations we’ve looked at so far.  I found the way it dealt with multiple matching to be surprisingly elegant and intuitive for something that is really incredibly complex.  That said, I am too actively bio-phobe to figure out exactly what the goal or takeaway of all these colorful wheel lines actually is.  I look forward to someone giving me the idiots guide to genetic markers one of these days.  Until then I’ll appreciate these for the pretty mandalas they are.

Mizbee, however, is a perfect example of a great visualization that I have absolutely no interest in.  Not because of its subject matter, since it could be used for various data, but because, despite its elegance it is fundamentally non-intuitive.  I’m interested in visualization as argument and visualization as communication.  A different, and often contradictory, goal is to divine new relationships in existing data, in other words, visual analytics.  Though I can appreciate the latter from an aesthetic level, these sorts of tools usually have high barriers to entry in the form of a steep learning curve.  I am much more concerned, personally, with visualization as argumentation, and argumentation should be clear and accessible if it is to be successful.

Another good use of sound in visualization.  I’m really getting interested in the use of other senses to add dimensions.

vizualize:

http://www.whitevinyldesign.com/solarbeat/

Reading Response 6

Slate.com recently had an article on how hand-drawn maps are in many ways superior to more ‘accurate’ graphics.  Steven Few’s introduction to geographic mapping makes essentially the same point:  When there is an abundance of information, including it all merely serves to obscure the important bits.  If I need to get the the dry cleaners four blocks away (and I do, at the moment, hooray for job interviews) it doesn’t really help me to know the side streets paralleling my route, or the name of each cross street between my apartment and Jet Cleaners.  The best a detailed map can do is confuse me.  At worst, I might misread it and end up too far up Dixwell at the wrong time of night.

One of the best maps I’ve ever received was a map of Chicago which consisted of four arrows labeled with street names and a couple of major landmarks that would tell me I’d gone too far.  Nothing was to scale, the landmarks were doodles and none of the lines were straight.  Still, our car did much better than the other two who were using Google maps, getting there about ten minutes ahead of schedule.  Just because you have a lot of data, doesn’t mean you should use it all.  Down that road confusion and data minding lie.

Few’s point about color ordering is one that has come up on this blog and in man of the other readings before, so I’ll only mention that color is often less obtrusive and more useful than the sorts of graphics Few recommends.  The non-orderabilty (nominal nature) of colors is actually a benefit to a good designer, who can represent both a nominal variable (region) with different colors and an interval variable using intensity at the same time.  As Few glosses, in terms of good graphical practice, there is a big difference between mapping regions and mapping locations, even if the map you are using for both is exactly the same. 

As for Hurricanes, reading the Steed article without MDX to fool around with reminded me of reading those sex ed. books my parents bought for me in sixth grade.  Tantalizing, but without some hands on experience you’re going to miss the point.  I can see on principle how a parallel coordinate system with an accessible user interface could be incredibly helpful, but without the chance to play with it I mostly have to take Steed et al. at their word. Looking at the static bundles of lines in the accompanying screenshots, however, was rather unimpressive.  But, that’s why so much emphasis was placed on dynamic interactivity!

Chart design consists of as many decisions about what to display as much as it does on how to display it. 

vizualize:

LOVE LINES by Simon Becker

More Information