Discussion


Note: Images may be inserted into your messages by uploading a file attachment (see "Manage Attachments"). Even though it doesn't appear when previewed, the image will appear at the end of your message once it is posted.
Register Latest Topics
 
 
 


Reply
  Author   Comment  
bpierce

Moderator
Registered:
Posts: 99
Reply with quote  #1 

The January/February/March 2016 Visual Business Intelligence Newsletter article, titled Visualizing Wide-Variation Data, was written by Nick Desbarats, Perceptual Edge's Senior Educator and Consultant. In it, Nick addresses the challenge of displaying values that vary significantly in magnitude, especially when a few huge values force the majority of the other, much smaller values to appear so low that they're difficult to see and compare. He describes several ways that people often attempt to address this issue, explaining their strengths and weaknesses, before recommending the solution that we think usually works best.

What are your thoughts about the article? Do you agree or disagree with Nick's assessment of the various methods for solving this problem? Are there useful approaches that he didn't address? We invite you to post your comments here.

 

wd

Registered:
Posts: 167
Reply with quote  #2 
The "magnified inset solution" is clearly the best. Context is retained and it's all on one "page".

I've done something similar for aggregation of subsets - example attached. I know the vertical time axis will ruffle some feathers, but I'm choosing to ignore that :)Capture.JPG


__________________
Bill Droogendyk
Nick_Desbarats

Registered:
Posts: 6
Reply with quote  #3 
Thanks for posting your sample graph, Bill. It features a common situation wherein the wide variation in the data arises from a need to show both individual values in a data series as well as a total of those values which is, of course, generally much larger than the individual values.

In such cases, I agree that a two-graph solution is a good choice, although, if you opt for a magnified inset graph, you might try to find a way to make the relationship between the "totals" graph and the "individual values" graphs clearer than in your example. Perhaps light lines that show the "individual value" graphs “exploding” from the "totals" graph.
 
And yes, the vertical time axis is a feather-ruffler <grin>. That, and a few other issues in this graph could probably be avoided by switching to a line graph with one line per customer, and then stacking a separate “total” graph on top of the "customers" graph. If the “customers” line graph were too crowded with eight lines, it could be further split it into graphs for top four and bottom four customers, or some other meaningful segmentation. Time would then go from left to right, it would be easier to see and compare patterns of change among customers over time (assuming that that's the story you're telling with this graph), you wouldn’t need to repeat the date labels nine times, and you wouldn’t have value-encoding objects that are very far away from the quantitative scale, and thus hard to "eyeball" accurately. Because this would be a multi-graph solution with the large values (i.e., the totals) in their own graph, all of the lines for the smaller values would be clearly visible.

__________________
Nick Desbarats
Sr. Educator and Consultant, Perceptual Edge
rory_gaddin

Registered:
Posts: 1
Reply with quote  #4 
Thanks for the informative and thorough article Nick.  

One thing that struck me while reading was how we need to always be mindful of the ways in which the data will be interacted with and shared with others.  Many modern mobile and web-based UIs allow the end-user a great degree of flexibility in filtering out or "zooming in" to the data that is of interest to them, and thus to decide for themselves which aspects of the story are relevant for them at that point in time.

This is all fine and well if someone is analyzing the data for their own purposes to make a decision and move on from there to take some sort of action, but can be problematic if that same "end"-user exports a screenshot or PDF version of that filtered report (perhaps only showing the values on the low end of the scale, for example) and sends it off to someone else.  The new recipient of the information has now lost some of the original context, and may draw very different conclusions as a result.

What I really like about the inset chart idea is that it allows both the detail and the fuller context to be transmitted in a compact way that does not require any form of "end"-user intervention.  In this way, loss of information is prevented in the process of transmission from one person to another.
danz

Registered:
Posts: 190
Reply with quote  #5 
Nick,

Clear and thorough document. As probably many readers of this forum, I was aware of several options for the matter, yet it is always nice to find in just one place a good description of all of them. It can always be used as a reference document for any audience.

I still consider that several things need to be clarified. 

1. I like very much the inset graph. It tends to optimize to usage of available space in a very efficient manner. This solution, however does not work on dynamically analysed data. This is actually the main issue of hand made designs. They work on particular cases only, a solution with 2 adjacent charts being more adequate for dynamic use.

2. "The others" group is a practice I encourage myself as well. Still, the analysed variable has to be the "additive" kind (when it makes sense to add them together). In case the "add" operation does not make sense for "the others" group, a different math aggregation has to be performed on the values of that group. When this is not possible or to complex to be understood,  "the others" group should be skipped, again two charts with different scale being a more appropriate choice.

The cases you presented in your article are small sets of data with a clear identification of data size ranges. A more general approach can be designed for situations where 80-20 like rule applies or a compact visual information is required for more elements. Below a quick sketch of a situation which works for an additive variable. The treemap like tiling algorithm used for last bar, even if we all know that it does not provide the most efficient comparison algorithm, it does provide visual information about data diversity as well as user selectivity/tooltip for interactive support. The design has the advantage that it scales well on large amount of elements and is suitable for dynamic data analysis. The "depth" of analysed levels depends on available space and designer setup. This design was intended to focus on groups of 20 elements at a time, two levels visible at a time. "Others" group contains the same elements as the right chart, a recursive approach being possible on a third chart and so on. 
 
two segments data.png 

Most of the analysis we perform are on dynamic data, with, usually, an unknown amount of elements or ranges of values. Above design is using "The Others" group approach to keep the context on successive ranges of values, using different colors for different levels and treemap like algorithm for the grouping bars. Unlike the inset solution, this can scale well and adapt on dynamic data. However the inset charts based on static (known) data can be nicely designed, using more effective the available space, making them more suitable for presentations or data story telling like scenarios. 

Dan


wd

Registered:
Posts: 167
Reply with quote  #6 
Nick, thanks for your comments and suggestions
__________________
Bill Droogendyk
heinzel

Registered:
Posts: 16
Reply with quote  #7 
I read this article with utmost interest as I often come across this problem in practice. For some reason, organizational units in the company I work for vary quite widely in size. So I appreciate this topic being discussed in detail.

The one solution I would have loved to see did not come up in the paper or the discussion so far. This solution being the use of a table rather than a graph. The underlying assumption seems to be that everything has to be visualized, so let's find the best visualization. What if the best solution is not to visualize at all?

Interested to hear your thoughts.
Matthias
sfew

Moderator
Registered:
Posts: 823
Reply with quote  #8 
Matthias,

While it is true that the article assumes a graphical display, we do not assume that this is always necessary. When the data is used merely for lookup purposes, a table would work better and the problem of displaying wide-variation data would not exist.

__________________
Stephen Few
Nick_Desbarats

Registered:
Posts: 6
Reply with quote  #9 

Rory,

You make a great point. It's now easy indeed for consumers of data visualizations to crop out potentially important context before sharing a visualization. While the magnified inset chart solution doesn't prevent them from doing so, it does, at least, make them work a little harder for it.

As Steve has said and written many times, the only way to truly address problems such as the one you described is to educate people regarding concepts such as the importance of context, and that's what articles such as this are intended to do.

Nick


__________________
Nick Desbarats
Sr. Educator and Consultant, Perceptual Edge
Nick_Desbarats

Registered:
Posts: 6
Reply with quote  #10 

Dan,

Thanks for the great comments, and for posting your interesting sample graph.

I certainly agree that the magnified inset chart would indeed be difficult, if not practically impossible, to automate, and that it's therefore only appropriate for "hand-crafted" visualizations. A stacked or side-by-side solution strikes me as more plausibly (though not easily) automated.

Regarding using the appropriate aggregation method to when determining an "All others" value, I certainly agree that that's necessary, but I think that I might have been getting a bit outside of the scope of this article by going into depth on that, since "All others" isn't the main topic.

Your multi-chart bar/treemap hybrid solution is interesting. I think it works well, although I might opt for a conventional stacked bar for the "All others" data series. Yes, there's a risk that you'll end up with one-pixel-wide bar segments, but this is a risk (albeit a somewhat lower one) when using rectangles, as well. Bars would enable users to eyeball and compare values more accurately than rectangles, however. I like the idea, though.

Your post made me realize that "All others" might make for a good blog post. Stay tuned...


__________________
Nick Desbarats
Sr. Educator and Consultant, Perceptual Edge
danz

Registered:
Posts: 190
Reply with quote  #11 
Thank you, Nick.

To keep it simple for any audience, "The Others" group makes very much sense when the bar chart can be interpreted (on a different scale) as a part of whole. As soon as the Others group is not a result of a simple addition operation (sum or count), usually is quite difficult to be interpreted, so it may be skipped.

Your article message and the examples are clear. The inset chart and two side by side charts are solutions they work well for small amount of items, for a known set of data and (this I forgot to mention in my previous comment) for a decent balanced amount of items in the two sizes categories. For instance if we have 1 large value and the rest small, this is a case of dealing with outliers rather than two distinct ranges of values. If we have only 1-2-3 small values the "detail" chart can be easily skipped, is not much of a reason to have a separate chart for them.

The only reason I used the "treemap" reference term is that is usually easier to reference to it. The treemap was invented to encode hierarchical part of a whole structures for which different tiling algorithms were studied or implemented. When a rectangle is divided into just one level of smaller rectangles is no point in calling it a treemap. Unfortunately, I am not aware of a more appropriate term for it, a "divided rectangle" is the closest I can think of.

From my perspective regular stacking is just another tiling option for rectangle areas. Of course, the treemap visualization was invented more recent than classic stacked bars, yet they are both used for part of whole visual encoding, but in different scenarios. A stacked bar should work better on lower amount of items (better comparison), while "squarified" tiling algorithm is more suitable for higher amounts. Both of the techniques can be used to enrich "The Other" group information. 

Dan
Nick_Desbarats

Registered:
Posts: 6
Reply with quote  #12 

Thanks, Dan.

Regarding cases where there are only one, two or three outliers (whether they're low outliers or high ones), I think that most, though perhaps not all, of the recommendations in the article still apply. Even if there are a few extreme values, that doesn't necessarily mean that you can exclude those values from the visualization, so you still need to find a way to show them -unless that's not what you meant?

A perhaps relatively rare scenario that isn't addressed in the article is that wherein there are more than two very different ranges of values. For example, a situation where there are values between 1 and 10, others between 100 and 200, and yet others between 1,000 and 2,000. Such cases could be addressed by multi-graph solution, but you'd need more than two graphs.

Regarding your treemap/divided rectangle comments, I agree. I think that the ideal approach would be to use stacked bars since visual comparisons can be made more accurately based on length, but then degrade to rectangles (less accurate visual comparisons based on 2-D area) if the stacked bar segments became too small due to the number of values being encoded.


__________________
Nick Desbarats
Sr. Educator and Consultant, Perceptual Edge
Jerome_Ravetz

Registered:
Posts: 1
Reply with quote  #13 
A very small addition to all the excellent suggestions.  For some users, it could help if the enlarged portion of the graph were displayed as through the lens of a magnifying glass.  I have a memory of seeing this in Frank Land's seminal work The Language of Mathematics; but I may be wrong!
Nick_Desbarats

Registered:
Posts: 6
Reply with quote  #14 
Thanks, Jerome.

I haven't read Land's book, although I've now added it to my list.

I could see how a magnifying glass image might make the relationship between the "micro" and "macro" graphs even more obvious. Ultimately, though, I think that this benefit may not be worth the significant clutter that a magnifying glass illustration would add, since I think that the relationship between the "macro" and "micro" graphs is pretty clear with the more subtle shapes and lines shown in the examples in the article. One would also have to be careful to not distort the "micro" graph as if it were being viewed through a fish-eye lens since this would affect how people perceive the actual values.

__________________
Nick Desbarats
Sr. Educator and Consultant, Perceptual Edge
Previous Topic | Next Topic
Print
Reply

Quick Navigation:

Easily create a Forum Website with Website Toolbox.