Discussion


Note: Images may be inserted into your messages by uploading a file attachment (see "Manage Attachments"). Even though it doesn't appear when previewed, the image will appear at the end of your message once it is posted.
Register Latest Topics
 
 
 


Reply
  Author   Comment  
bpierce

Moderator
Registered:
Posts: 99
Reply with quote  #1 

For the July/August/September 2017 Visual Business Intelligence Newsletter article, titled Journey to Zvinca: The Making of a New Chart, Stephen introduces a new type of chart called a Zvinca Plot. Invented by Daniel Zvinca, the Zvinca Plot can be used to extend the capabilities of a standard dot plot to enable us to view thousands of values at once in a way that is more effective perceptually than the displays that we’ve relied on until now to do this.

What are your thoughts on Zvinca Plots? We invite you to post your comments here.

-Bryan

bella_gotie

Registered:
Posts: 22
Reply with quote  #2 
It's the great idea! I like it very much.
Nuligoy

Registered:
Posts: 2
Reply with quote  #3 
Thank you for this development, I can see many use cases for this new chart.

In cases where your dataset has a large number of identical values, how would this best be handled?

For example, taking your dataset of 1000 values, let's say 200 values in the middle had the same value.  If the data is laid out in the same way as in your newsletter (50 values per column), am I right in thinking this would result in 4 overlapping columns of 50 values each?

Would this require some kind of highlighting or shift/jitter to indicate there are multiple values in this column?
sfew

Moderator
Registered:
Posts: 823
Reply with quote  #4 
Nuligoy,

A Zvinca Plot would not work well for a data set that includes many instances of the same value. If a column of values included only 50 rows, as many of our examples did, then a data set with more than 50 instances of the same value would result in over-plotting. Fortunately, this isn't a common problem. Keep in mind that many more than 50 rows of values can be displayed in a Zvinca Plot, so this problem need not arise unless well over 100 instances of the same value occurs. If you felt inclined to solve this problem with a Zvinca Plot rather than switching to a different form of display, you could make the data points transparent, resulting in darker colors where overlapping occurs, assoming that you haven't varied data point colors for other purposes, or you could enlarge the data points (e.g., when two data points that are rendered as a short vertical line overlap, the line could be made twice as wide), assuming that you haven't varied data point sizes for other purposes.

__________________
Stephen Few
Nuligoy

Registered:
Posts: 2
Reply with quote  #5 
Thank you for the response, Stephen.  That does make sense.
badger

Registered:
Posts: 4
Reply with quote  #6 
It's very like a DNA profile visualisation to me.

A few things that don't make reading it clear are

What do the columns represent, for instance, for Item 1 the value of nearly 300? in the 4th column? (sorry it's not clear even on the image) means what?  So, in your example Sally Smith (Item 1) has a first value of 4.42 (4.42 what?) and a 14th value of nearly 300 (nearly 300 what?)

How would you display a negative value?

Thanks

__________________
Many thanks
B
sfew

Moderator
Registered:
Posts: 823
Reply with quote  #7 
Badger,

Only one column of values at a time is labeled: the column of values that is highlighted. Regarding the meaning of the measure, the measure could be anything.

Negative values are easy to include. If the lowest value in the data set is less than zero, then the quantitative scale would begin at a value that is slightly less than that negative value rather than at zero.

__________________
Stephen Few
badger

Registered:
Posts: 4
Reply with quote  #8 
Thanks for the reply Stephen

So for this chart to be useful it needs to be interactive so that a column of values can be selected and their title/heading displayed, right?

Doesn't seem too useful when printed though.

Thanks
Charles


__________________
Many thanks
B
sfew

Moderator
Registered:
Posts: 823
Reply with quote  #9 
Charles,

I don't advocate printing Zvinca Plots. We print graphs for the purpose of communicating information to others, which is not the purpose of a Zvinca Plot. This graph is primarily useful for exploratory data analysis, which will always be done interactlvely, on a computer.

__________________
Stephen Few
Or_Shoham

Registered:
Posts: 1
Reply with quote  #10 
Having read this article, and having revisited the original wrapped graph article, and I was not able to find an explanation of when displaying this many values on a bar chart or dot plot would actually be required and what sort of analysis it would facilitate. When and why would you need to do this, and why are existing solutions for display of values and distribution (some combination of box plots, bar charts or line charts with buckets, and regular bar charts/dot plots displaying a partial data set such as top 50 and possibly including a scroll bar) insufficient?
sfew

Moderator
Registered:
Posts: 823
Reply with quote  #11 
Or,

I've been puzzling over your question, because I explained the purpose for which and the occasions on which Zvinca Plots are used quite clearly in the article. I provided the same explanation in my article about Wrapped Graphs. To repeat what I said in the article, a Zvinca Plot is used for the same purpose as a regular bar graph or dot plot and it can be used when you need to examine and compare more values than a regular bar graph or dot plot can handle. Every example that appears in the article illustrates this. Scrolling a regular bar graph or dot plot is fine if you only want to see and compare a few values at a time without the context of the whole, but this isn't a good way to explore an entire data set.

Given the fact that I clearly explained the purpose of a Zvinca Plot in the article, I'm left wondering why you were "not able to find an explanation." All that I can imagine is that you are actually asking a different question: "What is the purpose of a regular bar graph or dot plot?" I assumed that anyone who reads articles about data visualization knows the answer to this question, but your questions suggests that I might be wrong.

Multiple graphs use bars to encode values and several use individual data points to encode values, but what we call a "bar graph" and what we call a "dot plot" serve a specific purpose. We use them both to examine and compare individual quantitative values. We can do more with them at times, but this is their fundamental purpose. For example, we don't use bar graphs or dot plots to examine distributions, but they reveal information about a distribution when an entire data set is displayed. There is a specific graph, however, that encodes values using bars whose purpose is for examining distributions--a histogram--but this is not what we call a bar graph. It doesn't encode individual values. Values can also be encoded as individual data points for the purpose of examining a distribution in the form of a strip plot, but this is not what we call a dot plot.

Both bar graphs and dot plots have a quantitative scale along one of the axes and a categorical scale (i.e., a list of categorical items) along the other axis. The fundamental purpose of these graphs is to examine a set of quantitative values that are associated with a categorical variable (product, country, etc.) and to easily and rapidly compare those values. When the data set is too large for a bar graph or dot plot to handle, a Zvinca Plot may be used.

__________________
Stephen Few
sfew

Moderator
Registered:
Posts: 823
Reply with quote  #12 
Someone emailed me directly to say that, with a data set consisting of 1,000 values or less, a bar graph works better than a Zvinca Plot. The example below was provided to support this position. It is absolutely true that, if you don't need to see categorical labels to know what the values represent, a bar graph or a dot plot with up to 1,000 values works better than a Zvinca Plot. This, however, does not undermine the merits of a Zvinca Plot in the least. The purpose of a Zvinca Plot is to serve as a substitute form of display when the number of values exceeds what can be shown in a regular bar graph or dot plot. If categorical labels are not needed, the limits of bar graphs and dot plots are dramatically extended, as the example below demonstrates. Looking at this example, however, we can easily see that more than 1,000 values would require an alternative form of display. That's when a Zvinca Plot would come in handy. In cases when you want categorical labels to be visible, the Zvinca Plot comes in handy for data sets that are much smaller than 1,000 values. For example, if categorical labels were included in the bar graph below, it would probably be limited to approximately 100 values. image.png 

__________________
Stephen Few
neilism

Registered:
Posts: 8
Reply with quote  #13 
This graph is a clever way of allowing the user to see all the data in a relatively small space, rather than having to present a summary like a histogram or a box plot. I'm curious to know from other readers when they would use this in practice?

In my organisation, we have lots of different products (hundreds...) all of which have various metrics associated with them (e.g. customer satisfaction questions, sales, etc.). However, I can't think of a circumstance as to when I would want people to explore a single metric like this, either as a lookup for a specific product, or a way of individually identifying the products at the extremes.

Thinking aloud, is this something generally better suited to operational management information (like statistical process control type situations)  -- e.g. picking up on the call center operator whose handling times appear most extreme over the course of a shift?
sfew

Moderator
Registered:
Posts: 823
Reply with quote  #14 
Neilism,

A Zvinca Plot is not a graph that you would use primarily for lookup purposes, nor is it one that you would use primarily for viewing patterns of distribution, even though those patterns can be seen in one fairly well. Think of it primarily as a form of display that you would use to examine and compare many individual values in a manner that allows you to see all of the values at once. In other words, you can examine and compare individual values in the context of the entire data set. Can you not imagine the usefuless of examining and comparing the annual sales revenues associated with all of your hundreds of products at once?

One of the mistakes that is often made in data analysis is never examining sets of quantitave values in their entirety. By only examining subsets of values, we miss an important part of the story.

__________________
Stephen Few
Previous Topic | Next Topic
Print
Reply

Quick Navigation:

Easily create a Forum Website with Website Toolbox.