Discussion


Note: Images may be inserted into your messages by uploading a file attachment (see "Manage Attachments"). Even though it doesn't appear when previewed, the image will appear at the end of your message once it is posted.
Register Latest Topics
 
 
 


Reply
  Author   Comment   Page 1 of 2      1   2   Next
StatsFella

Registered:
Posts: 9
Reply with quote  #1 
I am trying to come up with a better method to display in a single chart (or table) where I have a range of values and the performance for an Area against various skills.

In the below example each cell represents the average for the skill/area.  What I lose is the range of results and particularly for a large area the results end up in the yellow.  

Other thoughts?

Skill Matrix.png 

wd

Registered:
Posts: 167
Reply with quote  #2 
Your options depend somewhat on the number of data points that you have for each cell - something a StatsFella can appreciate!.
If you have the data, consider a horizontal box-whisker plot in each cell, horizontally positioned according to an overall average (or overall box-whisker) that you would also show.

__________________
Bill Droogendyk
StatsFella

Registered:
Posts: 9
Reply with quote  #3 
Updated.png 

Thanks, my original thoughts were that to do the box with whiskers for each Area but will give the individual cell option (aka sparklines a go).

I did update one option per below, the average as the icon then the Number who have mastered the skill listed.  


StatsFella

Registered:
Posts: 9
Reply with quote  #4 
As an exercise thought I would try Sparklines, as thought some areas simply have too many data points to have this as valuable.

Spark.png 

wd

Registered:
Posts: 167
Reply with quote  #5 
let's see the box & whisker option ...
__________________
Bill Droogendyk
danz

Registered:
Posts: 181
Reply with quote  #6 
StatsFella,

I understand from you first post that each cell shows the average of multiple values for each pair of area/skill. As we can all see, a mini barchart chart showing all values for a pair does not improve the general understanding of data, it makes it more cluttered so less understandable. The fact you have variable count values for each cell it just make it look worse. In top of it you added extra traffic light colors to your mini bars, so the result is really bad.

I would drop the extra level of details you try to display with your sparklines and display in each cell just an horizontal bar for each "average" for more accurate comparison. If your visualization is interactive, an extra detail chart can be displayed for certain user interaction (click or hover).

However if your "average" would be the median of the values, a gray boxplot displayed in each cell using a mandatory common scale (for correct comparison), might not look that cluttered. The median value displayed as a line in top of your boxplot can be enriched with a color if there are values that require special attention, just avoid to assign traffic lights like colors for all. Take also into account that median like calculations are meant for many values, a median calculated for 3 values despite its valid mathematical formula has not much of a logic.

If your detail values would be time series (as Stepehen mentioned) sparklines designed as line chart might be possible, yet displaying a two dimensional array of sparklines in small areas is not a great idea, a separate detail chart triggered by user interaction would be my choice.

StatsFella

Registered:
Posts: 9
Reply with quote  #7 

Quote:
Originally Posted by wd
let's see the box & whisker option ...

Bill,

For your pleasure only I have attached the box and whisker.   This will be far too messy when taking across all 10 categories on the audience expectation of a single piece of paper.

Some nodes such as #15 simply don't have many data points.   This simply is too complex given the audience.

BW.png 
Darren


StatsFella

Registered:
Posts: 9
Reply with quote  #8 

Quote:
Originally Posted by danz
StatsFella, I understand from you first post that each cell shows the average of multiple values for each pair of area/skill. As we can all see...


For clarification:

Danz,

Data is
- 20 areas;
- that have between 5 to 80 data points;
- with a 1 to 6 result;
- for 10 categories.

No time series
1 printed page expected - no dyanmic abilty

Keys
How many get a 5/6 rating
Range of results
Average / Median

 

Working on thoughts now

StatsFella

Registered:
Posts: 9
Reply with quote  #9 
Looking like best option for now, clean-up and completing for each still to do but making sense.

Going with.png 

wd

Registered:
Posts: 167
Reply with quote  #10 
Darren, your latest confuses me. It looks like there's a different scale for every area such that the length of the blue bar and the value are disconnected (although relative across skills). I think we would be better served with a constant scale in both directions.
There's also no way to know what the total sample size is for each area/skill combination. For A15/S1, there's 1 5-6, how many 3-4, 1-2?

Having seen the box & whisker, I'm still liking it. I would add a total sample value beside each, an overall b/w (at the top) for each skill and a schematic to explain what a b/w is. Don't assume the audience isn't able to comprehend - teach them!

__________________
Bill Droogendyk
jlbriggs

Registered:
Posts: 190
Reply with quote  #11 
Can you provide a data set to work with?

First thoughts on what's been posted so far:

1) Agree that the original is not effective way to communicate the required information.

2) If the important piece of information is the number of 5-6 scores, then that's what needs to be put front and center; the displays presented so far do not make that information the most prominent.

3) Toward that end, the colors really need to be used in that manner - the last display is not as bad, uses a good bold color for the primary information, but pushing the 5-6 bars to the right diminishes their focus. Reversing the stack order would make it easier to compare the proportion of scores that fall into the 5-6 range in each node, though this set up does not give a visual comparison of the count of scores involved.

Whether the count or proportion is the important comparison is up to you and what you're trying to get across, but it's important that the difference is understood.

I also fully agree with Bill that a box plot is not so complex that it can't be understood by a broad audience, and it's easy to teach an audience how to read and understand one.

I question the usefulness of a box plot here only because of your statement that the important piece of information is the number of scores falling in the 5-6 range.

You could always have to columns for each category, one with a bar chart showing the proportion of 5-6 scores, and one with a box plot showing the range of values, if the range of values is also important. A vertical reference line marking the 5 value could be added to show how much the distribution overlaps the desired range.

The other consideration here is the existence of a target value.

If we're measuring how many scores fall into a specific range, presumably we have a goal in mind for how many scores should fall in that range, and presumably we want this visualization to some how influence behavior in order to drive more scores into that range.

So, what's the goal, and what's the desired action that this report should affect?



danz

Registered:
Posts: 181
Reply with quote  #12 
Agree with Bill. Stacked % bars are rarely useful. Especially when you decide to label the counter and not the percentage.

Be aware that the count of values (samples size) always matters. Do you consider that two cells, one having 2 "good" values out of 5 and other having 10 "good" values out of 25, are similar? It might be very well that 5 values are not enough to even draw a conclusion, at least not as robust as in case of 25 values.

When the sample sizes are so different, make sure you will encode both sample size and the median/mean value for each cell. In some cases a stripplot with a line showing the median value would suffice, but you have only maximum 6 distinct values, which means too many overlapped shapes for large samples to be useful.

Encoding a distribution of values in a small area is very challenging. Boxplots, histograms, stripplots are only a few possibilities. Question is: what do you gain from overcomplicating a view supposed to be easily understood by your audience?
StatsFella

Registered:
Posts: 9
Reply with quote  #13 

More details on the audience to give clarity.   There is no time allocation for education, the ‘messenger’ (not an analyst) will not want to have explain.   I’m fully behind your enthusiasm and work along that same premise to educate at every chance, this just isn’t the right opportunity.  Politics means that in theory each Area only cares about their own area and should NOT know all the details pertaining to other areas but having some reference for comparison.   There are no targets in place.

The action desired is to know how many are in the top category and if there are gaps.  For instance Skill 10 is well provided for however efforts in Skill 3 need to be made.   That #15 proportionally is lacking and #2 excelling is secondary to this.

Posting below just to show points above not that I have re-worked as yet.

Jlbriggs and danz – some good points re the proportion v count options for visual optimisation.  


Fin.png 
Thank you.


danz

Registered:
Posts: 181
Reply with quote  #14 
The fact you provide matrix like display makes already the information in cells related to each other. I see that you already decide how to display data, but I dont see how a wrong solution might suit to an untrained audience if is not acceptable by an audience with some experience. Your solution with stacked % bars, with count labels is as wrong as it can be.
bella_gotie

Registered:
Posts: 21
Reply with quote  #15 
12 greater than 17? טיוטות.png 
Previous Topic | Next Topic
Print
Reply

Quick Navigation: