Discussion


Note: Images may be inserted into your messages by uploading a file attachment (see "Manage Attachments"). Even though it doesn't appear when previewed, the image will appear at the end of your message once it is posted.
Register Latest Topics
 
 
 


Reply
  Author   Comment   Page 1 of 2      1   2   Next
pzajkowski

Registered:
Posts: 46
Reply with quote  #1 
In a recent discussion thread within the Perceptual Edge forums, Stephen Few suggested taking a look at Donald Wheeler's books regarding analyzing variation. I took up Stephen's advice and purchased "Understanding Variation: the Key to Managing Chaos." Indeed, it is an excellent book: a useful introduction to creating and understanding process control charts. (Thank you Stephen!)

The reason for my specific forum post now is to seek help on how to interpret the following graph (see below) which reflects a rate (%) of compliance over time. I've applied Wheeler's techniques for generating the XmR chart. All of the data points fall within the upper and lower control limits.

At first glance the data presents a downward trend. Based on Wheeler's book, however, he would suggest that the data points represent nothing significant (for now) since all data points fall within the upper and lower limits. One will also notice that four of the data points reside above the average, and four data points are below -- an even split.

So, is the current downward trend noteworthy or merely expected variation (noise) reflected by the "voice of the process"? Or.. are there merely not enough data points to make a conclusive decision?

One additional thought... The time series is presented in months -- Would the interpretation of the graph be different if the time series were in days or quarters or years?  Does the interpretation also rely on what process is actually being measured (% of deaths in hospital discharges vs. % of iPod Apps purchased)?


Thanks --Pete 

Attached Images
Name: RateOfCompliance.jpg, Views: 771, Size: 84.45 KB


sfew

Moderator
Registered:
Posts: 820
Reply with quote  #2 
Hi Pete,

I'm still a novice in the use of process control charts, but I'll do my best to answer your questions based on Wheeler's books. As I understand it, neither the interval of time (days, months, quarters, etc.) nor the nature of the measure (% of deaths vs. iPod app purchases) affects the chart's effectiveness. Also, you have enough values for the chart to function reliably. Assuming that your calculations are correct, there does not appear to be a clear indication of variation that you can influence in this data set. If the next value exceeds the lower control limit, however, you'll have a clear sign of variation that you should seek to understand and remedy.

As you may remember from Understanding Variation (or perhaps from his other books), there are other signs of meaningful variation besides exceeding the 3-sigma control limits. One such sign is the presence of two out of three successive values on the X Chart that are both on the same side of the central line and are both more than two sigma units (note, two, not three sigma units) away from the central line, which may indicate a "moderate shift in the underlying process." This doesn't appear to be the case in your data, but there is another rule of signal detection that you might check. "Whenever four out of five successive values on the X Chart are on the same side of the central line and are also more than one sigma unit away from the central line, you should look for the cause of a small to moderate shift in the underlying process."

Apart from process control charts, it might also be interesting to see this year's data compared to the same months of the last few years. A simple line graph with one line per year would make this comparison easy. I'm wondering if there is a seasonal pattern that matches this downward trend following February.

__________________
Stephen Few
pzajkowski

Registered:
Posts: 46
Reply with quote  #3 

Stephen,

Thank you for the reply. Your answers ultimately affirm my general conclusions, but nonetheless I felt compelled to pose my questions/concerns to the forum community for validation. Indeed, I am aware of the various indicators for identifying a "shift" in a given process; as you stated, there appears to be nothing in the charts I posted to signal something out of the ordinary.

A couple of thoughts still keep bugging me, however. 

First, a fundamental purpose of a control chart is to reveal whether a process is predictable within a given scope of likely performance. So, plot the data, calculate the limits, and voila -- the control chart reveals "nothing to see here" or "Houston, we have a problem". Thus, the effectiveness of the control chart is evident. BUT... the meaningfulness of a control chart (and perhaps any visualization) is still somewhat dependent on the granularity of the data. For example, in Understanding Variation , 2ed., (chapter 5 - "But You Have to Use the Right Data"), Wheeler makes the case that all too often data are summed/aggregated to the point that the data context gets lost. His example of analyzing "on-time closings" for all departments of company "x" ends up not as meaningful compared to charting each department's on-time closing (pg. 84). So, if rolling up 35 departments into a single measure may not be optimal for insight due to the individual variations in each department's closing process, couldn't the same thing be said about whether the meaningfulness of a control chart can be impacted by the granularity of the time period (i.e., days vs months vs quarters vs years)? For example, Davis Balestracci's article Use the Charts for New Conversations at QualityDigest.com compares three hospitals with seemingly similar mortality rates, yet takes a deeper look into what is actually behind their summarized data points --> three hospitals with very different run charts.

Second, I find myself frustrated with the "feeling" that control charts some how "one-up" all other data visualizations. I don't like this feeling. Do bar charts or line charts  lack key information that promotes action from insight in the way control charts appear to be designed? Remember, a control chart is designed to filter out noise from actual signals of deviance -- i.e., the control chart can "interpret" the data in a way that isn't inherently present in many other visualizations. As I toss these thoughts around in my head, I choose to conclude that each tool (visualization) has its own advantages that should be applied thoughtfully for obtaining insight.

I must say that I have enjoyed the mental rigors of (re)learning about control charts and pondering their role among your work and prescribed techniques, Stephen. While reading Wheeler's book over the past week or so, I've also revisited your texts with enthusiasm. All of this comes at a rare time when my daily work has eased up enough for me to return to thinking about my work rather than just doing my work. 

sfew

Moderator
Registered:
Posts: 820
Reply with quote  #4 

Pete,

This is the kind of discussion that I fully appreciate: inquisitive, thoughtful, and open to the perspective of others.

Regarding your concern that a process control chart’s effectiveness might vary with different intervals of time (year, quarter, month, day, etc.), I’d like to invite others to join in the discussion. Based on my limited knowledge so far of process control charts (I’m planning to increase my expertise in their use soon), I don’t think that the interval of time matters as long as the values are meaningful and comparable. An adequate number of values presented at any interval of time in a process control chart should separate signals from the noise, but only do so at that particular level. We should not ignore the fact that annual variation might be routine, but monthly, weekly, or daily variation might be unpredictable and therefore in need of control. As I point out in my book Now You See It, you should not examine time-series data at only one interval of time. Each interval reveals a different aspect of what’s going on.

Regarding one of your other points, it we shouldn’t compare the merits of a control chart to those of a line graph. A process control chart is a line graph. It just happens to be a line graph that is used for a particular purpose and tends to be designed in a particular way. A process control chart is designed to tell a particular story: whether or not variation through time exhibits routine behavior (noise) vs. behavior that is a sign of variation that can be controlled (signal). As such, it is a valuable tool, and one that most people who examine time-series data don’t use, although they should. The list of potential patterns that we should try to find and understand in our data includes many more stories than this, which is why we rely on many forms of data visualization and techniques of interaction to make sense of data. Every type of graph that’s used for data exploration and analysis should to some extent enable signals to be extracted from the noise. This is accomplished in various ways, in part because noise comes in various types. When we examine data of any type for any purpose, we should be asking the questions, “Of the information that I’m examining, what matters? What, if anything, does this information indicate that I should do?” As I contemplate the topic of my next book, I’m leaning toward one that focuses on the separation of signals from the noise in all types of data analysis. This seems like a logical next step for people who have read my existing work and am asking, “What next?”


__________________
Stephen Few
jlbriggs

Registered:
Posts: 194
Reply with quote  #5 
Pete - in regard to a control chart 'one-upping' other visualizations: you asked "Do bar charts or line charts  lack key information that promotes action from insight in the way control charts appear to be designed?" 
Well - yes.  A control is a line chart, but it is a line chart with additional key information explicitly built into its design.  Those pieces of key information are the entire purpose of using a control chart.  Knowing where a data point lies in regard to control limits, and knowing what the SPC rules have to say about a pattern of data points is information you'll never glean from a standard line or bar chart.

It is extremely important to know a few things about a process before a control chart can really be meaningful.  You need to know if the process is capable of remaining in statistical control (voice of the process...fitness of the measure...).  You need to determine if the data being collected is the right data...not always an easy question.  And really, what it comes down to , is that you need to know what it really is that want to measure, and what you want to learn from it.

I don't think it makes sense to talk about using a control vs a regular line chart or a bar chart.  Line and bar charts are obviously useful tools for learning about and presenting data, but if process control is what you need, the statistical measures that a control chart provides are in a completely different class.

As to the question: "
So, is the current downward trend noteworthy or merely expected variation (noise) reflected by the "voice of the process"? Or.. are there merely not enough data points to make a conclusive decision?"

Whether or not it is noteworthy certainly depends on what is being measured, and how big a variation the chart really reflects.  According to the major SPC rules used, there is nothing statistically significant in the trend as recorded on that chart.
It's also important to note that the mean and control limits used in control charts are often pre-determined or based on a historical mean, rather than being set strictly according to the data present on the chart.

The matter of the time frame is certainly a potential issue. As Stephen suggested, the data should be looked at in several ways.  There may very well be assignable cause variation that can be determined from a more granular view, which may explain the trend better (of course, there may not be as well).

And then, of course, there is the matter of specification.  A process can be in perfect statistical control, but not meet the specifications.  It can meet the specifications but not be in statistical control.

:)





Anders

Registered:
Posts: 18
Reply with quote  #6 

An excellent post by jlbriggs. I just have a few more comments from my point of view. Control charts are an important part of quality control in chemical analysis, where I'm more familiar with them.

- First of all, for a control chart to make sense, the data must be normal distributed. I'm not sure that is the case with your data; compliance may be affected by seasonal changes and what not. It all depends on what you’re actually measuring.

 

- The central line and limits has to establish by a representative selection and robust calculations. You have 8 points in your chart; in my world of chemical analysis, 8 samples are a bit on the short side. 20 measurements or more is preferable.

 

- The central line and limits must be established before you start using the control chart. So if the 8 points in the chart are the ones that are used to calculate the limits, it does not make much sense to say that “all are within the control limits” or that it’s an “even split” between results over and below average.

 

- All that aside, in chemical analysis it’s common to have to limits: The alarm limit, which is average ± 2SD and the action limit, which is average ± 3SD. Any result outside the action limit is a statistical anomaly that would require the analysis to be stopped and the problem investigated. One result outside the alarm limit (but within the action limit) is considered normal, since it’s about 1 in 20 change of this happening. Normally, 2 of 3 consecutive results outside the alarm limit is considered a statistical anomaly that would require the analysis to be stopped and the problem investigated.

 

The alarm and action limits are the one used for the daily evaluation of the control chart. However, it’s also common to look for trends or patterns in a chart. These trends may is not considered serious enough to stop the analysis, but serves as an early warning sign for a future problem. One is to count the consecutive number of results on one side of the central line. 10 of 11 consecutive numbers are normally considered as sign that something is not right. A second pattern to look out for is an increasing or decreasing trend. 7 consecutive results either increasing or decreasing is consider a sign that a problem is under development. You chart shows 7 consecutive points decreasing, and would be something to keep an eye on.

 

Control charts are also inspected for long time trends. Once a year (or when you have 60-100 results), the central line and limits are recalculated and checked. Also, the number of samples on each side of the central line is counted (should be an “even split”) and the number of samples between avg ± 1SD, avg ± 2SD and avg ± 3SD are counted. The last calculations would indicate whether the data is normal distributed or not (the 68-95-99.7 rule).

 

As for your data, I suspect that the information you want to extract from the data is better served with other graphs and calculations then a control chart. A control chart is suppose to check the process, and needs more frequent readings then once a month. Plotting the results daily would make more sense. Also, for monthly plotting, you need data for several years back to calculate robust limits and check the data for normality, which is required for a control chart. It’s also required that the process has remained unchanged for all those years you are basing your calculations on; otherwise your calculations may be flawed.

Derek_C

Registered:
Posts: 69
Reply with quote  #7 

It's true that for a strict traditional control chart, your data must be normally distributed, since the control lines are "mean" "+1 SD" and "-1 SD". But I would not disdain a "control" chart that had "median", "upper quartile" and "lower quartile" as lines (or some less frequently occurring quantile), as long as the lines were clearly labeled as such. The meaning of the curve crossing the line would be "this happens (for example) only 5% of the time".

Anders

Registered:
Posts: 18
Reply with quote  #8 

That would be possible, Derek, but it would demand more statistical expertise to interpret the chart. In fact, there are several other limits that could be used. The limits and "rules" are set based on the probability to discover a systematic error of a given size.

However, my main concern is the few data points collected to set the limits on this one. 8 points is way too few to determine whether the data is normalized or not. Also, "rate of compliance" might have natural seasonal variation which who influence the statistics. For example, are any inexperienced temps hired during summer that would lead to a lower compliance rate? Is there a massive increase in production before Christmas that might influences the compliance rate?

All these questions would be better answered by examining the data with other methods than a control chart.
 

pzajkowski

Registered:
Posts: 46
Reply with quote  #9 

Wow - lots of responses! Quite an education from everyone, if not somewhat difficult to digest all at once. Nonetheless I have a couple of comments/questions that I'll share (in no particular order or priority).

1) Are normalized data really required for control charts? I've come across a couple of articles indicating that normalized data are not required for process control charts. Although I can't find an exact quote from Wheeler, I'm fairly certain he's stated that it isn't a requirement. The closest evidence I found from Wheeler is in his article "Do You Have Leptokurtophobia". Also, Davis Balestracci's article "Four Control Chart Myths from Foolish Experts" begins with "Myth No. 1: Data must be normally distributed before they can be placed on a control chart." Consequently, I'm a little confused as to whether normalized data are essential. 

2) I agree that having only 8 points is a tad weak.  The "rate of compliance" I graphed is just one of many clinical measures that are tracked for a particular client. Interestingly enough, though, we only report to the client Current Rate vs. Baseline -- no trending.  Looking at the client's latest report, I noticed only one measure out of dozens that showed an undesirable current rate below the baseline rate. So, I was curious to see if I could apply Wheeler's techniques to this one set of data to see if something would stand out as an exceptional cause of variation. Wheeler is quite clear that a comparison of two values just doesn't provide insight that can lead to improvement. (By the way, when I created the graph, I changed the months on the x-axis before uploading to the discussion forum. I figured it would be easier for someone to read a chart that started with January, rather than beginning in August and then crossing over into March of the following year.)

3) If a control chart is not ideal for the data I presented, what suggestions might be offered? How would one determine if the downward trend is meaningful or noise? 

4) As for concerns of seasonal variation, it's possible given that I work with health care data. Seasonality can be viewed in lots of the data sets depending on what is being analyzed. Immunizations come to mind. Increased hospital admissions in the spring for baby births is another obvious example.




Anders

Registered:
Posts: 18
Reply with quote  #10 

My thoughts regarding two of your comments:

 

1) No, you’re not required to normalize that data. By not normalizing the data, you can read of the deviation from the central line in absolute units, which may be an advantage when looking for trends. You do need to normalize the data if the expected average is different on the different points. Another common calculation is to plot the Z-score (the deviation from mean as standard deviation). Z-score is nice when comparing multiple control charts.

 

3) Control charts are a tool to see whether a business process, a chemical analysis or the manufacturing of high precision machine parts are under statistical control. They are often used in a go-no go decision (approve a batch of machine parts for the marked, release analytical results etc). First I would ask, are you interested in monitoring the process, examine the data for trends and patterns or see where to optimize the process? If you just want to monitor the process, is a reading every month enough? It may take a month or two before the charts show that the process is out of control, is this time lag acceptable for taking action? With that in mind, your chart as it is shows that the process is under control. The downward trend might be an early warning that a problem is building up. Make a note of it, and see whether the trend is still going downward by the next reading. Unless there are other reasons to believe that the process is out of control.

 

On the other hand, if you want to explore the data or optimize the process, I would suggest drilling down in the data. It depends a bit on what the process you’re looking into is. But for a start, I would make a plot with more frequent readings, daily or weekly. A cycle plot (http://www.perceptualedge.com/articles/guests/intro_to_cycle_plots.pdf) might be good to see if there are any trends on particular weekdays. I would also plot non-normalized data. Make a biplot (X/Y-plot) to see if there are any relationship between volume and rate of compliance. Are there any other data to compare the results with? If a chemical analysis is out of control and I need to do problem solving, I usually check the logbook of the instrument to see if there is any instrument repair or changes in the method that correlates to a change in the data. Have any solvents or calibration standards been change? Are there any new personnel? Etc, etc.  Also, I notice your Range chart. What data is plotted there? You do not see the same trend there as you do in your X-chart.

jlbriggs

Registered:
Posts: 194
Reply with quote  #11 
I would just add that a control chart is really designed to measure a process that is expected to be sustainable, repeatable, and controllable.

If you are measuring how many people come into a health care office with a cold...a control chart isn't going to do a lot for you. Trends and distributions will be more meaningful.

If you are measuring how many people come into a health care office and contract an infection (or something along those lines)...then that's a process that may be measurable in a meaningful way using a control chart.

Also: Having normally distributed data is important for some things, but for measuring on a control chart it is not necessary.
Anders

Registered:
Posts: 18
Reply with quote  #12 

As I said earlier, normal distributed data isn't an absolute requirement, but common tools like some of the Westgaard rules do not apply (http://www.westgard.com/westgard-rules-and-multirules.htm) unless the data is approaching normality. Also, calculating the limits and the probability for detecting a bias in your data is more complex if the data isn't normal distributed (e.g. the 68-95-99.7 rule does not apply) In my opinion, unless normality is assumed, some sort of quick check should be performed.  The x-chart in question: If the data isn’t close to normality, how does he know what systematical error the calculated limits will detect?




wd

Registered:
Posts: 167
Reply with quote  #13 

Control charts exist for a number of distribution types.  P & NP charts for binomial data, c & U for Poisson distributions and a host of "X" charts for continuous data.  An X-chart should have normally distributed data underlying the chart since it uses individual points.  An X-bar chart uses the mean value of some number of points (> 2 to ?) where the effect of averaging will naturally normalize the data.  Know the data and know the purpose of the chart.  Then choose the chart appropriate for the task at hand.


__________________
Bill Droogendyk
pzajkowski

Registered:
Posts: 46
Reply with quote  #14 
Bill (wd) -- the various control charts and the calculations appropriate for each type of chart is still somewhat confusing to me. Not impossible to understand, but leaves me feeling like someone could easily employ a given control chart incorrectly if not well versed in the use of them. Jon Peltier created a control-chart decision diagram to help discern what type of chart to use, especially if someone is creating a control chart on one's own rather than using a SPC software application.

I've also taken an initial read through "Plotting basic control charts: tutorial notes for healthcare practitioners" (M.A. Mohammed, P. Worthington, W.H. Woodall) which provides a bit more detail regarding appropriate control-chart selection.

Based on Wheeler's book "Understanding Variation" as well as some of his articles, he seems to recommend XmR charts as a great place to start for most analyses over other types of control charts. And since I'm getting re-acquainted with control charts ("process behavior charts" per Wheeler), starting off with XmR charts as detailed in Wheeler's book is fine by me.

Anders -- I think I might be misusing terminology. In an earlier reply of mine, I referred to "normalized data" in response to an earlier statement you made about "First of all, for a control chart to make sense, the data must be normal distributed." My intent was to question if data truly needs to have a normal distribution since Wheeler contends, generally speaking, that process behavior charts (or at least XmR charts) do not require data to be normally distributed. I have a feeling that my use of "normalized data" may mean something different than "normal distrubted."

jlbriggs -- Measuring a process is ultimately what I'm after. Actually, since I work primarily as a SQL database developer/reporting analyst, it should be straight forward to measure the various IT processes we have for data loading, data rejections, data gaps... i.e., do we load and manage the data consistently? But on the applied healthcare side of the company, I've been wondering if some of the clinical measures that our case managers support can be measured as a "process". For example, one of our clinical measures tracks the percent of diabetics whose A1c is less than 8% --> when I plotted the data on an XmR chart, the initial data point which reflected the baseline rate was below the lower control limit; all other data points show an upward trend yet hovering around a mean that is well above the initial data point. When I examine a dozen other clinical measures, very similar outcomes are plotted: i.e, the baseline rate is on or below the lower control limit. Only one clinical measure showed a downward trend, which is reflected in the chart I posted at the beginning of the discussion thread. Overall, I'm still investigating what possible application control charts might have for the company, specifically related to our care management services and efforts to coordinate physician services.

staceybarr

Registered:
Posts: 3
Reply with quote  #15 
Hi everyone, and sorry for butting in!

I only have scanned the comments so far, and just noticed a few potential misunderstandings in the use of the XmR chart, as per the instructions in Wheeler's book.

XmR charts can be used for any type of measure (whether normally distributed or not) so long as the measure is calculated at even time intervals, like weekly or monthly or whatever. I got this advice directly from Donald Wheeler (as I had the same question). So averages, ratios, percentages, totals - they all qualify for XmR charts.

Your XmR chart here is not really showing a signal, but because you have 6 out of 7 points in a row descending consecutively, I'd say you don't have a stable process yet - or have started to measure the process at a point where it's changing. Therefore I would wait for more data before finalising where those limits of natural variation and the mean line should go.

Like Anders, I like to use about 20 points to do this, even though Wheeler says you can use as few as 6 or 7 (you can, if those 6 or 7 are a good representation of normal variation). So perhaps you could try changing your measure to weekly, but I still think you might see that you haven't got stability yet. (Stability is when you have random up and down movements in the points, with no apparent trends or shifts or too many outliers).

I agree that the frequency with which you measure really does depend on what you're measuring. It makes no sense to measure employee engagement daily since it's one of those performance results that shifts very slowly and all you'd see is a lot of data not changing much. But annually would be too infrequent. So it's about common sense and as Steve suggests, looking at a few different time scales to familiarise yourself with the patterns of variation in that particular measure.

A point falling outside the limits of natural variation is only one signal type. Others include 7 or more points on one side of the meanline (or 12 out of 14), or 7 or more points consecutively increasing or decreasing (or 12 out of 14). Or kind of like Steve pointed out, 3 points in a row closer to a limit of natural variation than to the meanline.

We really need to take care with using these charts, as some have pointed out there are many forms of control charts. The XmR chart is best for management information (ie KPIs or performance measures), and it's probably the simplest of all control charts to construct and understand. More advanced control charts are probably better for manufacturing processes, where the data is based on samples and batches.

pzajkowski - my only tip for you would be to keep adding data to your XmR chart, and recalculating the lines until you see some stability in them (ie adding more data values doesn't change them much). But of course, you want to also be sure you're measuring at the appropriate frequency (so try weekly also) and that the thing you're measuring isn't chaotic or going through a change right now.

Again, sorry for butting in. And I hope I haven't wasted anyone's time.

__________________
Smiles, Stacey.

http://www.staceybarr.com
Previous Topic | Next Topic
Print
Reply

Quick Navigation:

Easily create a Forum Website with Website Toolbox.