What is the minimum number of sub groups needed for control limits calculation?

How many points are required for robust control limits?

The I-Chart / XMR chart is often suggested as the best chart to use if you have fewer than 12 data points.

There is no agreement as to how many points you need

Even for the I chart, the experts do not reach a consensus:

3 well known sources of information relating to Statistical Process Control, particularly the Individual / X chart:

The Health Care Data Guide (p155) :

“To develop an I chart, the most recent 20 to 30 measurements should be used. More than any other chart, this minimum number of subgroups is important since each subgroup has only one data value”.

Understanding Variation - The Key to Managing Chaos (p60):

“Useful limits may be constructed with as few as five or six values”

Data Sanity (First Edition p150):

Presents a table for runs analysis, starting with 10 data points (and the corresponding lower and upper limit for expected number of runs)

Let’s throw in one more reference :

Practical Performance Measurement (p258)

“We still need a minimum of five consecutive values of a measure just to get started establishing a baseline to represent the current performance level…we might even need more than five to establish that baseline….And once we have our baseline, we need at least another three measure values, and often more than eight, before we can be sure whether performance has changed or not”

I have less than 12 points. What should I do?

If Wheeler say 5 or 6 are good enough, then, as long as you have 5 or 6 points, you can begin to use these charts. However the limits should be considered useful, rather than robust.

We therefore refer to these as ‘trial’ limits.

What is meant by ‘trial limits’?

Just that, due to lack of data , these are as good as they can be, for the time being. With each new data point, the limits will be revised. Once we have enough (i.e. >=12 data points) the calculations will be locked, until such time as a signal of special cause variation occurs - for example, a long run above or below the mean. At this point, if we understand the cause of the variation, we might decide to revise the limits from the start of the long run. Each time the limits are revised, the new limits should be considered ‘trial limits’ until a further 12 (or more) points are collected, at which point they can be ‘locked’ once more.

Are there any other charts we can use if we have less than 12 points?

There are alternatives:

  • run charts
  • cumulative sum (cusum) charts

However, the individuals chart is also advised as one of the three “go to” charts when less than 12 data points are available. This package provides a warning for any groups with less than 12 data points, advising that trial limits are in place.

What is the minimum number of sub groups needed for control limits calculation?

The key is to specify a subgroup size so that significant shifts are detected with high probability and that insignificant shifts are unlikely to produce a signal.

Click here for Part II of this article

The purpose of control charts is to detect significant process changes when they occur. In general, charts that display averages of data/measurements (X-bar charts) are more useful than charts of individual data points or measurements. Charts of individuals are not nearly as sensitive as charts of averages at detecting process changes quickly. X-bar charts are far superior at detecting process shifts in a timely manner, and the subgroup size is a crucial element in ensuring that appropriate chart signals are produced.

WinSPC Means Lower Costs and Higher Quality

WinSPC is software to help manufacturers create the highest quality product for the lowest possible cost. You can learn more here or try it free for 60 days.

Often, the subgroup size is selected without much thought. A subgroup size of 5 seems to be a common choice. If the subgroup size is not large enough, then meaningful process shifts may go undetected. On the other hand, if the subgroup size is too large, then chart signals may be produced from insignificant process shifts. The key is to specify a subgroup size so that significant shifts (from a practical perspective) are detected with high probability and that insignificant shifts are unlikely to produce a signal.

To understand the concept, it is useful to review the impact that averaging data has on variation. The graphic below compares the distribution of individual values with the distributions of averages of various subgroup sizes.

What is the minimum number of sub groups needed for control limits calculation?

We see that as the subgroup size increases, the standard deviation of the distribution of averages decreases. Specifically, the relationship below relates the standard deviation of averages to the standard deviation of individuals and the subgroup size:

What is the minimum number of sub groups needed for control limits calculation?

On an x-bar control chart, this idea is reflected by control limits getting tighter as the subgroup size increases.

Now, consider a process that is stable and under statistical control. This curve is represented by the blue curve on the graphic below. Suppose we desire that if the process average shifts by a specified amount (such that it is represented by the red curve below), we would want to obtain a chart signal with very high probability.

What is the minimum number of sub groups needed for control limits calculation?

The question is, how likely is it that we will detect the process shift, if in fact the process shifts from the blue curve to the red curve? We consider this question for the 4 cases shown in the above graphic.

In the top set of distributions, we are working with individuals. The control limits for the blue process are represented by the dashed vertical lines. Following the process shift, we will sample from the red curve. Note that most of the red curve still falls inside the control limits for the blue curve. This means that we are very unlikely to see a signal on our chart for the shift indicated.

Now consider the second set of distributions. They represent the case where we are using an x-bar chart with subgroup size = 2. A shift of the same size is shown. Here, the process curves are tighter since they represent averages (with n = 2). While things are better than the first case, there is still a significant overlap between these distributions and it is still not very likely that we will detect the shift quickly.

The case with n = 5 (the third case) is better yet, but still not good enough. We want to be able to detect the shift with high probability.

Finally, with n = 12 (the last case), we see that for the same size shift, the two distributions are practically separate. In other words, if the shift occurs, our next subgroup average will come from the red curve and it will almost certainly be outside of the control limits (based on the blue curve).

Although the above graphics allow us to understand how subgroup size affects chart sensitivity (the ability to detect desired process changes), a formula is typically used to compute the necessary subgroup size for a given application. The subgroup size is a function of the desired sensitivity, the process standard deviation, and the willingness to tolerate type II errors (where the process shifts but the chart fails to detect the shift). The formula is:

What is the minimum number of sub groups needed for control limits calculation?

where

n = subgroup size required

Ζ α/2 = the number of standard deviations above zero on the standard normal distribution such that the area in the tail of the distribution is α/2 (α is the type I error probability and is typically 0.0027 for control chart applications. In this case, Ζ 0.00135 =3).

Ζ β = the number of standard deviations above zero on the standard normal distribution such that the area in the tail of the distribution is β (β is the type II error probability).

σ = the standard deviation of the characteristic being charted.

D = the difference we are trying to detect.

View a further discussion on this formula and its application: Click here for Part II of this article.

Steven Wachs, Principal Statistician
Integral Concepts, Inc.

Integral Concepts provides consulting services and training in the application of quantitative methods to understand, predict, and optimize product designs, manufacturing operations, and product reliability. www.integral-concepts.com

How many data points are needed to calculate limits?

Most Shewhart Charts need 12 data points to establish trial limits and 20 to set a baseline. I Charts (also known as X chart, Xmr chart, and Individuals chart) and T-charts require 20 data points.

How many data subgroups are required before a control chart can be interpreted in a meaningful way?

We need 25 or more subgroups, arranged in date order, to generate meaningful control charts. So we will need data from a minimum of 25 time intervals – i.e. 25 days, or 25 weeks, or 25 months. Students often fail to recognize this point and will create control charts with as few as four or five data points.

What should my subgroup size be?

Subgroup size is normally 5 and sample size normally 25-30. You will take samples from a group to understand the group. [This respondent's profile trumpeted that he's an “expert in Six Sigma.”]
If possible, collect data from 20-25 subgroups, with at least 100 individual values. While the data is being collected, minimize disturbances to the process. If a process change is unavoidable, develop a system for recording changes so that their effect can be determined.