PPQ batches are performed to demonstrate that the process under test can be manufactured consistently. You are expected to do this on a statistical basis. I’ve often wondered how you can demonstrate this with only (probably) three batches.

It’s all very well saying that all PPQ batches have passed there in-process control and release tests and so the PPQ has passed, right? But you are required to demonstrated statistical confidence that the process was in control. But if you only performed three validation batches (based on a risk assessment of course, see my previous tech transfer note #11). how can I demonstrate statistically significant consistency?

Some of the observed batch to batch test result variations will come from variation within the batch (intra-batch) and some comes from variation between batches (inter-batch) sources.

In the main these tests all require some statistical knowledge. Let me just start by saying that I’m (definitely) not a statistician, so I won’t be delving into the mathematics of the statistical methods, just exploring what methods are available, but I will give a suitable link at for each for the mathematically minded of you. However, if you are like me you will probably just feed the test results into a statistical software package (such as JMP, SPSS etc) and let the software do the heavy lifting.

There are two main types of statistical test –

- Those that demonstrate inter-batch variability
- Those that demonstrate intra-batch variability

It is important to recognise though that statistical tests cannot give you an absolute answer, they can only give a probability such as “I am 95% sure that 95 % of the sample results will fall into this group” etc.

There are several statistical tests that can be used such as:

- Chi square
- USP <905>
- t-test
- Analysis of variance (ANOVA)
- Kruskal-Wallis test
- Levene’s test.

__Intra-batch variability__

**Chi square (goodness of fit)**

Link: https://www.jmp.com/en_gb/statistics-knowledge-portal/chi-square-test.html

The Chi-square goodness of fit test checks whether your sample data is likely to be from a specific theoretical distribution – in other words are all the results from repeat samples taken from the same point at the same representative of the actual value. The results are compared against the ideal (required result). The test looks at the variance between the results and gives us a way to decide if the results have a “good enough” fit to our specification, or if the results do not give sufficient confidence in the repeatability of the sampling / testing methodologies.

If you are looking to see if there is any variability for the results from a sample point over time, then a t-test is probably the better test to use (comparing the variability of samples taken at a different time point.

**USP <905>**

This is a compendial test which calls for tablets, capsules to be weighed and the standard deviation and Relative Standard Deviation are calculated. The standard defines a specific acceptance criteria and requires a minimum of 30 samples (tablets). So, while this is a statistical test of intra-batch variation, t is only applicable in certain circumstances, although the principle used can be applied to other sample types and sample numbers.

__Inter-batch variability__

The t-test is a statistical test procedure that tests whether there is a significant difference between the means of two groups (e.g. PPQ batches or two sets of samples taken at a different time).

Link: https://www.jmp.com/en_gb/statistics-knowledge-portal/t-test.html

The **one-way ANOVA** (Analysis of Variance) is the extension of the t-test used to compare more than two PPQ batches.

Link: https://www.jmp.com/en_gb/statistics-knowledge-portal/one-way-anova.html

The t-test and the one-way ANOVA test do have one main assumption – and that is that it is assumed that the distribution of results is parametric – and by that, I mean that it is assumed that the sample points in each “group” are normally distributed (conform to a Normal distribution curve).

If it is felt that this assumption is not the case, then non-parametric tests such as the Kruskal-Wallis Means Test and Levene’s Variance Test should be used.

The Kruskal-Wallis test is used to compare the results from three or more PPQ batch results to determine if there are statistically significant differences between them. If you are fortunate to have been able to justify the use of only two PPQ batches, then you could use the Mann-Whitney U test instead.

**Kruskal-Wallis test**

For each sample point, the Kruskal-Wallis test looks at the results from each PPQ batch and combines the results of each duplicated or repeated sample point for each of the PPQ batches and then statistically determines if all the points are “likely” to have come from the same “spread” of distribution by evaluating the means.

Link: https://datatab.net/tutorial/kruskal-wallis-test

**Levene’s Test**

Levene’s test is also a non-parametric test used to determine if two or more samples have the same variance. The Levene’s test uses deviations replacing the original data points.

Link: https://datatab.net/tutorial/levene-test

As with all statistical tests, the higher the number of samples taken and the more PPQ batches that are run, the higher the number of data points to be manually calculated and this can end up being very complex and cumbersome. As such it is recommended that statistical software is used for these calculations. Whereas Excel can always be used, the validation of any spreadsheet constructed could be time-consuming. Statistical software can be assumed to be regarded as “off-the-shelf” software requiring minimal validation.

**Control Chart**

Control Charts: https://deming.org/a-beginners-guide-to-control-charts/

Western Electric Rules: https://en.wikipedia.org/wiki/Western_Electric_rules

There is though one “non-statistical” test commonly used and that is the Control Chart (sometimes called a Shewhart chart). This chart is marked with an “average” line, an “upper control limit” and a “lower Control limit and plots the value of each reading in time order. By using a simple set of rules such as the Western Electric e.g. whether Four out of five consecutive points fall beyond a limit line on the same side of the average line, you can assess whether the process is in control or not.

Example of a control chart (The Deming Institute)

The control chart is usefully to monitor how a single parameter varies with time but cannot be used for multiple samples taken from the same point as the same time.

**About The Author:**

Trefor Jones is a technology transfer specialist with Bluehatch Consultancy Ltd. After spending over 30 years in the pharmaceutical / biopharmaceutical industry in engineering design, biopharmaceutical processes, and scale-up of new manufacturing processes, he now specializes in technology transfer especially of biotechnology and sterile products.

He can be reached at trefor ”at” bluehatchconsultancy.com.