1010 USP39-NF34 ANALYTICAL DATA INTERPRETATION AND TREATMENT (ÖÐÓ¢ÎÄ) ÏÂÔØ±¾ÎÄ

experiment, or the sample size for the new experiment is sufficiently large so that the normal distribution is a good approximation to the t distribution; and 3) the laboratory is confident that there is no actual difference in the means, the most optimistic case. It is not common for all three of these assumptions to hold. The formula above should be treated most often as an initial approximation. Deviations from the three assumptions will lead to a larger required sample size. In general, we recommend seeking assistance from someone familiar with the necessary methods.

ÕâÊÇÒ»ÖָĽø¹ýµÄESD¼ìÑ飬Ëü¿ÉÒÔ´ÓÒ»¸öÕý̬·Ö²¼µÄ×ÜÌåµ±Öз¢ÏÖÔ¤ÏÈÉ趨ÊýÁ¿£¨r£©µÄÒì³£Öµ¡£¶ÔÓÚ½ö¼ì²â1¸öÒì³£ÖµµÄÇé¿ö£¬¼«¶ËѧÉú»¯Æ«Àë¼ìÑéÒ²¾ÍÊdz£ËµµÄGrubb's¼ìÑé¡£²»½¨Ò齫Grubb's¼ìÑéÓÃÓÚ¶à¸öÒì³£ÖµµÄ¼ìÑé¡£É趨r=2,¶øn=10¡£

Table 8. Common Values for a Standard Normal Distribution

Confidence Level 99% 95% 90% 80% One-sided (a) 2.326 1.645 1.282 0.842 z-Values Two-sided (a/2) 2.576 1.960 1.645 1.282

When a log transformation is required to achieve normality, the sample size formula needs to be slightly adjusted as

shown below. Instead of formulating the problem in terms of the population variance and the largest acceptable difference, ¦Ä, between the two procedures, it now is formulated in terms of the population %RSD and the largest acceptable proportional difference between the two procedures.

ÕâÊÇÒ»ÖָĽø¹ýµÄESD¼ìÑ飬Ëü¿ÉÒÔ´ÓÒ»¸öÕý̬·Ö²¼µÄ×ÜÌåµ±Öз¢ÏÖÔ¤ÏÈÉ趨ÊýÁ¿£¨r£©µÄÒì³£Öµ¡£¶ÔÓÚ½ö¼ì²â1¸öÒì³£ÖµµÄÇé¿ö£¬¼«¶ËѧÉú»¯Æ«Àë¼ìÑéÒ²¾ÍÊdz£ËµµÄGrubb's¼ìÑé¡£²»½¨Ò齫Grubb's¼ìÑéÓÃÓÚ¶à¸öÒì³£ÖµµÄ¼ìÑé¡£É趨r=2,¶øn=10¡£

where

and r represents the largest acceptable proportional difference between the two procedures ((alternative-current)/current), and the population %RSDs are assumed known and equal.

ÕâÊÇÒ»ÖָĽø¹ýµÄESD¼ìÑ飬Ëü¿ÉÒÔ´ÓÒ»¸öÕý̬·Ö²¼µÄ×ÜÌåµ±Öз¢ÏÖÔ¤ÏÈÉ趨ÊýÁ¿£¨r£©µÄÒì³£Öµ¡£¶ÔÓÚ½ö¼ì²â1¸öÒì³£ÖµµÄÇé¿ö£¬¼«¶ËѧÉú»¯Æ«Àë¼ìÑéÒ²¾ÍÊdz£ËµµÄGrubb's¼ìÑé¡£²»½¨Ò齫Grubb's¼ìÑéÓÃÓÚ¶à¸öÒì³£ÖµµÄ¼ìÑé¡£É趨r=2,¶øn=10¡£

APPENDIX F: EQUIVALENCE TESTING AND TOST

In classical statistical hypothesis testing, there are two hypotheses, the null and the alternative. For example, the null may be that two means are equal and the alternative that they differ. With this classical approach, one rejects the null hypothesis in favor of the alternative if the evidence is sufficient against the null. A common error is to interpret failure to reject the null as evidence that the null is true. Actually, failure to reject the null just means the evidence against the null was not sufficient. For example, the procedure used could have been too variable or the number of determinations too small.

ÕâÊÇÒ»ÖָĽø¹ýµÄESD¼ìÑ飬Ëü¿ÉÒÔ´ÓÒ»¸öÕý̬·Ö²¼µÄ×ÜÌåµ±Öз¢ÏÖÔ¤ÏÈÉ趨ÊýÁ¿£¨r£©µÄÒì³£Öµ¡£¶ÔÓÚ½ö¼ì²â1¸öÒì³£ÖµµÄÇé¿ö£¬¼«¶ËѧÉú»¯Æ«Àë¼ìÑéÒ²¾ÍÊdz£ËµµÄGrubb's¼ìÑé¡£²»½¨Ò齫Grubb's¼ìÑéÓÃÓÚ¶à¸öÒì³£ÖµµÄ¼ìÑé¡£É趨r=2,¶øn=10¡£

The consequence of this understanding is that, when one seeks to demonstrate similarity, such as results from two laboratories, then one needs similarity as the alternative hypothesis. A statistical test for an alternative hypothesis of similarity is referred to as an equivalence test. It is important to understand that ¨Dequivalence¡¬ does not mean ¨Dequality.¡¬ Equivalence should be understood as ¨Dsufficiently similar¡¬ for the purposes of the laboratory(ies). As noted earlier in this chapter, how close is close enough is something to be decided a priori.

ÕâÊÇÒ»ÖָĽø¹ýµÄESD¼ìÑ飬Ëü¿ÉÒÔ´ÓÒ»¸öÕý̬·Ö²¼µÄ×ÜÌåµ±Öз¢ÏÖÔ¤ÏÈÉ趨ÊýÁ¿£¨r£©µÄÒì³£Öµ¡£¶ÔÓÚ½ö¼ì²â1¸öÒì³£ÖµµÄÇé¿ö£¬¼«¶ËѧÉú»¯Æ«Àë¼ìÑéÒ²¾ÍÊdz£ËµµÄGrubb's¼ìÑé¡£²»½¨Ò齫Grubb's¼ìÑéÓÃÓÚ¶à¸öÒì³£ÖµµÄ¼ìÑé¡£É趨r=2,¶øn=10¡£

As a specific example, suppose we are interested in comparing average results, such as when transferring a procedure from one laboratory to another. (Such an application would also likely include a comparison of precision; see Appendix D.) A priori, we determine that the means need to differ by no more than some positive value, ¦Ä, to be considered equivalent or sufficiently similar. (Appendix E provides some guidance on choosing ¦Ä.) Our hypotheses are then: ÕâÊÇÒ»ÖָĽø¹ýµÄESD¼ìÑ飬Ëü¿ÉÒÔ´ÓÒ»¸öÕý̬·Ö²¼µÄ×ÜÌåµ±Öз¢ÏÖÔ¤ÏÈÉ趨ÊýÁ¿£¨r£©µÄÒì³£Öµ¡£¶ÔÓÚ½ö¼ì²â1¸öÒì³£ÖµµÄÇé¿ö£¬¼«¶ËѧÉú»¯Æ«Àë¼ìÑéÒ²¾ÍÊdz£ËµµÄGrubb's¼ìÑé¡£²»½¨Ò齫Grubb's¼ìÑéÓÃÓÚ¶à¸öÒì³£ÖµµÄ¼ìÑé¡£É趨r=2,¶øn=10¡£

Alternative (H1): |¦Ì1?¦Ì2| <¦Ä Null (H0): |¦Ì1?¦Ì2| >¦Ä

where m1 and m2 are the two means being compared.

ÕâÊÇÒ»ÖָĽø¹ýµÄESD¼ìÑ飬Ëü¿ÉÒÔ´ÓÒ»¸öÕý̬·Ö²¼µÄ×ÜÌåµ±Öз¢ÏÖÔ¤ÏÈÉ趨ÊýÁ¿£¨r£©µÄÒì³£Öµ¡£¶ÔÓÚ½ö¼ì²â1¸öÒì³£ÖµµÄÇé¿ö£¬¼«¶ËѧÉú»¯Æ«Àë¼ìÑéÒ²¾ÍÊdz£ËµµÄGrubb's¼ìÑé¡£²»½¨Ò齫Grubb's¼ìÑéÓÃÓÚ¶à¸öÒì³£ÖµµÄ¼ìÑé¡£É趨r=2,¶øn=10¡£

The two one-sided tests (TOST) approach is to convert the above equivalence hypotheses into two one-sided

hypotheses. The rationale is that one can conclude |m1? m2| ¦Ä< if one can demonstrate both

ÕâÊÇÒ»ÖָĽø¹ýµÄESD¼ìÑ飬Ëü¿ÉÒÔ´ÓÒ»¸öÕý̬·Ö²¼µÄ×ÜÌåµ±Öз¢ÏÖÔ¤ÏÈÉ趨ÊýÁ¿£¨r£©µÄÒì³£Öµ¡£¶ÔÓÚ½ö¼ì²â1¸öÒì³£ÖµµÄÇé¿ö£¬¼«¶ËѧÉú»¯Æ«Àë¼ìÑéÒ²¾ÍÊdz£ËµµÄGrubb's¼ìÑé¡£²»½¨Ò齫Grubb's¼ìÑéÓÃÓÚ¶à¸öÒì³£ÖµµÄ¼ìÑé¡£É趨r=2,¶øn=10¡£

¦Ì1?¦Ì2 < +¦Ä and¦Ì1?¦Ì2 < ?¦Ä

As one-sided tests, they can be addressed with standard one-sided t-tests. In order for the test of the equivalence hypotheses to be of level a, both one-sided tests are conducted at level a (typically, but not necessarily, 0.05). Often, the two one-sided test is performed using a confidence interval. In this case, reject the null in favor of the equivalence hypothesis if the 100(1 ? 2a)% two-sided confidence interval is entirely contained in (?¦Ä, +¦Ä). This is the approach described earlier in the Accuracy section.

ÕâÊÇÒ»ÖָĽø¹ýµÄESD¼ìÑ飬Ëü¿ÉÒÔ´ÓÒ»¸öÕý̬·Ö²¼µÄ×ÜÌåµ±Öз¢ÏÖÔ¤ÏÈÉ趨ÊýÁ¿£¨r£©µÄÒì³£Öµ¡£¶ÔÓÚ½ö¼ì²â1¸öÒì³£ÖµµÄÇé¿ö£¬¼«¶ËѧÉú»¯Æ«Àë¼ìÑéÒ²¾ÍÊdz£ËµµÄGrubb's¼ìÑé¡£²»½¨Ò齫Grubb's¼ìÑéÓÃÓÚ¶à¸öÒì³£ÖµµÄ¼ìÑé¡£É趨r=2,¶øn=10¡£

APPENDIX G: ADDITIONAL SOURCES OF INFORMATION

¸½Â¼G£ºÆäËûµÄÐÅÏ¢À´Ô´

There may be a variety of statistical tests that can be used to evaluate any given set of data. This chapter presents several tests for interpreting and managing analytical data, but many other similar tests could also be employed. The chapter simply illustrates the analysis of data using statistically acceptable methods. As mentioned in the Introduction, specific tests are presented for illustrative purposes, and USP does not endorse any of these tests as the sole approach for handling analytical data.

¿ÉÄÜÓÐÐí¶àͳ¼Æ¼ìÑé¿ÉÒÔÓÃÓÚÆÀ¹ÀÈκθø¶¨µÄÊý¾Ý×é¡£±¾ÕÂչʾÁËһЩ¼ìÑé±»ÓÃÀ´½âÊͺ͹ÜÀí·ÖÎöÊý¾Ý£¬µ«ÊÇÐí¶àÆäËûµÄÏàËÆ¼ìÑéÒ²¿ÉÒÔʹÓᣱ¾ÕÂʹÓÃͳ¼ÆÉÏ¿ÉÊÚÊܵķ½·¨¼òµ¥²ûÊöÁËÊý¾Ý·ÖÎö¡£ÈçÔÚ¡°½éÉÜ¡±ËùÊö£¬ÎªÁË˵Ã÷µÄÄ¿µÄ£¬Ê¹ÓÃÁËÌØ¶¨µÄ·½·¨£¬Í¬Ê±USP²¢²»½«ÕâЩ·½·¨ÖеÄÈκÎÒ»ÖÖ¼ìÑé×÷Ϊ´¦Àí·ÖÎöÊý¾ÝµÄΨһ·½·¨¡£ Additional information and alternative tests can be found in the references listed below or in many statistical textbooks. ÆäËûµÄÐÅÏ¢ºÍÌæ´úµÄ¼ìÑé¿ÉÒÔ´ÓÏÂÁвο¼ÎÄÏ×»òÕßÐí¶àͳ¼Æ½Ì²ÄÖлñµÃ¡£

Control Charts:

th

1. Manual on Presentation of Data and Control Chart Analysis, 6 ed., American Society for Testing and Materials (ASTM), Philadelphia, 1996.

th

2. Grant, E.L., Leavenworth, R.S., Statistical Quality Control, 7 ed., McGraw-Hill, New York, 1996.

rd

3. Montgomery, D.C., Introduction to Statistical Quality Control, 3 ed., John Wiley and Sons, New York, 1997.

rd

4. Ott, E., Schilling, E., Neubauer, D., Process Quality Control: Troubleshooting and Interpretation of Data, 3 ed., McGraw-Hill, New York, 2000.

Detectable Differences and Sample Size Determination:

nd

1. CRC Handbook of Tables for Probability and Statistics, 2 ed., Beyer W.H., ed., CRC Press, Inc., Boca Raton, FL, 1985.

nd

2. Cohen, J., Statistical Power Analysis for the Behavioral Sciences, 2 ed., Lawrence Erlbaum Associates, Hillsdale, NJ, 1988.

3. Diletti, E., Hauschke, D., Steinijans, V.W., ¨DSample size determination for bioequivalence assessment by means of confidence intervals,¡¬ International Journal of Clinical Pharmacology, Therapy and Toxicology, 1991; 29,1¨C8. 4. Fleiss, J.L., The Design and Analysis of Clinical Experiments, John Wiley and Sons, New York, 1986, pp. 369¨C375.

th

5. Juran, J.A., Godfrey, B., Juran's Quality Handbook, 5 ed., McGraw-Hill, 1999, Section 44, Basic Statistical Methods.

6. Lipsey, M.W., Design Sensitivity Statistical Power for Experimental Research, Sage Publications, Newbury Park, CA, 1990.

7. Montgomery, D.C., Design and Analysis of Experiments, John Wiley and Sons, New York, 1984.

8. Natrella, M.G., Experimental Statistics Handbook 91, National Institute of Standards and Technology, Gaithersburg, MD, 1991 (reprinting of original August 1963 text).

9. Kraemer, H.C., Thiemann, S., How Many Subjects<: Statistical Power Analysis in Research, Sage Publications, Newbury Park, CA, 1987.

10. van Belle G., Martin, D.C., ¨DSample size as a function of coefficient of variation and ratio of means,¡¬ American Statistician 1993; 47(3):165¨C167.

11. Westlake, W.J., response to Kirkwood, T.B.L.: ¨DBioequivalence testing¡ªa need to rethink,¡¬ Biometrics 1981; 37:589¨C594.