1010 USP39-NF34 ANALYTICAL DATA INTERPRETATION AND TREATMENT(中英文) 下载本文

value and the resulting differences are divided by MAD (see below). The calculation of MAD is done in three stages. First, the median is subtracted from each data point. Next, the absolute values of the differences are obtained. These are called the absolute deviations. Finally, the median of the absolute deviations is calculated and multiplied by the constant 1.483 to obtain MAD6.

阶段1—应用Hampel规则的第一步是将数据正态化。然而,不是使用将每个数据减去平均值后再除以标准偏差,而是将每个数据减去中位值后用差值除以MAD(参见下文)。MAD的计算有3个步骤。首先,每个数据减去中位值。然后,取差值的绝对值。这些值被称为绝对偏差。最后,计算绝对偏差的中位值,再乘以常数1.483来得到MAD6。

Step 2—The second step is to take the absolute value of the normalized data. Any such result that is greater than 3.5 is declared to be an outlier. Table 4 summarizes the calculations.

阶段2—第二个步骤是计算正态化数据的绝对值。任何超过了3.5的结果都被识别为异常值。表4汇总了计算。 The value of 95.7 is again identified as an outlier. This value can then be removed from the data set and Hampel's Rule reapplied to the remaining data. The resulting table is displayed as Table 5. Similar to the previous examples, 99.5 is not considered an outlier.

95.7再次被识别为异常值。这个值可以从数据组中除去,对剩余数值再次应用Hampel规则。计算结果显示在表5当中。与前面的例子相同,99.5未被识别为异常值。

Table 4. Test Results Using Hampel's Rule

Median = MAD = Data 100.3 100.2 100.1 100 100 100 99.9 99.7 99.5 95.7 100 n = 10 Deviations from the Median 0.3 0.2 0.1 0 0 0 ?0.1 ?0.3 ?0.5 ?4.3 Absolute Deviations 0.3 0.2 0.1 0 0 0 0.1 0.3 0.5 4.3 0.15 0.22 Absolute Normalized 1.35 0.90 0.45 0 0 0 0.45 1.35 2.25 19.33

Table 5. Test Results of Re-Applied Hampel's Rule

6

Assuming an underlying normal distribution, 1.483 is a constant used so that the resulting MAD is a consistent estimator of the population standard deviation.This means that as the sample size gets larger, MAD gets closer to the population standard deviation.

假定一个潜在的正态分布,1.483作为常数使用,可以使得到的MAD就是总体标准偏差的一致估计值。这意味着随着样本是的增加,MAD更加趋近于总体的标准偏差。

Median = MAD = Data 100.3 100.2 100.1 100 100 100 99.9 99.7 99.5 100 n = 9 Deviations from the Median Absolute Deviations Absolute Normalized 0.3 0.3 2.02 0.2 0.2 1.35 0.1 0.1 0.67 0 0 0 0 0 0 0 0 0 ?0.1 0.1 0.67 ?0.3 0.3 2.02 ?0.5 0.5 3.37 0.1 0.14

APPENDIX D: COMPARISON OF PROCEDURES—PRECISION

附录D:方法比较 - 精密度

The following example illustrates the calculation of a 90% confidence interval for the ratio of (true) variances for the purpose of comparing the precision of two procedures. It is assumed that the underlying distribution of the sample measurements are well-characterized by normal distributions. For this example, assume the laboratory will accept the alternative procedure if its precision (as measured by the variance) is no more than four-fold greater than that of the current procedure.

为了比较两种方法的精密度需要计算(真)方差比值的90%置信区间,下面的实例阐述了计算过程。需要假设样本测量值的分布本质上是良好的正态分布。对于本例,如果替代方法的精密度(以方差计算)不大于现行方法的4倍,实验室就可以接受替代方法。

To determine the appropriate sample size for precision, one possible method involves a trial and error approach using the following formula:

对于精密度实验需要确定适当的样本量,使用下列公式的试错法是一种可能的确定方法:

where n is the smallest sample size required to give the desired power, which is the likelihood of correctly claiming the alternative procedure has acceptable precision when in fact the two procedures have equal precision; α is the risk of wrongly claiming the alternative procedure has acceptable precision; and the 4 is the allowed upper limit for an increase in variance. F-values are found in commonly available tables of critical values of the F-distribution. Fα, n-1, n-1 is the upper a percentile of an F-distribution with n-1 numerator and n-1 denominator degrees of freedom; that is, the value exceeded with probabilityα. Suppose initially the laboratory guessed a sample size of 11 per procedure was necessary (10 numerator and denominator degrees of freedom); the power calculation would be as follows7:

其中n是获得预期效能的最小样本量,这样就有可能在两种方法实际上等精度时正确地证明替代方法具备适当的精度;α是做出等精度证明错误的概率;4是方差增长的允许下限。F值通常可以从F分布的临界值表中找到。Fα, n-1, n-1值是F分布的上四分位数,这个F分布具有以n-1为分子及分母的自由度;也就是说,这个值超出了概率α。假定实验室最初猜测每个方法11个样本是必需为样本量(分子及分母的自由度均为10),按下式计算效能7:

Pr [F>/4Fα, n-1, n-1] = Pr [F>/4F.05, 10, 10] = Pr [F> (2.978/4)] = 0.6751

In this case the power was only 68%; that is, even if the two procedures had exactly equal variances, with only 11

7

This could be calculated using a computer spreadsheet. For example, in Microsoft? Excel the formula would be:

FDIST((R/A)*FINV(alpha, n ? 1, n ? 1), n ? 1, n ? 1), where R is the ratio of variances at which to determine power (e.g., R = 1, which was the value chosen in the power calculations provided in Table 6) and A is the maximum ratio for acceptance (e.g., A = 4). Alpha is the significance level, typically 0.05.

可以使用计算表格进行这个运算。比如在Microsoft? Excel中公式应该是:FDIST((R/A)*FINV(alpha, n-1, n-1), n-1, n-1),其中R是用于计算效能的方差比值(如:R=1,这个值可以从效能计算表6中选择),A是最大可接受值(如:A=4)。Alpha是显著性水平,通常为0.05。

1

1

samples per procedure, there is only a 68% chance that the experiment will lead to data that permit a conclusion of no more than a fourfold increase in variance. Most commonly, sample size is chosen to have at least 80% power, with choices of 90% power or higher also used. To determine the appropriate sample size, various numbers can be tested until a probability is found that exceeds the acceptable limit (e.g., power >0.90). For example, the power determination for sample sizes of 12–20 are displayed in Table 6. In this case, the initial guess at a sample size of 11 was not adequate for comparing precision, but 15 samples per procedure would provide a large enough sample size if 80% power were desired, or 20 per procedure for 90% power.

在这个例子当中效能仅为68%,这就是说,即使两个方法实际上是等方差的,当每个方法拥有11个样本时,实验仅有68%的机会得出方法没有超过4倍的结论。更常见的,样本量的选择至少能满足80%的效能,也会选择90%的效能或者更高。为了决定适当的样本量,会测试不同的数值直到结果超过了可接受的限度(如,效能>0.90)。例如,样本量为12–20时的效能测定值列在表6当中。本例中,最初的猜测样本量为11,其不具备足够的比较精度,但是每个方法15个样本就可以在预期80%效能时提供足够的样本量,或者在预期90%效能时需要20个样本。

Table 6. Power Determinations for Various Sample Sizes (Specific to the Example in Appendix D) (Continued)

Typically the sample size for precision comparisons will be larger than for accuracy comparisons. If the sample size for precision is so large as to be impractical for the laboratory to conduct the study, there are some options. The first is to reconsider the choice of an allowable increase in variance. For larger allowable increases in variance, the required sample size for a fixed power will be smaller. Another alternative is to plan an interim analysis at a smaller sample size, with the possibility of proceeding to a larger sample size if needed. In this case, it is strongly advisable to seek professional help from a statistician.

精度度比较的典型样本量会比实际比较时的大一些。如果精密度的样本量过大对于实验室进行研究就不实际了,这时可以有一些选择。第一个是重新选择增加的允许值。如果对方差增加的允许值大一些,对于相同效能所需的样本量会小一些。另一个选择是计划使用小样本量进行一个中间分析,可以使用大样本量的概率。本例中,强烈建议向统计学家寻求帮助。

Now, suppose the laboratory opts for 90% power and obtains the results presented in Table 7 based on the data generated from 20 independent runs per procedure.

现在,假定基于每个方法20次独立测试所获得的数据,实验室选择了90%效能,表7显示所获得的结果。