Rodger's Method
Rodger's Method
Background
[edit | edit source]Anova
[edit | edit source]Statistical analysis, especially in the sciences, social sciences and related disciplines, usually involves the examination of multiple categories of information (say, J such sets). These are often random samples of observations; or random assignments of patients, people, pigeons, plants, field plots, etc., to 'treatment' conditions - the 'treatments' being drugs, optical illusions, learning conditions, fertilizers, time of planting, etc. The object of the analysis of such sets of data is to determine whether (and where) they differ from one another by more than 'random error' can explain away. Typically, each of the J sets is characterized by a single 'statistic.' That is often the mean or average of the observations in the set, but sometimes by some other 'statistic' such as the proportion of successes in the set. Whether such 'sample' means (m_{j}) or proportions (p_{j}) differ from one another by more than 'random error' - in the light of the amount of variation among the observations within sets (or samples) - is measured in the procedure known as Analysis of Variance (or Anova), or various forms of that procedure, or analogues of the method such as analysis of contingency tables using the Chi-square approximation to the multinomial distribution.
Although the ideas incorporated in Anova have a quite long history, a paper by Sir Ronald Aylmer Fisher (1918)^{[1]} started the modern ball rolling. If the true population mean of set j was μ_{j}, then the hypothesis of interest seemed to be:
H_{0}: μ_{1} = μ_{2} = . . . = μ_{J} {1}
This null hypothesis says that the sets (at least as characterized by their means) do not differ. It seemed natural to accept that hypothesis if the variation among the sample m_{j} was no more than was reasonably likely in the light of the observed variation within samples.
Type 1 Error Rate
[edit | edit source]At this point, a controversial issue has reared its head. The Fisherian doctrine, as it developed over the next 40 years, refused to accept null hypotheses when they could not be rejected. After all, they might really be false, but by an amount so small that only huge samples would be large enough to reject them! Some fifteen years later, Jerzy Neyman and Egon Sharpe Pearson (1933)^{[2]} formalized the criteria for accepting and rejecting null hypotheses. An excellent discussion of some of these matters can be found at the mathnstats website (via the external links section below).
It had been suggested that if the probability of the observed data variation (or more) was 0.05 or less (given H_{0} were true), that might be grounds for suspecting that the null hypothesis at {1} was not true. Neyman and Pearson formalized and extended that thinking significantly. One should set up the criterion for 'deciding' whether H_{0} was true, before the data were collected (say at a probability α, which might be 0.05 or some similarly small probability). If the observed probability of the data variation turned out to be α or less, then one rejects H_{0}. That might, of course, be a mistake (called a type 1 error). The probability of such an error is α when H_{0} is really true. 'Deciding' that H_{0} is true is not 'chipped in stone' - for example, such decisions can be revised by later evidence.
Power
[edit | edit source]Neyman-Pearson also pointed out that there is another, important side to this matter. H_{0} may indeed be false; so we should try to arrange the size of our statistical investigation (by choice of sample size N) to yield a decent probability (say, β) of detecting that falsity. The procedure is to state how far from equal the μ_{j} might be (or how small the true variation needs to be to lead us to discount it at this stage of investigation). In Anova, that involves the calculation of a noncentrality parameter, then its use in Fisher's Noncentral Variance Ratio Distribution. It is rather poetic that although he did not accept the Neyman-Pearson methodology, Fisher's (1928)^{[3]} distribution plays a crucial part in the method for setting 'power' β.
Rodger's Approach
[edit | edit source]Although the classical H_{0} plays an important part in the theory of Anova, that part is more theoretical than practical. This is reflected in the fact that there are at least three formulae for the variance-ratio (F_{m}), from Anova, that have been used to decide whether to accept or reject H_{0}. The first is the obvious variance-ratio form, but the second shows that it is equivalent to an evaluation of whether a null contrast should be accepted or rejected, and the third shows that F_{m} is equivalent to the simultaneous evaluation of any (J-1) linearly independent contrasts. A constant sample size N is used throughout here to keep the formulae simple. Unequal N_{j} can easily be handled, though in real applications unequal N_{j} raise the risk of misleading results when the true variances (σ^{2}_{j}) are unequal.
F_{m} = NΣ(m_{j}-m.)^{2} /(ν_{1}s^{2}){2}
F_{m} = N(Σc_{mj}m_{j})^{2}/(ν_{1} s^{2} Σc^{2}_{mj}){3}
F_{m} = N _{1}v_{H} (_{H}C_{J} _{J}C^{T}_{H})^{-1} _{H}v^{T}_{1}/(ν_{1} s^{2}){4}
in which ν_{1} = J-1 is the numerator degrees of freedom for F_{m}, and (mathematically) there can be no more than H = J-1 linearly independent contrasts across J means.
Any contrast across the μ_{j} takes the form:
K_{h} = c_{1}μ_{1} + c_{2}μ_{2} + . . . c_{J}μ_{J} = 0{5}
in which the c_{j} are not all zero, and Σc_{j} =0. The contrast in {3} is the maximized contrast, defined as:
c_{mj} = m_{j}-m.{6}
where m. is the average of the J values of m_{j}. The matrix _{H}C_{J} in {4} holds the contrast coefficients (the c_{j}) for any H = J-1 linearly independent contrasts, and the vector _{1}ν_{H} holds the sample values of the H contrasts.
Alternatives
[edit | edit source]When {5} is not true, then what is true is the 'alternative':
K'_{h} = c_{1}μ_{1} + c_{2}μ_{2} + . . . c_{J}μ_{J} = δ_{h} = g_{h} σ √(Σc^{2}_{j}){7}
in which δ_{h} is the linear noncentrality parameter for this h^{th} contrast, expressed in terms of g_{h}, which is a scale-free parameter. The measurement scale (such as inches, centimetres, pounds, kilogrammes) is absorbed by the true (but unknown) standard deviation (σ), and the scale of the contrast is absorbed by √(Σc^{2}_{j}); thus the value of g is exactly the same for the two (equivalent) contrasts:
(μ_{1} + μ_{2})/2 - μ_{3} = g σ √(1.5){8}
μ_{1} + μ_{2} - 2μ_{3} = g σ √(6){9}
The Noncentral Variance Ratio Distribution uses a quadratic noncentrality parameter, such as:
Δ = Nδ^{2}/(σ^{2} Σc^{2}_{j}) = Ng^{2}{10}
which makes g an even more interesting quantity.
The overall noncentrality parameter (Δ_{m}) in the Noncentral Variance Ratio Distribution can be written in at least three ways - analogous to F_{m} at {2} through {4} - as:
Δ_{m} = N Σ(μ_{j}-μ.)^{2}/σ^{2}{11}
Δ_{m} = N (Σc_{μj} μ_{j})^{2}/(σ^{2} Σc^{2}_{μj}){12}
Δ_{m} = N _{1}δ_{H} (_{H}C_{J} _{J}C^{T}_{H})^{-1} _{H}δ^{T}_{1}/σ^{2}{13}
Note that there is no division by ν_{1} in any of these formulae. The c_{μj} in {12} are the very theoretical, maximizing coefficients c_{μj} = μ_{j}-μ., and the vector _{1}δ_{H} holds the linear noncentrality parameters δ_{h} for the h^{th} contrast. Finally, if we use {3} to compute F_{h} for H = J-1 mutually orthogonal contrasts, then:
F_{m} = ΣF_{h}{14}
It follows that if we used the critical value F_{crit} to decide whether to accept or reject null contrasts, then the maximum number of mutually orthogonal null contrasts we could reject in a research study, by that criterion, is:
r = [F_{m}/F_{crit}] ≤ ν_{1}{15}
in which [] indicates fraction truncation, and r cannot be allowed to exceed the maximum (mathematically) permissible number ν_{1}.
Decision-based Rejection Rates
[edit | edit source]Usually the researcher is mainly interested in which μ_{j} differ from which, and H_{0} at {1} is of no more than secondary interest (at most); so, if evaluating contrasts post hoc, the researcher will try various contrasts in equation {3}, rejecting the nulls when F_{h} is large, accepting the null otherwise. To remain logically consistent (contradiction nullifies the whole operation), the researcher ends with decisions for H = J-1 linearly independent contrasts (for simplicity, preferably J-1 mutually orthogonal contrasts), giving the rejected nulls the planned value of g, but subject to change to better fit the data.
But the big question is, what should be the criterion against which F_{h} should be compared? If the traditional Fα;ν_{1},ν_{2} is used, either the probability of detecting false null contrasts goes down, down and down as J is increased; or N must go up, up and up as J is increased. Rodger (1967)^{[4]} argued that it is not the probability (α) of rejecting H_{0} in error that should be controlled, rather it is the average rate of rejecting true null contrasts that should be controlled; i.e., we should control the expected rate (Eα) of true null contrast rejection. In the same way it is the average rate of rejecting null contrasts when they are not all true that should be controlled, not the probability (power β) of rejecting H_{0} at {1} when it is false. That is to say, we should control the average or expected rate (Eβ).
Tables of F[Eα];ν_{1},ν_{2} and Δ[Eβ];ν_{1},ν_{2}
[edit | edit source]To implement the above decision-based error procedure, Rodger (1975a)^{[5]} published tables of F[0.05];ν_{1},ν_{2} and F[0.01];ν_{1},ν_{2}. He also (Rodger (1975b)^{[6]}) published tables of Δ[Eβ];ν_{1},ν_{2} for Eα = 0.05 and for Eα = 0.01. The values reported are for Eβ = 0.50, 0.70, 0.80, 0.90, 0.95, and 0.99. As an example of what the expectations (or averages) Eα and Eβ represent, consider an investigation with J = 4 samples, each with N = 11 observations. That makes the F_{m} degrees of freedom ν_{1} = J-1 = 3, and ν_{2} = J(N-1) = 4×10 = 40. Rodger's (1975a) table reports F[0.05];3,40 = 1.974 and his (1975b) table gives Δ[0.95];3,40 = 9.246. That is not a Δ_{m} parameter, it is a Δ per contrast: hence Δ_{m} = ν_{1}×Δ[Eβ];ν_{1},ν_{2} = 3×9.246 = 27.738. In the analysis of the data in our illustrative experiment we will reject (see {15} above):
r = [F_{m}/F[0.05];ν_{1},ν_{2}] = [F_{m}/1.974] ≤ 3{16}
null contrasts. We can integrate the Central Variance Ratio Distribution to find the probabilities π_{r} of r = 0, 1, 2, or 3 when all null contrasts are true, and we can integrate the Noncentral Variance Ratio Distribution (with Δ_{m} = ν_{1}×Δ[0.95];ν_{1},ν_{2} = 3×9.246 = 27.738, when Eα = 0.05) to find the probabilities π'_{r} of r = 0, 1, 2, or 3 if there are ν_{1} = 3 mutually orthogonal contrasts possible across the μ_{j}, each of which has a Δ of 9.246. The procedure is to find the areas under the distribution from F = 0 to F = 1.974 (for π_{0} and π'_{0}), from F = 1.974 to F = 2×1.974 = 3.948 (for π_{1} and π'_{1}), from F = 3.948 to F = 3×1.974 = 5.922 (for π_{2} and π'_{2}) and, finally the area under the distribution from F = 5.922 to F = ∞ (for π_{3} and π'_{3}). The results are given in Table 1 below.
Table 1: Probabilities π_{r} for Δ_{m}=0 and π'_{r} for Δ_{m} = 27.738
r | π_{r} | r×π_{r} | π'_{r} | r×π'_{r} |
---|---|---|---|---|
0 | 0.8667 | 0.0000 | 0.0013 | 0.0000 |
1 | 0.1186 | 0.1186 | 0.0252 | 0.0252 |
2 | 0.0128 | 0.0256 | 0.0956 | 0.1912 |
3 | 0.0019 | 0.0057 | 0.8779 | 2.6337 |
Sum | 1.0000 | 0.1499 | 1.0000 | 2.8501 |
Sum/3 | 0.0500 | 0.9500 |
The π_{r} and π'_{r} are multiplied by r to find the expectation of r because there will be r null rejections made. The formulae are:
Eα = Σr×π_{r}/ν_{1}; Eβ = Σr×π'_{r}/ν_{1}{17}
and those are reported at the bottom of Table 1. When all possible null contrasts are true, the expected (i.e., average) proportion of ν_{1} nulls rejected by the procedure will be exactly Eα = 0.05. When there are ν_{1} mutually orthogonal nulls that are false, each with Δ_{h} = 9.246, then the expected (i.e., average) proportion of ν_{1} nulls rejected by the procedure will be exactly Eβ = 0.95.
An Illustration
[edit | edit source]Suppose we intend to use J = 4 samples, and we would like to detect null contrasts that are false by g^{2} = 0.81 (g = ±0.9) or more at a rate of Eβ = 0.95. We do not know ν_{2} yet, but Rodger's (1975b) table shows Δ[0.95];3,∞ = 8.370; so as {10} indicates, we should use sample size:
N ≥ Δ[Eβ];ν_{1},∞/g^{2} = 8.370/0.81 = 10.33{18}
If we use N = 11, that will make ν_{2} = J(N-1) = 4×10 = 40 and Δ[0.95];3,40 = 9.246. Our Δ = Ng^{2} = 11×0.81 = 8.91 is a little less than 9.246, but we will continue with N = 11, knowing that Eβ is a little less than 0.95 (for the curious, the exact Eβ = 0.942).
Suppose now that the sample data turn out to be those in Table 2:
Table 2: Illustration Data for J = 4, N = 11, s^{2} = 72
j = | 1 | 2 | 3 | 4 | Sum |
---|---|---|---|---|---|
m_{j} = | 15 | 16 | 21 | 24 | 4x19 |
m_{j}-m. = | -4 | -3 | 2 | 5 | 0 |
(m_{j}-m.)^{2} = | 16 | 9 | 4 | 25 | 54 |
These data yield the Anova Source Table 3.
Source | d.f. | Sum Squares | MS | F_{m} | r |
---|---|---|---|---|---|
Between Samples | 3 | 11x54 = 594 | 198 | 2.75 | 1 |
Within Samples | 40 | 2880 | 72 | ||
Total | 43 | 3474 |
Equation {15} makes:
r = [F_{m}/F[Eα];ν_{1},ν_{2}] = [2.75/1.974] = [1.4] = 1{19}
We may therefore reject one (out of ν_{1} = 3) null contrast. Note that the traditional criterion F0.05;3,40 = 2.893 (which is used by Scheffé's procedure) would find nothing 'significant' in these data and, as J increases, the discovery rate by F[Eα];ν_{1},ν_{2} grows ever better than that of Fα;ν_{1},ν_{2}. (Similarly, the post hoc procedures of Tukey and Newman-Keuls, which use studentized range values, are also unable to declare any differences within this illustration data to be 'significant.')
For different forms of contrasts, i.e., having different values of Σc^{2}_{j}, the size of sample effects needed to reject the null are:
Critical = √(ν_{1}×F[Eα];ν_{1},ν_{2}×s^{2}Σc^{2}_{j}/N){20}
= √(3×1.974×72Σc^{2}_{j}/11) = √(38.762Σc^{2}_{j})
Three examples are shown in Table 4 below.
Table 4: Critical Contrast Values for Null Rejection
Σc^{2}_{j} | Critical | Example |
---|---|---|
2 | 8.9 | m_{4} - m_{1} = 9 |
4 | 12.5 | m_{4} + m_{3} - m_{2} - m_{1} = 14 |
6 | 15.3 | 2m_{4} - m_{2} - m_{1} = 17 |
If it made reasonable scientific sense (in terms of the subject matter studied) the data seem to suggest that μ_{1} = μ_{2} < μ_{3} = μ_{4} and three orthogonal contrasts saying that are those in Table 5.
h | Contrast | Sample Value | F_{h} | δ_{h} |
---|---|---|---|---|
1 | μ_{2}-μ_{1} | 16-15 = 1 | 11(1)^{2}/(2×72×2)=0.025 | 0 |
2 | μ_{4}-μ_{3} | 24-21 = 3 | 11(3)^{2}/(2×72×2)=0.229 | 0 |
3 | μ_{4}+μ_{3}-μ_{2}-μ_{1} | 24+21-16-15 = 14 | 11(14)^{2}/(3×72×4)=2.495 | 0.9σ×2 |
For the curious, the maximizing contrast has the coefficients shown from Table 2 as: c_{mj} = m_{j}-m. = -4, -3, 2, 5; so formula {3} gives:
F_{m} = N(Σc_{mj}m_{j})^{2}/(ν_{1} s^{2} Σc^{2}_{mj}) = 11(54)^{2}/(3×72×54) = 32076/11664 = 2.75{21}
and, using our decision set in Table 5, formula {4} gives:
F_{m} = N _{1}v_{H} (_{H}C_{J} _{J}C^{T}_{H})^{-1} _{H}v^{T}_{1}/(ν_{1} s^{2}){22}
One can see the simplicity here of inverting the product of H = 3 mutually orthogonal contrasts, whose product is a diagonal matrix. Furthermore, non-orthogonal but linearly independent contrasts not only yield a product matrix that is more difficult to invert, they also make interpretation more complicated since differentially weighted modifications may be desirable for the δ_{h}.
Decision Implications
[edit | edit source]It is obviously true that the following three statements contradict one another:
X:μ_{6} - μ_{5} = 0; Y:μ_{7} - μ_{6} = 0; Z:μ_{7} - μ_{5} > 0{23}
The old adage is that two things (e.g., μ_{5} and μ_{7}) that are equal to the same thing (μ_{6}) are equal to one another; so the three statements cannot possibly all be true - no matter what statistical tests on their sample estimates say! Nevertheless, this type of contradiction (explicitly or implicitly) occurs often enough in reports of statistical analysis to be quite an embarrassment! The three comparisons in {23} are not linearly independent of one another because, algebraically:
Z = X + Y{24}
Rodger's method precludes drawing logically contradictory conclusions from any set of data by requiring that each statistical decision be linearly independent of every other one. But there is a very important, positive result to be extracted from the implication of decisions. In algebraic terms, a set of H decisions for contrasts (such as those in Table 5) can be expressed in matrix form as:
_{1}μ_{J} _{J}C^{T}_{H} = _{1}δ_{H}{25}
If we could only invert _{J}C^{T}_{H} (i.e., divide it away somehow) we could find out what our H decisions say about all the μ_{j} - quite an achievement in terms of saying what the investigator believes the investigation has shown, without the 'noise' of sample error - and a valuable guide to future research on the subject matter. But _{J}C^{T}_{H} is not even a square matrix; so its regular inverse does not exist! Happily there exist 'generalized inverses' that can be used in some occasions of this sort - and this 'implication need' is just such an occasion! The result is Rodger's 'implication equation':
_{1}μ_{J} = _{1}δ_{H} (_{H}C_{J} _{J}C^{T}_{H})^{-1} _{H}C_{J}{26}
From the decision set in Table 5:
_{1}μ_{4} = _{1}δ_{3} (_{3}C_{4} _{4}C^{T}_{3})^{-1} _{3}C_{4}{27}
These are the values of μ_{j}-μ. (not the μ_{j} alone) implied by the decisions in Table 5. This procedure produces results that would not be clearly seen otherwise when there are more null rejections.
The results in {27} yield an overall, quadratic noncentrality parameter:
Δ_{μ} = NΣ(μ_{j}-μ.)^{2}/σ^{2} = 11×0.81σ^{2}/σ^{2} = 8.91{28}
This equals Ng^{2} because only one null contrast was rejected and given δ_{3} = 0.9 σ√4.
There are ways of modifying the g_{h} to reflect more closely the sample observations, and the SPS computer program (discussed below) uses two of these procedures by default. SPS also automatically provides two separate statistics which assess the fit of the implied μ_{j}-μ. One of these statistics is the correlation coefficient between these implied true population means and the sample means, and the other is the F fit residual (i.e., the amount of the omnibus F_{m} value that is not accounted for by the as many as r rejected null contrasts among the J-1 statistical decisions). A high degree of 'fit' between the sample and implied means is a necessary condition for concluding that a particular set of J-1 decisions is the scientifically optimal one, but it can never be a sufficient condition for drawing that conclusion. When partitioning the overall Anova between-groups variance into independent components (and J > 2), it is theoretically possible to do this in an infinite number of ways (i.e., "literally," that many, infinitesimally different from one another, sets of J-1 mutually orthogonal contrasts could be constructed). The statistical fit between the implied true population means that are mathematically entailed by any specific set of J-1 decisions, and the sample means one began with, matters. But it is the scientific sense that those J-1 statistical decisions make that, statistically speaking, needs to be of ultimate concern.
Further Applications
[edit | edit source]Of course, the methodology shown here can be applied to other forms of Anova, e.g., to Randomized Blocks data, but Rodger (1974)^{[7]} has argued that data collected for a Factorial Design analysis (e.g., for I×J×K factors) are better analyzed by his method in a one-way Anova for L = I×J×K samples. F[Eα];L-1,ν_{2} does not have the loss of effect detectability as L is increased, as does happen if the traditional Fα;L-1,ν_{2} were used; so 'common sense' interactions (by simple contrasts across the m_{ijk}) have a reasonable chance of being detected. The 2013 article by Rodger and Roberts^{[8]} shows clearly how effect detectability by the traditional standard (Fα;L,ν2) becomes poorer and poorer as L is increased, which does not happen if Rodger's F[Eα];L,ν2 is used.
Rodger (1969)^{[9]} showed his method can be applied to evaluating contrasts, or linear hypotheses, across frequencies in 2×J contingency tables, and could be extended to correlated frequencies (both of these are options in the SPS computer program). The analysis of ranked data (correlated or otherwise) are further non-parametric options available in SPS.
On the multivariate front, SPS offers analysis of the one-sample version of Hotelling's (1931)^{[10]} T^{2}. Tables for Hotelling's (1951)^{[11]} Generalized T^{2} have been computed (by R.S. Rodger) for Multivariate Analysis of Variance, but these have not yet been published.
Finally, it is possible to set alternatives to null contrasts in all-numeric terms (no unknown σ required - see {7} above) by doing two-stage sampling. Rodger (1978)^{[12]} has published the relevant tables of the noncentrality parameter D[Eβ];ν_{1},ν_{2}.
Using Rodger's Method with the SPS Computer Program
[edit | edit source]The Simple, Powerful Statistics (SPS) computer program mentioned in the two previous subsections is a free, Windows-based one which implements a comprehensive set of the important features of Rodger's method. As already noted, SPS makes it relatively easy to use Rodger's method with either independent or correlated means, proportions, or ranks; and analyzing two-stage sampling data can be done almost as easily as doing that with the usual single stage of sampling. SPS can be downloaded from the Simple, Powerful Statistics website (see the external links section immediately below the references). An article about both the computer program and Rodger's method was published in the Journal of Methods and Measurement in the Social Sciences, and that can be downloaded by clicking the link contained in reference #13.^{[13]}
The Bottom Line on Rodger's Method
[edit | edit source]As demonstrated in the 'An Illustration' section above, using the traditional Fα;ν_{1},ν_{2} criterion produces an inevitable loss of statistical power as ν_{1} (the numerator degrees of freedom) increases. In direct contrast, “Rodger’s approach ensures that statistical power does not decline (and even increases) with increasing numerator degrees of freedom” (Delamater, Campese, & Westbrook, 2009; p.228).^{[14]} As another set of researchers put it: “We chose Rodger’s method because it is the most powerful post hoc method available for detecting true differences among groups. This was an especially important consideration in the present experiments in which interesting conclusions could rest on null results” (Williams, Frame, & LoLordo, 1992, p.43).^{[15]} The most definitive evidence for the statistical power advantage that Rodger's method possesses (as compared with eight other multiple comparison procedures) is provided in the 2013 article by Rodger and Roberts (downloadable with the link in reference #8).
A corollary that necessarily follows from the truth of the highlighted proposition in the previous paragraph, of course, is that Rodger's method has more power than all other post hoc procedures to detect every conceivable sort of interaction effect. Importantly, though, Rodger's method also permits completely ignoring every factorially-defined interaction (which are widely acknowledged to be difficult to interpret), and encourages focusing instead on "the interactions defined by common sense" including "simple cross-cell contrasts such as μ_{11} - μ_{22} which are easy to interpret ... though they are not the interactions defined in the factorial model" (Rodger, 1974, p.195).
An absolutely unlimited amount of post hoc data snooping is permitted by Rodger's method, and this is accompanied by a guarantee that the long-run expectation of type 1 errors can never exceed Eα (i.e., .05 or .01]. This statement is not in need of any testimonial support or empirical verification (nor is it susceptible to disconfirmation), because it is essentially a logical tautology. Both the increased power that Rodger's method possesses, and the impossibility of type 1 error rate inflation, are obtained by using a decision-based (i.e., per contrast) error rate - analogous to the rate used in planned t-tests. As noted at the end of the 'Decision Implications' section, whenever J>2, an infinite number of orthogonal contrasts can be constructed and combined into that many mutually orthogonal sets of J-1 statistical decisions. Rodger's method precludes every one of these potentially infinite number of sets from ever containing more than r rejected null contrasts (see {15} above). Consequently, each and every one of the rejected nulls that are included in whichever set you decide to adopt and interpret will (necessarily, by virtue of the manner in which Rodger's method was conceived and built) have an expected type 1 error rate of either: 1) Eα if r rejected nulls were included in that set, or 2) less than Eα if the decision set you decided upon contains fewer than r null contrast rejections. Rodger's method does its job - which in this context is not to prevent statistical errors from occurring, but to control their rate of occurrence.
A unique feature of Rodger's method is its specification of the 'implied means' (or implied proportions or implied mean ranks) that are logically implied, and mathematically entailed, by the J-1 (number of means minus one) statistical decisions that the user of his method will make. These implied true population means constitute a very precise statement about the outcome of one's research, and assist other researchers in determining the size of effect that their related research ought to seek.
The single best source for finding out more about Rodger's method is his 1974 article (reference #7).
References
[edit | edit source]- ↑ Fisher, R.A. (1918). The Correlation Between Relatives on the Supposition of Mendelian Inheritance. Philosophical Transactions of the Royal Society of Edinburgh, 52, 399-433.
- ↑ Neyman, J. and Pearson, E.S. (1933). On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London, Series A, 231, 289-337.
- ↑ Fisher, R.A. (1928). The general sampling distribution of the multiple correlation coefficient. Proceedings of the Royal Society of London, 121, 654-673.
- ↑ Rodger, R.S. (1967). Type I errors and their decision basis. British Journal of Mathematical and Statistical Psychology, 20, 51-62.
- ↑ Rodger, R. S. (1975a). The number of non-zero, post hoc contrasts from ANOVA and error-rate I. British Journal of Mathematical and Statistical Psychology, 28, 71-78.
- ↑ Rodger, R. S. (1975b). Setting rejection rate for contrasts selected post hoc when some nulls are false. British Journal of Mathematical and Statistical Psychology, 28, 214-232.
- ↑ Rodger, R. S. (1974). Multiple contrasts, factors, error rate and power. British Journal of Mathematical and Statistical Psychology, 27, 179-198.
- ↑ Rodger, R.S. and Roberts, M. (2013). Comparison of power for multiple comparison procedures. Journal of Methods and Measurement in the Social Sciences, 4(1), 20–47.
- ↑ Rodger, R. S. (1969). Linear hypotheses in 2xa frequency tables. British Journal of Mathematical and Statistical Psychology, 22, 29-48.
- ↑ Hotelling, H. (1931). The generalization of Student's ratio. Annals of Mathematical Statistics, 2, 360-378.
- ↑ Hotelling, H. (1951). A generalised T-test and measure of multivariate dispersion. Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, 23-42.
- ↑ Rodger, R. S. (1978). Two-stage sampling to set sample size for post hoc tests in ANOVA with decision-based error rates. British Journal of Mathematical and Statistical Psychology, 31, 153-178.
- ↑ Roberts, M. (2011). Simple, Powerful Statistics: An instantiation of a better 'mousetrap.' Journal of Methods and Measurement in the Social Sciences, 2(2), 63-79.
- ↑ Delamater, A. R., Campese, V., & Westbrook, R. F. (2009). Renewal and spontaneous recovery, but not latent inhibition, are mediated by gamma-aminobutyric acid in appetitive conditioning. Journal of Experimental Psychology: Animal Behavior Processes, 35(2), 224–237. doi:10.1037/a0013293
- ↑ Williams, D. A., Frame, K. A., & LoLordo, V. M. (1992). Discrete signals for the unconditioned stimulus fail to overshadow contextual or temporal conditioning. Journal of Experimental Psychology: Animal Behavior Processes, 18(1), 41-55.
External Links
[edit | edit source]- Math'n'Stats historical material
- Simple, Powerful Statistics (SPS software) (download website for a free, Windows-based computer program that makes using Rodger’s method accessible to all researchers)