Rodger's Method

Rodger's Method

Background

Anova

Statistical analysis, especially in the sciences, social sciences and related disciplines, usually involves the examination of multiple categories of information (say, J such sets). These are often random samples of observations; or random assignments of patients, people, pigeons, plants, field plots, etc., to 'treatment' conditions - the 'treatments' being drugs, optical illusions, learning conditions, fertilizers, time of planting, etc. The object of the analysis of such sets of data is to determine whether (and where) they differ from one another by more than 'random error' can explain away. Typically, each of the J sets is characterized by a single 'statistic.' That is often the mean or average of the observations in the set, but sometimes by some other 'statistic' such as the proportion of successes in the set. Whether such 'sample' means (m_j) or proportions (p_j) differ from one another by more than 'random error' - in the light of the amount of variation among the observations within sets (or samples) - is measured in the procedure known as Analysis of Variance (or Anova), or various forms of that procedure, or analogues of the method such as analysis of contingency tables using the Chi-square approximation to the multinomial distribution.

Although the ideas incorporated in Anova have a quite long history, a paper by Sir Ronald Aylmer Fisher (1918)^[1] started the modern ball rolling. If the true population mean of set j was μ_j, then the hypothesis of interest seemed to be:

H₀: μ₁ = μ₂ = . . . = μ_J {1}

This null hypothesis says that the sets (at least as characterized by their means) do not differ. It seemed natural to accept that hypothesis if the variation among the sample m_j was no more than was reasonably likely in the light of the observed variation within samples.

Type 1 Error Rate

At this point, a controversial issue has reared its head. The Fisherian doctrine, as it developed over the next 40 years, refused to accept null hypotheses when they could not be rejected. After all, they might really be false, but by an amount so small that only huge samples would be large enough to reject them! Some fifteen years later, Jerzy Neyman and Egon Sharpe Pearson (1933)^[2] formalized the criteria for accepting and rejecting null hypotheses. An excellent discussion of some of these matters can be found at the mathnstats website (via the external links section below).

It had been suggested that if the probability of the observed data variation (or more) was 0.05 or less (given H₀ were true), that might be grounds for suspecting that the null hypothesis at {1} was not true. Neyman and Pearson formalized and extended that thinking significantly. One should set up the criterion for 'deciding' whether H₀ was true, before the data were collected (say at a probability α, which might be 0.05 or some similarly small probability). If the observed probability of the data variation turned out to be α or less, then one rejects H₀. That might, of course, be a mistake (called a type 1 error). The probability of such an error is α when H₀ is really true. 'Deciding' that H₀ is true is not 'chipped in stone' - for example, such decisions can be revised by later evidence.

Power

Neyman-Pearson also pointed out that there is another, important side to this matter. H₀ may indeed be false; so we should try to arrange the size of our statistical investigation (by choice of sample size N) to yield a decent probability (say, β) of detecting that falsity. The procedure is to state how far from equal the μ_j might be (or how small the true variation needs to be to lead us to discount it at this stage of investigation). In Anova, that involves the calculation of a noncentrality parameter, then its use in Fisher's Noncentral Variance Ratio Distribution. It is rather poetic that although he did not accept the Neyman-Pearson methodology, Fisher's (1928)^[3] distribution plays a crucial part in the method for setting 'power' β.

Rodger's Approach

Although the classical H₀ plays an important part in the theory of Anova, that part is more theoretical than practical. This is reflected in the fact that there are at least three formulae for the variance-ratio (F_m), from Anova, that have been used to decide whether to accept or reject H₀. The first is the obvious variance-ratio form, but the second shows that it is equivalent to an evaluation of whether a null contrast should be accepted or rejected, and the third shows that F_m is equivalent to the simultaneous evaluation of any (J-1) linearly independent contrasts. A constant sample size N is used throughout here to keep the formulae simple. Unequal N_j can easily be handled, though in real applications unequal N_j raise the risk of misleading results when the true variances (σ²_j) are unequal.

F_m = NΣ(m_j-m.)² /(ν₁s²){2}

F_m = N(Σc_mjm_j)²/(ν₁ s² Σc²_mj){3}

F_m = N ₁v_H (_HC_J _JC^T_H)^-1 _Hv^T₁/(ν₁ s²){4}

in which ν₁ = J-1 is the numerator degrees of freedom for F_m, and (mathematically) there can be no more than H = J-1 linearly independent contrasts across J means.

Any contrast across the μ_j takes the form:

K_h = c₁μ₁ + c₂μ₂ + . . . c_Jμ_J = 0{5}

in which the c_j are not all zero, and Σc_j =0. The contrast in {3} is the maximized contrast, defined as:

c_mj = m_j-m.{6}

where m. is the average of the J values of m_j. The matrix _HC_J in {4} holds the contrast coefficients (the c_j) for any H = J-1 linearly independent contrasts, and the vector ₁ν_H holds the sample values of the H contrasts.

Alternatives

When {5} is not true, then what is true is the 'alternative':

K'_h = c₁μ₁ + c₂μ₂ + . . . c_Jμ_J = δ_h = g_h σ √(Σc²_j){7}

in which δ_h is the linear noncentrality parameter for this h^th contrast, expressed in terms of g_h, which is a scale-free parameter. The measurement scale (such as inches, centimetres, pounds, kilogrammes) is absorbed by the true (but unknown) standard deviation (σ), and the scale of the contrast is absorbed by √(Σc²_j); thus the value of g is exactly the same for the two (equivalent) contrasts:

(μ₁ + μ₂)/2 - μ₃ = g σ √(1.5){8}

μ₁ + μ₂ - 2μ₃ = g σ √(6){9}

The Noncentral Variance Ratio Distribution uses a quadratic noncentrality parameter, such as:

Δ = Nδ²/(σ² Σc²_j) = Ng²{10}

which makes g an even more interesting quantity.

The overall noncentrality parameter (Δ_m) in the Noncentral Variance Ratio Distribution can be written in at least three ways - analogous to F_m at {2} through {4} - as:

Δ_m = N Σ(μ_j-μ.)²/σ²{11}

Δ_m = N (Σc_μj μ_j)²/(σ² Σc²_μj){12}

Δ_m = N ₁δ_H (_HC_J _JC^T_H)^-1 _Hδ^T₁/σ²{13}

Note that there is no division by ν₁ in any of these formulae. The c_μj in {12} are the very theoretical, maximizing coefficients c_μj = μ_j-μ., and the vector ₁δ_H holds the linear noncentrality parameters δ_h for the h^th contrast. Finally, if we use {3} to compute F_h for H = J-1 mutually orthogonal contrasts, then:

F_m = ΣF_h{14}

It follows that if we used the critical value F_crit to decide whether to accept or reject null contrasts, then the maximum number of mutually orthogonal null contrasts we could reject in a research study, by that criterion, is:

r = [F_m/F_crit] ≤ ν₁{15}

in which [] indicates fraction truncation, and r cannot be allowed to exceed the maximum (mathematically) permissible number ν₁.

Decision-based Rejection Rates

Usually the researcher is mainly interested in which μ_j differ from which, and H₀ at {1} is of no more than secondary interest (at most); so, if evaluating contrasts post hoc, the researcher will try various contrasts in equation {3}, rejecting the nulls when F_h is large, accepting the null otherwise. To remain logically consistent (contradiction nullifies the whole operation), the researcher ends with decisions for H = J-1 linearly independent contrasts (for simplicity, preferably J-1 mutually orthogonal contrasts), giving the rejected nulls the planned value of g, but subject to change to better fit the data.

But the big question is, what should be the criterion against which F_h should be compared? If the traditional Fα;ν₁,ν₂ is used, either the probability of detecting false null contrasts goes down, down and down as J is increased; or N must go up, up and up as J is increased. Rodger (1967)^[4] argued that it is not the probability (α) of rejecting H₀ in error that should be controlled, rather it is the average rate of rejecting true null contrasts that should be controlled; i.e., we should control the expected rate (Eα) of true null contrast rejection. In the same way it is the average rate of rejecting null contrasts when they are not all true that should be controlled, not the probability (power β) of rejecting H₀ at {1} when it is false. That is to say, we should control the average or expected rate (Eβ).

Tables of F[Eα];ν₁,ν₂ and Δ[Eβ];ν₁,ν₂

To implement the above decision-based error procedure, Rodger (1975a)^[5] published tables of F[0.05];ν₁,ν₂ and F[0.01];ν₁,ν₂. He also (Rodger (1975b)^[6]) published tables of Δ[Eβ];ν₁,ν₂ for Eα = 0.05 and for Eα = 0.01. The values reported are for Eβ = 0.50, 0.70, 0.80, 0.90, 0.95, and 0.99. As an example of what the expectations (or averages) Eα and Eβ represent, consider an investigation with J = 4 samples, each with N = 11 observations. That makes the F_m degrees of freedom ν₁ = J-1 = 3, and ν₂ = J(N-1) = 4×10 = 40. Rodger's (1975a) table reports F[0.05];3,40 = 1.974 and his (1975b) table gives Δ[0.95];3,40 = 9.246. That is not a Δ_m parameter, it is a Δ per contrast: hence Δ_m = ν₁×Δ[Eβ];ν₁,ν₂ = 3×9.246 = 27.738. In the analysis of the data in our illustrative experiment we will reject (see {15} above):

r = [F_m/F[0.05];ν₁,ν₂] = [F_m/1.974] ≤ 3{16}

null contrasts. We can integrate the Central Variance Ratio Distribution to find the probabilities π_r of r = 0, 1, 2, or 3 when all null contrasts are true, and we can integrate the Noncentral Variance Ratio Distribution (with Δ_m = ν₁×Δ[0.95];ν₁,ν₂ = 3×9.246 = 27.738, when Eα = 0.05) to find the probabilities π'_r of r = 0, 1, 2, or 3 if there are ν₁ = 3 mutually orthogonal contrasts possible across the μ_j, each of which has a Δ of 9.246. The procedure is to find the areas under the distribution from F = 0 to F = 1.974 (for π₀ and π'₀), from F = 1.974 to F = 2×1.974 = 3.948 (for π₁ and π'₁), from F = 3.948 to F = 3×1.974 = 5.922 (for π₂ and π'₂) and, finally the area under the distribution from F = 5.922 to F = ∞ (for π₃ and π'₃). The results are given in Table 1 below.

Table 1: Probabilities π_r for Δ_m=0 and π'_r for Δ_m = 27.738

r	π_r	r×π_r	π'_r	r×π'_r
0	0.8667	0.0000	0.0013	0.0000
1	0.1186	0.1186	0.0252	0.0252
2	0.0128	0.0256	0.0956	0.1912
3	0.0019	0.0057	0.8779	2.6337
Sum	1.0000	0.1499	1.0000	2.8501
Sum/3		0.0500		0.9500

The π_r and π'_r are multiplied by r to find the expectation of r because there will be r null rejections made. The formulae are:

Eα = Σr×π_r/ν₁; Eβ = Σr×π'_r/ν₁{17}

and those are reported at the bottom of Table 1. When all possible null contrasts are true, the expected (i.e., average) proportion of ν₁ nulls rejected by the procedure will be exactly Eα = 0.05. When there are ν₁ mutually orthogonal nulls that are false, each with Δ_h = 9.246, then the expected (i.e., average) proportion of ν₁ nulls rejected by the procedure will be exactly Eβ = 0.95.

An Illustration

Suppose we intend to use J = 4 samples, and we would like to detect null contrasts that are false by g² = 0.81 (g = ±0.9) or more at a rate of Eβ = 0.95. We do not know ν₂ yet, but Rodger's (1975b) table shows Δ[0.95];3,∞ = 8.370; so as {10} indicates, we should use sample size:

N ≥ Δ[Eβ];ν₁,∞/g² = 8.370/0.81 = 10.33{18}

If we use N = 11, that will make ν₂ = J(N-1) = 4×10 = 40 and Δ[0.95];3,40 = 9.246. Our Δ = Ng² = 11×0.81 = 8.91 is a little less than 9.246, but we will continue with N = 11, knowing that Eβ is a little less than 0.95 (for the curious, the exact Eβ = 0.942).

Suppose now that the sample data turn out to be those in Table 2:

Table 2: Illustration Data for J = 4, N = 11, s² = 72

j =	1	2	3	4	Sum
m_j =	15	16	21	24	4x19
m_j-m. =	-4	-3	2	5	0
(m_j-m.)² =	16	9	4	25	54

These data yield the Anova Source Table 3.

Table 3: Anova Source
Source	d.f.	Sum Squares	MS	F_m	r
Between Samples	3	11x54 = 594	198	2.75	1
Within Samples	40	2880	72
Total	43	3474

Equation {15} makes:

r = [F_m/F[Eα];ν₁,ν₂] = [2.75/1.974] = [1.4] = 1{19}

We may therefore reject one (out of ν₁ = 3) null contrast. Note that the traditional criterion F0.05;3,40 = 2.893 (which is used by Scheffé's procedure) would find nothing 'significant' in these data and, as J increases, the discovery rate by F[Eα];ν₁,ν₂ grows ever better than that of Fα;ν₁,ν₂. (Similarly, the post hoc procedures of Tukey and Newman-Keuls, which use studentized range values, are also unable to declare any differences within this illustration data to be 'significant.')

For different forms of contrasts, i.e., having different values of Σc²_j, the size of sample effects needed to reject the null are:

Critical = √(ν₁×F[Eα];ν₁,ν₂×s²Σc²_j/N){20}

= √(3×1.974×72Σc²_j/11) = √(38.762Σc²_j)

Three examples are shown in Table 4 below.

Table 4: Critical Contrast Values for Null Rejection

Σc²_j	Critical	Example
2	8.9	m₄ - m₁ = 9
4	12.5	m₄ + m₃ - m₂ - m₁ = 14
6	15.3	2m₄ - m₂ - m₁ = 17

If it made reasonable scientific sense (in terms of the subject matter studied) the data seem to suggest that μ₁ = μ₂ < μ₃ = μ₄ and three orthogonal contrasts saying that are those in Table 5.

Table 5: Illustration Decision Set
h	Contrast	Sample Value	F_h	δ_h
1	μ₂-μ₁	16-15 = 1	11(1)²/(2×72×2)=0.025	0
2	μ₄-μ₃	24-21 = 3	11(3)²/(2×72×2)=0.229	0
3	μ₄+μ₃-μ₂-μ₁	24+21-16-15 = 14	11(14)²/(3×72×4)=2.495	0.9σ×2

For the curious, the maximizing contrast has the coefficients shown from Table 2 as: c_mj = m_j-m. = -4, -3, 2, 5; so formula {3} gives:

F_m = N(Σc_mjm_j)²/(ν₁ s² Σc²_mj) = 11(54)²/(3×72×54) = 32076/11664 = 2.75{21}

and, using our decision set in Table 5, formula {4} gives:

F_m = N ₁v_H (_HC_J _JC^T_H)^-1 _Hv^T₁/(ν₁ s²){22}

=N\ _{1}v_{3}\left(_{3}C_{4}\ _{4}C_{3}^{T}\right)^{-1}\ _{3}v_{1}^{T}\ /(3\times 72)

=11\ _{1}v_{3}\left({\begin{bmatrix}-1&1&0&0\\0&0&-1&1\\-1&-1&1&1\end{bmatrix}}{\begin{bmatrix}-1&0&-1\\1&0&-1\\0&-1&1\\0&1&1\end{bmatrix}}\right)^{-1}\ _{3}v_{1}^{T}/216

=11\ _{1}v_{3}{\begin{bmatrix}2&0&0\\0&2&0\\0&0&4\end{bmatrix}}^{-1}\ _{3}v_{1}^{T}/216

=11{\begin{bmatrix}1&3&14\end{bmatrix}}{\begin{bmatrix}1/2&0&0\\0&1/2&0\\0&0&1/4\end{bmatrix}}{\begin{bmatrix}1\\3\\14\end{bmatrix}}/216

=11{\begin{bmatrix}1/2&3/2&14/4\end{bmatrix}}{\begin{bmatrix}1\\3\\14\\\end{bmatrix}}/216

=11\times 54\ /\ 216=2.75

One can see the simplicity here of inverting the product of H = 3 mutually orthogonal contrasts, whose product is a diagonal matrix. Furthermore, non-orthogonal but linearly independent contrasts not only yield a product matrix that is more difficult to invert, they also make interpretation more complicated since differentially weighted modifications may be desirable for the δ_h.

Decision Implications

It is obviously true that the following three statements contradict one another:

X:μ₆ - μ₅ = 0; Y:μ₇ - μ₆ = 0; Z:μ₇ - μ₅ > 0{23}

The old adage is that two things (e.g., μ₅ and μ₇) that are equal to the same thing (μ₆) are equal to one another; so the three statements cannot possibly all be true - no matter what statistical tests on their sample estimates say! Nevertheless, this type of contradiction (explicitly or implicitly) occurs often enough in reports of statistical analysis to be quite an embarrassment! The three comparisons in {23} are not linearly independent of one another because, algebraically:

Z = X + Y{24}

Rodger's method precludes drawing logically contradictory conclusions from any set of data by requiring that each statistical decision be linearly independent of every other one. But there is a very important, positive result to be extracted from the implication of decisions. In algebraic terms, a set of H decisions for contrasts (such as those in Table 5) can be expressed in matrix form as:

₁μ_J _JC^T_H = ₁δ_H{25}

If we could only invert _JC^T_H (i.e., divide it away somehow) we could find out what our H decisions say about all the μ_j - quite an achievement in terms of saying what the investigator believes the investigation has shown, without the 'noise' of sample error - and a valuable guide to future research on the subject matter. But _JC^T_H is not even a square matrix; so its regular inverse does not exist! Happily there exist 'generalized inverses' that can be used in some occasions of this sort - and this 'implication need' is just such an occasion! The result is Rodger's 'implication equation':

₁μ_J = ₁δ_H (_HC_J _JC^T_H)^-1 _HC_J{26}

From the decision set in Table 5:

₁μ₄ = ₁δ₃ (₃C₄ ₄C^T₃)^-1 ₃C₄{27}

=\ _{1}\delta _{3}{\begin{bmatrix}1/2&0&0\\0&1/2&0\\0&0&1/4\end{bmatrix}}{\begin{bmatrix}-1&1&0&0\\0&0&-1&1\\-1&-1&1&1\end{bmatrix}}

={\begin{bmatrix}0&0&1.8\sigma \end{bmatrix}}{\begin{bmatrix}-1/2&1/2&0&0\\0&0&-1/2&1/2\\-1/4&-1/4&1/4&1/4\end{bmatrix}}

={\begin{bmatrix}-.45\sigma &-.45\sigma &.45\sigma &.45\sigma \end{bmatrix}}

These are the values of μ_j-μ. (not the μ_j alone) implied by the decisions in Table 5. This procedure produces results that would not be clearly seen otherwise when there are more null rejections.

The results in {27} yield an overall, quadratic noncentrality parameter:

Δ_μ = NΣ(μ_j-μ.)²/σ² = 11×0.81σ²/σ² = 8.91{28}

This equals Ng² because only one null contrast was rejected and given δ₃ = 0.9 σ√4.

There are ways of modifying the g_h to reflect more closely the sample observations, and the SPS computer program (discussed below) uses two of these procedures by default. SPS also automatically provides two separate statistics which assess the fit of the implied μ_j-μ. One of these statistics is the correlation coefficient between these implied true population means and the sample means, and the other is the F fit residual (i.e., the amount of the omnibus F_m value that is not accounted for by the as many as r rejected null contrasts among the J-1 statistical decisions). A high degree of 'fit' between the sample and implied means is a necessary condition for concluding that a particular set of J-1 decisions is the scientifically optimal one, but it can never be a sufficient condition for drawing that conclusion. When partitioning the overall Anova between-groups variance into independent components (and J > 2), it is theoretically possible to do this in an infinite number of ways (i.e., "literally," that many, infinitesimally different from one another, sets of J-1 mutually orthogonal contrasts could be constructed). The statistical fit between the implied true population means that are mathematically entailed by any specific set of J-1 decisions, and the sample means one began with, matters. But it is the scientific sense that those J-1 statistical decisions make that, statistically speaking, needs to be of ultimate concern.

Further Applications

Of course, the methodology shown here can be applied to other forms of Anova, e.g., to Randomized Blocks data, but Rodger (1974)^[7] has argued that data collected for a Factorial Design analysis (e.g., for I×J×K factors) are better analyzed by his method in a one-way Anova for L = I×J×K samples. F[Eα];L-1,ν₂ does not have the loss of effect detectability as L is increased, as does happen if the traditional Fα;L-1,ν₂ were used; so 'common sense' interactions (by simple contrasts across the m_ijk) have a reasonable chance of being detected. The 2013 article by Rodger and Roberts^[8] shows clearly how effect detectability by the traditional standard (Fα;L,ν2) becomes poorer and poorer as L is increased, which does not happen if Rodger's F[Eα];L,ν2 is used.

Rodger (1969)^[9] showed his method can be applied to evaluating contrasts, or linear hypotheses, across frequencies in 2×J contingency tables, and could be extended to correlated frequencies (both of these are options in the SPS computer program). The analysis of ranked data (correlated or otherwise) are further non-parametric options available in SPS.

On the multivariate front, SPS offers analysis of the one-sample version of Hotelling's (1931)^[10] T². Tables for Hotelling's (1951)^[11] Generalized T² have been computed (by R.S. Rodger) for Multivariate Analysis of Variance, but these have not yet been published.

Finally, it is possible to set alternatives to null contrasts in all-numeric terms (no unknown σ required - see {7} above) by doing two-stage sampling. Rodger (1978)^[12] has published the relevant tables of the noncentrality parameter D[Eβ];ν₁,ν₂.

Using Rodger's Method with the SPS Computer Program

The Simple, Powerful Statistics (SPS) computer program mentioned in the two previous subsections is a free, Windows-based one which implements a comprehensive set of the important features of Rodger's method. As already noted, SPS makes it relatively easy to use Rodger's method with either independent or correlated means, proportions, or ranks; and analyzing two-stage sampling data can be done almost as easily as doing that with the usual single stage of sampling. SPS can be downloaded from the Simple, Powerful Statistics website (see the external links section immediately below the references). An article about both the computer program and Rodger's method was published in the Journal of Methods and Measurement in the Social Sciences, and that can be downloaded by clicking the link contained in reference #13.^[13]

The Bottom Line on Rodger's Method

As demonstrated in the 'An Illustration' section above, using the traditional Fα;ν₁,ν₂ criterion produces an inevitable loss of statistical power as ν₁ (the numerator degrees of freedom) increases. In direct contrast, “Rodger’s approach ensures that statistical power does not decline (and even increases) with increasing numerator degrees of freedom” (Delamater, Campese, & Westbrook, 2009; p.228).^[14] As another set of researchers put it: “We chose Rodger’s method because it is the most powerful post hoc method available for detecting true differences among groups. This was an especially important consideration in the present experiments in which interesting conclusions could rest on null results” (Williams, Frame, & LoLordo, 1992, p.43).^[15] The most definitive evidence for the statistical power advantage that Rodger's method possesses (as compared with eight other multiple comparison procedures) is provided in the 2013 article by Rodger and Roberts (downloadable with the link in reference #8).

A corollary that necessarily follows from the truth of the highlighted proposition in the previous paragraph, of course, is that Rodger's method has more power than all other post hoc procedures to detect every conceivable sort of interaction effect. Importantly, though, Rodger's method also permits completely ignoring every factorially-defined interaction (which are widely acknowledged to be difficult to interpret), and encourages focusing instead on "the interactions defined by common sense" including "simple cross-cell contrasts such as μ₁₁ - μ₂₂ which are easy to interpret ... though they are not the interactions defined in the factorial model" (Rodger, 1974, p.195).

An absolutely unlimited amount of post hoc data snooping is permitted by Rodger's method, and this is accompanied by a guarantee that the long-run expectation of type 1 errors can never exceed Eα (i.e., .05 or .01]. This statement is not in need of any testimonial support or empirical verification (nor is it susceptible to disconfirmation), because it is essentially a logical tautology. Both the increased power that Rodger's method possesses, and the impossibility of type 1 error rate inflation, are obtained by using a decision-based (i.e., per contrast) error rate - analogous to the rate used in planned t-tests. As noted at the end of the 'Decision Implications' section, whenever J>2, an infinite number of orthogonal contrasts can be constructed and combined into that many mutually orthogonal sets of J-1 statistical decisions. Rodger's method precludes every one of these potentially infinite number of sets from ever containing more than r rejected null contrasts (see {15} above). Consequently, each and every one of the rejected nulls that are included in whichever set you decide to adopt and interpret will (necessarily, by virtue of the manner in which Rodger's method was conceived and built) have an expected type 1 error rate of either: 1) Eα if r rejected nulls were included in that set, or 2) less than Eα if the decision set you decided upon contains fewer than r null contrast rejections. Rodger's method does its job - which in this context is not to prevent statistical errors from occurring, but to control their rate of occurrence.

A unique feature of Rodger's method is its specification of the 'implied means' (or implied proportions or implied mean ranks) that are logically implied, and mathematically entailed, by the J-1 (number of means minus one) statistical decisions that the user of his method will make. These implied true population means constitute a very precise statement about the outcome of one's research, and assist other researchers in determining the size of effect that their related research ought to seek.

The single best source for finding out more about Rodger's method is his 1974 article (reference #7).

References

↑ Fisher, R.A. (1918). The Correlation Between Relatives on the Supposition of Mendelian Inheritance. Philosophical Transactions of the Royal Society of Edinburgh, 52, 399-433.
↑ Neyman, J. and Pearson, E.S. (1933). On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London, Series A, 231, 289-337.
↑ Fisher, R.A. (1928). The general sampling distribution of the multiple correlation coefficient. Proceedings of the Royal Society of London, 121, 654-673.
↑ Rodger, R.S. (1967). Type I errors and their decision basis. British Journal of Mathematical and Statistical Psychology, 20, 51-62.
↑ Rodger, R. S. (1975a). The number of non-zero, post hoc contrasts from ANOVA and error-rate I. British Journal of Mathematical and Statistical Psychology, 28, 71-78.
↑ Rodger, R. S. (1975b). Setting rejection rate for contrasts selected post hoc when some nulls are false. British Journal of Mathematical and Statistical Psychology, 28, 214-232.
↑ Rodger, R. S. (1974). Multiple contrasts, factors, error rate and power. British Journal of Mathematical and Statistical Psychology, 27, 179-198.
↑ Rodger, R.S. and Roberts, M. (2013). Comparison of power for multiple comparison procedures. Journal of Methods and Measurement in the Social Sciences, 4(1), 20–47.
↑ Rodger, R. S. (1969). Linear hypotheses in 2xa frequency tables. British Journal of Mathematical and Statistical Psychology, 22, 29-48.
↑ Hotelling, H. (1931). The generalization of Student's ratio. Annals of Mathematical Statistics, 2, 360-378.
↑ Hotelling, H. (1951). A generalised T-test and measure of multivariate dispersion. Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, 23-42.
↑ Rodger, R. S. (1978). Two-stage sampling to set sample size for post hoc tests in ANOVA with decision-based error rates. British Journal of Mathematical and Statistical Psychology, 31, 153-178.
↑ Roberts, M. (2011). Simple, Powerful Statistics: An instantiation of a better 'mousetrap.' Journal of Methods and Measurement in the Social Sciences, 2(2), 63-79.
↑ Delamater, A. R., Campese, V., & Westbrook, R. F. (2009). Renewal and spontaneous recovery, but not latent inhibition, are mediated by gamma-aminobutyric acid in appetitive conditioning. Journal of Experimental Psychology: Animal Behavior Processes, 35(2), 224–237. doi:10.1037/a0013293
↑ Williams, D. A., Frame, K. A., & LoLordo, V. M. (1992). Discrete signals for the unconditioned stimulus fail to overshadow contextual or temporal conditioning. Journal of Experimental Psychology: Animal Behavior Processes, 18(1), 41-55.

External Links

Math'n'Stats historical material
Simple, Powerful Statistics (SPS software) (download website for a free, Windows-based computer program that makes using Rodger’s method accessible to all researchers)

[1] Fisher, R.A. (1918). The Correlation Between Relatives on the Supposition of Mendelian Inheritance. Philosophical Transactions of the Royal Society of Edinburgh, 52, 399-433.

[2] Neyman, J. and Pearson, E.S. (1933). On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London, Series A, 231, 289-337.

[3] Fisher, R.A. (1928). The general sampling distribution of the multiple correlation coefficient. Proceedings of the Royal Society of London, 121, 654-673.

[4] Rodger, R.S. (1967). Type I errors and their decision basis. British Journal of Mathematical and Statistical Psychology, 20, 51-62.

[5] Rodger, R. S. (1975a). The number of non-zero, post hoc contrasts from ANOVA and error-rate I. British Journal of Mathematical and Statistical Psychology, 28, 71-78.

[6] Rodger, R. S. (1975b). Setting rejection rate for contrasts selected post hoc when some nulls are false. British Journal of Mathematical and Statistical Psychology, 28, 214-232.

[7] Rodger, R. S. (1974). Multiple contrasts, factors, error rate and power. British Journal of Mathematical and Statistical Psychology, 27, 179-198.

[8] Rodger, R.S. and Roberts, M. (2013). Comparison of power for multiple comparison procedures. Journal of Methods and Measurement in the Social Sciences, 4(1), 20–47.

[9] Rodger, R. S. (1969). Linear hypotheses in 2xa frequency tables. British Journal of Mathematical and Statistical Psychology, 22, 29-48.

[10] Hotelling, H. (1931). The generalization of Student's ratio. Annals of Mathematical Statistics, 2, 360-378.

[11] Hotelling, H. (1951). A generalised T-test and measure of multivariate dispersion. Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, 23-42.

[12] Rodger, R. S. (1978). Two-stage sampling to set sample size for post hoc tests in ANOVA with decision-based error rates. British Journal of Mathematical and Statistical Psychology, 31, 153-178.

[13] Roberts, M. (2011). Simple, Powerful Statistics: An instantiation of a better 'mousetrap.' Journal of Methods and Measurement in the Social Sciences, 2(2), 63-79.

[14] Delamater, A. R., Campese, V., & Westbrook, R. F. (2009). Renewal and spontaneous recovery, but not latent inhibition, are mediated by gamma-aminobutyric acid in appetitive conditioning. Journal of Experimental Psychology: Animal Behavior Processes, 35(2), 224–237. doi:10.1037/a0013293

[15] Williams, D. A., Frame, K. A., & LoLordo, V. M. (1992). Discrete signals for the unconditioned stimulus fail to overshadow contextual or temporal conditioning. Journal of Experimental Psychology: Animal Behavior Processes, 18(1), 41-55.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]