Instructional designs

Considering P-values, Confidence Intervals & Analysis of Variance

The process of data collection, for this dissertation, begins with administering pretest and posttest surveys in such a way that responses generate statistics in the form of interval and ratio data. “A statistic is defined as a numerical quantity, such as the mean, calculated in a sample” (Lane, 2008. Para. 1). The researcher then examines the responses to these surveys and calculates a mean score, which describes the statistical average within the data set. Using a T-test, this researcher will use means, standard deviations, normal distributions and confidence intervals in an attempt to generate a T-curve, the results of which will either confirm or refute the null hypothesis.

The P-value

If the null hypothesis is refuted and the alternative hypothesis is confirmed, the researcher will validate the probability that change in the studied phenomenon was due to more than sampling variation and will confirm the data may be generalized to a larger population, by calculating the P-Value.

In hypothesis testing, the p-value is the probability of obtaining a statistic as different from or more different from the parameter specified in the null hypothesis as the statistic obtained in the experiment. The probability value is computed assuming the null hypothesis is true. If the probability value is below the significance level then the null hypothesis is rejected” (Lane, 2008. Para. 1).

Accurately calculating the mean requires normal distribution of data, so “predictable percentages of the population lie within any given portion of the curve” (Leedy & Ormrod, 2005, p. 255). According to Janeba (1999), normal distribution refers to a bell curve when 68 percent of the data falls within one standard deviation of the mean, 95 percent of the data will fall within two standard deviations of the mean and 99.7% of the data will fall within three standard deviations of the mean. Thus, the normal curve assists this researcher in determining how close the sample’s responses, on the pre and post tests, are to the overall population.

Analysis of Variance

Despite the term, analysis of variance (ANOVA) is not concerned with difference between variances. Rather, according to Lethen (1996), it deals with differences between means of groups. “Analysis of variance assumes normal distributions and homogeneity of variance. Normal distributions are a family of distributions that have the shape shown below.

Normal distributions are symmetric with scores more concentrated in the middle than in the tails. They are defined by two parameters: the mean (μ) and the standard deviation (σ). A parameter is a numerical quantity measuring some aspect of a population of scores. For example, the mean is a measure of central tendency” (Lane, 2008. para. 3)

Analysis of Variance (ANOVA) allows us to test more than two populations or treatments by examining whether all the means from more than two populations are equal and whether means from more than two treatments on one population are equal.

“The key statistic in ANOVA is the F-test of difference of group means, testing if the means of the groups formed by values of the independent variable (or combinations of values for multiple independent variables) are different enough not to have occurred by chance. If the group means do not differ significantly then it is inferred that the independent variable(s) did not have an effect on the dependent variable” (Garson, 1998/2008. para. 6).

For the purposes of this researcher’s applied dissertation, only one intervention will be implemented, on a single population, therefore an ANOVA will not be employed. While this researcher will not be comparing two different approaches to teaching with one or more sample populations, this applied dissertation does intend to generalize the data collected from the sample population, to the general population. In order to do so, examining the confidence interval is important.

Confidence Interval

The confidence interval is a description of the accuracy to which ones sample results reflects the general population.

“Confidence intervals give us an estimate of the amount of error involved in our data. They tell us about the precision of the statistical estimates (e.g., means, standard deviations, correlations) we have computed. Confidence intervals are related to the concept of the power. The larger the confidence interval the less power a study has to detect differences between treatment conditions in experiments or between groups of respondents in survey research” (Becker, 1999. pt. 1).

For example, one might survey the students in a given classroom asking; "Do you prefer lecture or experiential learning?" Sixty percent answer Experiential and 40 percent answer Lecture. Does this mean 60% of all students prefer experiential learning? Is there any way to be confident that the actual proportion of people choosing Experiential Learning will be within some interval around the 60% found in the sample? Based on a basic formula, if the survey is based on a sample of 100 persons, you can be 90% confident that the actual proportion of those who prefer experiential learning will be between 52% and 68%. The larger the sample population, the more closely it represents the general population and so the more narrow the confidence interval.

Employing a T-test, this researcher will study a single intervention administered to a single population. Using means, normal distributions and confidence intervals, a T-curve will be generated, either confirming or refuting the null hypothesis. Accordingly, the P-Value will be calculated to support generalizations that the data collected from the study may be generalized to a broader population.

Analyzing Numerical Outcomes Using Means, Standard Deviations, Normal Distributions and Confidence Intervals

Using a pre-post course survey, a researcher maycollect data that will assist in determining whether the course itself was effective in influencing course outcomes. Using a T-test, this researcher will use means, standard deviations, normal distributions and confidence intervals in an attempt to generate a T-curve, the results of which will either confirm or refute the null hypothesis. If the null hypothesis is refuted and the alternative hypothesis is confirmed, the researcher will validate the probability that change in the studied phenomenon was due to more than sampling variation. The process of data collection, for this dissertation, begins with administering pretest and posttest surveys in such a way that responses generate interval and ratio data. The researcher then examines the responses to these survey’s and calculates a mean score, which describes the statistical average within the data set.

Leedy & Ormrod (2005) describe the mean, also known as the arithmetic mean, as a computation of the average tendency of scores derived from the pretest and posttest, participant responses. For the purposes of this study, determining the mean responses to the pretest and posttest questions will assist the researcher in interpreting the data collected. For instance, if the average number of students/counselors respond “yes” to the question; “do you have a better understanding of the term psychobiology than you had prior to taking the course,” the researcher will be able to draw conclusions based on that specific response for the sample population and perhaps draw conclusions about counselors in general.

In order for the researcher to calculate the mean of the data set, the concepts of normal distribution, standard deviation and confidence intervals must be considered. Accurately calculating the mean requires normal distribution of data, so “predictable percentages of the population lie within any given portion of the curve” (Leedy & Ormrod, 2005, p. 255). According to Janeba (1999), normal distribution refers to a bell curve when 68 percent of the data falls within one standard deviation of the mean, 95 percent of the data will fall within two standard deviations of the mean and 99.7% of the data will fall within three standard deviations of the mean. Thus, the normal curve assists this researcher in determining how close the sample’s responses, on the pre and post tests, are to the overall population. To further illustrate the relationship between the mean and the standard deviation on a curve of normal distribution, one must consider the context of the mean. If one were to say the mean score is 50, the number would have little meaning unless one identifies whether the score is, for instance, 50 out of 50 or 50 out of 100. “If the mean and standard deviation of a normal distribution are known, it is relatively easy to figure out the percentile rank of a person obtaining a specific score” (Lane, 2008. pt. 1). Standard deviation refers to measures of variability within the data set. The confidence interval is a description of the accuracy to which ones sample results reflects the general population.

“Confidence intervals give us an estimate of the amount of error involved in our data. They tell us about the precision of the statistical estimates (e.g., means, standard deviations, correlations) we have computed. Confidence intervals are related to the concept of the power. The larger the confidence interval the less power a study has to detect differences between treatment conditions in experiments or between groups of respondents in survey research” (Becker, 1999. pt. 1).

For example, one might survey the students in a given classroom asking; "Do you prefer lecture or experiential learning?" Sixty percent answer Experiential and 40 percent answer Lecture. Does this mean 60% of all students prefer experiential learning? Is there any way to be confident that the actual proportion of people choosing Experiential Learning will be within some interval around the 60% found in the sample? Based on a basic formula, if the survey is based on a sample of 100 persons, you can be 90% confident that the actual proportion of those who prefer experiential learning will be between 52% and 68%. The larger the sample population, the more closely it represents the general population and so the more narrow the confidence interval.

Employing a T-test, this researcher will use means, standard deviations, normal distributions and confidence intervals in an attempt to generate a T-curve, the results of which will either confirm or refute the null hypothesis. Ascertaining the mean on a normal distribution, with an estimated standard deviation, this researcher will be able to calculate percentages of similarities and dissimilarities within the sample population with a confidence interval of moderate width. This will allow the researcher to make some inferences as to how the data from the sample population may be generalized to a broader population.

The specific formulas to calculate the mean, standard deviation and confidence intervals are beyond the scope of this paper. Suffice it to say, they are not overly complex and can indeed these calculations can be done using a standard, home calculator or a basic, statistics, software package. Certainly, these statistical calculations will be appropriate for analyzing the quantitative data collected from the pretest, posttest and peer-assessment which will be administered as part of the evaluation of the training, central to this applied dissertation. Using Qualitative and Quantitative Research Designs for Curriculum Evaluation

Blending Quantitative and Qualitative Methodologies

“Conducting mixed methods research involves collecting, analyzing, and interpreting quantitative and qualitative data in a single study or in a series of studies that investigate the same underlying phenomenon” (Onwuegbuzie & Leech, 2006. p. 1). Both quantitative and qualitative methodologies have limits and benefits respectively. “Critics of quantitative studies concluded that these studies restrict our views of human beings because they concentrate on repetitive and predictable aspects of human behavior while “on the other hand, qualitative research may appear to be fraught with subjectivism” (Schulze, 2003. pp. 2-20).

Using a mixed-methodology allows the researcher to counterbalance the shortcomings of each approach without compromising the benefits each can offer. “Because of its logical and intuitive appeal, providing a bridge between the qualitative and quantitative paradigms, an increasing number of researchers are utilizing mixed methods research to undertake their studies” (Onwuegbuzie & Leech, 2006. p. 474). There are five reasons, described in the literature, for blending methodologies;

“(a) triangulation (i.e., seeking convergence and corroboration of findings from different methods that study the same phenomenon); (b) complementarity (i.e., seeking elaboration, illustration, enhancement, and clarification of the results from one method with results from the other method); (c) initiation (i.e., discovering paradoxes and contradictions that lead to a re-framing of the research question/questions); (d) development (i.e., using the results from one method to help inform the other method); and (e) expansion (i.e., seeking to expand the breadth and range of the investigation by using different methods for different inquiry components)” (Onwuegbuzie & Leech, 2006. p. 480).

Reliability and Validity

A researcher must endeavor to have well established protocols for collecting and organizing data. Whether the data examined is derived from a qualitative or quantitative approach, the reliable and valid replication of the study is essential to bolster the study’s credibility. When using a qualitative, quantitative or a mixed approach, “researchers need to test and demonstrate that their studies are credible” (Golafshani, 2003. pp. 597-607). To this end, the researcher must incorporate the rules of reliability and validity in ways that can be useful when employing a mixed methodology. Quantitative and qualitative methodologies approach this endeavor in different ways. Quantitative researchers posit “if the results of a study can be reproduced under a similar methodology, then the research instrument is considered to be reliable. Validity determines whether the research truly measures that which it was intended to measure or how truthful the research results are” (Golafshani, 2003. pp. 597-607). Although reliability and validity are treated separately in quantitative studies, these terms are not viewed separately in qualitative research. Instead, terminology that encompasses both, such as credibility, transferability, and trustworthiness is used (Golafshani, 2003. pp. 597-607).

When a researcher is also the instructor of the course, there is an inherent subjectivity that must be addressed in the development of the research project, the collection and the analysis of the data. Leedy & Ormond (2000) admit it is impossible to be totally objective when analyzing data. Jahn and Dunne (1997) go on to suggest excessive objectivity could limit scientific and cultural relevance. “While the credibility in quantitative research depends on instrument construction, in qualitative research the researcher is the instrument" (Golafshani, 2003. pp. 597-607). Nonetheless, Leedy and Ormond do emphasize a diligent endeavor towards meeting a standard of "rigorous subjectivity" (Leedy & Ormond, 2000. p 138).

Ones research questions are central to linking data analysis to study design. “The development of research questions and data analysis procedures in mixed method studies should occur logically and sequentially. Quantitative research questions usually represent one of the following three types: descriptive, correlational, or comparative” (Onwuegbuzie & Leech, 2006. p. 488). The type of research question drives the choice of statistical analysis. For instance, If the research question is descriptive in nature, measures of central tendency would be appropriate. If the research question is correlational, then using a correlation coefficient would be a congruent choice for the purposes of analysis.

Constant comparative analysis is a commonly employed qualitative data analysis tool. However, there are many tools available for analyzing qualitative data including “methods of constant comparison, keywords-in-context, word count, classical content analysis, domain analysis, taxonomic analysis, and componential analysis. In addition to these qualitative data analyses there is a class of data analytical tools known as cross-case analyses. Indeed, it is recommended in the literature that researchers analyze their data using at least two procedures in order to triangulate their findings and interpretations” (Onwuegbuzie & Leech, 2006. p. 490). Onwuegbuzie & Leech (2006. p. 493) well delineate the relationships among research questions, research design, and mixed methods data analyses and will serve as a helpful guide in employing mixed-methodology for this researchers applied dissertation.

Conclusion

Attempting to prove cause and effect is difficult. The researcher must understand the limitations of a study that assesses and evaluates interventions in a specific space and time, versus longitudinal studies that offer a more comprehensive view of phenomena. While concurrent control and mediating factors will be assessed and addressed, in order to “strengthen the plausibility of any statement attributing an impact to the intervention” (Kirkwood & Sterne, 1988/2003. p. 405), this researchers primary goal is to identify themes and patterns amongst the sample population and in order to make general assumptions about ways to develop and implement a course that teaches the use of role-play for treating PTSD.

Example:

A researcher employs simple, random sampling to conduct a cross-sectional study in order to examine “the association of an exposure with an outcome” (Kirkwood & Sterne, 1988/2003. p. 407). More specifically, the researcher will evaluate the prevalence of student attitudinal changes and specific, skills acquisition.

Leedy & Ormrod (2005) note the significant challenge of measuring insubstantial phenomenon such as concepts, ideas, opinions and feelings. Still, this researcher intends to identify patterns and themes that arise though class by collecting data from peer and instructor observations, as well as inventories and surveys. As with any self-inventory, discussion regarding the results is critical to learning. Structured reflection, in an assessment-centered learning, environment will provide opportunities for peers and instructors to "receive feedback, clarify ideas and correct misconceptions" (Bransford et al., 2000, p. 196). Questions of method are secondary to questions of paradigm, which we define as the basic belief system or world view that guides the investigator” (Casebeer & Verhoef, 1997. para. 16). This researcher has chosen a mixed methodology, rather than be confined to the constructs of only qualitative and quantitative approaches.

References

Bransford, J. et al. (2000). How People Learn (Expanded ed.). Washington, DC: National Academy Press.

Becker, L. (1999). Confidence Intervals. . (Original work published 1997) Retrieved May 2, 2008, from The University of Colorado at Colorado Springs Web site: http://web.uccs.edu/lbecker/SPSS/confintervals.htm

Casebeer, & Verhoef. (1997). Combining Qualitative and Quantitative Research Methods: Considering the Possibilities for Enhancing the Study of Chronic Diseases. Chronic Diseases in Canada, 18(3), para. 16.

Garson, D. (2008). Univariate GLM, ANOVA, and ANCOVA. . (Original work published 1998) Retrieved May 9, 2008, from Rice University: HyperStat Oline Web site: http://davidmlane.com/hyperstat/intro_ANOVA.html

Janeba, M. (1999). Quick facts about the normal curve. Retrieved May 1, 2008, from Department of Mathematics, Willamette University Web site: http://www.willamette.edu/~mjaneba/help/normalcurve.html

Kirkwood, & Sterne. (2003). Medical Statistics (2nd ed.). Malden, MA: Blackwell Publishing. (Original work published 1988)

Lane, D. (2008). Standard Deviation and Variance. Retrieved May 1, 2008, from Departments of Psychology and Statistics, Rice University Web site: http://davidmlane.com/hyperstat/A79567.html

Lane, D. (2008). Probability Value. Retrieved May 9, 2008, from Rice University: HyperStat Online Web site: http://davidmlane.com/hyperstat/search_hyperstat.html

Lane, D. (2008). Introduction to Between-Subjects Analysis of Variance: Preliminaries (3 of 4). Retrieved May 9, 2008, from Rice University: Hyperstat Online Web site: http://davidmlane.com/hyperstat/B83967.html

Leedy, P. D., & Ormrod, E. J. (2005). Practical Research: Planning and Design (8th ed.) (Julie. Peters & Crisp. Benson, Eds.). Upper Saddle River, N.J./USA: Author.

Lethen, J. (1996). Introduction to ANOVA. Retrieved May 9, 2008, from Rice University: HyperStat Online Web site: http://davidmlane.com/hyperstat/intro_ANOVA.html

Nahid Golafshani. (2003). Understanding Reliability and Validity in Qualitative Research. The Qualitative Report, 8(4), 597-607.

Onwuegbuzie, & Leech. (2006). Linking Research Questions to Mixed Methods Data Analysis Procedures. The Qualitative Report, 11(3), 474-498.

Salomé Schulze. (2003). Views on the combination of quantitative and qualitative research approaches. Progressio, 25(2), 8-20.

Spread, Dispersion, Variability. (n. d.). Retrieved May 1, 2008, from Rice University, Virtual Lab in Statistics Web site: http://www.ruf.rice.edu/~lane/stat_sim/conf_interval/index.html