Evaluation Theories/Pre-Class Notes Wk. 2
Type classification: this is a notes resource. |
Completion status: Been started, but most of the work is still to be done. |
Pre-Class Notes Wk 2; Alkin Intro & Tree; Shadish on Scriven (Valuing) & Campbell (Experimenting Society)
p. 3 - 13:29:40 (15 min)
p. 4
Stakeholder participation is “essential to derive the maximum values from a program evaluation” (Alkin, 2013, p. 4)
“two general types of models: (a) a prescriptive model, the most common type, is a set of rules, prescriptions, prohibitions, and guiding frameworks that specify what a good or proper evaluation is and how evaluation should be done— such models serve as exemplars—and (b) a descriptive model is a set of statements and generalizations that describes, predicts, or explains evaluation activities— such a model is designed to offer an empirical theory.” – linked with “research on evaluation” (See Henry & Mark, 2003) (Alkin, 2012, p.
“When (and if) we do [develop descriptive models], then the descriptive models would define what is to be appropriately “pre- scribed.” -- “Until then, however, we must rely on the prescriptive models generated by knowledge- able members of the evaluation community to guide practice.”
p. 4 - 13:45:00 (15 min)
J: Three Stages in Developing Theory for a New Field (And how Alkin’s Categories Fit) Stage 1: categorize practices, Stage 2: put forward theories for why they work and what ends they strive towards, Stage 3: test those theories.
p. 5
Methodology for Choosing Theorists:
4 Categories: Methodologists, Evaluation Issue Analysts; Interpreters & Teachers; Theorists Methodologists: J: Methodology needs a "for what" - so what is Campbell's, "For What"?) contributed very substantially to the basic research methodology that forms the essential foundation for much of the work in evaluation. Noted: Donald Campbell, Julian Stanley, Thomas Cook, Matthew Miles and Michael Huberman, Robert Yin, and Anthony Bryk.1 Chosen: Donald Campbell Evaluation Issue Analysts substantially assisted in the understanding of various aspects of evaluation. Lois-Ellin Datta, Stewart Donaldson, Karen Kirkhart, Jonathan Morell, Michael Morris, Thomas Schwandt, William Shadish, Nick Smith, Evaluation Interpreters and Teachers. teaching about evaluation and help- ing to interpret its nuances. Michael Bamberger, Linda Mabry, and Jim Rugh (2011), Jody Fitzpatrick, James Sanders, and Blaine Worthen (2011), Rita O’Sullivan (2004), Emil Posevac and Raymond Carey (2007), and Liliana Rodriguez-Campos (2005). (J: Some of these might be “theorists”) Theorists definitively associated with a particular theoretical position on evaluation restricted my consideration of evaluators to those who speak about the field generically p. 6: “those who write only about evaluating health programs or education or social welfare have not been included. p. 6: those who do not consider themselves as evaluators but exclusively assign another disciplinary designation to their name are also not generally included.
p. 5 14:00:16 (15 minutes)
(J: Analysis: I do not like the bit about being “associated with a particular theoretical position on evaluation” – That may be one measure, but any other measure should include Shadish’s definition of theory: Something that can be developed separately from what others know: “Theory connotes a body of knowledge that organizes, categorizes, describes, predicts, explains, and otherwise aids in under- standing and controlling a topic." –Shadish (1991) 'Foundations of Program Evaluation', p. 30. These should be among the criteria as well, so that Alkin as an expert himself can use logic as well as surveys of current literature to develop “who is a theorist” - There may be people who aren’t on anyone’s radar within the field who nonetheless have something crucial to say to tie together the field. )
p. 6
Category Systems:
attempts to look at the ways in which various theoretical perspectives relate to each other.
identify a limited set of characteristics for grouping theories.
Entries within a category are deemed to belong together in the sense that they can be judged to be similar with respect to the ... configuration of characteristics that define that category.
Early Category Systems: evaluation category systems were those provided by Worthen and Sanders (1973) and Popham (1975). Subsequently, category systems have been developed by House (1978), Glass and Ellett (1980), Alkin and Ellett (1985), Williams (1988), Shadish, Cook, and Leviton (1991), Alkin and House (1992),
J: If I study this now, I will develop biases that I didn’t even know I had. I believe that a certain subset of scholars should develop a qualitative methodology for exploring a field, and then use that methodology on that field – and then look at whether their methodology makes sense.
p. 6 - 15:44:06
p. 7 -15:57:45 “These category systems failed to portray the historically derived relationships between theories” “failed to show which theoretical formulations provided the intellectual stimulation for new theories” J: But historical derivation is often at odds with logical simplification of the complexity: historical derivation builds up cruft that must be removed in order for the optimal system.
p.20:
Suchman’s identification of five categories of evaluation
(2) performance (effect criteria that measure the results of effort),
(3) adequacy of performance (the degree to which performance is adequate for the total amount of need)
(4) efficiency (examination of alternative paths or methods in terms of human and monetary costs)
(5) process (how and why a program works or does not work)
p.24 Cronbach’s UTOS: Units (populations), Treatments, Observations (outcomes) Settings
Cronbach (1982) coins
a set of symbols
to define the domains of evaluation.
domains
units (populations)
treatments
observations (outcomes)
settings
p. 41
CIPP
an acronym for four types of evaluation: context, input, process, and product.
Context evaluation involves identifying needs to decide on program objectives.
nput evaluation leads to decisions about strategies and designs.
Process evaluation consists of identifying shortcomings in a current pro- gram to refine implementation.
product evaluation measures outcomes for decisions regard- ing the continuation or refocus of the program.
V.2
Anne Vo has actively participated in and contributed to our theoretical discus- sions. Tarek Azzam provided substantial assistance on the tree as well.
Marilyn, for putting up with my obsession with “growing” a tree. Marvin Alkin
“New Ideas.” I wondered about it and after many months had the courage to com- ment on it. Erick Lindman gently smiled and turned the clothespin over; on the other side was the inscription “Old Ideas Still Good.”
Hightlighting Guide:
read the works of Ralph Tyler in connection with the evaluation conducted on the famous “Eight-Year Study” of progressive education. In Tyler (1942), I found many concepts that I recognized to be the basis of contemporary approaches to evaluation.
"works of Ralph Tyler in connection with the evaluation conducted on the famous “Eight-Year Study” of progressive education."
Uncategorized
Yellow highlights: general information
How-To
Orange Highlights: Important "HOW-TO" information (Also coded "Purple" In Apple Preview)
Purple Highlights = Citations
Reference
is to validate the hypothesis upon which the education institution operates. (p. 492) (Theory-Based Evaluation? See Fitz-Gibbon & Morris, 1975; Chen, 1990)
validate the hypothesis upon which the education institution operates.
important purpose of evaluation that is frequently not recognized
Topic of Evaluation
Green - Topic
One purpose of evaluation is to make a periodic check on the effectiveness of the educational institution,
check on the effectiveness of the educational institution
thus to indicate the points at which improvements in the program are necessary.
indicate the points at which improvements in the program are necessary.
(Formative Evaluation? See Scriven, 1967)
stake
formation and classification of objectives, they
formation
formation and classification
obtain evidence about the progress
“Old Ideas Revisited and Enhanced—But Still Good.”
hile it is conventionally used in evaluation literature, in some ways, it would be more appropriate to use the term approaches or models.
two general types of models: (a) a prescriptive model, the most common type, is a set of rules, prescriptions, prohibitions, and guiding frameworks that specify what a good or proper evaluation is and how evaluation should be done—such models serve as exemplars—and (b) a descriptive model is a set of statements and generalizations that describes, predicts, or explains evaluation activities—such a model is designed to offer an empirical theory.
When (and if) we do, then the descriptive models would define what is to be appropriately “pre- scribed.”
Until then, however, we must rely on the prescriptive models generated by knowledge- able members of the evaluation community to guide practice.
Some have spent more time systemizing their standards, criteria, and prin- ciples.
systemizing their standards, criteria, and prin- ciples.
None of the approaches is predic- tive or offers an empirical theory.
None of the approaches is predic- tive or offers an empirical theory. That is, these “theories” have not been validated by empirical research.
A few have tried to defend or justify their prescriptions.
defend or justify
prescriptions
these “theories” have not been validated by empirical research.
we refer to those who have developed evaluation approaches and models as “theorists.”
we identify theories by the name of the theorist prominently associated with it.
contributed very substantially to the basic research methodology that forms the essential foundation for much of the work in evaluation.
basic research methodology
orms the essential foundation for much of the work in evaluation.
Donald Campbell, Julian Stanley, Thomas Cook, Matthew Miles and Michael Huberman, Robert Yin, and Anthony Bryk.1
Of these individuals, we have written only about Donald Campbell in the discussion of theorists because of the unique impact of his methodological contributions.
methodologists
evaluation issue analysts
substantially assisted in the understanding of various aspects of evaluation.
Lois-Ellin Datta, Stewart Donaldson, Karen Kirkhart, Jonathan Morell, Michael Morris, Thomas Schwandt, William Shadish, Nick Smith,
evaluation interpreters and teachers.
teaching about evaluation and help- ing to interpret its nuances.
Michael Bamberger, Linda Mabry, and Jim Rugh (2011), Jody Fitzpatrick, James Sanders, and Blaine Worthen (2011), Rita O’Sullivan (2004), Emil Posevac and Raymond Carey (2007), and Liliana Rodriguez-Campos (2005).
definitively associated with a particular theoretical position on evaluation
evaluation theorists
many in this category may not have presented a full theo- retical exposition
proposed a particular evaluation orientation
restricted my consideration of evaluators to those who speak about the field generically
those who write only about evaluating health programs or education or social welfare have not been included.
those who do not consider themselves as evaluators but exclusively assign another disciplinary designation to their name are also not generally included.
attempts to look at the ways in which various theoretic perspec- tives relate to each other.
Earlier efforts have taken the form of category (or classification) systems. These simplified structures provided a way to identify a limited set of characteristics for grouping theories.
identify a limited set of characteristics for grouping theories.
Entries within a category are deemed to belong together in the sense that
they can be judged to be similar with respect to the
configuration of character- istics that define that category.
in making this judgment, the categorizer is selecting from the many aspects of the approach only those that are considered most essential.
this is similar to an artist’s creation of a caricature, portraying someone or something by focusing on (even overemphasizing) its most prominent features
evaluation category systems were those provided by Worthen and Sanders (1973) and Popham (1975). Subsequently, category systems have been developed by House (1978), Glass and Ellett (1980), Alkin and Ellett (1985), Williams (1988), Shadish, Cook, and Leviton (1991), Alkin and House (1992),
Category systems are of great value.
Category systems also were an aid to theorists in understanding perceived relationships with other theorists
theorists’ views are not fixed in time,
published work often lags behind these changes
one’s views as perceived by others (whether or not they are still held) have influenced theorists.
Whatever the explanations for a perceived portrayal of a theorist’s views, the perceptions provided by category systems may force theorists to recon- sider their views and perhaps modify them.
While earlier category systems prior to the first edition of Evaluation Roots served evalua- tion well, they suffered from several deficiencies
These category systems failed to portray the historically derived relationships between theories
Historically derived relationships
failed to show which theoretical formulations provided the intellectual stimulation for new theories
prescriptive theories must consider (a) the issues related to the methodology being used, (b) the manner in which data are to be judged or valued, and (c) the user focus of the evaluation effort.
evaluation theory “tree,
each of the theorists is pre- sented on the tree on a branch that we believe represents his or her main emphasis among these three.
use, methods, judgment/valuing.
not one based on exclusivity
It might then be possible to ask this question: When evaluators must make concessions, what do they most easily give up and what do they most tenaciously defend (Alkin & Ellet, 1985)?
the category system is based on the relative emphasis within the various models
Carol Weiss, for example, indicated that she was satisfied with her placement but felt that, to some extent, she belonged on the “use” branch as well. David Fetterman, likewise, agreed with his placement but felt that it did not adequately represent his interest in “social justice.” Jennifer Greene commented that Lee Cronbach is not fundamentally concerned about methods but that his placement on the “methods” branch was probably as good a representa- tion as possible.
Testimonies from evaluation theorists about hteir placement
we capture only one dimension of the influences on their theoretical perspectives.
w
ana- lyzed the evaluator influences from other branches of the tree based on the names identified by the theorists in their respective chapters.
evaluator influences from other branches of the tree based on the names identified by the theorists in their respective chapters.
Ernest House’s The Logic of Evaluative Argument (1977), which clearly shows the influence of early evaluation theorists such as Michael Scriven and Robert Stake but which also relies heavily on the work of two Belgian philosophers, Perelman and Olbrechts-Tyteca (1969) and on the work of Weizenbaum (1976).
more authors
Eleanor Chelimsky, Jennifer Greene, Henry Levin, Melvin Mark and Gary Henry, and Donna Mertens
Eleanor Chelimsky had created an impressive model for evaluating national programs in her work at the GAO.
Henry Levin provided an evalu- ative dimension not otherwise represented—cost-effectiveness evaluation.
social accountability, systematic social inquiry, and epistemology
systematic social inquiry
three “roots”
social accountability
presents an important motivation for evaluation,
ocial accountability
systematic social inquiry—emanates from a concern for employing a method- ical and justifiable set of procedures for determining accountability
"method- ical and justifiable set of procedures for determining accountability" - Is this the best methodological base to work from? Why?
epistemology, is the area of philoso- phy that deals with the nature and validity (or limitations) of knowledge
Key evaluation con- cerns that are based in epistemological arguments include the legitimacy of value claims, the nature of universal claims, and the view that truth (or fact) is what we make it to be.
is the branch of the tree in which evaluation is primarily guided by research methodology.
social inquiry root.
concerned with obtaining the most rigorous knowledge possible given the contextual constraints,
knowledge construction,”
Shadish, Cook, and Leviton (1991).
we recog- nize that it is more accurate to describe these approaches as emphasizing research methodology, specifically the techniques used in the conduct of evaluation studies, rather than just the methods used to conduct such studies. However, because we called this the methods branch in the first version of the tree, we have chosen to continue with this label.
valuing
branch is split in two—objectivist and subjectivist
subjectivists argue that value judgments should be based on “publicly observable” facts.
objectivist
is compatible with the postpositivist philosophical ideas that by and large inform methods theorists’ work.
use
originally focused on an orientation toward evaluation and decision making.
not meant to be viewed as independent from one another
reflect a relational quality between them
the work of those theorists who are placed on the right side of the methods branch, leaning toward the valuing branch, reflects a secondary importance placed on valuing
the left side of the valuing branch, closest to the methods branch, are primarily objectivist valuing theorists, with a secondary concern for methods
e
far right side of the tree—the subjectivist arm of the valuing branch—we find theorists who reflect the relationship between valuing and use
concerned with individual stakeholders’ actionable use of findings
the far left of the use branch reflects theorists whose primary concern is use but who have a secondary concern for valuing—in other words, a secondary concern for social justice and empowerment.
we populated the tree with the theorists who are most commonly associated with or who have made the initial and most notable and substantial contributions to each of the particular approaches rep- resented.
Chen
synthesized and advanced these ideas
n
FOUNDATIONAL ROOTS Social Accountability
important role that accountability plays in evaluation
several dimensions to accountability
reporting
in which only description is provided
justifying analysis
explanation
where a justify- ing analysis recognizes deficiencies
true accountability requires “answerability
that is, those responsible must be held accountable. This phase of accountability is not reflected in evaluation; evaluation simply provides the information for “being answerable.”
2) process accountability
Alkin (1972a) defines three types of accountability
(1) goal accountability
3) outcome accountability
Goal accountability
examines whether reasonable and appro- priate goals have been established
whether reasonable and appropriate procedures for accomplishing those goals have been established and implemented,
Process accountability
outcome accountability
c
the extent to which established goals have been achieved
program accountability is prominent in the “process” section of Daniel Stufflebeam’s CIPP (an acronym for four types of evaluation: context, input, process, and product)
CIPP
context
input
process
product
four types of evaluation
Today, most evaluations have a strong focus on goal accountability, with an eye toward the improvement of institutional performance
U.S. Government Accountability Office (GAO) typifies how accountability is often viewed in contemporary North American evaluation practice
situates and legitimizes evaluation as a fundamental process for generating systematic information for decision making
Social accountability
Social Inquiry
can be characterized as the systemic study of the behav- ior of groups of individuals in various social settings by a variety of methods
Social inquiry
c
recognition that human action has a unique social dimension rather than simply a natural or psychological dimension
ntral overriding question is “Why do people in social groups act as they do?”
17th- and 18th-century
Hobbes, Montesquieu, and Rousseau
mid- and late 19th century, as demonstrated in the works of Karl Marx, Emile Durkheim, and Max Weber, for instance, that society and social groups began to be studied empirically through the collection and analysis of empirical data on social groups.
ich methods are appropriate for the study of society, social groups, and social life a
perennial question in social inquiry
discipline of anthropology
qualitative studies of the social world.
distinction
is sometimes couched in terms of the distinction between explana- tion and prediction, on the one hand, and interpretation and understanding, on the other.
explana- tion and prediction
interpretation and understanding
Clifford Geertz’s classical essay “Thick Description: Toward an Interpretive Theory of Culture” in The Interpretation of Cultures (1973)
Cutting across social science disciplines are broad philosophical and methodological questions
Important questions include the following:
Should social scientists have a moral stance toward the individuals and groups that they study?
What is the relationship between theory and observation?
Is this stance appropriate, and would it compromise the researchers’ objectivity?
Epistemology
paradigm
a worldview or perspective that, in the case of research and evaluation, includes conceptions of methodology, purposes, assumptions, and values .
typically consists of an ontology (the nature of reality),
methodology (how one can obtain knowledge).
an epistemology (what is knowable and who can know it),
2) con- structivism (and related thinking),
three broad areas of thinking:
(1) postpositivism
basic axioms of these paradigms offer a broader framework for understanding the theoretical influences on evaluation theorists’ Note: Portions of this section of this chapter are reprinted from What Counts as Credible Evidence in Applied Research and Evaluation Practice? by Stewart I. Donaldson, Christina A. Christie, and Melvin M. Mark 2009 by SAGE Publications, Inc. Chapter 2. An Evaluation Theory Tree 17 work
(3) pragmatism
What Axioms?!
What Axioms?
some evaluation perspectives are shaped more exactly by a philosophical theory, while in others, only a theory’s undercurrent can be detected
I think I'm a post-positivist on this scale:)
Views of science shifted during the 20th century away from positivism toward postpositiv- ism.
postpositivists believe its goal is to attempt to measure truth, even though that goal cannot be attained because all observation is fallible and has error.
positivists believe that the goal of science is to uncover the truth, postpositivists believe its goal is to attempt to measure truth, even though that goal cannot be attained because all observation is fallible and has error.
positivists believe that the goal of science is to uncover the truth
This type of realism is referred to as “critical realism.”
critical realism
critical realism
causation is observable and that over time predictors can be established; however, some degree of doubt associated with the conclusion will always exist.
causation is observable and that over time predictors can be established; however, some degree of doubt associated with the conclusion will always exist.
we have to measure how much they can be controlled.
Values and biases are noted and accounted for, yet the belief is that they can be controlled within the context of scientific inquiry.
there is no one reality, rather that several realities emerge from one’s subjective belief system.
Constructivism is one element of intrepretivism
Constructivism
Constructivists
we have to measure how much they can be controlled.
new knowledge and discovery can only be understood through a person’s unique and particular expe- riences, beliefs, and understandings of the world." -
These multiple realities are believed to be subjective and will vary based on the “knower.”
the “knower” and the “known” are interrelated, and to determine what is “known,” the knower constructs a reality that is based on and grounded in context and experience.
inquiry is considered value-bound, not value-free, and therefore, bias should be acknowledged rather than attempting to position the inquiry process so as to control it.
inquiry is considered value-bound, not value-free, and therefore, bias should be acknowledged rather than attempting to position the inquiry process so as to control it.
No - impact is the greatest priority: generalizability has a little impact for a lot of people; local relevance has a big impact for a few people. We want both.
All things influence all things; but cause and effect can come from the constant (juxtaposition) and theory for why that is so – (however, theories may change)
Inductive logic drives the inquiry process, which means that particular instances are used to infer broader, more general principles.
Inductive logic drives the inquiry process, which means that particular instances are used to infer broader, more general principles
Local relevance, then, outweighs and is of much greater priority than generalizability.
Cause and effect are thought to be impossible to distinguish because relationships are interde- pendent, and so simultaneously, all things have influence on all things.
there are no absolute truths, only relative truths
constructivist and relativist philosophy
Stake, Guba, and Lincoln.
No: there may well be absolute truths, but we can only KNOW relative truths, because of our imperfect information.
argue that deductive and inductive logic should be used in concert.
Pragmatists embrace objectivity and subjectivity as two positions on a continuum
move away from embracing equally the axioms of the postpositivist and constructivist paradigms.
pragmatists are more similar to postpositivists with regard to notions about external reality, with the understanding that there is no absolute “truth” concerning reality.
in line with constructivist thought, however, pragmatists argue that there are multiple explanations of reality and that at any given time there is one explanation that makes the most sense.
a single explanation of reality may be considered “truer” than another.
believe that causes may be linked to effects.
Which is of course, all I'm talking about when I say I'm a constructivist
However, they temper this thinking with the caveat that absolute certainty of causation is impos- sible.
they do not believe inquiry is value-free; rather, they consider their values important to the inquiry process
pragmatist paradigm
seems to influence the thinking of those on the use branch, particularly those who have an interest in promoting instrumental use of evaluation findings, such as Patton.
positivist and postpositivist research methodolo- gies dominated the conduct of studies.
In the beginning, there was research.
Donald Campbell
best known for his pathbreaking work on the elimination of bias in the conduct of research in field settings.
papers on experimental and quasi-experimental designs for research (Campbell, 1957; Campbell & Stanley, 1966).
“rule out many threats precluding causal inference” (Shadish et al., 1991, p. 122).
impact on social science research
Campbell and Stanley’s Experimental and Quasi-Experimental Designs for Research (1966).
conditions necessary to conduct a true experimental study
First
where randomization is the hallmark
Second
degree to which an experiment is properly controlled internal validity
degree to which an experiment is properly controlled internal validity
degree of applicability of the results of an experiment as external valid- ity.
degree of applicability of the results of an experiment as external valid- ity.
experiments are not perfect and that they should not, and cannot, be used in a great many situations
Third
Quasi-experimental designs were developed to deal with the messy world of field research,
The ideas put forth in Campbell and Stanley’s manuscript are now the foundation of almost all social science research methods courses.
not until Suchman (1967) saw the relevance of it to evaluation that his name became prominently identified with that field. It is because Campbell’s work on quasi-experimental design precedes Suchman’s application of it to evaluation that we choose to discuss Campbell prior to Suchman. It should be noted that Campbell (1975a, 1975b) has also written papers indicating the potential appropri- ateness of qualitative methods as a complement to quantitative experimental methods.
Edward Suchman
promoted Campbell’s work as the most effective approach for conduct- ing evaluation studies to measure program impact.
book, Evaluative Research 20 PART I. INTRODUCTION (1967),
perhaps the first full-scale description of the application of research methods to evaluation.
1967
evaluation as a commonsense usage, referring to the “social process of making judgments of worth” (p. 7)
Suchman (1967) distinguishes between
and evaluative research that uses scientific research methods and techniques.
affirms the appropriate use of the word evalua- tive as an adjective specifying the type of research being done.
comments that the evaluative researcher, in addition to recognizing scientific criteria, must also acknowledge administrative criteria for determining the worthwhileness of doing the study
This
influence
many
others
on the methods branch (e.g., Rossi, Weiss, Chen, and Cronbach).
(1) effort (the quantity and quality of activity that takes place)
Suchman’s identification of five categories of evaluation
(2) performance (effect criteria that measure the results of effort),
(3) adequacy of performance (the degree to which performance is adequate for the total amount of need)
(4) efficiency (examination of alternative paths or methods in terms of human and monetary costs)
(5) process (how and why a program works or does not work)
Robert Boruch
Randomized field tests are also different from quasi-experiments.
The latter research designs have the object of estimating the relative effectiveness of different treatments that have a com- mon aim, just as randomized experiments do, but they depend on methods other than randomiza- tion to rule out the competing explanations for the treatment differences that may be uncovered. Quasi-experiments and related observational studies then attempt to approximate the results of a randomized field test. (p. 4)
Thomas Cook
Campbell and Stanley’s classic Experimental and Quasi-Experimental Designs for Research (1966)
During the 1970s, Cook and Campbell e
alled attention to some less recognized threats to internal validity (e.g., resentful demoralization)
during the mid-1970s,
Campbell
denounced the use of quasi-experimental designs and went so far as to state that per- haps he had committed an injustice by suggesting that quasi-experimental designs were a viable alternative to classic randomized design.
remained a proponent of quasi- experiential designs and continued to focus on their use.
Cook
random selection, methods, the evaluation context, and stakeholder involve- ment in evaluation studies—
He asserts that it is imperative for evaluators to choose methods that are appropriate to the particular evaluation being conducted and that they
take into consideration the context of each evaluation rather than using the same set of meth- ods and designs for all evaluations—a direct attack on experimental design.
Lee J. Cronbach
contributions include Cronbach’s coefficient alpha, generalizability theory, and notions about construct valid- ity.
one of the methodological giants of our field.
our field
strong evaluation roots in methodology and social science research led us to place him on the methods branch of the theory tree.
ssociation with more policy research–oriented Stanford University colleagues, notably in his book Toward Reform of Program Evaluation (Cronbach & Associates, 1980), helped establish his concern for evalua- tion’s use in decision making.
ejects the simplistic model that assumes a single decision maker and “go/no-go” decisions
Cronbach (1982) coins
a set of symbols
to define the domains of evaluation.
domains
units (populations)
treatments
observations (outcomes)
settings
Cronbach’s concern about generalizing to *UTOS leads him to reject Campbell and Stanley’s emphasis on experimental design and Scriven’s focus on comparison programs.
Campbell and Stanley’s emphasis on experimental design
Scriven’s focus on comparison programs
c
proposes that generalization to *UTOS can be attained by extrapolating through causal explanation, using either
or the “thick description” of qualitative meth- ods.
causal modeling
sometimes beneficial to examine subpopulations (sub-UTOS)
focusing on the subset of data for a particular group might enable generalization to other domains.
seeks to capitalize on naturally occurring variability within the sample as well as the consequences of different degrees of exposure to treatments.
displays sensitivity to the values of the policy-shaping community
done system- atically with an eye to what will contribute most to generalization:
issues receiving attention from the policy-shaping community
issues relevant in swaying important (or uncommitted) groups
issues that would best clarify why a pro- gram works
issues having the greatest uncertaint
Shadish et al. (1991) make the following keen distinction between Cronbach and several other major theorists: “[Cronbach views] evaluators [as] educators rather than [as] the philosopher-kings of Scriven, the guardians of truth of Campbell or the servants of management of Wholey” (p. 340).
Peter Rossi
well-known for his highly popular textbook on evaluation
first edition was Evaluation: A Systematic Approach (Rossi, Freeman, Chapter 2. An Evaluation Theory Tree 25 & Wright, 1979).
& Wright, 1979).
now includes discussions of qualitative data collection, evaluation utilization, the role of stakeholders,
changing nature of the field
earlier writings stressed
experimental design.
his ideas eventually evolved to a place that some say was so compre- hensive that the approach he suggested was virtually impossible to implement.
response to this criticism led him to develop “tailored evaluations”
tailored to the stage of the program,
theory-driven evaluation involves the construction of a detailed program theory, which is then used to guide the evaluation.
Rossi maintains that this approach helps reconcile the two main types of validity—internal and external.
Carol Weiss
influenced our thinking about what evaluation can do
informed
by
research
on evaluatio
in
context of political decision making
expands on or defines many of our key concepts and terms related to evaluation use,
conceptual use, or use for understanding (Weiss, 1979, 1980)
enlightenment use, or more subtle and indirect use that occurs in the longer term (Weiss, 1980)
imposed use,
(Weiss, Murphy-Graham, & Birkeland, 2005; Weiss, Murphy-Graham, Petrosino, & Gandhi, 2008).
argues for more evidence-based policy making
most effective kinds of evaluations are those that withstand the test of time
that is, are generalizable and therefore use the most rigorous methods possible.
political theo- rists
early evaluation wor
influenced by research methodologists,
xpositors of democratic thought (e.g., Rousseau, the Federalist Papers).
“Evaluation is a kind of policy study, and the boundaries are very blurred . . . I think we have a responsibility to do very sound, thorough systematic inquiries” (Weiss, cited in Alkin, 1990, p. 90).
Politics intrudes on program evaluation in three ways:
(1) programs are created and maintained by political forces;
(2) higher echelons of government, which make decisions about programs, are embedded in politics;
(3) the very act of evaluation has political connotations. (p. 213)
decision accretion
decision accretion
decisions are the result of “the build-up of small choices, the closing of small options and the gradual narrowing of available alternatives” (Weiss, 1976, p. 226).
decisions are the result of “the build-up of small choices, the closing of small options and the gradual narrowing of available alternatives” (Weiss, 1976, p. 226).
Huey T. Chen
concept and practice of theory-driven evaluation (Chen, 1990, 2005).
there is no indication as to whether failure is due to, for example, poorly constructed causal linkages, insufficient levels of treatment, or poor implementation. (Chen, 2005)
Chen proposes a solution to this dilemma: We have argued for a paradigm that accepts experiments and quasi-experiments as dominant research designs, but that emphasizes that these devices should be used in conjunction with a priori knowledge and theory to build models of the treatment process and implementation system to produce evaluations that are more efficient and that yield more information about how to achieve desired effects. (Chen & Rossi, 1983, p. 300)
concerned with identifying secondary effects and unintended consequences. This is similar to Scriven.3
theories that he seeks
are
“plausible and defensible models of how programs can be expected to work” (Chen & Rossi, 1983, p. 285).
Gary Henry and Melvin Mark (With George Julnes)
view social betterment as the ultimate objective of evaluation and present a point of view grounded in what they refer to as a “common sense realist philosophy.”
we were struck by the views presented in the “realist evaluation” monograph in New Directions for Evaluation (Henry, Julnes, & Mark, 1998)
Emergent realist evaluation (ERE)
a comprehensive new evaluation model that offers reconceptualized notions of use, methods, and valuing.
“a new theory that captures the sense- making contributions from post-positivism and the sensitivity to values from constructivist tradi- tions” (Henry et al., 1998, p. 1)
“social betterment, rather than the more popular and pervasive goal of utiliza- tion, should motivate evaluation” (Mark, Henry, & Julnes, 1998, p. 19).
ERE is an evaluation methodology that
(a) gives priority to the study of generative mechanisms,
(b) is attentive to multiple levels of analysis,
(c) is mixed methods appropriate.
focuses on understanding the underlying mechanisms of programs,
identify which mechanisms are operating and which are not (Mark et al., 1998).
to identify casual linkages and to enhance the generalizable knowledge base of a particular set of programs or program theories.
Mark and Henry argue that an evaluation should examine program effects that are of most interest to the public and other relevant stakeholders
so evaluators must determine stakeholders’ values when investigating possible mechanisms.
three methods for investigating stakeholder values:
surveying and sampling possible stakeholders
qualitative
interviews and/or focus groups to determine their needs and concerns
nalyzing the context of the evaluation from a broad philosophical perspective
issues such as equity, equality, and freedom.
then
communicated
(Mark et al., 1998).
or principled discovery
competitive elaboration
ruling out alternative explanations for study find- ings
Competitive elaboration
threats to validity (Mark et al., 1998).
alternative program theories
requires a preexisting body of knowledge of possible program mechanisms
approach lends itself to quantitative methods of inquiry
Principled discovery is used when pro- grams are evaluated before practitioners are able to develop experientially tested theories (Mark et al., 1998).
Principled discovery is used when pro- grams are evaluated before practitioners are able to develop experientially tested theories (Mark et al., 1998). Approaches to discovering program mechanisms include exploratory data analysis, graphical methods (Henry, 1995), and regression analysis.
Approaches to discovering program mechanisms include
exploratory data analysis,
graphical methods (Henry, 1995)
regression analysis.
valu- ation influence (Henry & Mark, 2003; Mark & Henry, 2004).
recently, Mark and Henry have extended
theoretical work
into
defined as “the capacity or power of persons or things to produce effects on others by intangible or direct means” (Kirkhart, 2000, p. 7).
theory of evaluation influence
Henry and Mark (2003)
at which influence can occur.
depicts three levels
individual,
interpersonal
collective
Each level is further explained by identifying specific mechanisms, measurable out- comes, and forms of influence.
Ralph Tyler
ne of the major starting points for modern program evaluation
The Eight-Year Study
far-reaching
Madaus and Stufflebeam (1989)
taxonomic classification of learning outcomes
eed to validate indirect measures against direct indicators of the trait of interest
concept of formative evaluation
content mastery,
decision-oriented evaluation
criterion-referenced and objectives-referenced tests”
(p. xiii).
curricula to be evaluated are based on hypotheses that are the best judgments of program staff regarding the most effective set of procedures for attaining program outcomes.
focus is on the specification of objectives and measurement of outcomes.
Tyler’s point of view has come to be known as objectives-oriented (or objectives-referenced) evaluation.
p
(a) formulating a statement of educational objectives, (b) classifying these objectives into major types, (c) defining and refining each of these types of objectives in terms of behavior, (d) identifying situations in which students can be expected to display these types of behavior, (e) selecting and trying promising methods for obtaining evidence regarding each type of objective, (f) selecting on the basis of preliminary trials the more promising appraisal methods for further development and improvement, and (g) devising means for interpreting and using the results (Tyler, 1942, pp. 498–500).
focuses on
Madaus and Stufflebeam (1989) claim that Tyler coined the term educational evaluation in the 1930s to describe his procedures—the comparison of (well-stated) intended outcomes (called objectives) with (well-measured) actual outcomes.
Metfessel and Michael’s (1967) work follows Tyler’s evaluation step progression but pays greater heed to expanding the range of alternative instruments.
Hammond (1973) includes Tyler’s views as a behavioral objectives dimension that is part of a model that also includes a more precise definition of instruction and the institution.
Popham (1973, 1975) follows the Tyler model and focuses pri- marily on the championing of “behavioral objective specification.”
opham (1973) called for
massive number of objectives required to conduct an evaluation and subsequent system overload.
narrow scope for individual educational objectives
Popham (1988) recognized this problem and called for a focus on a manageable number of broad-scope objectives and the use of the taxonomies of educational objectives only as “gross heuristics.”
Bloom, Englehart, Furst, Hill, and Krathwohl (1956) developed a taxonomy of educational objectives for the cognitive domain,
Krathwohl, Bloom, and Masia (1964)
placed him on the methods branch because we believe that his attention to educational measurement as the essence of evaluation is the most prominent feature of his work.
our revised view
he is not a theoretical predecessor of those further up on the branch.
on further reflection, we concluded that his overall influence on the methods branch specifically was less than his original position suggested.
Valuing
Out of the root of epistemology has grown a branch of evaluators who focus on concerns related to valuing in the evaluation process.
Out of the root of epistemology has grown a branch of evaluators who focus on concerns related to valuing in the evaluation process.
Of particular importance is the fact/value distinction delineated by the 18th-century Scottish philosopher David Hume.
the legitimacy of value claims (as ably described by House & Howe, 1999)
Important issues raised when considering valuing in evaluation include
the nature of universal (justifiable) claims
he constructivist view that truth (or fact) is guided by “the meanings that people construct in particular times and places” (Greene, 2009, p. 159).
argues that it is the work of the evaluator to make a value judgment about the object that is being evaluated and
Scriven
proclaims that evaluation is not evaluation without valuing.
that this value judgment should be based on observable data about the quality and effective- ness of the evaluand under study.
Scriven’s philosophical training in logic, which helps inform his argument for a systematic, objective approach to valuing and evaluation, has importantly influenced his thinking.
those who reject the notion that we should strive for an objectivist judgment about the merit or worth of the evaluand.
espouse the phi- losophy of relativism or subjectivism—that
human activity is not like that in the physical world but
Double-click to edit.
an ongoing, dynamic process and a truth is always rela- tive to some particular frame of reference.
Stake’s (1967) article “The Countenance of Educational Evaluation” offers hints of subjectivist thinking,
his paper on responsive evaluation (Stake, 1974)
(evaluation conducted in the spirit of obtaining objective information).
explicitly reject “preordinate evaluation”
argues for a responsive approach using case studies as a means of capturing the issues, personal relationships, and complexity of the evaluand and for judging the value of the evaluand under study.
has served as an important influence in the thinking of others who argue for atten- tion to complexity, dialogue, and meaning in evaluations and use this as a basis for informing value claims in evaluation studies.
Michael Scriven
major contribution is the way in which he adamantly defines the role of the evaluator in making value judgments.
Shadish et al. (1991) note that Scriven was “the first and only major evaluation theorist to have an explicit and general theory of valuing” (p. 94).
(1986)
“Bad is bad and good is good and it is the job of evaluators to decide which is which” (p. 19).
The evaluator, in valuing, must fulfill his or her role in serving the “public interest” (Scriven, 1976, p. 220)
he views the evaluator’s role in valuing as similar to producing a study for Consumer Reports, in which the evaluator
he views the evaluator’s role in valuing as similar to producing a study for Consumer Reports, in which the evaluator determines the appro- priate criteria by which judgments are to be made and then presents these judgments for all to see.
determines the appro- priate criteria by which judgments are to be made and then
presents these judgments for all to see.
“critical competitors,” or competing alternatives.
here is the necessity for identifying
valuator has the responsibility for identifying the appropriate alternatives.
Comparisons are key in making value judgments,
adamantly states that it is not necessary to explain why a program or product works in order to determine its value.
lternative to experimental and quasi-experimental design called the “modus operandi” (MO) method
(Scriven, 1991, p. 234),
analogous to procedures used to profile criminal behavior:
The MO of a particular cause is an associated configuration of events, processes, or properties, usually in time sequence, which can often be described as the characteristic causal chain (or certain distinctive features of this chain) connecting the cause with the effect. (Scriven, 1974, p. 71)
then narrow it down
first develop a thorough list of potential causes
determines which potential causes were present prior to the effect.
two steps.
first
determine which complete MO fits the chain of events and thus determine the true cause.
second
To ensure accuracy and bias control,
calls in a “goal-free or social process expert consultant to seek undesirable effects” (Scriven, 1974, p. 76).
looks for instances of “co-causation and over determination” and
believes that, ultimately, the evaluator is able to deliver a picture of the causal connections and effects that eliminate causal competitors without introducing evaluator bias.
(1972b) advocates for “goal-free evaluation,”
maintains that by doing so, the evaluator is better able to identify the real accomplishments (and nonaccomplishments) of the program.
essential ele- ment of Scriven’s valuing is the determination of a single value judgment of the program’s worth (“good” or “bad”).
How good or how bad is important for comparisons; opportunity cost; etc.
J: Given: a. Logic: Search for Truth; born out of Platonic ideal of forms.
b. Statistics: Search for knowledge, born out of Aristotelian / Humean concept of empiricism and means / frequency.
c. Evaluation: The application of these tools (and any others) along with the identification and weighing of the ends (values) at hand:
and
d. Logic and statistics are developed by humans
e. humans are guided (consciously or not) by the "automatic" (or "natural") identification and weighing of ends (values)
Therefore:
1. Evaluation is the discipline that undergirds Logic and Statistics.
- I could make another argument for why Evaluation is more Alpha (basic) than Epistemology, along similar lines
In requiring the synthesis of multiple-outcome judgments into a single value statement, Scriven is alone among evaluation theorists.
Needs are the presumed cost to society and to individuals and are determined through a needs assessment.
his “conception of needs implies a prescriptive theory of valuing and that he disparages descriptive statements about what people think about the program” (p. 95).
fail- ing to directly reflect the views of stakeholders inhibits the potential use of evaluation findings
his needs assessment is not independent of the views of the evaluator and
Scriven is apparently unconcerned by this, maintaining that determining the “truth” is sufficient.
unique training in philosophy, mathematics, and mathematical logic provides him with the assurance that he can make sound, unbiased judgments.
The more you think you know. . . the less you do . . . because if you expand your knowledge in three dimensions what you don't know will grow, and figuring out what you do know will grow.
(Look at modeling this. . . get data; test people longitudinally (10-15 years) . . . look at bits of knowledge that are not used. . .but then re-discovered?)
logic provides him with the assurance that he can make sound, unbiased judgments. Extending his supposition for evaluation as the science of valuing, Scriven (1991, 2001, 2003) reasons that evaluation is a transdiscipline, that is, a discipline that possesses its own unique knowledge base while serving other disciplines as a tool. He maintains that, like logic and statistics, evaluation is a major transdiscipline because all disciplines rely on the evaluation
provides him with the assurance that he can make sound, unbiased judgments.
(1991, 2001, 2003) reasons that evaluation is a transdiscipline
that is, a discipline that possesses its own unique knowledge base while serving other disciplines as a tool.
like logic and statistics
because all disciplines rely on the evaluation process to judge the value of the entities within their own purview
as evidenced by the peer review publication process.
Scriven’s thinking pushed the field to consider valuing as a central feature of evaluation
more
than anyone else.
Henry Levin
Henry Levin
Cost analyses are a critical domain of evaluation work because they offer information to address what some consider to be the ultimate evaluation question: What is the overall value of the program?
economics-based strategies for determining the value of a program or policy.
ost analyses as a method for inform- ing object value judgments about a program
using a specific methodological approach.
an array of economics-based strat- egies for determining program costs prior to and during implementation.
Levin (2005; Levin & McEwan, 2001)
project what a program might cost
track of the costs of an ongoing program.
determine which program out of many achieves a target outcome most frugally (cost-effectiveness) or which program of many with equal costs produces the greatest outcome (also cost-effectiveness).
examined relative to its monetary impact (or benefits) on clients or society
cost
An essential part of preparing to undertake a cost study involves thinking about
cost–benefit
how best to assess them.
how “costs” are defined
what costs are important,
Because costs can be calculated and evaluated differently across strategies, it is important for evaluators and stakeholders to clarify what they need from a cost strategy and to understand how each works.
Robert Stake
three manuscripts—“The Countenance of Educational Evaluation” (Stake, 1967), Program Evaluation, Particularly Responsive Evaluation (Stake, 1974), and Case Studies in Science Education (Stake & Easley, 1979)
the essential components of Stake’s responsive evaluation are
House (2001b)
(a) there is no true value to anything (i.e., knowledge is context bound)
(b) stakeholder perspectives are integral elements in evaluations, and
(c) case studies are the best method for representing the beliefs and values of stakeholders and of reporting evaluation results.
he maintains that seeing and judging the evaluand regularly are part of the same act and that the task of evaluation is as much a matter of refining early perceptions of quality as of building a body of evidence to determine the level of quality.
is opposed to stakeholder participation in many evaluation activities and processes and instead asserts that evaluation is the job of the evaluator (Alkin, Hofstetter, & Ai, 1998, p. 98).
it is the evaluator’s job “to hear the [participants’] pleas, to deliberate, sometimes to negotiate, but regularly, non- democratically, to decide what [the participants’] interests are” (p. 104).
Stake (2000)
(1975) cautions Chapter 2. An Evaluation Theory Tree 35 that “whatever consensus in values there is [among participants] . . . should be discovered. The evaluator should not create a consensus that does not exist” (pp. 25–26).
hat “whatever consensus in values there is [among participants] . . . should be discovered. The evaluator should not create a consensus that does not exist” (pp. 25–26).
“The reader, the client, the people outside need to be in a position to make their own judgments using grounds they have already, plus the new data” (Abma & Stake, 2001, p. 10).
Elliott Eisner
“educational connoisseurship”
Journal of Aesthetic Education (1976)
subsequently expanded
1985, 1991a, 1991b, 1998)
focus is on educational outcomes
negative perceptions of
measured by standardized tests using the principles of psy- chological testing or
(1976) rejection of “technological scientism” includes a rejection of the extensive use of research models employing experimental and quasi-experimental designs, which depend heavily (if not exclusively) on quan- titative methods.
by criterion-referenced testing procedures.
I would say that everything that matters may be able to be measured quantitatively. . . frequency of oxytocin spikes in relationships, for example; lack of other things: It's not everything; but it gives information on waht matters, and it is a form of measurement.
Eisner notes that “things that matter” cannot be measured quantitatively.
“evaluation requires a sophisticated, interpretive map not only to separate what is trivial from what is signifi- cant, but also to understand the meaning of what is known” (Eisner, 1994, p. 193).
Eisner uses the role of critics in the arts as an analogy for an alternative conception of evaluation.
twin notions of connoisseurship and criticism.
connoisseur
have the ability to differentiate subtle- ties,
have knowledge about what one sees
“a connoisseur is someone who has worked at the business of learning how to see, to hear, to read the image or text and who, as a result, can experience more of the work’s qualities than most of us” (p. 174).
be aware of and understand the experience
(1991b)
is making the experience public through some form of represen- tation.
Criticism
three aspects of criticism
First
critical description, in which the 36 PART I. INTRODUCTION evaluator draws on his or her senses to describe events, reactions, interactions, and everything else that is seen.
evaluator draws on his or her senses to describe events, reactions, interactions, and everything else that is seen.
ortrays a picture of the program situation, frequently imagining himself or herself as a participant and drawing on the senses to describe the feeling in the participant’s terms.
second
expectation
understand or make sense of what was seen.
Eisner (1991b)
The essence of perception is its selectivity; the connoisseur is as unlikely to describe everything in sight as a gourmet chef is to use everything in his pantry. The selective process is influenced by the value one brings to the classroom. What the observer cares about, she is likely to look for . . . Making value judgments about the educational import of what has been seen and rendered is one of the critical features of educational criticism. (p. 176)
Ernest House
denounces
utilitarian framework
“Utilitarianism is a moral theory which holds that policies are morally right when they promote the greatest sum total of good or happiness
(1991)
from among the alternatives” (p. 235)
deplores the lack of value neutrality in stakeholder approaches, which he says results from the general lack of full inclusion of the represented interests of the poor and powerless in stakeholder groups (pp. 239–240).
ouse (1991, 1993) argues that evaluation is never value-neutral and should tilt in the direction of social jus- tice by specifically addressing the needs and interests of the powerless.
comes to these views by drawing on Rawls’s (1971) justice theory.
“ethical fallacies” in evaluation: cli- entism (taking the client’s interest as the ultimate consideration), contractualism (adhering inflexibly to the contract), managerialism (placing the interest of the managers above all else), methodologicalism (believing that proper methodology solves all ethical problems), pluralism/ elitism (including only the powerful stakeholders’ interests in the evaluation), and relativism (taking all viewpoints as having equal merit).
(House & Howe, 1999,
Inclusion
Dialogue
Deliberation
Fact and value
exist on a continuum where a middle ground exists between “brute fact” and “bare values” (House & Howe, 1999, p. 6).
deliberative democratic evaluation process is described as follows: “We can imagine moving along the value–fact continuum from
statements of preferences and values collected through initial dialogue, through deliberations based on democratic principles, to evaluative statements of fact” (House & Howe, 1999, p. 100).
ften cited as the theorist who first acknowledged the ways in which evaluation affects power and social structures and described how it can be used to either shift or maintain existing repressive structures.
J: Why eval is so hard to 'pin down' and why people are so adamant about their position? I believe it is because it IS the alpha discipline: People realize that how we define this has tremendous influence on what happens in the world; people's VALUES come into play: Evaluation is, logically, at the core of religion and politics (aka; evaluating whether something is in accordance with the religion or not, even in those religions where evaluating the religion in comparison to other things is frowned upon)
Jennifer C. Greene
by develop- ing a consensus around a set of criteria used to determine the value of a program.
evaluation should be used to determine valu
eliberative democratic evaluation: inclusion, dialogue, and deliberation.
stresses stakeholder involvement,
empha- sizes the use of mixed methods designs and fieldwork
(2005)
approach
emphasizes responsiveness to the particularities of the context,
three “justifications” for including stakeholder views
pragmatic justification argues for stakeholder inclusion because it increases the chance of evaluation utilization and organizational learning.
pragmatic, emancipatory, and deliberative.
emancipatory justification focuses on the importance of acknowledging the skills and contributions of stakeholders and empowering them to be their own social change agents.
deliberative justi- fication argues that evaluation should serve to ensure that program or policy conversations include all relevant interests and are “based on the democratic principles of fairness and equity and on democratic discourse that is dialogic and deliberative” (Greene, 2000, p. 14).
Egon Guba and Yvonna Lincoln
view stakeholders as the primary individuals involved in placing value.
there are multiple realities, based on the percep- tions and interpretations of the individuals involved in the program to be evaluated.
on the belief that
Thus
the role of the evaluator is to facilitate negotiations between individuals reflecting these multiple realities.
(1989) Fourth Generation Evaluation
constructivist paradigm
role of the constructivist investigator is to tease out these constructions and “to bring them into conjunction . . . with one another and with whatever information . . . can be brought to bear on the issues involved” (p. 142).
KW - inter-cultural evaluation
Donna Mertens
(1999, 2009) inclusive approach
unique in its emphasis on diversity and the inclusion of diverse groups.
best known for her inclusive/transformative model of evaluation.
four philosophical assumptions
primary role is to include marginalized groups, not to act as decision maker.
(2009) model,
advocates for the inclusion of marginalized groups
does not advocate for the marginalized groups
the following questions at the planning stages
ask
J: What makes this list is already a value judgement
• Are we including people from both genders and with diverse abilities, ages, classes, cul- tures, ethnicities, families, incomes, languages, locations, races, and sexualities?
• What barriers
exclude a diversity of people?
• Have we chosen the appropriate data collection strategies for diverse groups, including providing for preferred modes of communication?
Use
often referred to as “decision-oriented theo- ries.”
focuses primarily on the program at hand—this program at this time.
Evaluation influence refers to the capacity of evaluation processes, products, or findings to indirectly produce a change in understanding or knowledge either at the evaluation site at a future time or at other sites (Alkin & Taut, 2003; Christie, 2007; Kirkhart, 2000; Mark & Henry, 2004).
Rather than drawing from this broad definition of use, the use branch as we envision it depicts the work of theorists concerned with direct program site use (in action or understanding) that results from a particular evaluation study.
theorists presented on the use branch aim to promote the kind of actionable use that is within the purview of the evaluator.
Daniel Stufflebeam
CIPP
an acronym for four types of evaluation: context, input, process, and product.
Context evaluation involves identifying needs to decide on program objectives.
Input evaluation leads to decisions about strategies and designs
Process evaluation consists of identifying shortcomings in a current pro- gram to refine implementation.
product evaluation measures outcomes for decisions regard- ing the continuation or refocus of the program.
key strategy is
a cyclical process.
work with a carefully designed evaluation
maintaining flexibility
(1983),
view design as a process, not a product.
ontinually improve
continual information stream
aid decision makers in allocating resources to programs that best serve clients.
The Program Evaluation Standards (Joint Committee on Standards for Educational Evaluation, 1994).
four domains of practice: utility, feasibility, propriety, and accuracy.
Utility standards ensure that an evaluation will serve the information needs of intended users; feasibility standards ensure that an evalua- tion will be realistic, prudent, diplomatic, and frugal; propriety standards ensure that an evaluation will be conducted legally, ethically, and with due respect for the welfare of those involved in the evaluation as well as of those affected by its results; and accuracy standards ensure that an evaluation will reveal and convey technically adequate information about the features that deter- mine the worth or merit of the program being evaluated.
a “representative stakeholder panel to help define the evaluation questions, shape evaluation plans, review draft reports and disseminate the findings” (p. 57).
(2003)
engage
engages stakeholders (usually in decision-making positions) in focusing the evaluation and in making sure that it addresses their most important questions; provides timely, relevant information to assist decision mak- ing; and produces an accountability record.
(2001)
ormative and summative information become available to a panel of stakeholders
Joseph Wholey
academic training
ong-standing participation in federal government program
focus on managers and policymakers
three stages in the “sequential purchase of information” strategy are (1) rapid- feedback evaluation, which focuses primarily on extant and easily collected information; (2) performance (or outcome) monitoring, which measures program performance, usually in com- parison with prior or expected performance; and (3) intensive evaluation, which uses comparison or control groups to better estimate the effectiveness of program activities in causing observed results.
Eleanor Chelimsky
the chief purpose of evaluation” (Chelimsky, 1995).
“Telling the truth to people who may not want to hear it is,
establishing and directing
evaluation unit of the General Accountability Office (GAO),
largest independent internal evaluation unit in existence
www.gao.gov)
evaluation of public policies, programs, and practices is fundamental to a democratic government for four reasons: (1) to support congressional oversight; (2) to build a stronger knowledge base for policy making; (3) to help agencies develop improved capabilities for policy and program planning, implementa- tion, and analysis of results, as well as learning-oriented direction in their practice; (4) to strengthen public information about government activities through dissemination of evaluation findings. (p. 33)
(2006)
In essence,
evaluation should generate information for conceptual and enlightenment use, for organizational change and development, and for formative pro- gram improvements (i.e., actionable, instrumental use).
under her direction, GAO has devised a wide variety of methods to suit different question types (Chelimsky, 1997)
Marvin Alkin
imilarities to Stufflebeam’s CIPP model, though the primary distinction was Alkin’s recognition that process and product have both summative and formative dimensions.
one could look at process summatively (through program documentation) or at product formatively (through outcomes).
rejects the dominant role of evaluators as valuing agents.
prefers to work with primary users at the outset of the evaluation process to establish value systems for judging potential outcome data.
interac- tive sessions, he presents a variety of simulated potential outcomes and seeks judgments (values) on the implications of each.
Like McKinsey & KW
Michael Patton
most prominent theoretical explication of the utilization (or use) extension w
(1978, 1986, 1997, 2008).
utilization-focused evaluation (UFE),
(1) the development of users’ commitment to the intended focus of the evaluation and to evaluation utilization; (2) involvement
our major phases of UFE
in methods, design, and measurement; (3) user engagement—actively and directly interpreting findings and making judgments; and (4) making decisions about further dissemination.
(2002) urges that the evaluator be “active—reactive—interactive—adaptive.”
active in identify- ing intended users and focusing questions, reactive in continuing to learn about the evaluative situation, and adaptive “in altering the evaluation questions and designs in light of their increased understanding of the situation and changing conditions” (p. 432).
ntroduction of the term develop- mental evaluation (Patton, 2010).
evaluator becomes part of a program’s design team or management team.
evaluator becomes part of a program’s design team or management team.
Like KW
David Fetterman - Empowerment Evaluation; Teach
books on empowerment evaluation,
(1996, 2001)
process that encourages self-determination among recipients of the program evaluation, often including “training, facilitation, advocacy, illumination and liberation.”
goal of empowerment evaluation is to foster self-determination rather than dependency,
utside evaluator often serves as a coach or additional facilitator, providing clients with the knowledge and tools for continuous self-assessment and accountability.
argues that training participants to evaluate their own programs and coaching them through the design of their evaluations is an effective form of empowerment.
(1994)
two
forms of empowerment evaluation t
subtly different.
first
valuators teach program participants to
teach
conduct their own program evaluations,
is to build evaluation capacity.
primary work
coach to facilitate others to conduct their own evaluations.
second
coach
change. Fetterman sees all empowerment evaluators as having the potential to serve as “illum
(1998), the end point of evaluation is not the assessment of the program’s worth.
evaluation as an ongoing process
value and worth are not static,
Through the internalization and institutionalization of self-evaluation processes and practices, a dynamic and responsive approach to evaluation can be developed to accommodate shifts in populations, goals, value assessments and external forces” (p. 382).
participatory and empowerment evaluation employ similar practices
goals
different
J. Bradley Cousins
Cousins’s participatory evaluation (Cousins & Earl, 1992; Cousins & Whitmore, 1998)
if we care about utilization, then the way to achieve it is through buy-in.
the way to achieve buy-in is to have program personnel participating in the evaluation.
his evaluations are designed for structured, continued, and active participation of these users, as opposed to Patton’s user participation, which could take on a variety of different forms.
utilization takes place within the context of an organization and is best accomplished as a part of organizational development. Cousins calls this “practical participatory evaluation” (Cousins & Earl, 1995).
“applied social research that involves trained evaluation personnel and practice-based decision makers working in partnership” (Cousins & Earl, 1995, p. 8).
defines
practical participatory evaluation
best suited for evaluation projects that “seek to understand programs with the expressed intention of informing and improving their implementation” (Cousins & Earl, 1995).
evaluation as an organizational learning system (Cousins, Goh, & Clark, 2005).
Hallie Preskill - Transformational Learning & Appreciative Inquiry
is concerned with creating transformational learning within an organization through the evaluation process.
Transformational learning
(2000)
a process where individuals, teams, and even organizations identify, examine, and understand the information needed to meet their goals.
should (a) use a clinical approach, (b) span traditional boundaries between evaluator and program staff, and (c) diagnose the organizational capacity for learning.
approach “is inherently responsive to the needs of an organization and its members” (Preskill & Torres, 2000, p. 31)
Finally, an evalu- ator needs the ability to diagnose an organization’s capacity for learning (Preskill & Torres, 1998),
Finally, an evalu- ator needs the ability to diagnose an organization’s capacity for learning (Preskill & Torres, 1998),
appreciative inquiry (AI)
a process that builds on past successes (and peak experiences) in an effort to design and implement future actions.
philosophy underlying AI is that
when evaluators look for problems, more problems are found, and
when deficit-based language is used, stakeholders often feel hopeless, powerless, and generally more exhausted.
by remembering topics of study that created excitement and energy, and
by reflecting on what has worked well,
par- ticipants’ creativity, passion, and excitement about the future are increased.
by using affirmative and strengths-based language,
Preskill (2004) takes the philosophy and principles put forth by organizational change AI theorists and applies them to the evaluation context, maintaining that they increase the use of the evaluation processes and findings.
Jean King
development of participatory evaluation models.
prefers working long term with organizations to develop joint understandings and, over time, creating structures that will continue to build evaluation capacity (Volkov & King, 2007).
interactive evaluation practice (IEP)
for fostering participation and obtaining use.
defines IEP as “the intentional act of engaging people in making decisions, taking action, and reflecting while conducting an evaluation study” (King &
Stevahn, 2007).
defines evaluation as “a process of systematic inquiry designed to provide sound information about the characteristics, activities, or outcomes of a program or policy for a valued purpose” (King & Stevahn, 2007).
concerned about identifying and fostering leaders during the evaluation process
needed to “attract or recruit people to the evaluation process, who are eager to learn and facilitate the process . . . and who are willing to stay the course when things go wrong” (King, 1998, p. 64)
trust build- ing
fundamental requirement
for
successful participatory evaluation
pay close attention to the interpersonal dynamics that occur (King & Stevahn, 2007).
roles suggested by King acknowledge the impor- tance of the interpersonal factor
A FINAL NOTE
two main challenges in
this chapter
First,
First: if you made your tree mutually exclusive and collectively exhaustive (Ethan Rasiel, 1999, 'The McKinsey Way', p. 6) you wouldn't have had this problem.
Second, we needed to determine which theorists to include on the tree.
we needed to make specific placements on particular branches of the tree.
theory tree is posited on the view that ultimately one or another of the three dimensions, depicted as branches, is of the highest priority for each theorist." -J: The next bunch of notes are all highlights of the criterion words that Alkin uses
first
issues
emphasis
on
purpose of an evaluation
main concern
utilization
principal focus of an evaluation
primary motivations.
primary methodology
valuing out- comes
primary focus
?
Or,
process use
We believe that, as Fetterman describes it, the act of empowering focuses on the process of engaging in evaluation;
?
In the language of evaluation utilization, empowerment evalua- tion involves instrumental process use.
Thus, while noting a deep concern for social justice and a strong preference for (and early evaluation roots in) anthropological/ethnographic methods, we were led to place Fetterman on the utilization branch.
The determination of a theorist’s place- ment on a branch of the evaluation theory tree was not always this difficult, but it always required similarly careful consideration and analysis of trade-offs.
Which means your tree is not a very good way to categorize things, unless you couldn't find anything better!
With our more restrictive focus on North American theorists, we also deleted Barry MacDonald and John Owen from this version of the tree because both reside outside North America (Great Britain and Australia, respectively) and their writings relate to work in these countries.
We also removed Eisner from the tree,
because a primary argument of Eisner’s is centered on the impor- tance of evaluators having domain-specific knowledge and expertise—and an argument around this issue still exists today.
Finally, while the ideas of Thomas Owen, Robert Wolf, and Malcolm Provus were innovative at the time, there is little evidence to suggest that their theo- retical work has persisted in influencing the field today, and so we also removed these theorists from the current version of the tree.
It's all about influence! It's sickening!
Our field con- trasts the so-called prescriptive theory of evaluation practitioners with more traditional forms of
social science theory, labeled descriptive theory (Alkin & House, 1992).