Evaluation Theories/Pre-Class Notes Wk. 2

Type classification: this is a notes resource.

Completion status: Been started, but most of the work is still to be done.

Pre-Class Notes Wk 2; Alkin Intro & Tree; Shadish on Scriven (Valuing) & Campbell (Experimenting Society)

p. 3 - 13:29:40 (15 min)

p. 4

Stakeholder participation is “essential to derive the maximum values from a program evaluation” (Alkin, 2013, p. 4)

“two general types of models: (a) a prescriptive model, the most common type, is a set of rules, prescriptions, prohibitions, and guiding frameworks that specify what a good or proper evaluation is and how evaluation should be done— such models serve as exemplars—and (b) a descriptive model is a set of statements and generalizations that describes, predicts, or explains evaluation activities— such a model is designed to offer an empirical theory.” – linked with “research on evaluation” (See Henry & Mark, 2003) (Alkin, 2012, p.

“When (and if) we do [develop descriptive models], then the descriptive models would define what is to be appropriately “pre- scribed.” -- “Until then, however, we must rely on the prescriptive models generated by knowledge- able members of the evaluation community to guide practice.”

p. 4 - 13:45:00 (15 min)

J: Three Stages in Developing Theory for a New Field (And how Alkin’s Categories Fit) Stage 1: categorize practices, Stage 2: put forward theories for why they work and what ends they strive towards, Stage 3: test those theories.

p. 5

Methodology for Choosing Theorists:

4 Categories: Methodologists, Evaluation Issue Analysts; Interpreters & Teachers; Theorists Methodologists: J: Methodology needs a "for what" - so what is Campbell's, "For What"?) contributed very substantially to the basic research methodology that forms the essential foundation for much of the work in evaluation. Noted: Donald Campbell, Julian Stanley, Thomas Cook, Matthew Miles and Michael Huberman, Robert Yin, and Anthony Bryk.1 Chosen: Donald Campbell Evaluation Issue Analysts substantially assisted in the understanding of various aspects of evaluation. Lois-Ellin Datta, Stewart Donaldson, Karen Kirkhart, Jonathan Morell, Michael Morris, Thomas Schwandt, William Shadish, Nick Smith, Evaluation Interpreters and Teachers. teaching about evaluation and help- ing to interpret its nuances. Michael Bamberger, Linda Mabry, and Jim Rugh (2011), Jody Fitzpatrick, James Sanders, and Blaine Worthen (2011), Rita O’Sullivan (2004), Emil Posevac and Raymond Carey (2007), and Liliana Rodriguez-Campos (2005). (J: Some of these might be “theorists”) Theorists definitively associated with a particular theoretical position on evaluation restricted my consideration of evaluators to those who speak about the field generically p. 6: “those who write only about evaluating health programs or education or social welfare have not been included. p. 6: those who do not consider themselves as evaluators but exclusively assign another disciplinary designation to their name are also not generally included.

p. 5 14:00:16 (15 minutes)

(J: Analysis: I do not like the bit about being “associated with a particular theoretical position on evaluation” – That may be one measure, but any other measure should include Shadish’s definition of theory: Something that can be developed separately from what others know: “Theory connotes a body of knowledge that organizes, categorizes, describes, predicts, explains, and otherwise aids in under- standing and controlling a topic." –Shadish (1991) 'Foundations of Program Evaluation', p. 30. These should be among the criteria as well, so that Alkin as an expert himself can use logic as well as surveys of current literature to develop “who is a theorist” - There may be people who aren’t on anyone’s radar within the field who nonetheless have something crucial to say to tie together the field. )

p. 6 Category Systems: attempts to look at the ways in which various theoretical perspectives relate to each other. identify a limited set of characteristics for grouping theories.

Entries within a category are deemed to belong together in the sense that they can be judged to be similar with respect to the ... configuration of characteristics that define that category.

Early Category Systems: evaluation category systems were those provided by Worthen and Sanders (1973) and Popham (1975). Subsequently, category systems have been developed by House (1978), Glass and Ellett (1980), Alkin and Ellett (1985), Williams (1988), Shadish, Cook, and Leviton (1991), Alkin and House (1992),

J: If I study this now, I will develop biases that I didn’t even know I had. I believe that a certain subset of scholars should develop a qualitative methodology for exploring a field, and then use that methodology on that field – and then look at whether their methodology makes sense.

p. 6 - 15:44:06

p. 7 -15:57:45 “These category systems failed to portray the historically derived relationships between theories” “failed to show which theoretical formulations provided the intellectual stimulation for new theories” J: But historical derivation is often at odds with logical simplification of the complexity: historical derivation builds up cruft that must be removed in order for the optimal system.

p.20:

Suchman’s identification of five categories of evaluation

(2) performance (effect criteria that measure the results of effort),

(3) adequacy of performance (the degree to which performance is adequate for the total amount of need)

(4) efficiency (examination of alternative paths or methods in terms of human and monetary costs)

(5) process (how and why a program works or does not work)

p.24 Cronbach’s UTOS: Units (populations), Treatments, Observations (outcomes) Settings Cronbach (1982) coins

a set of symbols

to define the domains of evaluation.

domains

units (populations)

treatments

observations (outcomes)

settings

p. 41

CIPP

an acronym for four types of evaluation: context, input, process, and product.

Context evaluation involves identifying needs to decide on program objectives.

nput evaluation leads to decisions about strategies and designs.

Process evaluation consists of identifying shortcomings in a current pro- gram to refine implementation.

product evaluation measures outcomes for decisions regard- ing the continuation or refocus of the program.

V.2

Anne Vo has actively participated in and contributed to our theoretical discus- sions. Tarek Azzam provided substantial assistance on the tree as well.

Marilyn, for putting up with my obsession with “growing” a tree. Marvin Alkin

“New Ideas.” I wondered about it and after many months had the courage to com- ment on it. Erick Lindman gently smiled and turned the clothespin over; on the other side was the inscription “Old Ideas Still Good.”

Hightlighting Guide:

read the works of Ralph Tyler in connection with the evaluation conducted on the famous “Eight-Year Study” of progressive education. In Tyler (1942), I found many concepts that I recognized to be the basis of contemporary approaches to evaluation.

"works of Ralph Tyler in connection with the evaluation conducted on the famous “Eight-Year Study” of progressive education."

Uncategorized

Yellow highlights: general information

How-To

Orange Highlights: Important "HOW-TO" information (Also coded "Purple" In Apple Preview)

Purple Highlights = Citations

Reference

is to validate the hypothesis upon which the education institution operates. (p. 492) (Theory-Based Evaluation? See Fitz-Gibbon & Morris, 1975; Chen, 1990)

validate the hypothesis upon which the education institution operates.

important purpose of evaluation that is frequently not recognized

Topic of Evaluation

Green - Topic

One purpose of evaluation is to make a periodic check on the effectiveness of the educational institution,

check on the effectiveness of the educational institution

thus to indicate the points at which improvements in the program are necessary.

indicate the points at which improvements in the program are necessary.

(Formative Evaluation? See Scriven, 1967)

stake

formation and classification of objectives, they

formation

formation and classification

obtain evidence about the progress

“Old Ideas Revisited and Enhanced—But Still Good.”

hile it is conventionally used in evaluation literature, in some ways, it would be more appropriate to use the term approaches or models.

two general types of models: (a) a prescriptive model, the most common type, is a set of rules, prescriptions, prohibitions, and guiding frameworks that specify what a good or proper evaluation is and how evaluation should be done—such models serve as exemplars—and (b) a descriptive model is a set of statements and generalizations that describes, predicts, or explains evaluation activities—such a model is designed to offer an empirical theory.

When (and if) we do, then the descriptive models would define what is to be appropriately “pre- scribed.”

Until then, however, we must rely on the prescriptive models generated by knowledge- able members of the evaluation community to guide practice.

Some have spent more time systemizing their standards, criteria, and prin- ciples.

systemizing their standards, criteria, and prin- ciples.

None of the approaches is predic- tive or offers an empirical theory.

None of the approaches is predic- tive or offers an empirical theory. That is, these “theories” have not been validated by empirical research.

A few have tried to defend or justify their prescriptions.

defend or justify

prescriptions

these “theories” have not been validated by empirical research.

we refer to those who have developed evaluation approaches and models as “theorists.”

we identify theories by the name of the theorist prominently associated with it.

contributed very substantially to the basic research methodology that forms the essential foundation for much of the work in evaluation.

basic research methodology

orms the essential foundation for much of the work in evaluation.

Donald Campbell, Julian Stanley, Thomas Cook, Matthew Miles and Michael Huberman, Robert Yin, and Anthony Bryk.1

Of these individuals, we have written only about Donald Campbell in the discussion of theorists because of the unique impact of his methodological contributions.

methodologists

evaluation issue analysts

substantially assisted in the understanding of various aspects of evaluation.

Lois-Ellin Datta, Stewart Donaldson, Karen Kirkhart, Jonathan Morell, Michael Morris, Thomas Schwandt, William Shadish, Nick Smith,

evaluation interpreters and teachers.

teaching about evaluation and help- ing to interpret its nuances.

Michael Bamberger, Linda Mabry, and Jim Rugh (2011), Jody Fitzpatrick, James Sanders, and Blaine Worthen (2011), Rita O’Sullivan (2004), Emil Posevac and Raymond Carey (2007), and Liliana Rodriguez-Campos (2005).

definitively associated with a particular theoretical position on evaluation

evaluation theorists

many in this category may not have presented a full theo- retical exposition

proposed a particular evaluation orientation

restricted my consideration of evaluators to those who speak about the field generically

those who write only about evaluating health programs or education or social welfare have not been included.

those who do not consider themselves as evaluators but exclusively assign another disciplinary designation to their name are also not generally included.

attempts to look at the ways in which various theoretic perspec- tives relate to each other.

Earlier efforts have taken the form of category (or classification) systems. These simplified structures provided a way to identify a limited set of characteristics for grouping theories.

identify a limited set of characteristics for grouping theories.

Entries within a category are deemed to belong together in the sense that

they can be judged to be similar with respect to the

configuration of character- istics that define that category.

in making this judgment, the categorizer is selecting from the many aspects of the approach only those that are considered most essential.

this is similar to an artist’s creation of a caricature, portraying someone or something by focusing on (even overemphasizing) its most prominent features

evaluation category systems were those provided by Worthen and Sanders (1973) and Popham (1975). Subsequently, category systems have been developed by House (1978), Glass and Ellett (1980), Alkin and Ellett (1985), Williams (1988), Shadish, Cook, and Leviton (1991), Alkin and House (1992),

Category systems are of great value.

Category systems also were an aid to theorists in understanding perceived relationships with other theorists

theorists’ views are not fixed in time,

published work often lags behind these changes

one’s views as perceived by others (whether or not they are still held) have influenced theorists.

Whatever the explanations for a perceived portrayal of a theorist’s views, the perceptions provided by category systems may force theorists to recon- sider their views and perhaps modify them.

While earlier category systems prior to the first edition of Evaluation Roots served evalua- tion well, they suffered from several deficiencies

These category systems failed to portray the historically derived relationships between theories

Historically derived relationships

failed to show which theoretical formulations provided the intellectual stimulation for new theories

prescriptive theories must consider (a) the issues related to the methodology being used, (b) the manner in which data are to be judged or valued, and (c) the user focus of the evaluation effort.

evaluation theory “tree,

each of the theorists is pre- sented on the tree on a branch that we believe represents his or her main emphasis among these three.

use, methods, judgment/valuing.

not one based on exclusivity

It might then be possible to ask this question: When evaluators must make concessions, what do they most easily give up and what do they most tenaciously defend (Alkin & Ellet, 1985)?

the category system is based on the relative emphasis within the various models

Carol Weiss, for example, indicated that she was satisfied with her placement but felt that, to some extent, she belonged on the “use” branch as well. David Fetterman, likewise, agreed with his placement but felt that it did not adequately represent his interest in “social justice.” Jennifer Greene commented that Lee Cronbach is not fundamentally concerned about methods but that his placement on the “methods” branch was probably as good a representa- tion as possible.

Testimonies from evaluation theorists about hteir placement

we capture only one dimension of the influences on their theoretical perspectives.

w

ana- lyzed the evaluator influences from other branches of the tree based on the names identified by the theorists in their respective chapters.

evaluator influences from other branches of the tree based on the names identified by the theorists in their respective chapters.

Ernest House’s The Logic of Evaluative Argument (1977), which clearly shows the influence of early evaluation theorists such as Michael Scriven and Robert Stake but which also relies heavily on the work of two Belgian philosophers, Perelman and Olbrechts-Tyteca (1969) and on the work of Weizenbaum (1976).

more authors

Eleanor Chelimsky, Jennifer Greene, Henry Levin, Melvin Mark and Gary Henry, and Donna Mertens

Eleanor Chelimsky had created an impressive model for evaluating national programs in her work at the GAO.

Henry Levin provided an evalu- ative dimension not otherwise represented—cost-effectiveness evaluation.

social accountability, systematic social inquiry, and epistemology

systematic social inquiry

three “roots”

social accountability

presents an important motivation for evaluation,

ocial accountability

systematic social inquiry—emanates from a concern for employing a method- ical and justifiable set of procedures for determining accountability

"method- ical and justifiable set of procedures for determining accountability" - Is this the best methodological base to work from? Why?

epistemology, is the area of philoso- phy that deals with the nature and validity (or limitations) of knowledge

Key evaluation con- cerns that are based in epistemological arguments include the legitimacy of value claims, the nature of universal claims, and the view that truth (or fact) is what we make it to be.

is the branch of the tree in which evaluation is primarily guided by research methodology.

social inquiry root.

concerned with obtaining the most rigorous knowledge possible given the contextual constraints,

knowledge construction,”

Shadish, Cook, and Leviton (1991).

we recog- nize that it is more accurate to describe these approaches as emphasizing research methodology, specifically the techniques used in the conduct of evaluation studies, rather than just the methods used to conduct such studies. However, because we called this the methods branch in the first version of the tree, we have chosen to continue with this label.

valuing

branch is split in two—objectivist and subjectivist

subjectivists argue that value judgments should be based on “publicly observable” facts.

objectivist

is compatible with the postpositivist philosophical ideas that by and large inform methods theorists’ work.

use

originally focused on an orientation toward evaluation and decision making.

not meant to be viewed as independent from one another

reflect a relational quality between them

the work of those theorists who are placed on the right side of the methods branch, leaning toward the valuing branch, reflects a secondary importance placed on valuing

the left side of the valuing branch, closest to the methods branch, are primarily objectivist valuing theorists, with a secondary concern for methods

e

far right side of the tree—the subjectivist arm of the valuing branch—we find theorists who reflect the relationship between valuing and use

concerned with individual stakeholders’ actionable use of findings

the far left of the use branch reflects theorists whose primary concern is use but who have a secondary concern for valuing—in other words, a secondary concern for social justice and empowerment.

we populated the tree with the theorists who are most commonly associated with or who have made the initial and most notable and substantial contributions to each of the particular approaches rep- resented.

Chen

synthesized and advanced these ideas

n

FOUNDATIONAL ROOTS Social Accountability

important role that accountability plays in evaluation

several dimensions to accountability

reporting

in which only description is provided

justifying analysis

explanation

where a justify- ing analysis recognizes deficiencies

true accountability requires “answerability

that is, those responsible must be held accountable. This phase of accountability is not reflected in evaluation; evaluation simply provides the information for “being answerable.”

2) process accountability

Alkin (1972a) defines three types of accountability

(1) goal accountability

3) outcome accountability

Goal accountability

examines whether reasonable and appro- priate goals have been established

whether reasonable and appropriate procedures for accomplishing those goals have been established and implemented,

Process accountability

outcome accountability

c

the extent to which established goals have been achieved

program accountability is prominent in the “process” section of Daniel Stufflebeam’s CIPP (an acronym for four types of evaluation: context, input, process, and product)

CIPP

context

input

process

product

four types of evaluation

Today, most evaluations have a strong focus on goal accountability, with an eye toward the improvement of institutional performance

U.S. Government Accountability Office (GAO) typifies how accountability is often viewed in contemporary North American evaluation practice

situates and legitimizes evaluation as a fundamental process for generating systematic information for decision making

Social accountability

Social Inquiry

can be characterized as the systemic study of the behav- ior of groups of individuals in various social settings by a variety of methods

Social inquiry

c

recognition that human action has a unique social dimension rather than simply a natural or psychological dimension

ntral overriding question is “Why do people in social groups act as they do?”

17th- and 18th-century

Hobbes, Montesquieu, and Rousseau

mid- and late 19th century, as demonstrated in the works of Karl Marx, Emile Durkheim, and Max Weber, for instance, that society and social groups began to be studied empirically through the collection and analysis of empirical data on social groups.

ich methods are appropriate for the study of society, social groups, and social life a

perennial question in social inquiry

discipline of anthropology

qualitative studies of the social world.

distinction

is sometimes couched in terms of the distinction between explana- tion and prediction, on the one hand, and interpretation and understanding, on the other.

explana- tion and prediction

interpretation and understanding

Clifford Geertz’s classical essay “Thick Description: Toward an Interpretive Theory of Culture” in The Interpretation of Cultures (1973)

Cutting across social science disciplines are broad philosophical and methodological questions

Important questions include the following:

Should social scientists have a moral stance toward the individuals and groups that they study?

What is the relationship between theory and observation?

Is this stance appropriate, and would it compromise the researchers’ objectivity?

Epistemology

paradigm

a worldview or perspective that, in the case of research and evaluation, includes conceptions of methodology, purposes, assumptions, and values .

typically consists of an ontology (the nature of reality),

methodology (how one can obtain knowledge).

an epistemology (what is knowable and who can know it),

2) con- structivism (and related thinking),

three broad areas of thinking:

(1) postpositivism

basic axioms of these paradigms offer a broader framework for understanding the theoretical influences on evaluation theorists’ Note: Portions of this section of this chapter are reprinted from What Counts as Credible Evidence in Applied Research and Evaluation Practice? by Stewart I. Donaldson, Christina A. Christie, and Melvin M. Mark 2009 by SAGE Publications, Inc. Chapter 2. An Evaluation Theory Tree 17 work

(3) pragmatism

What Axioms?!

What Axioms?

some evaluation perspectives are shaped more exactly by a philosophical theory, while in others, only a theory’s undercurrent can be detected

I think I'm a post-positivist on this scale:)

Views of science shifted during the 20th century away from positivism toward postpositiv- ism.

postpositivists believe its goal is to attempt to measure truth, even though that goal cannot be attained because all observation is fallible and has error.

positivists believe that the goal of science is to uncover the truth, postpositivists believe its goal is to attempt to measure truth, even though that goal cannot be attained because all observation is fallible and has error.

positivists believe that the goal of science is to uncover the truth

This type of realism is referred to as “critical realism.”

critical realism

causation is observable and that over time predictors can be established; however, some degree of doubt associated with the conclusion will always exist.

we have to measure how much they can be controlled.

Values and biases are noted and accounted for, yet the belief is that they can be controlled within the context of scientific inquiry.

there is no one reality, rather that several realities emerge from one’s subjective belief system.

Constructivism is one element of intrepretivism

Constructivism

Constructivists

we have to measure how much they can be controlled.

new knowledge and discovery can only be understood through a person’s unique and particular expe- riences, beliefs, and understandings of the world." -

These multiple realities are believed to be subjective and will vary based on the “knower.”

the “knower” and the “known” are interrelated, and to determine what is “known,” the knower constructs a reality that is based on and grounded in context and experience.

inquiry is considered value-bound, not value-free, and therefore, bias should be acknowledged rather than attempting to position the inquiry process so as to control it.

No - impact is the greatest priority: generalizability has a little impact for a lot of people; local relevance has a big impact for a few people. We want both.

All things influence all things; but cause and effect can come from the constant (juxtaposition) and theory for why that is so – (however, theories may change)

Inductive logic drives the inquiry process, which means that particular instances are used to infer broader, more general principles.

Inductive logic drives the inquiry process, which means that particular instances are used to infer broader, more general principles

Local relevance, then, outweighs and is of much greater priority than generalizability.

Cause and effect are thought to be impossible to distinguish because relationships are interde- pendent, and so simultaneously, all things have influence on all things.

there are no absolute truths, only relative truths

constructivist and relativist philosophy

Stake, Guba, and Lincoln.

No: there may well be absolute truths, but we can only KNOW relative truths, because of our imperfect information.

argue that deductive and inductive logic should be used in concert.

Pragmatists embrace objectivity and subjectivity as two positions on a continuum

move away from embracing equally the axioms of the postpositivist and constructivist paradigms.

pragmatists are more similar to postpositivists with regard to notions about external reality, with the understanding that there is no absolute “truth” concerning reality.

in line with constructivist thought, however, pragmatists argue that there are multiple explanations of reality and that at any given time there is one explanation that makes the most sense.

a single explanation of reality may be considered “truer” than another.

believe that causes may be linked to effects.

Which is of course, all I'm talking about when I say I'm a constructivist

However, they temper this thinking with the caveat that absolute certainty of causation is impos- sible.

they do not believe inquiry is value-free; rather, they consider their values important to the inquiry process

pragmatist paradigm

seems to influence the thinking of those on the use branch, particularly those who have an interest in promoting instrumental use of evaluation findings, such as Patton.

positivist and postpositivist research methodolo- gies dominated the conduct of studies.

In the beginning, there was research.

Donald Campbell

best known for his pathbreaking work on the elimination of bias in the conduct of research in field settings.

papers on experimental and quasi-experimental designs for research (Campbell, 1957; Campbell & Stanley, 1966).

“rule out many threats precluding causal inference” (Shadish et al., 1991, p. 122).

impact on social science research

Campbell and Stanley’s Experimental and Quasi-Experimental Designs for Research (1966).

conditions necessary to conduct a true experimental study

First

where randomization is the hallmark

Second

degree to which an experiment is properly controlled internal validity

degree of applicability of the results of an experiment as external valid- ity.

experiments are not perfect and that they should not, and cannot, be used in a great many situations

Third

Quasi-experimental designs were developed to deal with the messy world of field research,

The ideas put forth in Campbell and Stanley’s manuscript are now the foundation of almost all social science research methods courses.

not until Suchman (1967) saw the relevance of it to evaluation that his name became prominently identified with that field. It is because Campbell’s work on quasi-experimental design precedes Suchman’s application of it to evaluation that we choose to discuss Campbell prior to Suchman. It should be noted that Campbell (1975a, 1975b) has also written papers indicating the potential appropri- ateness of qualitative methods as a complement to quantitative experimental methods.

Edward Suchman

promoted Campbell’s work as the most effective approach for conduct- ing evaluation studies to measure program impact.

book, Evaluative Research 20 PART I. INTRODUCTION (1967),

perhaps the first full-scale description of the application of research methods to evaluation.

1967

evaluation as a commonsense usage, referring to the “social process of making judgments of worth” (p. 7)

Suchman (1967) distinguishes between

and evaluative research that uses scientific research methods and techniques.

affirms the appropriate use of the word evalua- tive as an adjective specifying the type of research being done.

comments that the evaluative researcher, in addition to recognizing scientific criteria, must also acknowledge administrative criteria for determining the worthwhileness of doing the study

This

influence

many

others

on the methods branch (e.g., Rossi, Weiss, Chen, and Cronbach).

(1) effort (the quantity and quality of activity that takes place)

Suchman’s identification of five categories of evaluation

(2) performance (effect criteria that measure the results of effort),

(3) adequacy of performance (the degree to which performance is adequate for the total amount of need)

(4) efficiency (examination of alternative paths or methods in terms of human and monetary costs)

(5) process (how and why a program works or does not work)

Robert Boruch

Randomized field tests are also different from quasi-experiments.

The latter research designs have the object of estimating the relative effectiveness of different treatments that have a com- mon aim, just as randomized experiments do, but they depend on methods other than randomiza- tion to rule out the competing explanations for the treatment differences that may be uncovered. Quasi-experiments and related observational studies then attempt to approximate the results of a randomized field test. (p. 4)

Thomas Cook

Campbell and Stanley’s classic Experimental and Quasi-Experimental Designs for Research (1966)

During the 1970s, Cook and Campbell e

alled attention to some less recognized threats to internal validity (e.g., resentful demoralization)

during the mid-1970s,

Campbell

denounced the use of quasi-experimental designs and went so far as to state that per- haps he had committed an injustice by suggesting that quasi-experimental designs were a viable alternative to classic randomized design.

remained a proponent of quasi- experiential designs and continued to focus on their use.

Cook

random selection, methods, the evaluation context, and stakeholder involve- ment in evaluation studies—

He asserts that it is imperative for evaluators to choose methods that are appropriate to the particular evaluation being conducted and that they

take into consideration the context of each evaluation rather than using the same set of meth- ods and designs for all evaluations—a direct attack on experimental design.

Lee J. Cronbach

contributions include Cronbach’s coefficient alpha, generalizability theory, and notions about construct valid- ity.

one of the methodological giants of our field.

our field

strong evaluation roots in methodology and social science research led us to place him on the methods branch of the theory tree.

ssociation with more policy research–oriented Stanford University colleagues, notably in his book Toward Reform of Program Evaluation (Cronbach & Associates, 1980), helped establish his concern for evalua- tion’s use in decision making.

ejects the simplistic model that assumes a single decision maker and “go/no-go” decisions

Cronbach (1982) coins

a set of symbols

to define the domains of evaluation.

domains

units (populations)

treatments

observations (outcomes)

settings

Cronbach’s concern about generalizing to *UTOS leads him to reject Campbell and Stanley’s emphasis on experimental design and Scriven’s focus on comparison programs.

Campbell and Stanley’s emphasis on experimental design

Scriven’s focus on comparison programs

c

proposes that generalization to *UTOS can be attained by extrapolating through causal explanation, using either

or the “thick description” of qualitative meth- ods.

causal modeling

sometimes beneficial to examine subpopulations (sub-UTOS)

focusing on the subset of data for a particular group might enable generalization to other domains.

seeks to capitalize on naturally occurring variability within the sample as well as the consequences of different degrees of exposure to treatments.

displays sensitivity to the values of the policy-shaping community

done system- atically with an eye to what will contribute most to generalization:

issues receiving attention from the policy-shaping community

issues relevant in swaying important (or uncommitted) groups

issues that would best clarify why a pro- gram works

issues having the greatest uncertaint

Shadish et al. (1991) make the following keen distinction between Cronbach and several other major theorists: “[Cronbach views] evaluators [as] educators rather than [as] the philosopher-kings of Scriven, the guardians of truth of Campbell or the servants of management of Wholey” (p. 340).

Peter Rossi

well-known for his highly popular textbook on evaluation

first edition was Evaluation: A Systematic Approach (Rossi, Freeman, Chapter 2. An Evaluation Theory Tree 25 & Wright, 1979).

& Wright, 1979).

now includes discussions of qualitative data collection, evaluation utilization, the role of stakeholders,

changing nature of the field

earlier writings stressed

experimental design.

his ideas eventually evolved to a place that some say was so compre- hensive that the approach he suggested was virtually impossible to implement.

response to this criticism led him to develop “tailored evaluations”

tailored to the stage of the program,

theory-driven evaluation involves the construction of a detailed program theory, which is then used to guide the evaluation.

Rossi maintains that this approach helps reconcile the two main types of validity—internal and external.

Carol Weiss

influenced our thinking about what evaluation can do

informed

by

research

on evaluatio

in

context of political decision making

expands on or defines many of our key concepts and terms related to evaluation use,

conceptual use, or use for understanding (Weiss, 1979, 1980)

enlightenment use, or more subtle and indirect use that occurs in the longer term (Weiss, 1980)

imposed use,

(Weiss, Murphy-Graham, & Birkeland, 2005; Weiss, Murphy-Graham, Petrosino, & Gandhi, 2008).

argues for more evidence-based policy making

most effective kinds of evaluations are those that withstand the test of time

that is, are generalizable and therefore use the most rigorous methods possible.

political theo- rists

early evaluation wor

influenced by research methodologists,

xpositors of democratic thought (e.g., Rousseau, the Federalist Papers).

“Evaluation is a kind of policy study, and the boundaries are very blurred . . . I think we have a responsibility to do very sound, thorough systematic inquiries” (Weiss, cited in Alkin, 1990, p. 90).

Politics intrudes on program evaluation in three ways:

(1) programs are created and maintained by political forces;

(2) higher echelons of government, which make decisions about programs, are embedded in politics;

(3) the very act of evaluation has political connotations. (p. 213)

decision accretion

decisions are the result of “the build-up of small choices, the closing of small options and the gradual narrowing of available alternatives” (Weiss, 1976, p. 226).

Huey T. Chen

concept and practice of theory-driven evaluation (Chen, 1990, 2005).

there is no indication as to whether failure is due to, for example, poorly constructed causal linkages, insufficient levels of treatment, or poor implementation. (Chen, 2005)

Chen proposes a solution to this dilemma: We have argued for a paradigm that accepts experiments and quasi-experiments as dominant research designs, but that emphasizes that these devices should be used in conjunction with a priori knowledge and theory to build models of the treatment process and implementation system to produce evaluations that are more efficient and that yield more information about how to achieve desired effects. (Chen & Rossi, 1983, p. 300)

concerned with identifying secondary effects and unintended consequences. This is similar to Scriven.3

theories that he seeks

are

“plausible and defensible models of how programs can be expected to work” (Chen & Rossi, 1983, p. 285).

Gary Henry and Melvin Mark (With George Julnes)

view social betterment as the ultimate objective of evaluation and present a point of view grounded in what they refer to as a “common sense realist philosophy.”

we were struck by the views presented in the “realist evaluation” monograph in New Directions for Evaluation (Henry, Julnes, & Mark, 1998)

Emergent realist evaluation (ERE)

a comprehensive new evaluation model that offers reconceptualized notions of use, methods, and valuing.

“a new theory that captures the sense- making contributions from post-positivism and the sensitivity to values from constructivist tradi- tions” (Henry et al., 1998, p. 1)

“social betterment, rather than the more popular and pervasive goal of utiliza- tion, should motivate evaluation” (Mark, Henry, & Julnes, 1998, p. 19).

ERE is an evaluation methodology that

(a) gives priority to the study of generative mechanisms,

(b) is attentive to multiple levels of analysis,

(c) is mixed methods appropriate.

focuses on understanding the underlying mechanisms of programs,

identify which mechanisms are operating and which are not (Mark et al., 1998).

to identify casual linkages and to enhance the generalizable knowledge base of a particular set of programs or program theories.

Mark and Henry argue that an evaluation should examine program effects that are of most interest to the public and other relevant stakeholders

so evaluators must determine stakeholders’ values when investigating possible mechanisms.

three methods for investigating stakeholder values:

surveying and sampling possible stakeholders

qualitative

interviews and/or focus groups to determine their needs and concerns

nalyzing the context of the evaluation from a broad philosophical perspective

issues such as equity, equality, and freedom.

then

communicated

(Mark et al., 1998).

or principled discovery

competitive elaboration

ruling out alternative explanations for study find- ings

Competitive elaboration

threats to validity (Mark et al., 1998).

alternative program theories

requires a preexisting body of knowledge of possible program mechanisms

approach lends itself to quantitative methods of inquiry

Principled discovery is used when pro- grams are evaluated before practitioners are able to develop experientially tested theories (Mark et al., 1998).

Principled discovery is used when pro- grams are evaluated before practitioners are able to develop experientially tested theories (Mark et al., 1998). Approaches to discovering program mechanisms include exploratory data analysis, graphical methods (Henry, 1995), and regression analysis.

Approaches to discovering program mechanisms include

exploratory data analysis,

graphical methods (Henry, 1995)

regression analysis.

valu- ation influence (Henry & Mark, 2003; Mark & Henry, 2004).

recently, Mark and Henry have extended

theoretical work

into

defined as “the capacity or power of persons or things to produce effects on others by intangible or direct means” (Kirkhart, 2000, p. 7).

theory of evaluation influence

Henry and Mark (2003)

at which influence can occur.

depicts three levels

individual,

interpersonal

collective

Each level is further explained by identifying specific mechanisms, measurable out- comes, and forms of influence.

Ralph Tyler

ne of the major starting points for modern program evaluation

The Eight-Year Study

far-reaching

Madaus and Stufflebeam (1989)

taxonomic classification of learning outcomes

eed to validate indirect measures against direct indicators of the trait of interest

concept of formative evaluation

content mastery,

decision-oriented evaluation

criterion-referenced and objectives-referenced tests”

(p. xiii).

curricula to be evaluated are based on hypotheses that are the best judgments of program staff regarding the most effective set of procedures for attaining program outcomes.

focus is on the specification of objectives and measurement of outcomes.

Tyler’s point of view has come to be known as objectives-oriented (or objectives-referenced) evaluation.

p

(a) formulating a statement of educational objectives, (b) classifying these objectives into major types, (c) defining and refining each of these types of objectives in terms of behavior, (d) identifying situations in which students can be expected to display these types of behavior, (e) selecting and trying promising methods for obtaining evidence regarding each type of objective, (f) selecting on the basis of preliminary trials the more promising appraisal methods for further development and improvement, and (g) devising means for interpreting and using the results (Tyler, 1942, pp. 498–500).

focuses on

Madaus and Stufflebeam (1989) claim that Tyler coined the term educational evaluation in the 1930s to describe his procedures—the comparison of (well-stated) intended outcomes (called objectives) with (well-measured) actual outcomes.

Metfessel and Michael’s (1967) work follows Tyler’s evaluation step progression but pays greater heed to expanding the range of alternative instruments.

Hammond (1973) includes Tyler’s views as a behavioral objectives dimension that is part of a model that also includes a more precise definition of instruction and the institution.

Popham (1973, 1975) follows the Tyler model and focuses pri- marily on the championing of “behavioral objective specification.”

opham (1973) called for

massive number of objectives required to conduct an evaluation and subsequent system overload.

narrow scope for individual educational objectives

Popham (1988) recognized this problem and called for a focus on a manageable number of broad-scope objectives and the use of the taxonomies of educational objectives only as “gross heuristics.”

Bloom, Englehart, Furst, Hill, and Krathwohl (1956) developed a taxonomy of educational objectives for the cognitive domain,

Krathwohl, Bloom, and Masia (1964)

placed him on the methods branch because we believe that his attention to educational measurement as the essence of evaluation is the most prominent feature of his work.

our revised view

he is not a theoretical predecessor of those further up on the branch.

on further reflection, we concluded that his overall influence on the methods branch specifically was less than his original position suggested.

Valuing

Out of the root of epistemology has grown a branch of evaluators who focus on concerns related to valuing in the evaluation process.

Of particular importance is the fact/value distinction delineated by the 18th-century Scottish philosopher David Hume.

the legitimacy of value claims (as ably described by House & Howe, 1999)

Important issues raised when considering valuing in evaluation include

the nature of universal (justifiable) claims

he constructivist view that truth (or fact) is guided by “the meanings that people construct in particular times and places” (Greene, 2009, p. 159).

argues that it is the work of the evaluator to make a value judgment about the object that is being evaluated and

Scriven

proclaims that evaluation is not evaluation without valuing.

that this value judgment should be based on observable data about the quality and effective- ness of the evaluand under study.

Scriven’s philosophical training in logic, which helps inform his argument for a systematic, objective approach to valuing and evaluation, has importantly influenced his thinking.

those who reject the notion that we should strive for an objectivist judgment about the merit or worth of the evaluand.

espouse the phi- losophy of relativism or subjectivism—that

human activity is not like that in the physical world but

Double-click to edit.

an ongoing, dynamic process and a truth is always rela- tive to some particular frame of reference.

Stake’s (1967) article “The Countenance of Educational Evaluation” offers hints of subjectivist thinking,

his paper on responsive evaluation (Stake, 1974)

(evaluation conducted in the spirit of obtaining objective information).

explicitly reject “preordinate evaluation”

argues for a responsive approach using case studies as a means of capturing the issues, personal relationships, and complexity of the evaluand and for judging the value of the evaluand under study.

has served as an important influence in the thinking of others who argue for atten- tion to complexity, dialogue, and meaning in evaluations and use this as a basis for informing value claims in evaluation studies.

Michael Scriven

major contribution is the way in which he adamantly defines the role of the evaluator in making value judgments.

Shadish et al. (1991) note that Scriven was “the first and only major evaluation theorist to have an explicit and general theory of valuing” (p. 94).

(1986)

“Bad is bad and good is good and it is the job of evaluators to decide which is which” (p. 19).

The evaluator, in valuing, must fulfill his or her role in serving the “public interest” (Scriven, 1976, p. 220)

he views the evaluator’s role in valuing as similar to producing a study for Consumer Reports, in which the evaluator

he views the evaluator’s role in valuing as similar to producing a study for Consumer Reports, in which the evaluator determines the appro- priate criteria by which judgments are to be made and then presents these judgments for all to see.

determines the appro- priate criteria by which judgments are to be made and then

presents these judgments for all to see.

“critical competitors,” or competing alternatives.

here is the necessity for identifying

valuator has the responsibility for identifying the appropriate alternatives.

Comparisons are key in making value judgments,

adamantly states that it is not necessary to explain why a program or product works in order to determine its value.

lternative to experimental and quasi-experimental design called the “modus operandi” (MO) method

(Scriven, 1991, p. 234),

analogous to procedures used to profile criminal behavior:

The MO of a particular cause is an associated configuration of events, processes, or properties, usually in time sequence, which can often be described as the characteristic causal chain (or certain distinctive features of this chain) connecting the cause with the effect. (Scriven, 1974, p. 71)

then narrow it down

first develop a thorough list of potential causes

determines which potential causes were present prior to the effect.

two steps.

first

determine which complete MO fits the chain of events and thus determine the true cause.

second

To ensure accuracy and bias control,

calls in a “goal-free or social process expert consultant to seek undesirable effects” (Scriven, 1974, p. 76).

looks for instances of “co-causation and over determination” and

believes that, ultimately, the evaluator is able to deliver a picture of the causal connections and effects that eliminate causal competitors without introducing evaluator bias.

(1972b) advocates for “goal-free evaluation,”

maintains that by doing so, the evaluator is better able to identify the real accomplishments (and nonaccomplishments) of the program.

essential ele- ment of Scriven’s valuing is the determination of a single value judgment of the program’s worth (“good” or “bad”).

How good or how bad is important for comparisons; opportunity cost; etc.

J: Given: a. Logic: Search for Truth; born out of Platonic ideal of forms.

b. Statistics: Search for knowledge, born out of Aristotelian / Humean concept of empiricism and means / frequency.

c. Evaluation: The application of these tools (and any others) along with the identification and weighing of the ends (values) at hand:

and

d. Logic and statistics are developed by humans

e. humans are guided (consciously or not) by the "automatic" (or "natural") identification and weighing of ends (values)

Therefore:

1. Evaluation is the discipline that undergirds Logic and Statistics.

- I could make another argument for why Evaluation is more Alpha (basic) than Epistemology, along similar lines

In requiring the synthesis of multiple-outcome judgments into a single value statement, Scriven is alone among evaluation theorists.

Needs are the presumed cost to society and to individuals and are determined through a needs assessment.

his “conception of needs implies a prescriptive theory of valuing and that he disparages descriptive statements about what people think about the program” (p. 95).

fail- ing to directly reflect the views of stakeholders inhibits the potential use of evaluation findings

his needs assessment is not independent of the views of the evaluator and

Scriven is apparently unconcerned by this, maintaining that determining the “truth” is sufficient.

unique training in philosophy, mathematics, and mathematical logic provides him with the assurance that he can make sound, unbiased judgments.

The more you think you know. . . the less you do . . . because if you expand your knowledge in three dimensions what you don't know will grow, and figuring out what you do know will grow.

(Look at modeling this. . . get data; test people longitudinally (10-15 years) . . . look at bits of knowledge that are not used. . .but then re-discovered?)

logic provides him with the assurance that he can make sound, unbiased judgments. Extending his supposition for evaluation as the science of valuing, Scriven (1991, 2001, 2003) reasons that evaluation is a transdiscipline, that is, a discipline that possesses its own unique knowledge base while serving other disciplines as a tool. He maintains that, like logic and statistics, evaluation is a major transdiscipline because all disciplines rely on the evaluation

provides him with the assurance that he can make sound, unbiased judgments.

(1991, 2001, 2003) reasons that evaluation is a transdiscipline

that is, a discipline that possesses its own unique knowledge base while serving other disciplines as a tool.

like logic and statistics

because all disciplines rely on the evaluation process to judge the value of the entities within their own purview

as evidenced by the peer review publication process.

Scriven’s thinking pushed the field to consider valuing as a central feature of evaluation

more

than anyone else.

Henry Levin

Cost analyses are a critical domain of evaluation work because they offer information to address what some consider to be the ultimate evaluation question: What is the overall value of the program?

economics-based strategies for determining the value of a program or policy.

ost analyses as a method for inform- ing object value judgments about a program

using a specific methodological approach.

an array of economics-based strat- egies for determining program costs prior to and during implementation.

Levin (2005; Levin & McEwan, 2001)

project what a program might cost

track of the costs of an ongoing program.

determine which program out of many achieves a target outcome most frugally (cost-effectiveness) or which program of many with equal costs produces the greatest outcome (also cost-effectiveness).

examined relative to its monetary impact (or benefits) on clients or society

cost

An essential part of preparing to undertake a cost study involves thinking about

cost–benefit

how best to assess them.

how “costs” are defined

what costs are important,

Because costs can be calculated and evaluated differently across strategies, it is important for evaluators and stakeholders to clarify what they need from a cost strategy and to understand how each works.

Robert Stake

three manuscripts—“The Countenance of Educational Evaluation” (Stake, 1967), Program Evaluation, Particularly Responsive Evaluation (Stake, 1974), and Case Studies in Science Education (Stake & Easley, 1979)

the essential components of Stake’s responsive evaluation are

House (2001b)

(a) there is no true value to anything (i.e., knowledge is context bound)

(b) stakeholder perspectives are integral elements in evaluations, and

(c) case studies are the best method for representing the beliefs and values of stakeholders and of reporting evaluation results.

he maintains that seeing and judging the evaluand regularly are part of the same act and that the task of evaluation is as much a matter of refining early perceptions of quality as of building a body of evidence to determine the level of quality.

is opposed to stakeholder participation in many evaluation activities and processes and instead asserts that evaluation is the job of the evaluator (Alkin, Hofstetter, & Ai, 1998, p. 98).

it is the evaluator’s job “to hear the [participants’] pleas, to deliberate, sometimes to negotiate, but regularly, non- democratically, to decide what [the participants’] interests are” (p. 104).

Stake (2000)

(1975) cautions Chapter 2. An Evaluation Theory Tree 35 that “whatever consensus in values there is [among participants] . . . should be discovered. The evaluator should not create a consensus that does not exist” (pp. 25–26).

hat “whatever consensus in values there is [among participants] . . . should be discovered. The evaluator should not create a consensus that does not exist” (pp. 25–26).

“The reader, the client, the people outside need to be in a position to make their own judgments using grounds they have already, plus the new data” (Abma & Stake, 2001, p. 10).

Elliott Eisner

“educational connoisseurship”

Journal of Aesthetic Education (1976)

subsequently expanded

1985, 1991a, 1991b, 1998)

focus is on educational outcomes

negative perceptions of

measured by standardized tests using the principles of psy- chological testing or

(1976) rejection of “technological scientism” includes a rejection of the extensive use of research models employing experimental and quasi-experimental designs, which depend heavily (if not exclusively) on quan- titative methods.

by criterion-referenced testing procedures.

I would say that everything that matters may be able to be measured quantitatively. . . frequency of oxytocin spikes in relationships, for example; lack of other things: It's not everything; but it gives information on waht matters, and it is a form of measurement.

Eisner notes that “things that matter” cannot be measured quantitatively.

“evaluation requires a sophisticated, interpretive map not only to separate what is trivial from what is signifi- cant, but also to understand the meaning of what is known” (Eisner, 1994, p. 193).

Eisner uses the role of critics in the arts as an analogy for an alternative conception of evaluation.

twin notions of connoisseurship and criticism.

connoisseur

have the ability to differentiate subtle- ties,

have knowledge about what one sees

“a connoisseur is someone who has worked at the business of learning how to see, to hear, to read the image or text and who, as a result, can experience more of the work’s qualities than most of us” (p. 174).

be aware of and understand the experience

(1991b)

is making the experience public through some form of represen- tation.

Criticism

three aspects of criticism

First

critical description, in which the 36 PART I. INTRODUCTION evaluator draws on his or her senses to describe events, reactions, interactions, and everything else that is seen.

evaluator draws on his or her senses to describe events, reactions, interactions, and everything else that is seen.

ortrays a picture of the program situation, frequently imagining himself or herself as a participant and drawing on the senses to describe the feeling in the participant’s terms.

second

expectation

understand or make sense of what was seen.

Eisner (1991b)

The essence of perception is its selectivity; the connoisseur is as unlikely to describe everything in sight as a gourmet chef is to use everything in his pantry. The selective process is influenced by the value one brings to the classroom. What the observer cares about, she is likely to look for . . . Making value judgments about the educational import of what has been seen and rendered is one of the critical features of educational criticism. (p. 176)

Ernest House

denounces

utilitarian framework

“Utilitarianism is a moral theory which holds that policies are morally right when they promote the greatest sum total of good or happiness

(1991)

from among the alternatives” (p. 235)

deplores the lack of value neutrality in stakeholder approaches, which he says results from the general lack of full inclusion of the represented interests of the poor and powerless in stakeholder groups (pp. 239–240).

ouse (1991, 1993) argues that evaluation is never value-neutral and should tilt in the direction of social jus- tice by specifically addressing the needs and interests of the powerless.

comes to these views by drawing on Rawls’s (1971) justice theory.

“ethical fallacies” in evaluation: cli- entism (taking the client’s interest as the ultimate consideration), contractualism (adhering inflexibly to the contract), managerialism (placing the interest of the managers above all else), methodologicalism (believing that proper methodology solves all ethical problems), pluralism/ elitism (including only the powerful stakeholders’ interests in the evaluation), and relativism (taking all viewpoints as having equal merit).

(House & Howe, 1999,

Inclusion

Dialogue

Deliberation

Fact and value

exist on a continuum where a middle ground exists between “brute fact” and “bare values” (House & Howe, 1999, p. 6).

deliberative democratic evaluation process is described as follows: “We can imagine moving along the value–fact continuum from

statements of preferences and values collected through initial dialogue, through deliberations based on democratic principles, to evaluative statements of fact” (House & Howe, 1999, p. 100).

ften cited as the theorist who first acknowledged the ways in which evaluation affects power and social structures and described how it can be used to either shift or maintain existing repressive structures.

J: Why eval is so hard to 'pin down' and why people are so adamant about their position? I believe it is because it IS the alpha discipline: People realize that how we define this has tremendous influence on what happens in the world; people's VALUES come into play: Evaluation is, logically, at the core of religion and politics (aka; evaluating whether something is in accordance with the religion or not, even in those religions where evaluating the religion in comparison to other things is frowned upon)

Jennifer C. Greene

by develop- ing a consensus around a set of criteria used to determine the value of a program.

evaluation should be used to determine valu

eliberative democratic evaluation: inclusion, dialogue, and deliberation.

stresses stakeholder involvement,

empha- sizes the use of mixed methods designs and fieldwork

(2005)

approach

emphasizes responsiveness to the particularities of the context,

three “justifications” for including stakeholder views

pragmatic justification argues for stakeholder inclusion because it increases the chance of evaluation utilization and organizational learning.

pragmatic, emancipatory, and deliberative.

emancipatory justification focuses on the importance of acknowledging the skills and contributions of stakeholders and empowering them to be their own social change agents.

deliberative justi- fication argues that evaluation should serve to ensure that program or policy conversations include all relevant interests and are “based on the democratic principles of fairness and equity and on democratic discourse that is dialogic and deliberative” (Greene, 2000, p. 14).

Egon Guba and Yvonna Lincoln

view stakeholders as the primary individuals involved in placing value.

there are multiple realities, based on the percep- tions and interpretations of the individuals involved in the program to be evaluated.

on the belief that

Thus

the role of the evaluator is to facilitate negotiations between individuals reflecting these multiple realities.

(1989) Fourth Generation Evaluation

constructivist paradigm

role of the constructivist investigator is to tease out these constructions and “to bring them into conjunction . . . with one another and with whatever information . . . can be brought to bear on the issues involved” (p. 142).

KW - inter-cultural evaluation

Donna Mertens

(1999, 2009) inclusive approach

unique in its emphasis on diversity and the inclusion of diverse groups.

best known for her inclusive/transformative model of evaluation.

four philosophical assumptions

primary role is to include marginalized groups, not to act as decision maker.

(2009) model,

advocates for the inclusion of marginalized groups

does not advocate for the marginalized groups

the following questions at the planning stages

ask

J: What makes this list is already a value judgement

• Are we including people from both genders and with diverse abilities, ages, classes, cul- tures, ethnicities, families, incomes, languages, locations, races, and sexualities?

• What barriers

exclude a diversity of people?

• Have we chosen the appropriate data collection strategies for diverse groups, including providing for preferred modes of communication?

Use

often referred to as “decision-oriented theo- ries.”

focuses primarily on the program at hand—this program at this time.

Evaluation influence refers to the capacity of evaluation processes, products, or findings to indirectly produce a change in understanding or knowledge either at the evaluation site at a future time or at other sites (Alkin & Taut, 2003; Christie, 2007; Kirkhart, 2000; Mark & Henry, 2004).

Rather than drawing from this broad definition of use, the use branch as we envision it depicts the work of theorists concerned with direct program site use (in action or understanding) that results from a particular evaluation study.

theorists presented on the use branch aim to promote the kind of actionable use that is within the purview of the evaluator.

Daniel Stufflebeam

CIPP

an acronym for four types of evaluation: context, input, process, and product.

Context evaluation involves identifying needs to decide on program objectives.

Input evaluation leads to decisions about strategies and designs

Process evaluation consists of identifying shortcomings in a current pro- gram to refine implementation.

product evaluation measures outcomes for decisions regard- ing the continuation or refocus of the program.

key strategy is

a cyclical process.

work with a carefully designed evaluation

maintaining flexibility

(1983),

view design as a process, not a product.

ontinually improve

continual information stream

aid decision makers in allocating resources to programs that best serve clients.

The Program Evaluation Standards (Joint Committee on Standards for Educational Evaluation, 1994).

four domains of practice: utility, feasibility, propriety, and accuracy.

Utility standards ensure that an evaluation will serve the information needs of intended users; feasibility standards ensure that an evalua- tion will be realistic, prudent, diplomatic, and frugal; propriety standards ensure that an evaluation will be conducted legally, ethically, and with due respect for the welfare of those involved in the evaluation as well as of those affected by its results; and accuracy standards ensure that an evaluation will reveal and convey technically adequate information about the features that deter- mine the worth or merit of the program being evaluated.

a “representative stakeholder panel to help define the evaluation questions, shape evaluation plans, review draft reports and disseminate the findings” (p. 57).

(2003)

engage

engages stakeholders (usually in decision-making positions) in focusing the evaluation and in making sure that it addresses their most important questions; provides timely, relevant information to assist decision mak- ing; and produces an accountability record.

(2001)

ormative and summative information become available to a panel of stakeholders

Joseph Wholey

academic training

ong-standing participation in federal government program

focus on managers and policymakers

three stages in the “sequential purchase of information” strategy are (1) rapid- feedback evaluation, which focuses primarily on extant and easily collected information; (2) performance (or outcome) monitoring, which measures program performance, usually in com- parison with prior or expected performance; and (3) intensive evaluation, which uses comparison or control groups to better estimate the effectiveness of program activities in causing observed results.

Eleanor Chelimsky

the chief purpose of evaluation” (Chelimsky, 1995).

“Telling the truth to people who may not want to hear it is,

establishing and directing

evaluation unit of the General Accountability Office (GAO),

largest independent internal evaluation unit in existence

www.gao.gov)

evaluation of public policies, programs, and practices is fundamental to a democratic government for four reasons: (1) to support congressional oversight; (2) to build a stronger knowledge base for policy making; (3) to help agencies develop improved capabilities for policy and program planning, implementa- tion, and analysis of results, as well as learning-oriented direction in their practice; (4) to strengthen public information about government activities through dissemination of evaluation findings. (p. 33)

(2006)

In essence,

evaluation should generate information for conceptual and enlightenment use, for organizational change and development, and for formative pro- gram improvements (i.e., actionable, instrumental use).

under her direction, GAO has devised a wide variety of methods to suit different question types (Chelimsky, 1997)

Marvin Alkin

imilarities to Stufflebeam’s CIPP model, though the primary distinction was Alkin’s recognition that process and product have both summative and formative dimensions.

one could look at process summatively (through program documentation) or at product formatively (through outcomes).

rejects the dominant role of evaluators as valuing agents.

prefers to work with primary users at the outset of the evaluation process to establish value systems for judging potential outcome data.

interac- tive sessions, he presents a variety of simulated potential outcomes and seeks judgments (values) on the implications of each.

Like McKinsey & KW

Michael Patton

most prominent theoretical explication of the utilization (or use) extension w

(1978, 1986, 1997, 2008).

utilization-focused evaluation (UFE),

(1) the development of users’ commitment to the intended focus of the evaluation and to evaluation utilization; (2) involvement

our major phases of UFE

in methods, design, and measurement; (3) user engagement—actively and directly interpreting findings and making judgments; and (4) making decisions about further dissemination.

(2002) urges that the evaluator be “active—reactive—interactive—adaptive.”

active in identify- ing intended users and focusing questions, reactive in continuing to learn about the evaluative situation, and adaptive “in altering the evaluation questions and designs in light of their increased understanding of the situation and changing conditions” (p. 432).

ntroduction of the term develop- mental evaluation (Patton, 2010).

evaluator becomes part of a program’s design team or management team.

Like KW

David Fetterman - Empowerment Evaluation; Teach

books on empowerment evaluation,

(1996, 2001)

process that encourages self-determination among recipients of the program evaluation, often including “training, facilitation, advocacy, illumination and liberation.”

goal of empowerment evaluation is to foster self-determination rather than dependency,

utside evaluator often serves as a coach or additional facilitator, providing clients with the knowledge and tools for continuous self-assessment and accountability.

argues that training participants to evaluate their own programs and coaching them through the design of their evaluations is an effective form of empowerment.

(1994)

two

forms of empowerment evaluation t

subtly different.

first

valuators teach program participants to

teach

conduct their own program evaluations,

is to build evaluation capacity.

primary work

coach to facilitate others to conduct their own evaluations.

second

coach

change. Fetterman sees all empowerment evaluators as having the potential to serve as “illum

(1998), the end point of evaluation is not the assessment of the program’s worth.

evaluation as an ongoing process

value and worth are not static,

Through the internalization and institutionalization of self-evaluation processes and practices, a dynamic and responsive approach to evaluation can be developed to accommodate shifts in populations, goals, value assessments and external forces” (p. 382).

participatory and empowerment evaluation employ similar practices

goals

different

J. Bradley Cousins

Cousins’s participatory evaluation (Cousins & Earl, 1992; Cousins & Whitmore, 1998)

if we care about utilization, then the way to achieve it is through buy-in.

the way to achieve buy-in is to have program personnel participating in the evaluation.

his evaluations are designed for structured, continued, and active participation of these users, as opposed to Patton’s user participation, which could take on a variety of different forms.

utilization takes place within the context of an organization and is best accomplished as a part of organizational development. Cousins calls this “practical participatory evaluation” (Cousins & Earl, 1995).

“applied social research that involves trained evaluation personnel and practice-based decision makers working in partnership” (Cousins & Earl, 1995, p. 8).

defines

practical participatory evaluation

best suited for evaluation projects that “seek to understand programs with the expressed intention of informing and improving their implementation” (Cousins & Earl, 1995).

evaluation as an organizational learning system (Cousins, Goh, & Clark, 2005).

Hallie Preskill - Transformational Learning & Appreciative Inquiry

is concerned with creating transformational learning within an organization through the evaluation process.

Transformational learning

(2000)

a process where individuals, teams, and even organizations identify, examine, and understand the information needed to meet their goals.

should (a) use a clinical approach, (b) span traditional boundaries between evaluator and program staff, and (c) diagnose the organizational capacity for learning.

approach “is inherently responsive to the needs of an organization and its members” (Preskill & Torres, 2000, p. 31)

Finally, an evalu- ator needs the ability to diagnose an organization’s capacity for learning (Preskill & Torres, 1998),

appreciative inquiry (AI)

a process that builds on past successes (and peak experiences) in an effort to design and implement future actions.

philosophy underlying AI is that

when evaluators look for problems, more problems are found, and

when deficit-based language is used, stakeholders often feel hopeless, powerless, and generally more exhausted.

by remembering topics of study that created excitement and energy, and

by reflecting on what has worked well,

par- ticipants’ creativity, passion, and excitement about the future are increased.

by using affirmative and strengths-based language,

Preskill (2004) takes the philosophy and principles put forth by organizational change AI theorists and applies them to the evaluation context, maintaining that they increase the use of the evaluation processes and findings.

Jean King

development of participatory evaluation models.

prefers working long term with organizations to develop joint understandings and, over time, creating structures that will continue to build evaluation capacity (Volkov & King, 2007).

interactive evaluation practice (IEP)

for fostering participation and obtaining use.

defines IEP as “the intentional act of engaging people in making decisions, taking action, and reflecting while conducting an evaluation study” (King &

Stevahn, 2007).

defines evaluation as “a process of systematic inquiry designed to provide sound information about the characteristics, activities, or outcomes of a program or policy for a valued purpose” (King & Stevahn, 2007).

concerned about identifying and fostering leaders during the evaluation process

needed to “attract or recruit people to the evaluation process, who are eager to learn and facilitate the process . . . and who are willing to stay the course when things go wrong” (King, 1998, p. 64)

trust build- ing

fundamental requirement

for

successful participatory evaluation

pay close attention to the interpersonal dynamics that occur (King & Stevahn, 2007).

roles suggested by King acknowledge the impor- tance of the interpersonal factor

A FINAL NOTE

two main challenges in

this chapter

First,

First: if you made your tree mutually exclusive and collectively exhaustive (Ethan Rasiel, 1999, 'The McKinsey Way', p. 6) you wouldn't have had this problem.

Second, we needed to determine which theorists to include on the tree.

we needed to make specific placements on particular branches of the tree.

theory tree is posited on the view that ultimately one or another of the three dimensions, depicted as branches, is of the highest priority for each theorist." -J: The next bunch of notes are all highlights of the criterion words that Alkin uses

first

issues

emphasis

on

purpose of an evaluation

main concern

utilization

principal focus of an evaluation

primary motivations.

primary methodology

valuing out- comes

primary focus

?

Or,

process use

We believe that, as Fetterman describes it, the act of empowering focuses on the process of engaging in evaluation;

?

In the language of evaluation utilization, empowerment evalua- tion involves instrumental process use.

Thus, while noting a deep concern for social justice and a strong preference for (and early evaluation roots in) anthropological/ethnographic methods, we were led to place Fetterman on the utilization branch.

The determination of a theorist’s place- ment on a branch of the evaluation theory tree was not always this difficult, but it always required similarly careful consideration and analysis of trade-offs.

Which means your tree is not a very good way to categorize things, unless you couldn't find anything better!

With our more restrictive focus on North American theorists, we also deleted Barry MacDonald and John Owen from this version of the tree because both reside outside North America (Great Britain and Australia, respectively) and their writings relate to work in these countries.

We also removed Eisner from the tree,

because a primary argument of Eisner’s is centered on the impor- tance of evaluators having domain-specific knowledge and expertise—and an argument around this issue still exists today.

Finally, while the ideas of Thomas Owen, Robert Wolf, and Malcolm Provus were innovative at the time, there is little evidence to suggest that their theo- retical work has persisted in influencing the field today, and so we also removed these theorists from the current version of the tree.

It's all about influence! It's sickening!

Our field con- trasts the so-called prescriptive theory of evaluation practitioners with more traditional forms of

social science theory, labeled descriptive theory (Alkin & House, 1992).