Description
Discussion Questions: What are some of the best operational practices for managing diversity? How does your employer (past employer) employ such practices?
Unformatted Attachment Preview
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Research Article
Journal of Organizational Behavior, J. Organiz. Behav. 34, 1076–1104 (2013)
Published online 5 November 2012 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/job.1839
A meta-analytic evaluation of diversity training
outcomes
ZACHARY T. KALINOSKI, DEBRA STEELE-JOHNSON*, ELIZABETH J. PEYTON,
KEITH A. LEAS, JULIE STEINKE AND NATHAN A. BOWLING
Department of Psychology, Wright State University, Dayton, Ohio, U.S.A.
Summary
The purpose of this meta-analysis was to use theory and research on diversity, attitudes, and training to
examine potential differential effects on affective-based, cognitive-based, and skill-based outcomes, to examine potential moderators of those effects with a focus on affective-based outcomes, and finally, to provide
quantitative estimates of these posited relationships. Results from 65 studies (N = 8465) revealed sizable
effects on affective-based, cognitive-based, and skill-based outcomes as well as interesting boundary conditions for these effects on affective-based outcomes. This study provides practical value to human resources
managers and trainers wishing to implement diversity training within organizations as well as interesting
theoretical advances for researchers. Practitioners have quantitative evidence that diversity training changes
affective-based, cognitive-based, and skill-based trainee outcomes. This study also supports and addresses
future research needs. Copyright © 2012 John Wiley & Sons, Ltd.
Keywords: diversity training; training evaluation; meta-analysis
Organizations have been citing the “business case” for diversity since the 1980s (Bendick, Egan, & Lofhjelm, 2001;
Kulik & Roberson, 2008b; Page, 2007), but many researchers have noted the difficulty of obtaining the potential
benefits of diversity (Edmondson & Roloff, 2009; Kulik & Roberson, 2008b; Milliken & Martins, 1996). Organizations have diversity; the challenge is to use that diversity to enhance organizational outcomes. Training can enhance
organizational outcomes (e.g., Goldstein & Ford, 2002). Thus, organizations have used training as a key component
of diversity initiatives, with as many as 67 percent of US organizations reporting the use of diversity training (Kulik
& Roberson, 2008b). Researchers have provided evidence in qualitative reviews that diversity training has beneficial
effects on learning outcomes (Kulik & Roberson, 2008a). However, there has been little empirical evidence to
suggest when and under what conditions diversity training is most beneficial even though substantial information
has been published advising practitioners on how to best implement diversity training (Kulik & Roberson,
2008b). Clearly, the implementation of diversity training has exceeded the pace at which researchers have been able
to inform practitioners about best practices (Pendry, Driscoll, & Field, 2007).
Recently, researchers have shifted away from a main-effects model (which research results typically failed to
support) to more sophisticated conceptualizations of why and when diversity training works (Kulik & Roberson,
2008b; van Knippenberg & Schippers, 2007). For example, researchers have conducted numerous narrative literature reviews (Curtis & Dreachslin, 2008; Engberg, 2004; Kulik & Roberson, 2008a; Paluck & Green, 2009; Perry,
Kulik, & Field, 2009). These have focused to a greater extent on attitude change than on skill-based or knowledge
change. This focus might reflect that trainers and researchers place greater importance on attitude change or a
concern with understanding why diversity training has produced generally weaker effects on attitudes relative to
other outcomes. Certainly, the focus in diversity training research on multiple outcomes is consistent with training
*Correspondence to: Debra Steele-Johnson, Department of Psychology, Wright State University, 3640 Colonel Glenn Highway, Dayton, Ohio
45435, U.S.A. E-mail: [email protected]
Copyright © 2012 John Wiley & Sons, Ltd.
Received 9 May 2011
Revised 2 September 2012, Accepted 2 October 2012
DIVERSITY TRAINING
1077
research that has shown the benefits of distinguishing between training effects on affective-based, cognitive-based,
and skill-based outcomes (Kraiger, Ford, & Salas, 1993). Moreover, recent research (see Bohner & Dickel, 2011, for
a review) on the nature of attitudes and attitude change has introduced a discussion of the role of implicit and explicit
processes. This discussion has the potential to increase our understanding of potential differential effects of diversity
training on affective-based versus cognitive-based or skill-based outcomes and how one might design and deliver
diversity training to strengthen effects on affective-based outcomes. Thus, we extend prior research by using theory
and research on diversity, attitudes, and training to derive predicted relationships between diversity training and
training outcomes. More specifically, we use a meta-analytic approach to examine potential differential diversity
training effects on affective-based, cognitive-based, and skill-based outcomes and identify features incorporated
in diversity training that might result in stronger effects on affective-based outcomes.
Theory and Research on Diversity Training
As noted, substantial research has focused on the effects of diversity training (e.g., Combs & Luthans, 2007;
Rudman, Ashmore, & Gary, 2001; Sanchez & Medkik, 2004; Stewart, Laduke, Bracht, Sweet, & Gamarel,
2003). Briefly, some researchers have suggested that expectations surrounding diversity training should be realistic
(Kulik & Roberson, 2008a). For example, diversity training might not increase minority representation within
management ranks (Kalev, Dobbin, & Kelly, 2006). That is, hiring rates and staffing goals are distal outcomes of
diversity training, which are achieved through a conglomeration of diversity-related practices (e.g., affirmative
action plans, diversity-tolerant cultures, diversity management executives). In contrast, diversity training is likely
to enhance more proximal outcomes, influencing attitudes, knowledge, and skills that might have a downstream
effect on distal outcomes such as staffing (Kulik & Roberson, 2008a). Yet other research has examined outcomes
of diversity training programs that vary in terms of characteristics of the training design, trainer, trainee, and training
environment (Hanover & Cellar, 1998; Holladay, Anderson, Gilbert, & Turner, 2008; Holladay, Knight, Paige,
& Quiñones, 2003; Holladay & Quiñones, 2005, 2008; Kath, 2004; Kulik, Pepper, Roberson, & Parker, 2007;
Liberman, Block, & Uyebuko, 2010; Perry, Kulik, & Schmidtke, 1998; Robb & Doverspike, 2001; Roberson,
Kulik, & Pepper, 2001). We suggest that a focus on proximal outcomes and on how we can design training to enhance those outcomes will benefit our understanding of diversity training.
Van Knippenberg and Schippers (2007) noted a distinct shift in diversity research since Williams and O’Reilly’s
(1998) review. That is, Williams and O’Reilly’s (1998) comprehensive review of 40 years of diversity research
described inconsistent research results. van Knippenberg and Schippers (2007) suggested that these inconsistent results
reflected an oversimplified, main-effects approach to diversity and further suggested that the field has now shifted to a
more theory-driven and better-conceptualized approach to diversity. Subsequently, there have been many attempts at
furthering the theoretical basis underlying diversity training (Chavez & Weisinger, 2008; Nemetz & Christiansen,
1996; Paluck, 2006; Wiethoff, 2004).
Central to much of the theorizing about diversity and diversity training is a discussion of two perspectives: a
social categorization perspective and an information-processing/decision-making perspective. Recent research
has examined the implications of these perspectives, in general, associating social categorization with dysfunctional effects and the information-processing/decision-making perspective with beneficial effects of diversity
(for discussion and reviews of these models, see Jackson & Joshi, 2011; Martins, Milliken, Wiesenfeld, &
Salgado, 2003; van Knippenberg & Schippers, 2007). The basic premise of the social categorization perspective
is that diversity could result in disruptive sub-group (i.e., in-group and out-group) formation, and the premise of
the information-processing/decision-making perspective is that diversity could result in benefits from a greater
array of expertise or knowledge in workgroups. van Knippenberg, De Dreu, and Homan (2004) have integrated
these perspectives, suggesting that potential beneficial effects of diversity might be disrupted by social categorization if that categorization is accompanied by identity threat to produce intergroup biases. Van Knippenberg
Copyright © 2012 John Wiley & Sons, Ltd.
J. Organiz. Behav. 34, 1076–1104 (2013)
DOI: 10.1002/job
1078
Z. T. KALINOSKI ET AL.
et al. (2004) have implications for diversity training in that they suggested that researchers should focus less on
the effects of a particular attribute of diversity and more on the multiple and interactive routes through which
diversity attributes affect outcomes.
Integrating Training and Attitude Theory and Research with Diversity Training
Building on the theoretical foundations developed in diversity research, we seek to increase our understanding of
diversity training effects by considering also theory and research on training, on attitudes, and on attitude change
and prior attempts to integrate this research with diversity training. Organizations rarely conduct in-depth evaluations of their diversity training interventions (Hite & McDonald, 2006; Perry et al., 2009; Rynes & Rosen, 1995).
Often, social and political pressures dictate the extent to which training is evaluated (Gutiérrez, Kruzich, Jones, &
Coronado, 2000; Lynch, 1997). When organizations do evaluate, they tend to use reaction measures rather than
more rigorous tools such as knowledge assessments or behavioral measurements (Bennett, 2006; Rynes & Rosen,
1995) because reaction measures are easy to develop, quick to administer, and thus cost-effective. Nonetheless,
we opted to analyze diversity training from a more theoretical perspective, using the Kraiger et al. (1993) model
of training evaluation. The Kraiger et al. (1993) framework identified three learning outcomes: affective-based,
cognitive-based, and skill-based outcomes. Kraiger et al. (1993) defined affective-based outcomes as measures of
internal states that drive perception and behavior. Affective-based outcomes include attitudes, self-efficacy, and
motivation in general (Kraiger et al., 1993). Cognitive-based outcomes include verbal knowledge, knowledge
organization, and cognitive strategies. Skill-based outcomes include changes in behavior.
Consistent with this multiple-outcome approach to evaluation, researchers conducting narrative literature reviews
of diversity training have reported evidence of beneficial results for attitude, knowledge, and skill-based change although these results are not consistent (Curtis & Dreachslin, 2008; Engberg, 2004; Kulik & Roberson, 2008a;
Paluck & Green, 2009; Perry et al., 2009). The most common outcome evaluated within diversity training is affective based (Curtis & Dreachslin, 2008), but research results are mixed. Some research has suggested that training is
not a viable option for changing attitudes toward others (Bendick et al., 2001). Other researchers have found positive
effects for diversity training on attitudes (Bailey, Barr, & Bunting, 2001; Hollister, Day, & Jesaitis, 1993; Pruegger
& Rogers, 1994; Robb & Doverspike, 2001). However, most studies have focused on explicit attitude change, and
very few studies have investigated implicit attitude change resulting from diversity training (Castillo, Brossart,
Reyes, Conoley, & Phoummarath, 2007; Rudman et al., 2001).
Less research has examined effects of diversity training on cognitive-based and skill-based outcomes. However,
past diversity training research has observed relatively consistent effects for change in cognitive-based outcomes,
focusing primarily on the verbal knowledge attained by trainees (Gulick, Jose, Peddie, King, Kravitz, & Ferro,
2009; Holladay & Quiñones, 2008; Perry et al., 1998; Robb & Doverspike, 2001; Roberson et al., 2001; Sanchez
& Medkik, 2004; Williams, 2005). Also, some research has demonstrated beneficial effects for diversity training
on skill-based outcomes (Barker, 2004; Hanover & Cellar, 1998; Moyer & Nath, 1998; Perry et al., 1998; Sanchez
& Medkik, 2004). This research has placed a greater focus on assessments of procedural knowledge and skill
compilation than on automaticity because diversity-related behavior tends to be more complex and require more
working memory capacity, relative to traditional psychomotor skills. Most of these studies have evaluated skills
using self-assessments (Kulik & Roberson, 2008a).
Kulik and Roberson’s (2008b) review of diversity training research provided a useful summary of this research,
describing diversity training within two broad categories: training designed to disseminate information and training
designed to create behavioral change. The purpose of training to disseminate information is to inform employees of
the organization’s diversity strategy and expectations, that is, a cognitive-based change. Training to create behavioral change can be divided further into awareness and skill training. The latter teaches trainees specific skills and
behaviors, that is, a skill-based change. The former, awareness training, attempts to increase trainees’ awareness
Copyright © 2012 John Wiley & Sons, Ltd.
J. Organiz. Behav. 34, 1076–1104 (2013)
DOI: 10.1002/job
DIVERSITY TRAINING
1079
of their biases, including stereotypes. The notion is that once trainees are aware of their biases, they can change those
biases, that is, an affective-based change. Kulik and Roberson (2008b) suggested that there is more substantial
evidence of the success of dissemination and skill training than of awareness training in achieving training goals.
Understanding Differential Effects of Diversity Training: Implicit and
Explicit Processes
We contribute to prior theory and research by suggesting that the generally stronger effects of diversity training on
cognitive-based and skill-based outcomes than on affective-based outcomes might be explained in part by considering
explicit and implicit processes and in part by considering access to relevant content prior to training. Research on
attitude and attitude change (e.g., Gawronski & Bodenhausen, 2006) has begun to distinguish between explicit and
implicit aspects of attitudes. For example, one can report one’s attitude toward a target object, that is, a conscious,
explicit attitude. However, one might be unable to directly report other attitudes or aspects of attitudes, that is, an
implicit attitude or implicit component of an attitude. Further, research has suggested the following: (i) that measures
differ in the extent to which they capture implicit versus explicit aspects of attitudes; and (ii) that explicit and implicit
processes can be used to change both implicit and explicit aspects of attitudes (e.g., Briñol, Petty, & McCaslin, 2009).
We draw from this research to suggest that training reflects relatively explicit processes and thus that one might expect to
have more direct effects on change in explicit training outcomes. Further, we suggest that cognitive-based and skillbased outcomes reflect relatively explicit outcomes (with a few exceptions such as tacit knowledge or automaticity).
However, affective-based outcomes (e.g., attitudes) to a greater extent reflect both explicit and implicit components.
Also, trainees might have greater exposure or access to content relevant to diversity attitudes than to cognitivebased or skill-based content prior to training. That is, if trainees already have favorable attitudes (explicit and
implicit), diversity training would result in little attitude change. Alternatively, trainees might know prior to training
which diversity attitudes are socially desirable and be able to report those attitudes explicitly. Trainees might be
less likely to know prior to training specific facts or information (cognitive-based outcomes) or specific behaviors
(skill-based outcomes) related to diversity. Taken together, the more explicit nature of diversity training and the
potentially greater pre-training access to attitude content suggest weaker effects of diversity training on attitudes,
and affective-based outcomes in general, compared with cognitive-based or skill-based training outcomes.
Hypothesis 1: Diversity training will have stronger effects on cognitive-based and skill-based outcomes, relative
to affective-based outcomes.
Enhancing Diversity Training Effects on Affective-Based Outcomes: Features
Related to Social Interaction and Trainee Motivation
Additionally, we contribute to prior research by examining conditions in which one might observe stronger effects
on affective-based outcomes. We focus on affective-based outcomes because attitude change seems particularly
important in relation to diversity training, yet diversity training has failed to reveal consistent beneficial effects on
affective-based outcomes. We draw again on the notions of implicit and explicit processes and their role in attitude
change (e.g., Bohner & Dickel, 2011) as well as notions that diversity attributes affect outcomes through multiple,
interactive routes (i.e., social categorization and information processing/decision making; Van Knippenberg et al.,
2004) to increase understanding of diversity training effects on affective-based outcomes. That is, Bohner and
Dickel (2011) suggested that explicit processes affect explicit attitudes directly but also indirectly through effects
Copyright © 2012 John Wiley & Sons, Ltd.
J. Organiz. Behav. 34, 1076–1104 (2013)
DOI: 10.1002/job
1080
Z. T. KALINOSKI ET AL.
on implicit attitudes. Similarly, implicit processes affect implicit attitudes directly and through effects on explicit
attitudes. In support of these contentions, research has shown that carefully processing arguments influences implicit
attitudes (Briñol et al., 2009) and that individuals transform implicit associations into explicit proposition (Strack &
Deutsch, 2004). (For succinctness, we refer in the following to explicit and implicit attitudes, acknowledging that
there is a debate in the literature regarding whether implicit and explicit attitudes are separate attitudes or attitudes
have implicit and explicit components.) We conclude, then, that responses to explicit measures are likely to be
influenced in part by implicit attitudes.
These notions about how implicit and explicit processes affect attitude change are consistent with the van Knippenberg
et al. (2004) model of the interactive influences of social categorization and information processing/decision making on
individuals’ reactions to diversity attributes and suggest features in diversity training that should enhance beneficial
effects on affective-based outcomes. Van Knippenberg et al. (2004) suggested that diversity attributes have an influence
on outcomes through an implicit route that is moderated by identity threat. That is, diversity attributes can cue social
categorizations, that is, implicit, automatic associations, which in turn can lead to intergroup biases when identity threat
is high. Further, diversity attributes have an influence on outcomes through an explicit, conscious reasoning route.
These routes interact in that intergroup biases emerging from the implicit route can reduce potential beneficial effects
that emerge from the explicit route.
The very nature of diversity training should cue social categorizations because the content of training is focused
on race, gender, multicultural, or other aspects of diversity. Social categorizations are not inherently detrimental to
outcomes; rather, individuals’ reactions to the categorizations can be detrimental. Hence, the question becomes what
features can be built into diversity training to improve individuals’ reactions to social categorizations. Research on
prejudice reduction (see Paluck & Green, 2009, for a review) has suggested that social interdependence (Deutsch,
1949) and contact under optimal conditions (Pettigrew & Tropp, 2006) can reduce prejudice. Diversity training in
general likely embodies optimal conditions, including equal status, shared goals, and absence of competition. Thus,
employing features that increase the opportunity to interact with others in training should benefit affective-based
outcomes. These features might include using an interdependent task in training, having individuals actively participate rather than listen to lectures only, having individuals participate face to face rather than complete training on
a computer, or simply providing training of greater duration. Research has demonstrated that some of these features
(i.e., active vs. passive training and spaced vs. massed training) benefit cognitive-based and skill-based training
outcomes (Goldstein & Ford, 2002). These features are likely to benefit affective-based outcomes as well.
Hypothesis 2: Diversity training that provides greater opportunity for social interaction will have stronger beneficial
effects on affective-based outcomes, relative to training providing less opportunity for social interaction.
Whereas the preceding hypothesis addresses diversity training features that might benefit affective-based outcomes
emerging through an implicit route, other features might benefit affective-based outcomes emerging through an explicit
route. Specifically, van Knippenberg et al. (2004) suggested that diversity attributes have an influence on outcomes
through an explicit, conscious route and that this route is moderated by task and trainee factors, including trainee
motivation. Research on attitude change has described explicit processing as reflecting propositional reasoning,
that is, careful consideration of attitudes and their validity (Gawronski & Bodenhausen, 2006). Moreover, Kulik and
Roberson (2008b) reviewed research showing that individuals can regulate their cognitive processes relating to stereotypes and suggested that such self-regulation is likely to occur when trainee motivation is high. For example, they
suggested that diversity training is likely to have stronger effects if the trainee believes the organization values diversity
training. Other research has shown that training importance or relevance and trainee motivation have beneficial effects
on training outcomes (e.g., Baldwin & Magjuka, 1997; Mathieu & Martineau, 1997). Thus, to the extent that trainees
perceive that the stakes are high, that is, that diversity training is relevant or important, trainees should be more
motivated to change their attitudes and affective-based outcomes in general.
Hypothesis 3: Diversity training will have stronger beneficial effects on affective-based outcomes, when trainee
motivation is high than when it is low.
Copyright © 2012 John Wiley & Sons, Ltd.
J. Organiz. Behav. 34, 1076–1104 (2013)
DOI: 10.1002/job
DIVERSITY TRAINING
1081
Method
Literature search
We limited our search to post-1964 (i.e., Civil Rights Act passage). We began our investigation for sources
by referencing various known search engines such as PsycINFO, Business Source Complete, Social Science Citation
Index, Google Scholar, SocIndex, and ERIC. In addition, we referenced narrative reviews that documented extensive literature searches (e.g., Kulik & Roberson, 2008a; Paluck & Green, 2009). We also conducted ancestry
searches of identified articles to find other relevant articles for inclusion. To find relevant articles, we used the
following search terms: diversity training, evaluation, effectiveness, cultural diversity training, cultural awareness
training, racial diversity training, gender diversity training, sexual harassment training, racial awareness training,
LGBT diversity training, multicultural education, and multicultural training.
To obtain unpublished studies, we used the aforementioned search terms for identifying unpublished documents.
We located these studies by doing electronic and manual searches through past conference programs for the Society
of Industrial/Organizational Psychology, Academy of Management, Association for Psychological Science, and the
American Psychological Association. We also conducted electronic searches of Dissertations Abstracts and the Electronic Theses and Dissertations Center with the aforementioned search terms. We restricted the unpublished search
to the past 10 years for several reasons. One, the probability of unpublished documents being published since
this date was high, thus resulting in potential redundancy. Two, locating contact information for authors of older
unpublished documents was less feasible. Finally, even if authors were located, the likelihood that the authors
had access to the requested documents would be low. Therefore, we felt justified in limiting our search.
Inclusion criteria
The first criterion for inclusion in this meta-analysis was that the study had to provide some type of numerical data.
Hence, conceptual papers and qualitative or narrative reviews discussing diversity training were excluded from
further analysis. We decided to use Cohen’s d as our effect size metric. If not provided, we used the means, standard
deviations, and sample sizes from the study to calculate Cohen’s d. We also converted inferential statistics such as
univariate F ratios, t ratios, and correlations according to conversion formulas recommended by Hunter and Schmidt
(2004). If the authors did not provide adequate numerical data according to the aforementioned criteria, then we
discarded the study from inclusion.
All study designs (i.e., training vs. control groups with pre-tests and post-tests, training vs. control groups with posttests only, and single training group with repeated measures) were incorporated into the meta-analysis. To combine all
types of study designs to allow meaningful comparisons, we followed the advice of Morris and DeShon (2002).
All three study designs use different standard deviations to calculate effect sizes. For instance, one uses the standard
deviation of the change score for repeated-measures design, whereas one will use the pooled standard deviation of both
the experimental and control groups in an independent-groups, post-test-only design. Cohen’s d will be different
depending on how one calculates the standard deviation. Therefore, we used transformation formulas to convert effect
sizes into the same metric. Because we had more studies in the training-versus-control group designs, we elected to
transform repeated measures into the independent-groups metric. To convert Cohen’s d from repeated measures to
the training-versus-control group design, we used the standard deviation of the pre-test instead of the change score. This
provides an equivalent effect size to the training and control group effect sizes. Both training versus control groups with
pre-tests and post-tests and training versus control groups with post-test-only designs are equivalent if the variances are
homogeneous. If one cannot assume homogeneity of variance, Morris and DeShon (2002) recommended using a value
that best represented the untreated population (i.e., control group SD and pre-test SD). Therefore, in cases where we
could not assume homogeneity of variance, we used the pre-test standard deviation when transforming effect sizes.
Copyright © 2012 John Wiley & Sons, Ltd.
J. Organiz. Behav. 34, 1076–1104 (2013)
DOI: 10.1002/job
1082
Z. T. KALINOSKI ET AL.
From the preceding search and inclusion criteria, we collected a total of 200 articles for possible inclusion.
Seventy-four of the studies were discarded because they did not contain any numerical data to analyze (e.g., conceptual articles). Sixty-one articles were discarded because they did not provide adequate data to meta-analyze. This left
us with 65 usable studies with 97 total effect sizes. The total sample size was 8465. Table 1 provides a description of
each study included in the meta-analysis.
Coding procedures
There were two coders used for the coding of the meta-analysis. Both coders coded all studies and are co-authors
for this paper. The lead author (one of the coders) trained the other coder on coding strategies, definitions, criteria,
decision rules, and other coding-related processes. This initial training lasted approximately two hours. In
addition, both coders then proceeded to code an initial 10 studies. After every two to three studies, coders
discussed results and clarified any conceptual discrepancies. The training of the second coder was approximately
20 hours in length, including the initial training and the coding of the initial 10 studies. Both coders then coded the
remaining studies.
At the completion of the coding process, the coders met on two occasions to discuss discrepancies. The first round
of discussions centered on non-systematic errors. Examples of these types of errors were miscalculations, transcription errors, missing data, and oversight. We categorized the remaining discrepancies as reflecting conceptual differences between coders. The second meeting between the coders centered on the conceptual discrepancies. The coders
reached consensus on all discrepancies after discussion of the individual study in question.
Agreement indices were based on conceptual difference discrepancies (i.e., pre-second meeting data). We elected
to use these data to calculate inter-rater agreement because we believed that the non-systematic error discrepancies
were not based on true between-rater differences. Rather, these would be classified as random and not based on
conceptual differences that each rater had about the domain of interest.
To assess inter-rater agreement, we employed two metrics. One is based on the total percentage of agreement. The
average percentage of total agreement for all study variables was 89.76 percent, with type of diversity training
reflecting the lowest (73.53 percent) and study setting the highest (98.53 percent) agreement. Regarding the lower
level of agreement for diversity training type, about half of the cases resulted from the coders’ reluctance to decide
(i.e., “information not available”), whereas the other half resulted from an inability to distinguish between a single
type (i.e., skill or awareness) and double type (i.e., skill and awareness). The median percentage of total agreement
was 91.91 percent. The other metric that we used to assess inter-rater agreement was the ICC. We employed an ICC
with a two-way random-effects model using absolute agreement. The average ICC was .854, with subject matter
displaying the lowest ICC (.602) and age displaying the highest ICC (.985). Conceptual discrepancies on the subject
matter variable reflected differences in understanding the “multicultural” definition. Nonetheless, both coders
discussed each discrepancy individually, and both coders understood the distinctions further after discussion. The
median ICC was .884. Thus, most of the total variance in variable coding was due to between-rater variance. Therefore, we felt confident that our coding procedures