Description
read
Chapter 8: Fostering Cultural Humility in the Institutional Context in Diversity, Cultural Humility, and the Helping Professions: Building Bridges Across Difference.
The Women’s Power Gap at Elite Universities: Scaling the Ivory Tower.Links to an external site.
New Evidence Shows Large Gender Gap in Leadership of Major S. UniversitiesLinks to an external site..
There are few campus presidents or chancellors who are women in higher education. The lack of diversity, equity, and inclusion in the hiring practices of women of color for the top spot is less than 6%. Institutions have not fully integrated cultural humility into their policies, procedures, and service delivery. Additionally, these institutions have embraced cultural competence with their trainings, which has many limitations that impede the enforcement of cultural safety, while maintaining systemic racism and discriminatory practices.
The following directives will be included in your paper:
Describe the backlash that women face with the glass cliff and glass ceiling.
Discuss the differences between public and private universities and their promotions of women and women of color.
Identify the reasons why training programs that prepare underrepresented groups for top positions fail.
Explain the inclusion of cultural humility as an addition to cultural competency training programs and the factors that encourage cultural sensitivity of higher education administrators and governing boards.
Analyze the factors associated with the recruitment and hiring of women of color as campus presidents.
Define the governing boards’ role with gender parity and hiring of women to hold the top positions in higher education.
Discuss three solutions that may combat the systemic issues against diversity, equity, and inclusion in the hiring of women for campus president positions.
Identify strategies to develop and foster cultural humility.
The Lack of Women Campus Presidents in Higher Education final paper
Must be 8 to 10 double-spaced pages in length not including title and references and formatted according to APA StyleLinks to an external site. as outlined in the Writing Center’s APA Formatting for Microsoft Word resource.Links to an external site.
Must include a separate title page with the following:
Title of paper in bold font
Space should appear between the title and the rest of the information on the title page.
Student’s name
Name of institution (The University of Arizona Global Campus)
Course name and number
Instructor’s name
Due date
Must utilize academic
See the Academic VoiceLinks to an external site. resource for additional Additionally, the Making YourLinks to an external site. Writing FlowLinks to an external site. resource may be helpful to connect ideas together.
Must include an introduction and conclusion Your introduction paragraph needs to end with a clear thesis statement that indicates the purpose of your paper.
For assistance on writing Introductions & ConclusionsLinks to an external site. and Writing a Thesis StatementLinks to an external site., refer to the Writing Center resources.
Use the Landrum et al. text for assistance with academic writing and the APA publication manual for help with in-text citations and the reference list.
Must use at least 8 credible sources in addition to the Loue course text.
The Scholarly, Peer-Reviewed, and Other Credible SourcesLinks to an external site. table offers additional guidance on appropriate source types. If you have questions about whether a specific source is appropriate for this assignment, please contact your Your instructor has the final say about the appropriateness of a specific source.
To assist you in completing the research required for this assignment, view Quick andLinks to an external site. Easy Library ResearchLinks to an external site. tutorial, which introduces the University of Arizona Global Campus Library and the research process, and provides some library search tips.
Must document any information used from sources in APA Style as outlined in the Writing Center’s APA: Citing Within Your Paper guide.Links to an external site.
Must include a separate references page that is formatted according to APA Style as outlined in the Writing Center. See the APA: Formatting Your References ListLinks to an external site. resource in the Writing Center for specifications.
Carefully review the Grading RubricLinks to an external site. for the criteria that will be used to evaluate your assignment.
Unformatted Attachment Preview
Foundational Research Ideas:
Observation and Measurement
8
Farknot_Architect/iStock/Getty Images Plus
Learning Outcomes
After reading this chapter, you should be able to
ሁ Assess the importance of accurate and precise observation and measurement.
ሁ Explain how accurate measurements yield both valid and reliable scores.
ሁ Differentiate between the various types of reliability and validity.
ሁ Identify what information is needed to appropriately select a statistic to answer a question of
interest.
ሁ Recognize the various challenges and threats to collecting data, and identify the steps (such as
pilot testing and data storage).
ሁ Recognize the complexity of formulating meaningful survey or questionnaire items, and recognize
the challenge of survey administration.
© 2022 Zovio, Inc. All rights reserved. Not for resale or redistribution.
The Measurement Process
Section 8.2
For a student designing a doctoral research project (whether it’s a dissertation or an applied
doctoral project), the tasks of research (observation and measurement) are central to understanding and explaining human behavior in all of its various forms, ranging from the individual to societal perspectives. These are the fundamental goals of the social and behavioral
sciences, business, and educational research.
Designing a doctoral research project is a large endeavor, and may seem daunting—it can be
hard to know where to start. Just like a house, a good research project requires a solid foundation of observation and measurement, or whatever appears afterward will be on shaky
ground. Therefore, researchers should acquire basic skills in these areas. Presented in this
chapter are key concepts to consider regarding the design of a research study.
8.1 Operational Definitions and Related Ideas
During their coursework, students may hear about operational definitions. An operational
definition is a translation of the key terms of the hypothesis into measurable (i.e., public,
observable) and numeric quantities. If a researcher was studying depression, they would
need to operationally define depression in such a way as to obtain a numerical score. Or, if a
measure of hunger was desired, the score would need to be defined such that hunger is captured as a meaningful number. The key notion, however, is in making the connection between
the behavior to be studied and the measurement of that behavior. In theory, that is where the
concept of operational definition is crucial to what practitioners of social and behavioral science, business, and educational research do.
So, what does this all mean? Clearly defining the key terms of quantitative research is important. Measuring human behavior, attitudes, and opinions (in a meaningful way) requires a
rigorous approach. So, striving for both reliability and validity is the best idea (more on these
topics in the next section).
8.2 The Measurement Process
The measurement process is central to any area of research. From a research perspective,
measurement involves how one captures the responses of individuals (either quantitatively
or qualitatively) in such a manner as to allow for their systematic analysis. In any measurement process, however, there is always the possibility of error. Researchers know this, and
they keep this in mind when drawing conclusions by stating conclusions in the context of
probability. Classical test theory suggests that when a measurement is obtained, that measurement (X) is composed of true score (t) plus error (e), or X = t + e.
Suppose one wanted to know the height of their best friend. That best friend has a true
height—there is one correct answer. However, in measuring this best friend, there is the
potential for error. The “researcher” could make an error in reading the number on a yardstick or tape measure; the friend could be wearing shoes with thick soles or could be slouching; and so on. The resulting height is composed of part true score plus part error, which could
be a result of an overestimation or an underestimation. Researchers can try to minimize error
by taking other measurements and comparing results, although it should be noted that the
potential for error can never be fully eliminated.
© 2022 Zovio, Inc. All rights reserved. Not for resale or redistribution.
Section 8.2
The Measurement Process
Similarly, measuring any aspect of a person’s behavior
yields a result containing true score plus error, which can be
found from a number of sources. For example, participants
can contribute to measurement error by being unusually
motivated to do well, or they may contribute to the measurement error by not doing their best. Or the instruments
(surveys or questionnaires) may be too demanding, too
complicated, or too lengthy, leading to frustration, fatigue,
or boredom. And the length of the interview process may
tire the participants and, as a result, they may truncate their
answers in the interest of brevity. Finally, the location and
specifics of the situation may lead to measurement errors;
for example, the temperature, humidity, and number of
people or space in the room may hinder the acquisition of
the true score.
Researchers may also be a source of measurement error
by being too friendly (or too stern) with participants. They
may also provide inadequate instructions about the task,
or simply make errors in recording participant responses.
Researchers can aim to minimize errors in measurement
through the use of methodology and, in the case of quantitative methodologies, statistics.
Digital Vision/Thinkstock
ሁ There is always a potential
for error when obtaining
measurements. For example, a
researcher could make an error
reading the correct number on
a scale, but this can be avoided
by conducting additional
measurements and comparing
the results.
Central to the observation and measurement of behavior
are the concepts of reliability and validity. In other words,
researchers can have confidence that behavior measures
are meaningful only if they know that the data—as well as the instruments used to collect
that data—can be depended on for accuracy. For this reason, research techniques learned
throughout a student’s education will help them make better approximations of the true
score during a doctoral research project (or applied doctoral project), while attempting to
minimize the influence of measurement error.
Reliability
Simply put, reliability refers to consistency in measurement. There are a number of different
types of reliability, and this section provides a brief introduction to the main types as they
apply to quantitative studies. The counterpart in qualitative studies will be discussed next.
Test–Retest Reliability
Test–retest reliability may perhaps be one of the easier types of reliability to understand.
It refers to the consistency in scores when the same test is administered twice to the same
group of individuals. For example, a researcher may be interested in studying the trait of
humility. Many personality traits are assumed to be relatively stable over time, so an individual’s humility levels at the beginning of the semester should be comparable to their humility
levels one month into the semester. Generally speaking, the longer the time between test and
retest, the lower test–retest reliability.
© 2022 Zovio, Inc. All rights reserved. Not for resale or redistribution.
The Measurement Process
Section 8.2
Parallel Forms and Alternate Forms Reliability
One benefit of the test–retest approach is that a single instrument is created and administered twice to the same group. However, one drawback to this approach is that, depending on
the interval between testing, some individuals might remember some of the items from test
to retest. To avoid this, someone interested in constructing a reliable test could use a parallel
forms or alternative forms approach. Although related, these two approaches are technically
different (Field, 2017).
In a parallel forms test, there are two versions of a test, Test A and Test B. Both Test A and Test
B would be given to the same group of individuals, and the outcomes between the two test
administrations could be correlated (Aiken & Groth-Marnat, 2006). With true parallel forms
tests, researchers want identical means and standard deviations of test scores, but in practice,
one hopes that each parallel form would correlate equivalently with other measures (Field,
2017).
With alternate forms reliability, two different forms of the test are designed to be parallel
but do not meet the same criteria levels for parallel forms (e.g., nonequivalent means and
standard deviations). For example, at some schools, an instructor might distribute two (or
more) different versions of the test (perhaps on different colors of paper). This is usually
done to minimize cheating in a large lecture hall testing situation. One hopes that the different versions of the test (that is, alternate forms) are truly equivalent. This example provides
the spirit of alternate-forms testing, but does not qualify. In true alternate-forms testing, each
student is asked to complete all alternate forms so that reliability estimates can be calculated.
Internal Consistency Reliability
Test–retest, parallel forms, and alternate forms reliability all require that a participant complete two (or more) versions of a measure. In some cases, this may not be methodologically
possible or prudent. Various methods are used to estimate the reliability of a measure in a
single administration rather than requiring multiple administrations or multiple forms. This
gets complex, so we will not go into great detail here.
Interrater/Interobserver Reliability
Each of the preceding reliability estimates focuses on participant responses to a test or questionnaire, attempting to address the reliability of responses from a particular sample. Sometimes, however, an expert panel of judges is asked to observe a particular behavior and then
score that behavior based on a predetermined rating scheme. The reliability between the
rater scores—or how much the raters agree with one another or score similarly using the
same measure—is known as interrater reliability (also known as interobserver reliability,
scorer reliability, or judge reliability; Field, 2017).
© 2022 Zovio, Inc. All rights reserved. Not for resale or redistribution.
The Measurement Process
Section 8.2
S K I L L B U I L D E R 8 . 1 : Determining Interrater Reliability
For researchers, the underlying task (which is often not made explicit) is the ability to let
others know that work is reliable. A researcher can do so in a number of ways by discussing
the types of reliability. Sometimes, it is simply a matter of determining how much two people
agree when observing the same event at the same time. This is called interrater reliability,
and determining it becomes relevant to a researcher’s ability to persuade others that their
work is reliable and accurate. To determine interrater reliability, you need a minimum of two
people, although you can have as many people as you want. The important thing is that all
raters are viewing the same event, behavior, intervention, etc., and are scoring it using the
same instrument.
An example of interrater reliability might include a researcher who is interested in observing the leadership and management style of a childcare center. The researcher would first
gather a team of research assistants to observe and measure the behaviors of the administration using the same tool (for example, the Program Administration Scale, Talan & Bloom,
2011) during the same observational window. That is, for instance, the research team would
observe the administrators interact for the same 15-minute interval and rate them on a
scale of 1–7 on a predefined list of actions or policies that are observed. Then, the researcher
would calculate the percentage of agreement on each item and on the overall instrument.
This protocol allows the researcher to present the interrater reliability. By proceeding in
such a way, researchers can have confidence that what they say they are measuring and what
they are actually reporting is accurate.
Credibility, Transferability, Confirmability, and Dependability
The terms credibility, transferability, and dependability are the qualitative counterparts to
the quantitative concepts of reliability and validity.
Credibility
Credibility is the extent to which participant views and perceptions coincide with the way
in which the researcher presents them. Here, it is important that the researcher is actually
presenting what the participant says, feels, does, and perceives (Bloomberg & Volpe, 2019).
Transferability
Transferability is best described as the process by which a research study is applicable under
similar circumstance at different locations (Bloomberg & Volpe, 2019). For example, if someone does an applied doctoral project at a midwestern university, the findings of their study
should (theoretically) be similar at another midwestern university with similar populations
and circumstances.
© 2022 Zovio, Inc. All rights reserved. Not for resale or redistribution.
The Measurement Process
Section 8.2
Confirmability
Confirmability is the notion that the information captured (i.e., the results) in the qualitative
study has been accurately portrayed based upon the actual data (Bloomberg & Volpe, 2019).
For instance, if a student is completing a qualitative study, they interview 8–10 people, and
they report that all participants agree that bananas are the most desirable fruit, the actual
data should support that finding. One would not expect to examine the actual data and find
that six people claimed apples to be the most desirable.
Dependability
Dependability is best conceptualized as the researcher’s ability to clearly delineate what was
done and how it was done during the research process. This is accomplished through careful
documentation that is ultimately repeatable under similar circumstances elsewhere (Bloomberg & Volpe, 2019). For example, a student researcher should document their research process as succinctly and explicitly as possible so others can duplicate their study elsewhere with
a certain measure of fidelity.
Validity and Its Threats
Whereas reliability addresses consistency in measurement, validity addresses the question:
Are researchers measuring what they think they are measuring? To answer, there are at least
two major approaches to how researchers think about validity.
One approach to validity comes from the measurement literature and how researchers construct new measurement instruments (known as psychometrics). The classic approach here
is to discuss content validity, construct validity, criterion-related validity, and face validity.
A second approach to validity comes from the study of experimental design and particular
quasi-experimental designs as they are part of quantitative methodologies. In fact, some refer
to this latter approach as a “Cook and Campbell” approach, in part due to an influential book
(Cook & Campbell, 1979) that brought together this conceptualization of validity, as well as a
classic listing of threats to validity in quantitative studies. Both major approaches are briefly
reviewed here.
Psychometric Approach
In the classic psychometric approach, there is a trio of Cs: content validity, criterion-related
validity, and construct validity. All three types of validity mentioned here are important; each
is necessary, but not sufficient alone, to establish validity (Morgan et al., 2002).
Content validity refers to the composition of items that make up the test or measurement.
In other words, do the contents of the test adequately reflect the universe of ideas, behaviors,
attitudes, and so forth, that constitute the behavior of interest? For example, if a researcher
is interested in studying introversion and is developing an introversion inventory to measure
the level of introversion, do the actual items on the inventory capture the totality of the concept of introversion? If a student was taking the GRE subject test in psychology, content validity asks the question: Are the items truly capturing your knowledge of psychology?
© 2022 Zovio, Inc. All rights reserved. Not for resale or redistribution.
Section 8.2
The Measurement Process
Criterion-related validity refers to how the measurement outcome, or score, relates to other types
of scores. A general way to think about criterionrelated validity is: Given that a score is reliable,
what does this score predict? Social scientists, for
example, are often interested in making predictions
about behavior, so criterion-related validity can be
very useful in practice. Essentially, criterion-related
validity addresses the predictability of current
events or future events.
Chinnapong/iStock/Getty Images Plus
ሁ Human intelligence (as well as
measures like empathy and success) is a
Construct validity has been called “umbrella valid- construct. Though the brain is a physical
ity” (Cohen & Swerdlik, 2005, p. 157) because a object, intelligence is intangible.
construct is a hypothetical idea that is intended to
be measured but does not exist as a tangible “thing.” Intelligence, for example, is a construct—
a hypothetical (not tangible or physical) idea we believe humans and animals possess in varying degrees. So, if researchers were to do a postmortem examination of a human brain, they
would not be able to extract the part of the brain known as intelligence. Generally speaking,
construct validity exists when a test measures what it purports to measure. Much of what
is studied in the social and behavioral sciences, business, and educational research are constructs, such as humility, sympathy, depression, happiness, anxiety, altruism, success, dependence, and self-esteem.
Although not part of the three Cs of validity, face validity is often mentioned as a validity type
that refers to whether the person taking the test believes it measures what it purports to measure. Face validity is ascertained from the perspective of the test taker not from test responses.
If the test taker believes the items are unrelated to the stated purpose of the test, the quality
of responses—or the confidence that the test was taken seriously—might be affected.
The three Cs of validity (plus face validity) constitute the classic test construction approach
to validity, which is based on the accumulation of evidence over time. This is an important
concept to consider when developing a measure of behavior. There are other considerations,
as well, such as the level of confidence we have in the conclusions we draw or the generalizability of the results from the present study to other times, places, or settings. Generalizability refers to how useful or applicable the study results are for a broader group of people in
similar situations. Cook and Campbell (1979) offered a different conceptualization of validity,
and these ideas are particularly relevant.
Cook and Campbell’s Approach
Cook and Campbell (1979) conceptualize validity a bit differently than psychometricians.
They define validity as “the best available approximation to the truth or falsity of propositions, including propositions about cause” (p. 37). According to Cook and Campbell (1979),
four different categories of validity include internal, external, statistical conclusion, and
construct validity, however internal validity is the most applicable in research study design.
Internal validity refers to the general nature of the relationship between independent and
dependent variables. The chief goal in establishing internal validity is the determination of
causality: Did the manipulation of the independent variables cause changes in the dependent
variables? This is a key consideration in the design of experiments and research studies. For a
brief review of the types of threats that might challenge a researcher, see Table 8.1 and Pivotal
Moments in Research: The Hawthorne Studies.
© 2022 Zovio, Inc. All rights reserved. Not for resale or redistribution.
Section 8.2
The Measurement Process
Table 8.1: Classic threats to internal validity
Threat to
internal validity
History
Maturation
Brief definition
Research example
Something happens during the
experimental session that might change
responses on the dependent variable.
If you are collecting data in a large
classroom and the fire alarm goes off,
this may impact participant responses.
When testing participants more than
once, earlier testing may influence the
outcomes of later testing.
If you design a course to help students
do better on the GRE, students take
the GRE at the beginning of the course
and again at the end. Mere exposure
to the GRE the first time may influence
scores the second time, regardless of the
intervention.
Change in behavior can occur on its
own due to the passage of time (aging),
experience, etc., with change occurring
separately from independent variable
manipulation.
Testing
Instrumentation
Change occurs in the method by which
you are collecting data; that is, your
instrumentation changes.
Statistical
regression
When experimental and control group
assignments are based on extreme
scores in a distribution, the individuals
at the extremes tend to have scores
that move toward the middle of the
distribution (extreme scores, when they
change, become less extreme).
Selection
When people are selected to serve
in different groups, such as an
experimental group and a control group,
are there preexisting group differences
even before the introduction of the
independent variable?
In a within-subjects design, participants
view visual information on a computer
screen in 200 trials. By the end of
the study, participants may become
fatigued and change dependent variable
responses due to time and experience.
If you are collecting data through a
survey program on a website, and
the website crashes during your
experiment, you have experienced an
instrumentation failure.
In a grade school setting, children
scoring the absolute lowest on a reading
ability test are given extra instruction
each week on reading, and they are
retested after the program is complete.
Because these children’s scores were
at the absolute lowest part of the
distribution, these scores, when they
change, have nowhere else to go but up.
Is that change due to the effectiveness
of the reading program or statistical
regression?
In some studies, volunteers are
recruited because of the type of study
or potential implications (such as
developing a new drug in a clinical trial).
Volunteers, however, are often motivated
differently than non-volunteers. This
preexisting difference, at the point
of group selection, may influence the
effectiveness of the independent variable
manipulation.
(continued on next page)
© 2022 Zovio, Inc. All rights reserved. Not for resale or redistribution.
Section 8.2
The Measurement Process
Table 8.1: Classic threats to internal validity (continued)
Threat to
internal validity
Brief definition
Research example
Mortality
Individuals drop out of a study at
a differential rate in one group as
compared to another group.
Interaction with
selection
If some of the preceding threats happen
in one group but not in the other group
(selection), these threats are said to
interact with selection.
In your study, you start with 50
individuals in the treatment group and
50 individuals in the control group.
When the study is complete, you have
48 individuals in the control group but
only 32 individuals in the experimental
group. There was more mortality (“loss
of participants”) in one group compared
to the other, which means there is a
potential threat to the conclusions you
draw from the study.
Diffusion/
imitation of
treatments
If information or opportunities for the
experimental group spill over into the
control group, the control group can
obtain some benefit of an independent
variable manipulation.
Compensatory
equalization of
treatments
When participants discover their
control group status and believe that
the experimental group is receiving
something valuable, control group
members may work harder to overcome
their underdog status.
Resentful
demoralization
When participants discover their
control group status and realize that
the experimental group is receiving
something valuable, they decide to “give
up” and stop trying as hard as they
normally would.
If the instrumentation fails in the control
group but not in the experimental
group, this is known as a selection ×
instrumentation threat. If something
happens during the course of the study
to one group but not the other, this is a
selection × history threat.
In an exercise study at the campus
recreation center, students in the
experimental group are given specific
exercises to perform in a specific
sequence to maximize health benefits.
Control group members also working out
at the recreation center catch on to this
sequence and start using it on their own.
In a study of academic achievement,
participants in the experimental group
are given special materials that help
them learn the subject matter better and
perform better on tests. Control group
members, hearing of the advantage they
did not receive, vow to work twice as
hard to keep up with the experimental
group and show them that they can do
work at equivalent levels.
In the same academic achievement
example as above, rather than vowing
to overcome their underdog status,
the control group simply gives up on
learning the material, possibly believing
that the experiment is unfair, and
wondering why they should bother
trying.
© 2022 Zovio, Inc. All rights reserved. Not for resale or redistribution.
Section 8.2
The Measurement Process
P I V O TA L M O M E N T S I N R E S E A R C H : The Hawthorne Studies
Generally speaking, the Hawthorne effect refers to the situation where study participants
may band together to work harder than normal, perhaps because they have been specially
selected for a study, or they feel loyalty to the researchers or the particular situation. The
Hawthorne studies—so named because they were conducted at Western Electric Company’s
plant in Hawthorne, Illinois—began in the 1920s and
ended in the 1930s. Fritz J. Roethlisberger of Harvard
University and W. J. Dickson of the Western Electric
Company were chiefly involved in these efforts, but
many consultants were brought in over the course of the
multiyear studies (Baritz, 1960).
The first set of studies, beginning in November 1924,
examined how different levels of lighting affected
worker productivity. In one variation, individuals were
tested with lighting at 10 foot-candles. (Roughly speak© CSU Archives/Everett Collection
ing, 1 foot-candle is the amount of light that one candle
generates 1 foot away from the candle.) With each sucሁ Workers at the Western
cessive work period, lighting decreased by 1 foot-candle. Electric Company factory,
Interestingly, when lighting was decreased from 10
circa 1945.
foot-candles to 9 foot-candles, productivity increased.
In fact, productivity continued to increase with decreased lighting until about 3 foot-candles,
at which point productivity decreased (Adair, 1984; Roethlisberger & Dickson, 1939). The
researchers understood, if nothing else from this study, that understanding productivity was
much more complicated than lighting.
Around April 1927, a second series of studies began, which would typically be referred to as
the Relay Assembly Test Room Studies (Adair, 1984; Baritz, 1960). Experimentally speaking,
Roethlisberger and Dickson became more “rigorous” in this series of studies. For example,
they selected five female employees who were relay assemblers out of a large department
and placed these employees in a special test room for better control of the conditions and
variables to be tested. One could measure the daily and weekly output of test relays assembled by each woman as the dependent variable.
Prior to moving the female workers into the test room, researchers were able to establish
a baseline of productivity based on employee records. Over the course of 270 weeks, the
researchers systematically varied the conditions in the relay assembly test room, all the
while recording dependent variable data on the number of test relays. For example, sometimes the amount of voluntary rest time was increased, and sometimes it was decreased.
Sometimes rest breaks were increased in the morning but lengthened in the afternoon. Once
workers were given Saturday mornings off (a 48-hour workweek was customary at the time).
Researchers also occasionally reinstated baseline control conditions. Productivity seemed
to increase regardless of the manipulation introduced (Adair, 1984). In other words, even
when experimental conditions were manipulated to attempt to decrease productivity, oftentimes productivity increased. When the employees returned to baseline control conditions,
“unexpectedly, rather than dropping to pre-experiment levels, productivity was maintained”
(Adair, 1984, p. 336).
(continued on next page)
© 2022 Zovio, Inc. All rights reserved. Not for resale or redistribution.
The Measurement Process
Section 8.2
P I V O TA L M O M E N T S I N R E S E A R C H : The Hawthorne Studies
(continued)
There were also additional studies as part of the Hawthorne studies, such as the Mica
Splitting Test Room and the Bank Wiring Room, all with similar results. So, what is the
Hawthorne effect? Simply put, Stagner (1982) defined the Hawthorne effect as “a tendency
of human beings to be influenced by special attention from others” (p. 856). In other words,
when people are given special attention, they may behave differently.
Similarly, Goffman (1959) proposed that individuals live their lives as if they are performing
on a theater stage. He believed that our behavior was nothing more than a series of performances with the intent to deliver the best performance we can. Goffman’s beliefs were developed into what he called the dramaturgical theory.
Questions for Critical Thinking and Reflection
• How can individuals connect the results from the Hawthorne studies to their own work
history and experience? What are the motivational influences that help individuals thrive
at work? Think about any sort of program or experimentation that you might believe
could increase worker productivity or employee satisfaction. What types of skills do students learn in college, and particularly in this course, that can be applied to the workplace
setting in answering these important questions?
• Imagine you are thinking about designing a research study. What impact do you think the
principles of the Hawthorne effect would have on your experimental design? Does the
motivation of a participant in your proposed study possibly affect the validity of the data
that would be collected?
• What are the conditions (that is, variables) that motivate you to work the way you do?
Would a higher salary lead to an increase in your productivity? What about a larger workspace, greater respect, added responsibility, etc.? Fundamental attribution error is an idea
from social psychology that suggests humans attribute the behavior of others to internal
decisions, while sometimes attributing their own decisions to external forces. Thus, you
might think that if your coworker Finn would just work harder and not be so lazy, he
would get that promotion; but when you think about yourself, you think about how the
work conditions and the last manager prevented you from being promoted. Do thought
patterns about the workplace fall in line with the fundamental attribution error? What
motivates you, and how might others create an environment to help you grow and thrive?
The ideas of reliability and validity are central to both the observation and measurement of
behavior. These are everyday ideas that researchers utilize to improve their work. One other
note to make about the relationship between validity and reliability: An instrument can be
reliable without being valid, but an instrument can only be valid when it is measured reliably.
Understanding the measurement of behavior in a reliable and valid manner is important, but
what about the actual collection of data? When one relies on numerical (quantitative) scores,
what do the actual numbers mean, and how might a researcher analyze the data collected?
The next section covers these questions in detail.
© 2022 Zovio, Inc. All rights reserved. Not for resale or redistribution.
Section 8.3
Scales of Measurement and Statistic Selection
8.3 Scales of Measurement and Statistic Selection
This chapter is about observation and measurement. When the dependent variable is measured quantitatively, researchers typically desire reliable and valid numerical scores. But how
are these scores ob