Description
Complete an article analysis for each article using the “Article Analysis 1” template.An example is attached. Please look at the example and provide similar work.
Unformatted Attachment Preview
Article Analysis 1
Article Citation and
Permalink
(APA format)
Article 1
Article 2
Criteria
Description
Description
Broad Topic
Area/Title
Identify Independent
and Dependent
Variables and Type of
Data for the Variables
Population of Interest
for the Study
Sample
Sampling Method
Descriptive Statistics
(Mean, Median,
Mode; Standard
Deviation)
Identify examples of
descriptive statistics in
the article.
r Academy of Management Review
2021, Vol. 46, No. 3, 465–488.
https://doi.org/10.5465/amr.2018.0421
LEARNING FROM TESTIMONY ON QUANTITATIVE
RESEARCH IN MANAGEMENT
ANDREW KING*
Boston University
BRENT GOLDFARB
University of Maryland
TIMOTHY SIMCOE
Boston University
Published testimony in management, as in other sciences, includes cases where authors
have overstated the inferential value of their analysis. Where some scholars have diagnosed a current crisis, we detect an ongoing and universal difficulty: the epistemic problem of learning from testimony. Overcoming this difficulty will require responses suited
to the conditions of management research. To that end, we review the philosophical literature on the epistemology of testimony, which describes the conditions under which common empirical claims provide a basis for knowledge, and we evaluate ways these
conditions can be verified. We conclude that in many areas of management research, popular proposals such as preregistration and replication are unlikely to be effective. We suggest revised modes of testimony that could help researchers and readers to avoid some
barriers to learning from testimony. Finally, we imagine the implications of our analysis
for management scholarship and propose how new standards could come about.
to rectify this deficiency in our understanding, and
by so doing provide guidance on how we should
learn from testimony about quantitative empirical research in management.
We believe that our project is both timely and
timeless. Several recent studies have argued that reports of empirical research in management may provide an uncertain basis for knowledge. In some
circles, this has led to a perception of a field in
“crisis,” and yet the problem we face is neither recent nor unique (Everett & Earp, 2015; Honig et al.,
2018). Testimony of all forms has a vulnerability
problem. “It is inherent in the nature” of reports
from one person to another, Elizabeth Fricker (1994:
145–146) wrote “that insincerity and honest error
are both perfectly possible.” As a result, scholars in
all fields, including management, must contemplate
how to learn from empirical reports, and they must
consider how the best approach for doing so changes
with their local empirical context.
To develop a theoretical framework for management scholars and readers, we first review philosophical perspectives on learning from the
testimony of others. We assess the conditions under
which three common types of empirical claims
should be used as a basis for knowledge. We then
How do we build our knowledge of business management? Sometimes we learn by evaluating evidence directly, but more frequently we learn by
reading or hearing the testimony of other scholars
concerning their own observations and inferences.
Yet, most management scholars spend far more time
thinking about how they should learn directly from
evidence than about how they should learn from the
testimony of others. As students, we receive training
in the logical requirements for learning from evidence, but almost no instruction in the philosophical
issues related to learning from testimony. As researchers, we think about the inferences we make
from data, but seldom concern ourselves with the
epistemological problems our readers face in trying
to learn from our research. In this article, we attempt
We would like to thank Clark Glymour, Mikko Ketokivi, Edward Leamer, Saku Mantere, Myriam Mariani, P. J.
Mode, Darrell Rowbottom, Bill Starbuck, members of the
University of Oregon (Lundquist) and Rutgers Business
School research seminars, participants at the Utah Strategy Conference (2018), and members of the Boston University (Questrom) and University of Maryland (Robert H.
Smith School) brownbag research seminars.
Corresponding author.
465
Copyright of the Academy of Management, all rights reserved. Contents may not be copied, emailed, posted to a listserv, or otherwise transmitted without the copyright holder’s express
written permission. Users may print, download, or email articles for individual use only.
466
Academy of Management Review
outline current proposals for assessing the epistemic
value of such claims, and we evaluate whether they
are likely to be broadly effective in management. Finally, we propose an approach that suits the conditions in many areas of management research. This
approach involves the use of what we call epistemic
maps to connect a “multiverse” of empirical assumptions to inferential outcomes. Such maps, we
argue, would encourage authors to engage in better
research practices and to report in a more forthright
manner. For readers, such maps would allow more
direct inference and thereby enable a means of circumventing difficult aspects of testimony’s vulnerability problem.
THE EPISTEMOLOGY OF TESTIMONY
As used by philosophers of science, the term
“testimony” has been defined as “remarks from person A that invite us to accept proposition P” (Coady,
1992: 32–33). This definition embraces a broad array
of “tellings,” which range from driving directions to
reports of scientific research (Fricker, 1995; Wilholt,
2013). Many scholars have noted that testimony is
central to our understanding of the world (Adler,
2006; Coady, 1992). For example, few of us have
seen red blood cells or evidence of DNA, but we believe they exist. Few management scholars have directly observed the effect of incentives on effort or of
training on performance, yet many of us think we
know the direction of these relationships, and we
“know” these things largely by hearing or reading
testimony presented in a variety of settings, such as
classrooms, seminars, and publications.
Yet all testimony provides an uncertain basis for
knowledge. To use information provided via testimony, the hearer or reader “ascribes to the speaker
justification or warrant of knowledge for what she
asserts” (Adler, 2006). But what is the basis for granting this warrant? This question represents “the dominant epistemological problem of testimony” (Adler,
2006). One group of scholars, termed reductionists,1
have argued that testimony must be validated
“The term ‘reduction’ as used in philosophy expresses the idea that if an entity x reduces to an entity y
then y is in a sense prior to x, is more basic than x” (Van
Riel & Van Gulick, 2016). With respect to theory development, reductionists are those that think science can and
should advance by substituting new and more general
theories for older, more specific ones. With respect to the
use of testimony, reductionists believe that direct experience is prior to and more basic than testimony. The two
uses and groups are not identical.
1
July
a posteriori before it can be used as a source of
knowledge. At the other extreme, antireductionists
have argued that testimony should be trusted
“presumptively”—that is, without recourse to supporting evidence (Coady, 1992). In between these
two positions, philosophers such as Fricker (2004)
argued for a contingent approach to testimony
based on local conditions.
The main premise of the reductionist thesis is
straightforward: testimony, because it relies on another human, is vulnerable to error and insincerity.
As a result, reductionists have argued that the use
of testimony can only be justified via recourse to other sources, such as direct perception or prior experience. The Scottish philosopher David Hume is
typically accepted as a proponent of the reductionist
perspective (Adler, 2006). He argued that testimony
should be trusted only when it conforms with
“reality,” or when its source has been shown to be reliable (Hume, 1748/1993). Since Hume, the reductionist position has been repeatedly attacked. As a
result, an antireductionist position is now ascendant
among philosophers of science (Adler, 2006).
Antireductionists argue that a posteriori evaluation of testimony is seldom feasible, so we should
therefore trust testimony “presumptively.” For example, Coady (1992) argued that (a) our beliefs are
largely based on testimony that we have trusted
without evidence, (b) these beliefs are evidently
helpful, and critically (c) there is rarely a feasible
way to assess the testimony we receive. As a result,
Coady (1992: vii) concluded that granting testimony
a presumptive “warrant of authority” is “the only
honest [position] to adopt.”
Fricker (1994) proposed a middle ground between
the two sides. She decried Coady’s recommendation
of presumptive trust as “an epistemic charter for the
gullible and undiscriminating” (Fricker, 1994: 126),
but also admitted that, for many types of testimony,
it is simply too difficult to check its veracity. Users of
testimony, she argued, should consider local conditions before deciding how to approach testimony, because in some circumstances they will be able to
ascertain whether that testimony meets the necessary conditions for “veridicality.”
Fricker (2004) pointed out that hearing or reading
testimony, like any experience, provides a basis for
knowledge when its operation is veridical—that is,
the experience links to something real. For example,
the operation “seeing” is veridical if the viewer’s perception of an object corresponds with an actual object. Similarly, the operation “read testimony” is
veridical if the author is competent to advance their
2021
King, Goldfarb, and Simcoe
proposition and expresses it in a sincere way. For
empirical research, this means that an author’s
claims are justified by appropriate empirical analysis
and reported in a forthright way. Readers or hearers
cannot determine whether claims are true or false,
Fricker (2002) reasoned, but they can evaluate
whether claims satisfy veridicality conditions. She
contended that by reframing testimony in this way
readers can decide how to use a research report;
however, she did not provide instructions for how
they should do so (Fricker, 2002). In this paper, we
take up the challenge of proposing an approach for
evaluating veridicality that is suitable to conditions
common in management research.
The challenge we set is daunting, perhaps even insurmountable. Many authorities on the sociology of
science are less sanguine than Fricker about the opportunity for local evaluation of the veridicality of empirical reports. Helen Longino (2002), for example, argued
that a reader’s ability to evaluate an empirical claim is
severely limited by the incompleteness or imprecision
of published reports. She noted that empirical science
requires countless methodological decisions that are
“underdetermined” by the research questions or data.
Gelman and Loken (2015) illustrated the importance of
this point by comparing the empirical process to a
stroll through a “garden of forking paths.” At each fork,
the empirical researcher makes a choice that influences
where they exit the garden—that is, the estimates obtained and inferences formed. These choices, or the
logic behind them, are seldom fully documented
(Douglas, 2000). Such incomplete reporting, Torsten
Wilholt (2013) argued, prevents readers, even expert
ones, from evaluating whether the researcher had justification for the claims advanced. Assessing the veridicality of claims, Wilholt (2013) argued, would require
peering into the mind of the researcher to understand
their assumptions, values, and goals. Only then could
the reader fill in missing information about what empirical choices led to the reported claims. Wilholt
(2013) admitted, however, that his analysis was limited
to particular types of empirical claims; he did not consider how his assessment might be effected by “local”
norms of analysis and reporting.
In fact, none of the above assessments are situated
in the management literature, and thus they do not
enumerate or evaluate the empirical claims typically
made by authors in our field. To assess the feasibility
of Fricker’s local reduction, we must first consider
the kinds of claims common in management. Doing
so will allow us to understand their veridicality conditions, and inform our analysis of whether or when
reduction is possible.
467
CONDITIONS FOR VERIDICAL
RESEARCH CLAIMS
In this section, we consider common types of
empirical claims advanced in the management literature, and evaluate their veridicality conditions:
that is, when authors are competent and sincere in
advancing them. For scientific claims, competence
means that the speaker has a basis that is justified
by the epistemology of science (Fricker, 2002). For
quantitative empirical analysis, this means that the
scientist must somehow circumvent David Hume’s
argument that no claim to knowledge is ever warranted; all inferences from evidence, Hume argued,
require supporting assumptions, and since these assumptions cannot be validated independently, all
inferences are vulnerable to error (Hume, 1748/
1993). This “enormously important result” means
that Hume has since “loomed like a colossus” over
the development of the philosophy of science
(Stove, 1982: 72). Many scholars have attempted direct assaults on his argument, but these have largely
fallen from favor and been abandoned (Hacking,
2001). Approaches that avoid, rather than overcome, Hume’s problem of induction have been
more successful, and provide justification for the
types of claims commonly expressed in the management literature.
Although the precise language used to describe
“key findings” in the management literature
varies from one paper to the next, authors have
usually advanced claims that fall into one of three
broad categories: explanations, frequencies, or
beliefs.2 Each type of claim differs in strength and
nature, and each comes from a different tradition
in epistemology. These traditions both enable and
constrain what scholars can infer from evidence.
They provide a means for circumventing David
Hume’s argument that no claims to knowledge are
ever warranted; yet they constrain authors by requiring them to follow exacting empirical procedures and to express their inferences in a specific
manner.
Competence to Make Common Claims
An explanation is a conjecture about an observed pattern of evidence. Explanations are often
presented in management reports as part of “post
The latter two have also been called “frequency and
belief probabilities” (Hacking, 2001) “probability 1 and
probability 2” (Carnap, 1962), or aleatory and epistemic
probabilities (Keynes), see Bateman (1987).
2
468
Academy of Management Review
hoc analyses of alternative patterns in data, and in
discussion sections where non-significant or unanticipated results are speculated upon or where
links to other findings are proposed, [and] where
mysteries are explored” (Behfar & Okhuysen,
2018: 327).3 An example of an explanation drawn
from the management literature is found in Zaheer and Soda (2009: 28): “A possible explanation
for this result is that we investigate the redundancy of the network at the team level of analysis,
where factors such as efficiency and routines … may be exerting stronger influences on
performance.”
The philosopher Charles Sanders Pierce was an
early student of such explanations, and he termed
the process of finding them “abduction,” from the
Latin “to take away.” Pierce hoped to develop a basis
by which abduction could lead to truth claims, but
he eventually concluded that abduction provided
justification for nothing more than “guesses.” In the
past few decades, some scholars have argued that the
epistemic virtues (e.g., likeliness, simplicity, elegance, fruitfulness, etc.) may provide a basis for selecting some explanations over others (Ketokivi &
Mantere, 2010; Lipton, 2003). At present, however,
most epistemologists have agreed that abduction
provides justification only for conjectures about possible explanations for observed evidence (Adler,
2006). For example, Schurz (2008: 205) argued that
abduction allows only a basis for making a
“promising explanatory conjecture,” which then
must be “subject to further test.” Framed this way,
explanations avoid Hume’s skepticism with respect
to induction by making no claim that knowledge has
been inferred from evidence. In accepting explanations, readers do not need to face Wilholt’s (2013)
challenge that they must perceive and agree with researcher choices and values to determine whether a
claim is veridical. The reader need only conclude
that the evidence exists and the explanation is plausible. In the management literature, explanations are
frequently debated and discussed but are accepted
as veridical so long as they have an evidentiary basis
and advance no stronger claim (Behfar & Okhuysen,
2018; Mitnick, 2000).
Frequency claims are also common in the management literature. Chatterjee and Hambrick (2007: 216)
provided an example of a classic frequency claim:
“Both measures of objective performance were
3
Behfar and Okhuysen (2018) used the term “plausible
knowledge claim,” but we prefer the more standard
“explanation.”
July
significantly positively associated with risky outlays … at p , .05 and … at p , .01.” Frequency
claims can also be expressed in terms of a
“confidence interval”—that is, a prediction for how
often an estimate will fall within a specified range.
The statistical logic of these approaches was developed, respectively, by Ronald Fisher, and Jerzy Neyman, and Egon Pearson (Schneider, 2015). Both
methods provide a means of avoiding Hume’s problem of induction by allowing inference to rational action without the need for interpretation with respect
to the truth (Hacking, 2001). For example, a clinical
trial of a drug may estimate the conditional frequency of positive or negative outcomes, and thereby allow actors to make rational cost–benefit decisions,
without ever needing to know whether the drug itself
caused those outcomes.
Frequentist analysis places strict limitations on
the empirical process researchers must follow, because frequency claims describe what is expected to
happen if an identical test were conducted on an
identically constructed sample from the same population. Practically, this means that authors must
specify in advance (a) the hypotheses to be tested, (b)
the sampling plan and population to be used,4 (c) the
parameters to be evaluated, and (d) the test statistics
to be employed (Spanos, 2010). Even slight freedom
to deviate from these plans will undermine frequency claims (Sanborn & Hills, 2014). For example, if a
researcher does not set a test plan in advance,
they may not know how to interpret multiple tests
(Fisher, 1960). Should they, for example, consider a
particular test statistic to be (a) independent or (b)
part of a connected family of tests? If they have not
specified their rule in advance, they are unlikely to
know how statistics should be interpreted, and, as a
result, they are not competent to make a precise frequency claim.
The importance of prespecification creates a problem for readers who encounter frequency claims in
management research. A reader can inspect the statistical process and analytical methods that were
used, but cannot ascertain when or how that process
was chosen. The reader can evaluate whether an appropriate statistical method was employed, but this
provides only a necessary (and not sufficient) condition for justifying frequency claims. To know that
4
The ability to stop or continue gathering data (what
epistemologists have referred to as “optional stopping”)
can undermine frequency claims, even when the decision
to continue or stop does not depend on the data being
gathered or the estimate obtained (Mayo, 1996).
2021
King, Goldfarb, and Simcoe
the author is competent to advance a frequency
claim, the reader must know that empirical choices
were set in advance. Thus, consistent with Wilholt’s
(2013) argument, to establish veridicality the reader
or hearer of an empirical claim must somehow see
into the mind of the researcher to ascertain when
and why they selected a particular research design.
Several scholars in management have pointed out
that conditions for veridical frequency claims are
commonly violated. For example, Richard Bettis
(2012) has reported that, based on his personal experience, scholars often search through data for plausible hypotheses—indeed, prominent scholars have
admitted to following and advocating this practice
(Bartlett, 2017). The problem is not limited to management. In psychology, John, Loewenstein, and
Prelec (2012) reported that the majority of the researchers they surveyed admitted to adjusting
their sampling plan during their analysis. In economics, Nobel Laureate James Heckman and statistician Burton Singer observed that most scholars
violate the conditions for justified frequentist
claims: “Peeking at the data and formulating and
building new models in the light of those views is
an often-committed frequentist sin” (Heckman &
Singer, 2017: 299).
Belief claims express how hypotheses should
be understood in light of observed evidence—
that is, the probability that a hypothesis is true
conditional on the evidence, P(H|E). Such
claims often represent a researcher’s ultimate
summary of their analysis. A good example of a
belief claim can be found in Sine and Lee (2009:
151): “Our analysis indicates that the presence
of local social movements was responsible for
this regional variation [in wind energy entrepreneurship].” Because belief claims necessitate extrapolation from the physical realm of
experience to the epistemic realm of truth, they
must surmount Hume’s objection that such induction is never justified. How can this be overcome? The most enduring approaches have
sought to avoid the problem by shifting the nature of belief claims. The goal is to allow a rational approach to knowledge without ever
claiming confirmation of knowledge itself
(Hacking, 2001). Scholars working within the
traditions of logical probability and classical
statistics have both proposed justifying
conditions.
One approach, loosely termed Bayesian, concedes that Hume is correct that we are never justified to say we know the truth, but contends that
469
we can know “whether we are reasonable in modifying [our] opinions in the light of new experience” (Hacking, 2001: 256).5 Unfortunately, a
fully Bayesian approach to learning faces both
logical and practical difficulties, and thus is rarely used in many areas of social science, including
management (Glymour, 1980; Senn, 2011). To
rescue this approach to justification of belief
claims, Hacking (1965) argued that most scholars
can and do justify directional arguments about
belief using a “law of likelihoods.” This law is
easily derived from Bayes Rule, yet it obviates
the need for scientists to specify ex ante probability distributions. It also provides possible justification for commonly made claims that
evidence “supports” a proposed hypothesis. A
common form of the law states that: if the evidence (E) is more likely conditional upon a preferred hypothesis (H) than it is conditional on all
alternative hypotheses (!H), then the evidence
supports belief in H.6
Given the widespread use of non-Bayesian
methods, some scholars have sought to develop a
justification for belief claims that relies only on
classical statistics (Mayo, 1996; Mayo & Cox,
2006; Mayo & Spanos, 2011). Though their program is ongoing, their initial statements about
justifying conditions closely mirror the law of
likelihoods. Researchers have been justified in
claiming that the evidence provides “support”
for a hypothesis, they reasoned, when the data
agree with the preferred hypothesis, and there is
a low probability that the data agree with any alternative hypothesis. Thus, for both Bayesian
and non-Bayesian epistemologists, to make a belief claim it is necessary but not sufficient to provide evidence that is consistent with a preferred
hypothesis. For a belief claim to be veridical, the
author must also rule out alternative mechanisms and rival hypotheses that could generate
the same result.
Because belief claims are only justified if an author
considers alternative explanations for the evidence,
they are closely linked to what econometricians
5
This tactic has the added benefit that it avoids some
classic problems of induction (e.g., black swans) and accommodates some commonly practiced methods in science, such as falsification.
6
This law can be stated in terms of the likelihood ratio
P(E|H) / P(E|!H) . 1, where P(E|H) is the conditional
probability of observing evidence E given the truth of hypothesis H.
470
Academy of Management Review
have called the identification problem.7 Randomized controlled experiments provide one powerful
method for ruling out rival hypotheses, but other
modes of analysis are also commonly used. Indeed,
James Heckman (2000) argued that a major contribution of twentieth-century econometrics was the development of methods for demonstrating that a
particular explanation is identified, in the sense that
a statistical association can be connected to a particular hypothesis. The resulting “identification revolution” has had a strong influence on empirical
research in economics and management (Angrist &
Pischke, 2010).
Distinguished scholars have noted that confusion
about the necessary conditions for making a belief
claim often leads to erroneous claims (Schwab,
Abrahamson, Starbuck, & Fidler, 2011). For example, Schneider (2015: 421) argued that many scholars
“regard p values as a statement about the probability
of a null hypothesis being true or conversely, 1 2 p
as the probability of the alternative hypothesis being
true.” Other scholars have identified cases where researchers have simply failed to consider important
rival explanations, such as endogeneity or mediation, and moved directly from evidence that is consistent with a hypothesis to a claim that the evidence
supports that hypothesis (Antonakis, Bendahan,
Jacquart, & Lalive, 2010; Hamilton & Nickerson,
2003; Shaver, 2005). In the above cases, necessary
justification for belief claims can be ruled out relatively easily.
Readers face a greater challenge in determining
that researchers had sufficient justification for the
belief claims they advanced. First, they must evaluate whether all reasonable alternative explanations
were considered. If not, the law of likelihoods will
deliver a misleading result: it will imply belief in a
particular hypothesis, when in fact this hypothesis is
simply the “best of a bad lot” (Douven, 2002: 356).
Second, the reader must ascertain the assumptions
that enabled identification of the connection between the evidence and particular hypotheses
(Leamer, 2010). Because belief claims are always
contingent on acceptance of a set of maintained
Manski (1995: 4) defined identification as “seek[ing]
to characterize the conclusions that could be drawn if one
could use the sampling process to obtain an unlimited
number of observations,” and noted that “identification
problems [pose] inferential difficulties [that] can be alleviated only by invoking stronger assumptions or initiating new sampling processes that yield different kinds of
data.”
7
July
assumptions (Manski, 1995), readers hoping to use
those claims as a basis for knowledge must be able to
observe, and endorse, those assumptions. Thus,
once again, for belief claims, we encounter Wilholt’s
(2013) barrier to evaluating the veridicality of testimony: given the incompleteness of empirical reports
and our inability to observe researcher assumptions,
we cannot reduce testimony about belief claims.
Sincerity
Above, we discuss the conditions under which a
speaker or writer is competent to advance an inference claim. For such testimony to be veridical, a
competent speaker must also express the claim in a
sincere way (Fricker, 2004).
As used by philosophers of science, “sincerity”
has a deeper meaning than honesty or conformance
with an obligation to “say what is true.” Sincerity requires the speaker to make claims that accurately reflect their understanding (Elgin, 2002). For example,
it is insincere, although true, to say that “Beth is
somewhere in the United States” if one knows she is
in Minneapolis. Likewise, a sincere author cannot report only those statistical models with stronger (or
weaker) results, even if those results were properly
calculated, unless they have reason to believe that
such results are most informative. Sincerity also includes a requirement to provide information in a
way that will convey the proper understanding of
the speaker’s knowledge. Within management science, this requirement to consider the reader means
that reports must reflect norms of communication.
For example, researchers can not imply doubt or
confidence that they do not actually feel.8
Scholars in all fields have noted misrepresentations of inference claims. Most commonly, estimates
obtained through an abductive search for an explanation are represented as a valid frequency claim
(Bettis, Ethiraj, Gambardella, Helfat, & Mitchell,
2016). The practice of HARKing (Hypothesizing after
Results are Known) is a common example of such
misreporting. In HARKing, the researcher evaluates
many possible combinations of explanatory and
8
In the case of published testimony, the requirement
for sincerity is shared by the reviewers and editors of the
journal. Sincerity means that the published testimony
must not misinform the journal’s readership. Notably, the
common practice of selecting publications based on the
strength or significance of a result is a clear violation of
sincerity—whether this selection occurs at the level of
the researcher or that of the journal.
2021
King, Goldfarb, and Simcoe
outcome variables. After finding a relationship with
attractive properties, the researcher develops an explanation for this relationship. So far, the researcher
is following a valid abductive process and is competent to claim a possible explanation for the observed
evidence. However, many researchers report as if
they had followed a frequentist test process, make
frequency claims, and thereby encourage readers to
accept a stronger inference than they are justified to
make.
John et al. (2012) found evidence that many scholars surveyed engaged in reporting practices that reflected their knowledge in a biased way. For
example, more than half of those surveyed admitted
to failing to report all of the dependent variables that
had been tested, and 45% admitted to selectively reporting studies that “worked.” Such selective reporting has been noted in many areas of social science.
Richard Bettis (2012) described practices whereby
researchers hunt for interesting explanations and
then misrepresent these as the product of frequentist
tests. In strategy, Goldfarb and King (2016) reported
that the distribution of test statistics from 300 articles
implied a reporting process that selected for statistical significance.
With respect to misrepresentation of belief
claims, prominent scholars have argued that authors commonly fail to represent the plausibility
of rival explanations, and thus mislead readers
about the strength and justification of claims (Antonakis et al., 2010; Hamilton & Nickerson, 2003).
Even when authors provide identification arguments, they may not highlight all of the underlying assumptions, or provide any evidence that
those assumptions are plausible (Angrist &
Pischke, 2009). Scholars may also try multiple
strategies for identification, and then pick the one
that contains a desired collection of results
(Leamer, 2010).
Readers often face great difficulty in determining
the “sincerity” (in the philosophical sense of fully
authentic and neutral reporting) of authors because
they cannot observe the set of estimates from which
the reported one was selected. This has led some authors to adopt a skeptical approach to reported frequency claims. All of the concepts of frequency
analysis, Ed Leamer (1983: 37) wrote, “utterly lose
their meaning by the time the researcher pulls from
the bramble of computer output the one thorn of a
model he likes best, the one he chooses to portray as a
rose.” Heckman and Singer (2017) concurred, noting
that “test statistics are reported as if the hypotheses
being tested did not originate from the data being
471
assessed.” Such skepticism also extends to reporting
of belief claims: Leamer (2010: 36) wrote,
Most authors leave the rest of us wondering how
hard they had to work to find their favorite outcomes