Description
Below I have attached the following article you are going to review I will also attach a separate file on the specific instructions you must follow for reviewing this article please follow all instructions as they are very specific and critical.THE ARTICLE REVIEW IS 1-2 PAGES LONG, WRITTEN IN CHICAGO STYLE, 12-POINT FONT, DOUBLE-SPACED WITH ONE-INCH MARGINS ON ALL SIDES. THERE SHOULD ALSO BE PAGE NUMBERS THAT START ON THE FIRST PAGE AND ARE PUT IN THE UPPER RIGHT CORNER. PLEASE BE SURE TO READ ALL INSTRUCTIONS BEFORE WRITING THE ARTICLE REVIEW.
Unformatted Attachment Preview
Article/Critical Review Instructions
ECON 2300 / ECON 4520
1.
Introduction
Article reviews are reading assignments that ask you to link course reading with
current articles, podcasts, and other media that help illustrate the key statistical concepts
with a focus on data literacy (1-2 pages). Please answer one of the following:
● How Does the Article Apply and Illustrate Key Statistical Concepts?
○ Analyze how the author uses statistical methodologies, theories, or
concepts that are relevant to the course material. Identify specific methods,
findings, or interpretations, and link them back to the principles taught in
the course.
● What is the Quality and Relevance of the Data Presented?
○ Evaluate the quality of the data used in the article. Is the data collection
method robust and relevant to the study? How are the data analyzed, and
are the conclusions drawn from the data supported by appropriate
statistical tests? Assess whether the data and methods used are in
alignment with current standards of data literacy.
● How Does the Article Contribute to the Broader Field or Current Discussions
○ Connect the article’s findings or perspectives to current articles, podcasts,
or other media that are shaping the conversation in the field. How does
this particular work add to or challenge existing knowledge or debates?
Assess its potential impact on future research, policy, or practice.
Make sure to be precise in linking these concepts. I am looking for a synthesis of
classroom materials and the article’s information with a focus on linking theory and
practice.
2.
Formatting and Citation
In adhering to the Chicago style guidelines for formatting and citation, the 1-2
page synthesis should be typed with a clear, readable font such that is easily readable,
12-point in size. The text should be double-spaced with margins of at least one inch on
all sides. Page numbers should start on the first page and be placed in the upper right
corner. Citations within the text should be formatted using either footnotes or
endnotes, depending on your professor’s preference. Each footnote or endnote should
correspond to a superscript number in the text, leading the reader to the cited source. In
the footnotes or endnotes, include the author’s full name, the title of the work,
publication place, publisher, date of publication, and the page number(s) referenced. If
you are citing a journal article, include the volume, issue, and page numbers. A separate
bibliography page at the end of the document should list all the references in
alphabetical order by the author’s last name, following the same citation details as in the
footnotes or endnotes. Adherence to these guidelines ensures that the synthesis
maintains a formal, scholarly tone and accurately credits all sources, aligning with the
academic integrity standards as described by the University guidelines.
In Chicago style, citations can be handled through either footnotes or endnotes,
depending on the preference of your professor or the specific requirements of the
assignment. When you cite a source, a superscript number is inserted at the end of the
sentence or clause containing the cited material. This number corresponds to a note at
the bottom of the page (footnote) or at the end of the paper (endnote). Please consider
using a citation manager such as Zotero or Endnote for keeping track of your citations
and this makes it easy to manage your work.
In this class, we use the Chicago Author-Date citation style. The in-text citation
includes the author’s last name, the publication year, and the page number(s) if
applicable. This style is often used in the sciences and social sciences. Here’s how to
apply it:
2.1
In-Text Citation
When you cite a source in the text, you include the author’s last name and the
publication year. If you are citing a specific page or range of pages, you include that as
well. For example:
Example: (Smith 2020, 15-17)
2.2
Reference
At the end of your paper, you will include a reference list with full citations for
every source cited in the text. The entries are arranged alphabetically by the author’s last
name.
Example: Author’s Last name, First name. Year. “Title of the Article.” Title of
the Journal Volume Number (Issue Number): Page Range.
2.3
Examples
For a synthesis of an article and classroom material, you might cite both within the text
as follows:
According to recent research on data literacy (Johnson 2022, 45), the
methodology aligns with what was taught in our statistical concepts course
(Miller 2019, 67-68).
In the reference list, these citations would appear as:
Johnson, Mary A. 2022. Data Literacy in the Modern Age. New York: Academic
Press.
Miller, John B. 2019. “Statistical Concepts in Undergraduate Education.” Journal
of Statistics Education 15(3): 65-80.
Remember, consistency is key in a citation, and it may be beneficial to consult the latest
edition of the Chicago Manual of Style or your institution’s guidelines for any specific
rules or variations that may apply to your assignment.
3.
Writing Style
“The Elements of Style” by William Strunk Jr. and E.B. White is a seminal guide
that emphasizes clarity, conciseness, and precision in writing. Rather than inundating
the reader with complex rules, the authors present fundamental principles that foster a
clean and straightforward writing style. Among the well-known guidelines are the
encouragement to “omit needless words,” the advice to use the active voice rather than
the passive, and the recommendation to choose simple words over complex ones when
possible. Strunk and White’s guidance also extends to grammatical constructions,
advising on proper comma usage, verb tense consistency, and the appropriate use of
parallel structure. This timeless guide serves as a touchstone for many writers, whether
novices or professionals, aiming to produce clear and effective prose.
● Clarity and Brevity: In line with Strunk and White’s principle to “omit needless
words,” your synthesis should be concise and to the point. Eliminate unnecessary
words or phrases and focus on clear, direct statements.
● Use the Active Voice: Strunk and White emphasize the use of the active voice
for a more engaging and direct style. Instead of writing “The experiment was
conducted by the researchers,” write “The researchers conducted the experiment.”
● Adopt a Simple Style: Avoid overly complex or pretentious language. Choose
simple and straightforward words that convey your meaning without confusion,
following Strunk and White’s guidance to “write in a way that comes naturally.”
● Organize Logically: Your synthesis should follow a logical structure. Start with
a clear thesis statement, present your points in a coherent order, and conclude
with a strong summary. This aligns with Strunk and White’s focus on clear and
logical composition.
● Ensure Parallel Construction: Strunk and White advise using parallel structure
when listing items or ideas. For example, “The study was groundbreaking,
innovative, and influential” maintains a parallel structure.
● Be Consistent with Tenses: Maintain consistency in verb tenses, another
principle emphasized in “The Elements of Style.” If you begin in the past tense,
stay in that tense unless there’s a clear reason to switch.
A note on spelling and grammar, as college students you are still learning to write
effectively, and therefore you may not have all the grammar and spelling skills you need
for the working world. However, I am not really interested in evaluating your spelling or
grammar skills, I want to know how you think and you should work hard to make your
writing as clear as possible so that I can understand your ideas and critical thinking. I do
not take points off for spelling or grammar.
4.
Synthesis (ECON 2300)
A synthesis is a combination of various ideas, findings, or aspects from different
sources to form a comprehensive understanding of a subject. For a 1-2 page synthesis,
read and analyze the selected sources, identifying key themes or insights. Summarize
these in a coherent and concise manner, emphasizing how they connect or contrast with
one another. The goal is to create a unified picture that reflects the complexity and
nuances of the subject without presenting a specific argument or thesis. Proper citation
and adherence to stylistic guidelines are essential.
.
Submission
Please submit the 1-2 page Critical/Article Review via Brightspace by the noon
deadline. If you are late you will receive a 0 unless a prior arrangement has been made
with me before the deadline has passed.
Statistical Science
2011, Vol. 26, No. 1, 1–9
DOI: 10.1214/10-STS337
© Institute of Mathematical Statistics, 2011
Statistical Inference: The Big Picture1
Robert E. Kass
Abstract. Statistics has moved beyond the frequentist-Bayesian controversies of the past. Where does this leave our ability to interpret results? I suggest
that a philosophy compatible with statistical practice, labeled here statistical pragmatism, serves as a foundation for inference. Statistical pragmatism
is inclusive and emphasizes the assumptions that connect statistical models
with observed data. I argue that introductory courses often mischaracterize
the process of statistical inference and I propose an alternative “big picture”
depiction.
Key words and phrases: Bayesian, confidence, frequentist, statistical education, statistical pragmatism, statistical significance.
ple, these caricatures forfeit an opportunity to articulate
a fundamental attitude of statistical practice.
Most modern practitioners have, I think, an openminded view about alternative modes of inference, but
are acutely aware of theoretical assumptions and the
many ways they may be mistaken. I would suggest that
it makes more sense to place in the center of our logical framework the match or mismatch of theoretical
assumptions with the real world of data. This, it seems
to me, is the common ground that Bayesian and frequentist statistics share; it is more fundamental than either paradigm taken separately; and as we strive to foster widespread understanding of statistical reasoning, it
is more important for beginning students to appreciate
the role of theoretical assumptions than for them to recite correctly the long-run interpretation of confidence
intervals. With the hope of prodding our discipline to
right a lingering imbalance, I attempt here to describe
the dominant contemporary philosophy of statistics.
1. INTRODUCTION
The protracted battle for the foundations of statistics, joined vociferously by Fisher, Jeffreys, Neyman,
Savage and many disciples, has been deeply illuminating, but it has left statistics without a philosophy that
matches contemporary attitudes. Because each camp
took as its goal exclusive ownership of inference, each
was doomed to failure. We have all, or nearly all,
moved past these old debates, yet our textbook explanations have not caught up with the eclecticism of statistical practice.
The difficulties go both ways. Bayesians have denied the utility of confidence and statistical significance, attempting to sweep aside the obvious success
of these concepts in applied work. Meanwhile, for their
part, frequentists have ignored the possibility of inference about unique events despite their ubiquitous occurrence throughout science. Furthermore, interpretations of posterior probability in terms of subjective belief, or confidence in terms of long-run frequency, give
students a limited and sometimes confusing view of the
nature of statistical inference. When used to introduce
the expression of uncertainty based on a random sam-
2. STATISTICAL PRAGMATISM
I propose to call this modern philosophy statistical
pragmatism. I think it is based on the following attitudes:
Robert E. Kass is Professor, Department of Statistics,
Center for the Neural Basis of Cognition and Machine
Learning Department, Carnegie Mellon University,
Pittsburgh, Pennsylvania 15213, USA (e-mail:
[email protected]).
1. Confidence, statistical significance, and posterior
probability are all valuable inferential tools.
2. Simple chance situations, where counting arguments may be based on symmetries that generate
equally likely outcomes (six faces on a fair die; 52
cards in a shuffled deck), supply basic intuitions
1 Discussed in 1012.14/11-STS337C, 1012.14/11-STS337A,
1012.14/11-STS337D and 1012.14/11-STS337B; rejoinder at
1012.14/11-STS337REJ.
1
2
R. E. KASS
about probability. Probability may be built up to important but less immediately intuitive situations using abstract mathematics, much the way real numbers are defined abstractly based on intuitions coming from fractions. Probability is usefully calibrated
in terms of fair bets: another way to say the probability of rolling a 3 with a fair die is 1/6 is that 5 to
1 odds against rolling a 3 would be a fair bet.
3. Long-run frequencies are important mathematically, interpretively, and pedagogically. However, it
is possible to assign probabilities to unique events,
including rolling a 3 with a fair die or having a confidence interval cover the true mean, without considering long-run frequency. Long-run frequencies
may be regarded as consequences of the law of large
numbers rather than as part of the definition of probability or confidence.
4. Similarly, the subjective interpretation of posterior
probability is important as a way of understanding
Bayesian inference, but it is not fundamental to its
use: in reporting a 95% posterior interval one need
not make a statement such as, “My personal probability of this interval covering the mean is 0.95.”
5. Statistical inferences of all kinds use statistical
models, which embody theoretical assumptions. As
illustrated in Figure 1, like scientific models, statistical models exist in an abstract framework; to
distinguish this framework from the real world inhabited by data we may call it a “theoretical world.”
Random variables, confidence intervals, and posterior probabilities all live in this theoretical world.
When we use a statistical model to make a statistical inference we implicitly assert that the variation
exhibited by data is captured reasonably well by
the statistical model, so that the theoretical world
corresponds reasonably well to the real world. Conclusions are drawn by applying a statistical inference technique, which is a theoretical construct, to
some real data. Figure 1 depicts the conclusions as
straddling the theoretical and real worlds. Statistical inferences may have implications for the real
world of new observable phenomena, but in scientific contexts, conclusions most often concern scientific models (or theories), so that their “real world”
implications (involving new data) are somewhat indirect (the new data will involve new and different
experiments).
The statistical models in Figure 1 could involve large
function spaces or other relatively weak probabilistic
assumptions. Careful consideration of the connection
F IG . 1. The big picture of statistical inference. Statistical procedures are abstractly defined in terms of mathematics but are
used, in conjunction with scientific models and methods, to explain
observable phenomena. This picture emphasizes the hypothetical
link between variation in data and its description using statistical
models.
between models and data is a core component of both
the art of statistical practice and the science of statistical methodology. The purpose of Figure 1 is to shift
the grounds for discussion.
Note, in particular, that data should not be confused
with random variables. Random variables live in the
theoretical world. When we say things like, “Let us assume the data are normally distributed” and we proceed to make a statistical inference, we do not need
to take these words literally as asserting that the data
form a random sample. Instead, this kind of language
is a convenient and familiar shorthand for the much
weaker assertion that, for our specified purposes, the
variability of the data is adequately consistent with
variability that would occur in a random sample. This
linguistic amenity is used routinely in both frequentist
and Bayesian frameworks. Historically, the distinction
between data and random variables, the match of the
model to the data, was set aside, to be treated as a
separate topic apart from the foundations of inference.
But once the data themselves were considered random
variables, the frequentist-Bayesian debate moved into
the theoretical world: it became a debate about the
best way to reason from random variables to inferences
about parameters. This was consistent with developments elsewhere. In other parts of science, the distinction between quantities to be measured and their theoretical counterparts within a mathematical theory can
be relegated to a different subject—to a theory of errors. In statistics, we do not have that luxury, and it
seems to me important, from a pragmatic viewpoint, to
bring to center stage the identification of models with
data. The purpose of doing so is that it provides different interpretations of both frequentist and Bayesian
inference, interpretations which, I believe, are closer to
the attitude of modern statistical practitioners.
3
STATISTICAL INFERENCE
(A)
(B)
F IG . 2. (A) BARS fits to a pair of peri-stimulus time histograms displaying neural firing rate of a particular neuron under two alternative
experimental conditions. (B) The two BARS fits are overlaid for ease of comparison.
A familiar practical situation where these issues arise
is binary regression. A classic example comes from
a psychophysical experiment conducted by Hecht,
Schlaer and Pirenne (1942), who investigated the sensitivity of the human visual system by constructing an
apparatus that would emit flashes of light at very low
intensity in a darkened room. Those authors presented
light of varying intensities repeatedly to several subjects and determined, for each intensity, the proportion
of times each subject would respond that he or she had
seen a flash of light. For each subject the resulting data
are repeated binary observations (“yes” perceived versus “no” did not perceive) at each of many intensities
and, these days, the standard statistical tool to analyze
such data is logistic regression. We might, for instance,
use maximum likelihood to find a 95% confidence interval for the intensity of light at which the subject
would report perception with probability p = 0.5. Because the data reported by Hecht et al. involved fairly
large samples, we would obtain essentially the same
answer if instead we applied Bayesian methods to get
an interval having 95% posterior probability. But how
should such an interval be interpreted?
A more recent example comes from DiMatteo, Genovese and Kass (2001), who illustrated a new nonparametric regression method called Bayesian adaptive regression splines (BARS) by analyzing neural firing rate data from inferotemporal cortex of a macaque
monkey. The data came from a study ultimately reported by Rollenhagen and Olson (2005), which investigated the differential response of individual neurons under two experimental conditions. Figure 2 displays BARS fits under the two conditions. One way
to quantify the discrepancy between the fits is to estimate the drop in firing rate from peak (the maximal firing rate) to the trough immediately following the peak
in each condition. Let us call these peak minus trough
differences, under the two conditions, φ 1 and φ 2 . Using BARS, DiMatteo, Genovese and Kass reported a
posterior mean of φ̂ 1 − φ̂ 2 = 50.0 with posterior standard deviation (±20.8). In follow-up work, Wallstrom,
Liebner and Kass (2008) reported very good frequentist coverage probability of 95% posterior probability
intervals based on BARS for similar quantities under
simulation conditions chosen to mimic such experimental data. Thus, a BARS-based posterior interval
could be considered from either a Bayesian or frequentist point of view. Again we may ask how such an inferential interval should be interpreted.
3. INTERPRETATIONS
Statistical pragmatism involves mildly altered interpretations of frequentist and Bayesian inference. For
definiteness I will discuss the paradigm case of confidence and posterior intervals for a normal mean based
4
R. E. KASS
on a sample of size n, with the standard deviation being known. Suppose that we have n = 49 observations
that have a sample mean equal to 10.2.
F REQUENTIST ASSUMPTIONS . Suppose X1 , X2 ,
. . . , Xn are i.i.d. random variables from a normal distribution with mean μ and standard deviation σ = 1. In
other words, suppose X1 , X2 , . . . , Xn form a random
sample from a N(μ, 1) distribution.
√
Noting that x̄ = 10.2 and 49 = 7 we define the inferential interval
2
I = 10.2 − 27 , 10.2 + 7 .
The interval I may be regarded as a 95% confidence
interval. I now contrast the standard frequentist interpretation with the pragmatic interepretation.
F REQUENTIST INTERPRETATION OF CONFIDENCE
INTERVAL . Under the assumptions above, if we
were to draw infinitely many random samples from a
N(μ, 1) distribution, 95% of the corresponding confidence intervals (X̄ − 27 , X̄ + 27 ) would cover μ.
P RAGMATIC INTERPRETATION OF CONFIDENCE
If we were to draw a random sample according to the assumptions above, the resulting confidence interval (X̄ − 27 , X̄ + 27 ) would have probability
0.95 of covering μ. Because the random sample lives
in the theoretical world, this is a theoretical statement.
Nonetheless, substituting
INTERVAL .
(1)
X̄ = x̄
together with
(2)
x̄ = 10.2
we obtain the interval I , and are able to draw useful
conclusions as long as our theoretical world is aligned
well with the real world that produced the data.
The main point here is that we do not need a longrun interpretation of probability, but we do have to
be reminded that the unique-event probability of 0.95
remains a theoretical statement because it applies to
random variables rather than data. Let us turn to the
Bayesian case.
BAYESIAN ASSUMPTIONS . Suppose X1 , X2 ,
. . . , Xn form a random sample from a N(μ, 1) distribution and the prior distribution of μ is N(μ0 , τ 2 ),
1
with τ 2 49
and 49τ 2 |μ0 |.
The posterior distribution of μ is normal, the posterior mean becomes
μ̄ =
τ2
1/49
10.2 +
μ0
2
1/49 + τ
1/49 + τ 2
and the posterior variance is
v = 49 +
1
τ2
−1
1
but because τ 2 49
and 49τ 2 |μ0 | we have
μ̄ ≈ 10.2
and
1
.
49
Therefore, the inferential interval I defined above has
posterior probability 0.95.
v≈
BAYESIAN INTERPRETATION OF POSTERIOR IN TERVAL . Under the assumptions above, the probability that μ is in the interval I is 0.95.
P RAGMATIC INTERPRETATION OF POSTERIOR IN TERVAL . If the data were a random sample for
which (2) holds, that is, x̄ = 10.2, and if the assumptions above were to hold, then the probability that μ is
in the interval I would be 0.95. This refers to a hypothetical value x̄ of the random variable X̄, and because
X̄ lives in the theoretical world the statement remains
theoretical. Nonetheless, we are able to draw useful
conclusions from the data as long as our theoretical
world is aligned well with the real world that produced
the data.
Here, although the Bayesian approach escapes the
indirectness of confidence within the theoretical world,
it cannot escape it in the world of data analysis because
there remains the additional layer of identifying data
with random variables. According to the pragmatic interpretation, the posterior is not, literally, a statement
about the way the observed data relate to the unknown
parameter μ because those objects live in different
worlds. The language of Bayesian inference, like the
language of frequentist inference, takes a convenient
shortcut by blurring the distinction between data and
random variables.
The commonality between frequentist and Bayesian
inferences is the use of theoretical assumptions, together with a subjunctive statement. In both approaches
a statistical model is introduced—in the Bayesian case
the prior distributions become part of what I am here
calling the model—and we may say that the inference
STATISTICAL INFERENCE
is based on what would happen if the data were to be
random variables distributed according to the statistical
model. This modeling assumption would be reasonable
if the model were to describe accurately the variation in
the data.
4. IMPLICATIONS FOR TEACHING
It is important for students in introductory statistics
courses to see the subject as a coherent, principled
whole. Instructors, and textbook authors, may try to
help by providing some notion of a “big picture.” Often
this is done literally, with an illustration such as Figure 3 (e.g., Lovett, Meyer and Thille, 2008). This kind
of illustration can be extremely useful if referenced repeatedly throughout a course.
Figure 3 represents a standard story about statistical
inference. Fisher introduced the idea of a random sample drawn from a hypothetical infinite population, and
Neyman and Pearson’s work encouraged subsequent
mathematical statisticians to drop the word “hypothetical” and instead describe statistical inference as analogous to simple random sampling from a finite population. This is the concept that Figure 3 tries to get across.
My complaint is that it is not a good general description of statistical inference, and my claim is that Figure 1 is more accurate. For instance, in the psychophysical example of Hecht, Schlaer and Pirenne discussed
in Section 2, there is no population of “yes” or “no”
replies from which a random sample is drawn. We do
not need to struggle to make an analogy with a simple
random sample. Furthermore, any thoughts along these
lines may draw attention away from the most important
theoretical assumptions, such as independence among
the responses. Figure 1 is supposed to remind students
to look for the important assumptions, and ask whether
they describe the variation in the data reasonably accurately.
F IG . 3. The big picture of statistical inference according to the
standard conception. Here, a random sample is pictured as a sample from a finite population.
5
One of the reasons the population and sample picture in Figure 3 is so attractive pedagogically is that it
reinforces the fundamental distinction between parameters and statistics through the terms population mean
and sample mean. To my way of thinking, this terminology, inherited from Fisher, is unfortunate. Instead
of “population mean” I would much prefer theoretical
mean, because it captures better the notion that a theoretical distribution is being introduced, a notion that is
reinforced by Figure 1.
I have found Figure 1 helpful in teaching basic statistics. For instance, when talking about random variables
I like to begin with a set of data, where variation is
displayed in a histogram, and then say that probability may be used to describe such variation. I then tell
the students we must introduce mathematical objects
called random variables, and in defining them and applying the concept to the data at hand, I immediately
acknowledge that this is an abstraction, while also stating that—as the students will see repeatedly in many
examples—it can be an extraordinarily useful abstraction whenever the theoretical world of random variables is aligned well with the real world of the data.
I have also used Figure 1 in my classes when describing attitudes toward data analysis that statistical
training aims to instill. Specifically, I define statistical
thinking, as in the article by Brown and Kass (2009),
to involve two principles:
1. Statistical models of regularity and variability in
data may be used to express knowledge and uncertainty about a signal in the presence of noise, via
inductive reasoning.
2. Statistical methods may be analyzed to determine
how well they are likely to perform.
Principle 1 identifies the source of statistical inference to be the hypothesized link between data and statistical models. In explaining, I explicitly distinguish
the use of probability to describe variation and to express knowledge. A probabilistic description of variation would be “The probability of rolling a 3 with a fair
die is 1/6” while an expression of knowledge would be
“I’m 90% sure the capital of Wyoming is Cheyenne.”
These two sorts of statements, which use probability
in different ways, are sometimes considered to involve
two different kinds of probability, which have been
called “aleatory probability” and “epistemic probability.” Bayesians merge these, applying the laws of probability to go from quantitative description to quantified
belief, but in every form of statistical inference aleatory
6
R. E. KASS
F IG . 4. A more elaborate big picture, reflecting in greater detail the process of statistical inference. As in Figure 1, there is a hypothetical
link between data and statistical models but here the data are connected more specifically to their representation as random variables.
probability is used, somehow, to make epistemic statements. This is Principle 1. Principle 2 is that the same
sorts of statistical models may be used to evaluate statistical procedures—though in the classroom I also explain that performance of procedures is usually investigated under varying circumstances.
For somewhat more advanced audiences it is possible to elaborate, describing in more detail the process
trained statisticians follow when reasoning from data.
A big picture of the overall process is given in Figure 4. That figure indicates the hypothetical connection
between data and random variables, between key features of unobserved mechanisms and parameters, and
between real-world and theoretical conclusions. It further indicates that data display both regularity (which is
often described in theoretical terms as a “signal,” sometimes conforming to simple mathematical descriptions
or “laws”) and unexplained variability, which is usually taken to be “noise.” The figure also includes the
components exploratory data analysis—EDA—and algorithms, but the main message of Figure 4, given by
the labels of the two big boxes, is the same as that in
Figure 1.
5. DISCUSSION
According to my understanding, laid out above, statistical pragmatism has two main features: it is eclectic
and it emphasizes the assumptions that connect statistical models with observed data. The pragmatic view acknowledges that both sides of the frequentist-Bayesian
debate made important points. Bayesians scoffed at the
artificiality in using sampling from a finite population
to motivate all of inference, and in using long-run behavior to define characteristics of procedures. Within
the theoretical world, posterior probabilities are more
direct, and therefore seemed to offer much stronger
inferences. Frequentists bristled, pointing to the subjectivity of prior distributions. Bayesians responded by
treating subjectivity as a virtue on the grounds that all
inferences are subjective yet, while there is a kernel
of truth in this observation—we are all human beings,
making our own judgments—subjectivism was never
satisfying as a logical framework: an important purpose of the scientific enterprise is to go beyond personal decision-making. Nonetheless, from a pragmatic
perspective, while the selection of prior probabilities is
important, their use is not so problematic as to disqualify Bayesian methods, and in looking back on history
the introduction of prior distributions may not have
been the central bothersome issue it was made out to
be. Instead, it seems to me, the really troubling point
for frequentists has been the Bayesian claim to a philosophical high ground, where compelling inferences
could be delivered at negligible logical cost. Frequentists have always felt that no such thing should be possible. The difficulty begins not with the introduction of
prior distributions but with the gap between models and
data, which is neither frequentist nor Bayesian. Statistical pragmatism avoids this irritation by acknowledging
explicitly the tenuous connection between the real and
theoretical worlds. As a result, its inferences are necessarily subjunctive. We speak of what would be inferred if our assumptions were to hold. The inferential
STATISTICAL INFERENCE
bridge is traversed, by both frequentist and Bayesian
methods, when we act as if the data were generated
by random variables. In the normal mean example discussed in Section 4, the key step involves the conjunction of the two equations (1) and (2). Strictly speaking,
according to statistical pragmatism, equation (1) lives
in the theoretical world while equation (2) lives in the
real world; the bridge is built by allowing x̄ to refer to
both the theoretical value of the random variable and
the observed data value.
In pondering the nature of stat