retail note 2

Description

Organize notes based on PPT and make them as concise as possible

Don't use plagiarized sources. Get Your Custom Assignment on
retail note 2
From as Little as $13/Page

Unformatted Attachment Preview

BU.450.740
Retail Analytics
Week 2
Retail Pricing: Theory and Practice
Prof. Mitsukuni Nishida
Johns Hopkins Carey Business School
1
Some Announcements
• Homework 1: Will be graded in a week
• Phase 1 report is due midnight tonight
• Homework 2 (individual) is due next week
Word on course material sharing websites (e.g., coursehero): Don’t.
2
Today’s Agenda
• Pillar 1: Pricing tactics
• Pillar 2: “Turbo” version of Microecon & Marketing to establish
foundations for pricing
– Rationality and Net-benefit principles
– Consumer behavior
• Concepts: Demand curve, elasticities, competitive advantage
• Applications: pricing based on competitive advantage, value
pricing, demand estimation
– Firm’s behavior: Pricing for profit maximization (next week)
• Pillar 3: Quantitatively examining sales of a retailer in RStudio
– Linear regressions: simple & multivariate
3
RETAIL NEWS:
Google’s Wing debuts bigger
delivery drones
4
Wing’s new fleet

Wing’s new fleet
– Alphabet-owned Wing introduces a new drone capable of carrying up to five
pounds, doubling the payload capacity of its current drones, to better serve
partners like Walmart, DoorDash, and Apian by accommodating a wider
variety of deliveries
– Round trips of 12 miles and cruise at 65 mph
– Current fleets are able to field 70% of all US orders, the remaining 30% of trips
require two drones (=“Needs” drive decisions)

Regulatory Advances: Tipping point for drone delivery?
– Federal Aviation Administration (FAA) granted permissions for some drone
operators, including Wing and Zipline, to fly beyond the visual line of sight
(BVLOS)
– “This marks a paradigm shift in the way US regulators are approaching
approvals for these types of advanced BVLOS drone operations,” Wing CEO
Adam Woodworth wrote in a blog post. “We believe 2024 will be the year of
drone delivery.”

Walmart’s partnership
– Walmart has expanded its drone delivery partnerships with Wing and Zipline,
aiming to reach up to 75% of households in the Dallas-Fort Worth area
5
Digression on Zipline CEO
6
PILLAR 1: COMMON RETAIL
PRICING PRACTICES
7
Pricing Practices
1. Add margin to cost
– Rule of thumb pricing method for retailers, but what is “margin”?
– “Gross Margin”= Unit price paid – Cost of good sold
(COGS)
– Gross Margin% = (Revenue – COGS)/Revenue
• How does the gross margin at the firm level look like for major
retailers? See next slides
Note:
– COGS: “direct costs of producing the goods” materials, labor, etc
– Gross Profit = Revenue – COGS
– Operating Profit = Gross Profit – Operating Expenses (OEs)
• OEs are “cost for running a product, business,” such as advertising,
office expenses, utilities, insurance, property taxes, maintenance
fees, etc.
8
Gross margin %= (Revenue- Cost of goods sold)/Revenue
Kroger
Costco
Walmart
9
Operating margin %= Operating profit/Revenue
Kroger
Costco
Walmart
10
Source: macrotrends
Specialty Retailer’s Profits
Tapestry, Inc. (formerly known as Coach, Inc.)
11
Pricing Practices (cont’d)
1. Add margin to cost
2. Match competitors’ price
– Would this “price-match” policy really benefit consumers? Or is it
a way to make more profits?
– Will cover this practice– in a game-theoretical setting– in Week 5
12
13
Pricing Practices (cont’d)
1. Add margin to cost
2. Match competitors’ price
3. Algorithmic pricing
– Fully or automated pricing decision making through statistical
analysis, machine learning, and other AI methods
• Large retailers: Amazon, Walmart, Bestbuy, Target, etc..
• Medium-size retailers adopt it as well
– The methods extend to other sectors, such as hotel and
transportation industries (e.g., UBER)
• How frequently do retailers change their prices on average?
– Average monthly duration of regular price changes?
14
Monthly duration of regular price
changes by retail sector
Source: Katsov (2018)
Average duration for regular prices
• 6.7 months in 2008-2010
• 3.6 months in 2014-2017
15
16
Amazon changes product prices 2.5 million times a day
• On average product’s cost will change about every 10 minutes
• 50+ times more often than Walmart and Best Buy
17
iPhone 5S Dynamic Pricing over
time at Amazon US in 2014
• A – July 23: Apple announces possible event for the launch of the new iPhone in
mid September. The data for the US store show two minima for 2 following days
• B- August 8: Media confirms that the Apple keynote will be held on September
9. We observed a price drop on the same day in Amazon US of ~$30
• C – September 8: The price for the iPhone falls to a minimum the day before the
Keynote where Tim Cook unveiled the iPhones 6 and 6 plus
18
Amazon UK plays different pricing
• A – July 23: Mac announces possible event for the launch of the new iPhone in
mid September. The data for the US store show two minima for 2 following days
• B- August 8: Media confirms that the Apple keynote will be held on September
9. We see a price drop on the same day in Amazon US of ~$30
• C – September 8: The price for the iPhone falls to a minimum the day before the
Keynote where Tim Cook unveiled the iPhones 6 and 6 plus
19
Amazon matches the lowest price of
other sellers with a premium
Source: Katsov (2018)
20
Pricing Practices: Summary
1. Add margin to cost
2. Match competitors’ price
3. Algorithmic pricing
All of these practices have foundation in price elasticity analysis
• Benmark et al. (2018) reports a medium-size retailer experienced
10% improvement in gross margin via dynamic pricing & price
elasticity analysis
• To understand and apply the mechanics of pricing, we need to
revisit price theory to understand
1. Demand: Consumer’s preference, budget, & utilitymaximizing behavior
21
2. Supply: Firm’s profit-maximizing behavior
Devices to Implement Dynamic Pricing
22
PILLAR 2: PRINCIPLE OF
MICROECONOMICS
RATIONALITY &
NET-BENEFIT PRINCIPLE
23
Rationality (1 of 3)
• Central assumption in economics:
Economic agents (consumers, firms, workers) are rational
• Yet everybody knows some of the friends or family
members are hardly called “rational”
• What does “rationality” mean?
24
Rationality (2 of 3)
• Rationality in economics does not mean
– People are motivated only by money
– They don’t take other people’s welfare into account
– They know every possible consequence of their
actions
– They have perfect knowledge about what is involved
in this decision
– They have an absolute capacity to process information
25
Rationality (3 of 3)
• Rationality:
People do the best they can, given what they know and their
other constraints
– Consumers and workers maximize their well-being
(“utility,” which may depend on other’s well-being)
– Firms maximize their (long-run) profits
– Investors maximize their returns
26
Does anyone behave this way?
• Field experiments revealed economic incentives DO matter
Setup:
Kyoto, Japan in 2012&13
700 households
3 groups
(1) Moral suasion, no incentive
(2) Economic incentive:
Normal: 25 yen/kwh
Peak: 65, 85, 105 yen/kwh
(3) Control (i.e., no moral suasion
& no incentive)
• Economic incentives induce larger treatment effects & significant
habit formation (3 days vs. 15 days)
Reference: Ito, K., T. Ida, M. Tanaka (2017) “Moral Suasion and Economic Incentives: Field
Experimental Evidence from Energy Demand,” American Economic Journal: Economic Policy
27
Net-Benefit Principle
• The best choice or action from a set of actions is the one that
yields the highest expected net benefits
• i.e., one should do action y* if
Benefit(y*) > Cost(y*) and
[Benefit(y*)-Cost(y*)] > [Benefit(y)-Cost(y)]
for all possible alternatives y
• Why could assuming rationality be useful?
• Revealed preference approach
– By observing the choices people made, we can infer their
underlying preference and constraints
– I.e., believe more in what people actually DID, rather than
28
what people said
PILLAR 2: WHERE DOES
PRICE COME FROM?
MARKET-STRUCTURE
PERSPECTIVE
29
Where does price come from?
Market Level
• Market level: Where demand meets supply
– Demand curve: derived from consumer’s utility maximization
– Supply curve: derived from firm’s cost minimization
30
Market Structure: A Spectrum
Large # of firms
No market power
Few firms
Some market power
Only one firm
Full market power
Perfect
Monopolistic
Oligopoly
competition
Competition
“price taker”
“strategic
interaction”
Monopoly
“price setter”
31
Under perfect competition, Price is given
in Market Equilibrium: Supply = Demand
P
surplus
⚫ If P < P*, a shortage S P (excess demand) exists P* This tends to push the price up toward P* P shortage Q* ⚫ If P > P*, excess supply
D
exists
Q
This pushes the price down
toward P*
This model is most applicable to commodity and financial markets
32
Under Monopoly,
price is given in
profit maximization
P
Profit = π = TR – TC
To find the profitmaximizing quantity, set
the first derivative of π with
respect to Q = 0:
 TR TC
=

=0
Q Q
Q
MR
P*
MC
π
Q*
Q
MR
MC
Goal: To maximize the profits,
we find Q such that
MR = MC
Q*
33
Q
DEMAND: CONSUMER’S
DECISION MAKING
34
Two “Levels” of Demand
• A consumer’s demand curve:
Interpretation 1: Quantities that this agent is willing to buy at every
possible price per unit (all else equal)
Interpretation 2: The maximum willingness to pay for an additional
unit at every possible quantity of good (all else equal)
• The market demand curve:
Total quantity demanded of a product by all consumers in a
market at every possible price (all else equal)
35
My Demand for Coffee
P
5
4
3
2
1
.5
2
4
6
8
cups of
coffee
– Suppose my
consumption of
coffee looks like
this:
p=0;
q=8/day
p=0.50; q=6/day
p=1.25; q=4/day
p=2.00; q=3/day
p=3.00; q=2/day
p=5.00; q=0/day
36
Demand Curve: Perspective 1
• The amount of the good
a consumer is willing to
purchase for a given
price
P
For a given price p, the
consumer demands
quantity q.
p
q
Q
37
Demand Curve: Perspective 2
• The consumer’s
maximum willingness to
pay (WTP) for each unit
= how much they
value each unit
= how much
consumers benefit
from their purchase
P
For a given quantity q,
the consumer is willing
to pay up to p.
p
q
Q
38
From Individual to Market Demand
• We generate market demand from individual demand
• How? Horizontally add individual demand curves
• Suppose only you and I consume coffee.
– Me: P = 6 – Q, or Q = 6 – P
– You: P = 4 – 0.5Q, or Q = 8 – 2P
=> Market: Q = 14 – 3P if P
Increasing consumers’ maximum WTP via promotions, not engaging
price competition
• When product differentiation is weak (e.g., chemicals, sugar, paper etc.),
firm should follow a market share strategy => undercutting slightly
rivals’ price
47
APPLICATION OF ADDED
VALUE & COMPETITIVE
ADVANTAGE: VALUE PRICING
48
Value Pricing
• Imagine customers compare Cruze and Corolla from retailers
(i.e., car dealers)
• Imagine you are a manager at Chevy dealer
• What is the price range possible for you?
49
Value Pricing Provides Max Price Possible
• Net benefit for consumers
– Chevy Cruze: B – P
– Toyota Corolla: B’ – P’
• So consumers will buy from Chevy as long as
– (B – P) > (B’ – P’)
– Or P < (B – B’) + P’ = Maximum price Chevy can set Differentiation Value Reference Price – If (B – B’) < 0, Chevy needs to set a lower price P than Toyota (P’) – If (B – B’) > 0, Chevy can set a higher price than Toyota
• Chevy’s Minimum price: C = Variable cost (=shutdown point)
• Added value gives a wider range of price relative to rivals 50
SHAPE OF DEMAND:
ELASTICITY
51
Why are elasticities important?

A good manager needs to provide quantitative answers to:
– How large should the price cut be to achieve 3% sales growth?
– If we cut price by 6.5%, how many more units will we sell?

How can managers answer these questions? Elasticities– a
quantitative foundation

Implication to corporate decision making
“A great business has pricing power or the power to raise prices
without losing business to a competitor” (Warren Buffet)
52
Elasticity: Basic Definition
Elasticity is the % change that will occur in one variable in response
to a 1% change in another variable
It is a units-free measure of the sensitivity of one variable to another
(Own) Price Elasticity

Measures how quantity demanded changes due to changes in price
– Example: (10% decrease in Q) due to (5% increase in P)
– EP = (– 10) /5 = –2

P & Q are always positive numbers & the demand curve is downward
sloping
EP is always negative

Exceptions observed: Demand for rice in Hunan and Gansu in China
54
Cross-Price Elasticity
⚫ △% in the quantity demanded of product X resulting from
△% in the price of product Y
△% in Q of product X
△% in P of product Y

If EXY > 0 (where the subscript XY is short for Prius and Yaris say),
the two products are substitutes

If EXY < 0 (where the subscript XY is short for movie/popcorn say), the two products are complements ⚫ Example: a 5% increase in Py => a 10% decrease in Qx,
EXY = (–10)/5 = –2
55
Own Price Elasticity: Extreme Cases
Ep = Q P = 0
P Q
Ep = Q P = -∞
P Q
P
P
D
Perfectly elastic
demand
D
Perfectly
inelastic
demand
Q
Baby milk and chewing gum. Which
product is more elastic than the other?
Q
Examples of perfectly elastic demand?
56
DEMAND ESTIMATION
57
Own-Price, Cross-Price, and Income
Elasticities: Examples
Demand for Sentra
Demand for Escort
Demand for LS400
Demand for 735i
P of Sentra P of Escort P of LS400
-6.528
0.078
0.000
0.054
-6.031
0.001
0.000
0.001
-3.085
0.000
0.001
0.032
Price elasticity
Cross-price elasticity
Income elasticity
Demand for Apples
Demand for Bananas
Demand for Peaches
Coca-Cola
-1.47
0.52
0.58
P of apples
-0.586
-0.409
0.015
P of bananas
-0.207
-1.199
1.082
P of 735i
0.000
0.000
0.093
-3.515
Pepsi
-1.55
0.64
1.38
P of peaches
0.118
0.546
-1.105
58
What Hurricane Katrina in 2005 Tells
about Elasticities of Retail Gasoline?
Price
S1
S0
Pnew
Pold
D
Qnew
Qold
Quantity
% change in price = 17.5%, % change in quanitty = -8.3%
=> Elasticitiy = -8.3%/17.5% = -0.47: inelastic in the short run
59
Example: Market Demand for Fresh Roses
Prices and Quantities of Fresh-Cut Roses
0.60
Valentine’s
Day Effect
SHIFTS the
demand
curve OUT!
price (dollars per stem)
0.50
0.40
February 1991-1993
Upward
Sloping
Demand
Curve???
0.30
August 1991-1993
0.20
0.10
0.00
0
Source: Besanko-Braeutigam (2005)
2
4
6
8
10
Quantity (millions of stems per month)
60
Example: Market Demand for Coffee
⚫ Suppose you got
P
data from
Baltimore/Washin
gton DC, during
July and then
October…
P = 4.5 – 0.5Q
5
4
3
⚫ If you use OLS
regression to find
a relationship, you
would get …
2
1
.5
2
4
6
8
# of K
coffees
61
Example: Market Demand for Coffee
Summer day : Q = 8 – 2P  P = 4 – 0.5Q
School year day : Q = 10 – 2P  P = 5 – 0.5Q

P

5
4
3

2
1
.5
2
4
6
8
# of K
coffees
But daily demand
for coffee on
summer days (July)
may look like this
while daily demand
during the school
year (Oct.) is more
like this
These are two
separate demand
curves that should be
estimated separately
62
Factors Affecting Market Demand

Market demand for coffee takes some things as given:
⚫ Market population
⚫ Income of local population
⚫ Current price of other goods, e.g.
– Pepsi (substitute)
– Muffins (complements)
– Information about say health effects, etc.
– Advertising

Whenever these change, the whole demand curve shifts
63
Demand Estimation:
Simple vs Multivariate Regressions
Estimated equation: Q = 184.47 – 7.56*P (or P = 24.40 – 0.132*Q)

60
50
Y
40
30
Y
Predicted Y
Interpretation of the
coefficient on P:
Every $1000
increase in price
=> approx. 8 fewer
minivans sold
20
10
0
0.00
5.00
10.00
15.00
X Variable 1
20.00
25.00
64
Challenge in Estimating Demand:
“controlling for” factors other than price
Using historical data,
estimate demand as
P=a–bQ
P
P05
P06
P07
P08
Dest Q05 Q06 Q07 Q08
Q
65
Challenge in Estimating Demand:
“controlling for” factors other than price
(cont’d)
D05 D06 D07 D
08
A = expenditure
on advertising
P
If true demand is
P = a – b Q + c A, and you estimate
P = a – b Q, you will overestimate the effect of a change
in price on quantity demanded.
P05
P06
P07
P08
P09
Dest Q05 Q06 Q07 Q08
Q09(predicted)
Q09(realized)
Q
66
Demand Estimation:
Multivariate regression analysis
• Looking at relations between two variables in isolation can be
misleading…
• Suppose auto companies sell cars in urban markets where
price is low and ads expenditure is high
• How would you separate out how much of a change in Q sold
was due to pricing versus advertising (or something else)?
=> You must “control” for all the things that move the demand
curve around to get true price-quantity relation
• In most cases, you use multiple regression analysis, with
other explanatory variables beside price on the right-hand
side
67
(end of slides)
BU.450.740
Retail Analytics
Week 2
Regression Analysis in R
Prof. Mitsukuni Nishida
Johns Hopkins Carey Business School
1
Today’s Agenda
• Some logistics (to make your life easier in R computing)
– Draft a R script and implement the script
– Generate a business report in R (next week)
• Motivation: Why linear regression?
• Run regressions in R
– Simple regression
– Multivariate regression
• Interpreting outcomes in R
– Hypothesis testing: coefficient-level and equation-level
– Examining goodness of fit
– Omitted variable bias (next week)
2
BASIC LOGISTICS IN
COMPUTING IN R
3
Preparation
• Open RStudio
1. First clear the existing workspace, such as loaded data, scripts,
etc. by
– rm(list=ls())
2. Set the working directory by
– setwd(“C:/Users/mnishid2/Dropbox/teaching/2023_24/Retail
Analytics/Coding in R/w2”)
3. Load Required R package “tidyverse” & “dplyr” for easy data
manipulation and visualization
– install.packages(“tidyverse”)
– install.packages(“dplyr”)
4
DRAFT R SCRIPT
5
Draft R script
Let’s create a script to display a scatter plot of X and Y, where X and
Y are random numbers with mean 0 and standard deviation 1
• In console window, we can write something like
• It produces a graph
6
Draft R script
• Now imagine you will be producing plots with different
parameterization
• Typing a set of commands is tedious!
• Solution? A script
7
Draft and save R script
• We initiate the script by
– File- New File – R Script
• We write
– x = rnorm(100)
– y = rnorm(100)
– plot(x,y)
• We then save the script by
– File- Save As… – (Choose your favorite file name)
8
Run R script
• We run the script by clicking “Run” in Source pane (pane 1) three
times
– R runs the script line by line
• If we want to run the whole script, click “Source”
9
Maintain R script
• For easier understanding of the code among co-workers (including
yourself in future), I recommend adding comments
• R ignores anything after “#”
10
LINEAR REGRESSION
11
Regression
• A useful and widely used tool for predicting outcomes
• Reason 1: A baseline approach for supervised learning (SL)
– SL is a function that maps an input (X) to an output (Y)
• e.g., Ads & Sales are input & output
• Reason 2: A jumping-off point for many fancy approaches: they are
generalizations of linear regressions
12
Never Underestimate Power of
Linear Regressions!
Why are we obsessed with linear regressions?
• Practically, this method is the gold standard for our retail analytics
• Think of this skill as “litmus test” as a data scientist
– Properly conducting and interpreting linear regressions is a
scarce resource (still)
• Consequently,
– If you are able to (1) construct regression equations and (2)
interpret results properly, they will believe fancy ML analyses
you provide & later respect your skill
– The converse is true
13
A Guidebook for you…
You can download the 2nd edition
(free!) at
https://www.statlearning.com/
14
Simple Linear Regression
e22
Ordinary Least
Square (i.e., linear
regression) chooses
a regression line
consisting
b0 (beta_zero) and
b1 (beta_one)
Y22_hat
22th observation’s outcome Y22
Criteria for picking parameters?
Pick ones such that we minimize2mean squared errors (MSE)
(
1 n
MSE =  Yi − Yˆi
n i =1
)
(
)
n
2
1 n
1
=  Yi − ˆ0 − ˆ1 X 1 =  ( ei )
n i =1
n i =1
2
15
Multivariate Regressions
• What is “Multivariate” (or
“multiple”)?
– More than one
regressors/controls/explan
atory variables/ “Xs”
• Again, OLS picks parameters
such that MSE is minimized
n
(
1
MSE = å Yi – Yˆi
n i=1
)
2
1 n
= å Yi – b̂0 – b̂1 X1 n i=1
(
– b̂ p X p
)
2
16
Why OLS Useful? (1) Consistency
• As the number of observations increases infinity (i.e., the sample
size increases), the estimated parameters converge to the true
parameters
• Relevant in the age of “Big data”
17
Why OLS Useful? (2) Unbiasedness
• We generate 100 random Xs and 100 Ys based on
We run 10 regressions based on 10
– Y = 2 + 3X + e
different data sets.
Red line: true relationship (chosen by
“god”, never observed by humans)
Blue line: linear regression line from a
set of observations
Each blue line does not match the true line.
But on average, the regression lines
matches the true line (unbiasedness),
something that is never revealed to humans!
18
Advanced: Conditions for Unbiasedness
1. Linear in parameters
The model in population is a linear combination of explanatory
variables
2. Random sampling
We have a random sample of N observations of (Y, X)
3. Zero conditional mean
The error epsilon has an expected value of zero given X: E[e|X]=0
To translate the condition 3:
– This condition says X variables are exogenous such that X
variables do not correlate with e (epsilon or error term)
– This condition can be violated in several ways
– For instance, when key variables to explain Y are omitted from
the estimation equation (“omitted variable bias”, covered next
19
week)
READ TODAY’S DATA
20
Read the data
• Tell R that we use the package we just installed
– library(tidyverse)
– library(dplyr)
• Load “marketing.rda” by “File-Open File”
– load(“C:/Users/mnishid2/Dropbox/teaching/2023_24/Retail
Analytics/Coding in R/w2/marketing.rda”
What is this data (“marketing”)? It’s 200 months of sales and
advertising (ads) in media from a retailer
• Variables
youtube:
Expenditure on youtube ads in USD
facebook: Expenditure on facebook ads in USD
newspaper: Expenditure on newspaper ads in USD
sales:
Sales of the product in quantity
21
Data Cleaning: Missing Obs & Obvious Errors?
• Let’s scroll down to the end to make sure if there are no outliers
for each variable
• Also, by summarizing data, let’s check if any obvious errors exist
(e.g., negative Ad expenditures)
22
DESCRIPTIVE ANALYSES
23
Descriptive Analysis
• Summarize the data variables
– summary(marketing)
• If there are missing values, R will tell you via “NA’s”
24
Descriptive Analysis
• Plot youtube and sales
– plot(marketing$youtube, marketing$sales)
• Export in jpeg
25
Descriptive Analysis
• Plot all combinations all by once
– pairs(marketing, pch=”.”)
26
SIMPLE REGRESSION
(i.e., one regressor)
27
Simple Linear Regression
• We construct a linear regression model
Yi =  0 + 1 X 1 + 
– Where Y: sales, X1: youtube ads expenditure
• We use lm() to run a regression
– lm(sales ~ youtube, data = marketing)
Y (“regressand”)
X (“regressor”)
28
Simple Linear Regression
• We need more information about the regression outcome.. What
should we do?
• We store the results in “reg_sales_yt” when running a regression
– reg_sales_yt |t|): p-value corresponding to the t-statistic
30
Interpreting Regression: Magnitude
• Magnitude of coefficients: b0 and b1 are 8.439 and 0.047
– Thus, the estimated regression equation is
sales = 8.439 + 0.047 * youtube
• What are implications for the company?
31
Interpreting Regression: Magnitude
• sales = 8.439 + 0.047 * youtube
• Coefficient b1 (=0.047) can be interpreted as “average effect on y
(sales, measured in units) of a one unit increase in the control
(i.e., youtube ads in USD), holding all other factors fixed.”
– If youtube advertising budget is zero, we expect a sale of 8.439
units
– If youtube advertising budget is $1,000 USD, we expect a sale
of 55.439 units (= 8.439 + 0.047 *1000)
• R can predict sales units for two budget: $0 and $2,000
– pred_data % predict(pred_data)
32
Construct CI to Describe Precision
• Remaining question: how precise are these predictions?
• To answer this question, we construct confidence interval (CI) for
each parameter
• What is CI?
– CI measures level of confidence that parameter lies in the interval
– Technically, 95% CI is a range of values such that with 95%
probability, the range contains the true unknown value of b1 (See
ISLR p.77)
• Assuming b1 follows a normal distribution, 95% CI for b1 will be
[b1_hat – 1.96 * Std_error, b1_hat + 1.96 * Std_error]
• Substituting the parameters yields
[0.047 – 1.96 *0.0026, 0.047 + 1.96 *0.0026]
Or [0.041904, 0.052096]
33
Manually Construct CI in R
• Create a matrix to store the regression information
– Coef1 = summary(reg_sales_yt)$coef
Coef1[,1] Coef1[,2]
• Compute the lower and upper bound for CI
– lolim = Coef1[,1] – 1.96 * Coef1[,2]
– uplim = Coef1[,1] + 1.96 * Coef1[,2]
– cbind(lolim, uplim)
• Or use
– confint(reg_sales_yt)
34
Construct CI: Implementing in R
• Based on CI for b1: [0.04223072, 0.05284256], we can tell
management the following message:
“For each $1,000 increase in youtube advertising, there will be an
average increase in sales of between 42 and 53 units. ”
35
Remark 1: How to interpret “95%” CI?
• What does this CI really mean by stating “95%”?
• Formally speaking, CI measures the frequency of possible
confidence intervals that contain the true value of the unknown
population parameter
For instance, imagine we estimate CI for b1
for 100 times from 100 data sets.
b1= 0.047
If we are looking at 95% CI, 95 times out of
100 times the CI contains the true but
unobserved parameter (i.e., god knows this
value but observers like us do not know it)
So the estimated CI has a 95% chance of
containing the true b1
36
Remark 2: Why use 1.96 for CI?
• Quick answer: 1.96 is rule of thumb when no access to stats table
– We assumed a normal distribution for b1 for approximation,
because t distribution has a bell shape and for values of n greater
than approximately 30 it is quite similar to the normal distribution
• Under classical linear model assumptions, b1 follows a t-distribution
with N – K – 1 degree of freedom. Then CI is
[b1_estimated – c* SE, b1_estimated + c * SE]
where b1_estimated: 0.047
c: 97.5th %tile of t distribution with N – K – 1 degree of freedom
SE: estimated standard errors (= 0.0026)
N: number of observations (=200)
K: number of parameters (=2)
• We look up in the stats table to find out the value of c = 1.972
37
HYPOTHESIS TESTING:
VERIFYING RELATIONSHIP
BETWEEN Y AND X1
38
Interpreting Regression: Verifying
Relationship via Hypothesis Testing
• Can we conclude whether there is a statistical relationship
between “youtube” ad expenditure and “sales”?
• To answer this question, we can use the p-value of the estimated
coefficient b1 to perform a hypothesis test
• The p-value on “youtube” coefficient is 1.46*e-42
• What does this mean?
39
Interpreting Regression: Verifying
Relationship via Hypothesis Testing
• Consider the null hypothesis
– H0: There is no relationship between X (youtube) and Y (sales)
– In a simple regression, H0: b1 = 0
• Consider the alternative hypothesis
– H1: There is some relationship between X and Y
– Namely, H1: b1 ≠ 0
• If H0 is true, we expect that t statistics for b1 follows a tdistribution with N-2 degree of freedom
t=
ˆ j
SE ( ˆ j )
= 0.047/0.0026 = 17.66
• By looking at t-table, we can calculate the probability of having
t=17.66. In our case, p = 1.46*e-42
40
Interpreting Regression: Verifying
Relationship via Hypothesis Testing
• P-value of 1.46*e-42 means… what?
– It means that very unlikely event has happened IF H0 (null
hypothesis) is true
– In other words, a small value of p-value indicate it is unlikely to
observe such a substantial association between youtube and
sales, in the absence of any real association between the two
• Hence, if we see a very small p-value for H0, we infer there is a
statistical association between youtube and sales
– Namely, we reject the null hypothesis at the 1%/5% level
Note: Typical p-value cutoffs for rejecting null hypothesis is either 5%
or 1%
41
DIAGNOSING FIT OF
REGRESSION: R-SQUARED
42
Interpreting Regression: R-squared
• R^2 (“R squared”) statistic is a measure of the linear relationship
between X and Y, and 0
Purchase answer to see full
attachment