Description
Organize notes based on PPT and make them as concise as possible
Unformatted Attachment Preview
BU.450.740
Retail Analytics
Week 4
Marketing via GIS– Introduction
Prof. Mitsukuni Nishida
Johns Hopkins Carey Business School
1
Today’s Agenda for Geographical
Analysis in Retailing
• What are GIS and ArcGIS & why do we use them?
• Promotion decisions through geographical analysis
2
WHAT IS GIS AND WHY HOW
USEFUL IN RETAILING?
3
What is GIS?
Geographic Information System (GIS)
• A system to capture, store, manipulate, analyze, manage, and
present spatial/geographic data
• Several platforms, such as ArcGIS, QGIS, … etc., allow you to
conduct spatial analysis
Examples of spatial analysis
1. Calculate minimum distance
between points
2. Compute population density for a
given grid
3. Count how many points exist within a
certain boundary
4. Compute travel time
5. Compute spatial statistics re: density
… and more
4
Why ArcGIS?
• Standard platform in spatial analysis for two decades
– Opportunity to experience full capacity of GIS analysis
– GUI is (relatively) easy to use
• Licensed to Hopkins faculty and students
5
Note: We won’t use R for GIS analysis, but technically possible
Johns Hopkins CSSE provides online dashboard for
tracking coronavirus
6
“WalMap” by Walgreens provides visualized local
trends of Flu & others
7
CASE: WALGREENS’
PROMOTION STRATEGY
8
As a manager, what is your strategic decision after finishing designing such flyers?
Walgreens’ Flyers in Newspapers
“Walgreen treated each U.S. locale
equally for newspaper advertising.
…Mike Feldner, the marketing
manager, first created this picture,
noticed something interesting: .. No
dots (customers) for a store more
than two miles from the store.
…Walgreens stopped spending
advertising dollars in all zip codes
without a store within two miles of the
zip code.
Impact
• Sales revenues: almost zero
• Cost savings: more than US $5M
Source: Jeffery (2010)
Huge savings did not require a lot of
money.. analysis was done on a 10
personal PC’s ArcGIS.”
WG’s Sr. Mkt Analyst -> WG’s Project Manager -> Kellogg MBA -> WG’s Manager
for Media Analytics & Customer Insights -> WG’s Senior Director -> Geography
Head at Mu Sigma -> Chief Marketing Officer at Redbox Automated Retail, LLC
Location Analysis: Applications in Retailing
• Targeted promotional campaign (today)
• Location of new stores: demographics & rival stores
• Optimized route planning
• …
12
DEVELOPING A TARGETED
PROMOTIONAL CAMPAIGN
13
Background: Promotion Campaign of Camper
• Assume you are sales manager for Outdoor Living Inc.
– Outdoor Living Inc. (OLI) produces camping and recreational
equipment
• OLI has introduced a new, moderately priced camper to attract
middle-income families who are new to camping
• We will plan promotional campaign for this camper
• All the following exercises using ArcGIS Map (not AGO!) as
platform are covered in Chapter 3 of the textbook by Miller (2007)
http://downloads2.esri.com/ESRIpress/images/120/ch03_TUTMK
TG.pdf
14
Marketing Scenario
• Market segmentation: one of the powerful tools in Marketing
• According to market research, Outdoor Living Inc. has defined its
target segment as
– Family
– Income between $35,000 – $60,000
• As a sales manager, your tasks are
– To identify geographic concentrations of target customers
– To design promotional activities to reach them efficiently
• Newspaper and TV advertising
• Direct mail with price incentives
• Outdoor shows & demonstrations
15
How Do We Proceed?
1. We explore the demographic characteristics [This week]
• Load, display, and explore maps of Florida
• View thematic maps of Florida’s counties and their
demographic characteristics
• Examine data distribution in densely populated areas
2. Select counties for local advertising campaign [Homework 4]
3. Select stores for outdoor show demo
16
EXERCISE 3.1
EXPLORE DEMOGRAPHIC
CHARACTERISTICS
17
Login Procedure for ArcGIS Online (AGO)
1. Visit JHU Sheridan Library’s GIS website
https://guides.library.jhu.edu/gis
2. Click ArcGIS Online for JHU on the left tab (or click
https://gisanddata.maps.arcgis.com/)
3. Sign in with JHED account information
4. On the top tab, hit “Gallery” next to “Home” and “Map”
5. In the search box, type in “Retail Analytics”
6. Hit 2024 Retail Analytics Week 4
18
AGO dashboard
19
Things We Try to Get Used to ArcGIS
Let’s play with Layers
• Turning on and off layers
• Moving up and down layers
• Look at what data each layer contains
Operations
• Zooming up and out
• Selecting feature
20
Exercise 3.1: Checking Population
• Turn on “Total Population by County”
– Check the box next to “Total population by county”
• Question 1: “In general, what parts of the state contain the highest
population measure?” Ans. Central and southern parts of Florida
• Question 2: “Which country has the highest population, and what is
the population?” Ans. Miami-Dade County, 2,388,709
– Click the layer “Total_population_by_county”, and click on “Show
table” to open attribute table
– Click on “TOTPOP_CY” and choose “Sort Descending”
– Select the top row, which contains the highest population
– After done with this question, clear the selection
21
22
Explore Demographic Characteristics with
Thematic Layers
• Question
– What is the average family size (“AVGFMSZ_CY”) in the county
southwest of Miami?
– Click inside the county southwest of Miami
– Ans. 2.74
23
Change data represented by a layer
• We just realized a layer may contain various variables
• How can we change the variable to display within a layer?
• In particular, how do we show “Family Households by County”
(FAMHH_CY) instead of total population?
• Choose “Open in Map Viewer Classic”
• Click on “Change Style” below the layer name
• Choose “FAMHH_CY” and choose “Counts and Amounts (Color)”
• Click DONE
• Choose More Options under the layer name and rename the layer 24
25
Display Data Distribution in Smaller
Areas
• How do we move layers so we obtain detailed information?
• Turn on “Per Capita Income (PCI) by County”
• Then turn on “Per Capita Income in Tampa”
• Move the “PCI in Tampa” layer on the top
• By checking on & off the Tampa layer, we realize there is variation
of income within a country
26
EXERCISE 3.2
SELECT COUNTIES FOR
LOCAL AD CAMPAIGN
27
Local Advertising Media: Radio and TV
• Our goal and budget constraint
– We choose counties as appropriate geographic units for local ad
– Budget will allow us to run 10 counties for local ads via radio and
TV
– The costs for advertising are linear in the number of counties
selected
• Our question: Which will be these 10 counties?
For details about the data and analysis steps via ArcGIS Map, please see
Exercise 3-2 on pages 58-65 in Chapter 3 of Miller (2007) available at
http://downloads2.esri.com/ESRIpress/images/120/ch03_TUTMKTG.pdf
28
Homework Questions about Local
Advertising Media: Radio and TV
1. Which variable should we use to select these 10 counties for ads?
– Ans. (provide a variable name and state your reasoning)
2. Which ten counties should be included in Outdoor Living’s local
advertising campaign?
– Ans. (list all selected 10 counties)
3. How many target-market families will be reached by the campaign?
– Ans. (provide your answer)
4. Produce a map with those 10 selected counties
– Ans.. (provide your map)
29
Q1. Which variable should we use to
select these 10 counties for ads?
• We will be narrowing down to the choice between the following two
variables that are relevant for our promotional activities
– NumFamInTM (Number of families in target markets)
– PctFamInTM (Percentage of families in target markets)
• To maximize the number of target families reached by this TV and
Radio Ads campaign, we should choose (variable name) because
(your reasoning behind this choice)
• Hints
– For TV/radio Ads, the costs of promotions are linear in the
number of counties (i.e., number of markets) selected
– For comparison purpose, for direct mail Ads, the costs of
promotions are linear in the number of mails sent so the fraction
of families who will be interested in this product is key
30
Q2. Which ten counties should be
included in Outdoor Living’s local
advertising campaign?
How do we proceed for the analysis?
Steps
1. Generate a map layer on “Number of Families in Target Market”
• We can first modify the layer “Total population by county” and
convert it to display “Number of Families in Target Market”
(NumFamInTM)
2. Open attribute table and select “NumFamInTM” and select “Sort by
Descending”
3. Click 10 counties from the top county to the 10th county by
“CTRL+left click”
31
Q3. How many target-market families
will be reached by this campaign?
Steps
1. Open attribute table and select “NumFamInTM” and select “Sort by
Descending”
2. Click 10 counties from the top county to the 10th county by
“CTRL+left click”
3. Record the values for the top 10 counties
32
Q4. Produce a map with those 10
selected counties
Steps
• Select Print – Map with Legend
• Save the map via Browser’s print function
• To add text and other items, export the map to Powerpoints’ slide
(or other software if needed)
33
(end of slides)
BU.450.740
Retail Analytics
Week 4
Nonlinear Regression in R
Prof. Mitsukuni Nishida
Johns Hopkins Carey Business School
1
Today’s Agenda for Computation in R
• Nonlinear model: Logistic regression
• Implementations in R
– Application: Convenience store’s entry decision
2
PROBABILISTIC MODELING:
LOGISTIC REGRESSIONS
3
Logit Model: Classification Method
• If outcome/response variable Y_i is quantitative & hence continuous
(e.g., sales, income, price etc.), we use linear regressions
• If Y_i is qualitative/categorical & hence discrete (e.g., purchase or
non-purchase, abnormal/normal etc.), use classification methods
– Y_i = 1 if a firm enters a market, = 0 otherwise
– Y_i = 1 if a consumer returns a product, = 0 otherwise
• Today we introduce logistic regression algorithm
– Tool as basic as “Hello, World” in ML
4
Introducing Logit Model
• We model how outcome Y is explained by Xs
• Difference from the linear regressions is Y is now a binary variable
(= 1 or = 0). What we model is thus
– Y: outcome variable
• Y = 1 implies: firm enters a market, a person purchases a product,
a creditor declares a default, and
• Y = 0 implies otherwise (i.e., firm does not enter a market, a person
does not purchase a product, a creditor does not default)
– By modeling Y = 1 or 0, we can interpret predicted Y as
“Predicted probability of Y = 1 given Xs”
– Xs: variables that explain the probability of Y = 1 (“regressors”)
• Linear regression: Pr(Y=1) = b0 + b1*X
• Logit regression: Pr(Y=1) = exp(b0 + b1*X) / (1+ exp(b0 + b1*X))
– Why this formula? This formula ensures the right-hand side falls
between 0 and 1 unlike linear regression
5
– Why logit? Computation is faster due to exponential function
When to Choose Logit over Linear Regression?
p > 1: Probability exceeds 1!
= exp(b0 + b1*X) / (1+ exp(b0 + b1*X))
p < 0: Negative probability!
6
Actual Fit of Linear and Logistic Regression
• Data: A customer defaults for a credit cards. Y =1 if default, = 0 otherwise
• We want to predict probability of default as a function of bank balance
Predicted probabilities of default
using linear regression
Predicted probabilities of default
using logistic regression
Y
X
X
p < 0: Negative probability!
ISLR (“An Introduction to Statistical Learning”) 2nd edition page 133, Figure 4.2
7
READ TODAY’S DATA &
DATA CLEANING
8
Convenience-Store Chain’s Entry
• Empirical context: C-Store Chains in Japan 1982-2010
• Geographical markets: Prefecture (1 to 47)
• Outcome variable Y_i is discrete and binary (“d_index_entry”)
– Y_i = 1 if a firm enters a market that year
– Y_i = 0 if a firm does not enter a market
• Input variables (factors) that may affect the entry decision
– Population (“pop”) at the market level
– Income per capita (“incomepc”) at the market level
– # of competitor chains (“no_rival_chains”)
– Year id (“time”)
– Market id (“marketid”): Prefecture id
– Chain id (“chain_id”), such as 7-Eleven (1), LAWSON (2),
Family Mart (3) etc
9
Read the data
• Tell R that we use the package we just installed
• Load “” by “File-Open File” or
– cstore_raw 0.044
• Calculate the baseline probability after 10% increase in population
for these thee levels
– # Increase the population by 10%
– pop_25_10p = pop_25 * 1.1
– pop_50_10p = pop_50 * 1.1
21
– pop_75_10p = pop_75 * 1.1
Logistic Regression: Evaluating Outcomes
• Calculate the entry probability after 10% increase in population
– p_pop25_10p = exp(b0+b1*pop_25_10p)/(1+exp(b0+b1*pop_25_10p))
– p_pop50_10p = exp(b0+b1*pop_50_10p)/(1+exp(b0+b1*pop_50_10p))
– p_pop75_10p = exp(b0+b1*pop_75_10p)/(1+exp(b0+b1*pop_75_10p))
• Calculate the percentage change in probability of entering a market
– inc_p_pop25_10p = p_pop25_10p/p_pop25 - 1
– inc_p_pop50_10p = p_pop50_10p/p_pop50 - 1
– inc_p_pop75_10p = p_pop75_10p/p_pop75 - 1
– print(inc_p_pop25_10p*100) => 1.60%
– print(inc_p_pop50_10p*100) => 2.11%
– print(inc_p_pop75_10p*100) => 3.00%.
• Namely, if the market has population that is 75%-tile of all markets,
10% increase in population leads to 3.00% increase in probability in
entering a market.
22
Logistic Regression: Fit
• Run the following
– # Plot the fit of logistic regression
– newdat 20 + 101/10 = 30.1>30
• Industry currently uses Nash equilibrium to (1) predict traffic flows & (2)39
“nudge” people to follow Nash equilibrium
COMPETITION IN QUANTITY
(NOT IN FINAL EXAM)
40
Oligopoly
• Two major oligopoly models:
– Cournot: competition in quantity
– Bertrand: competition in price
•
Example: Coke and Pepsi (duopoly)
– Coke’s decision is likely to affect Pepsi’s profits
– So Coke should predict what Pepsi would react
41
Cournot Duopoly: Example
• Two firms in DRAM industry with a homogeneous product
• Firms simultaneously choose output levels Q1 and Q2
• One-shot static game
• Resulting total output determines the price P
• Each firm
– takes the other firm’s output as given
– chooses its output that maximizes its profits
42
Cournot Duopoly: Example (cont’d)
• Demand curve: P = 100 – Q1 – Q2
• (Total costs for firm1) = 10*Q1
– Both firms have constant marginal cost of $10, no fixed costs
• Firm 1 chooses Q1 to maximize profits taking Q2 as given
• Derive the best response function for each firm
– Step1: Construct the revenue & cost functions for firm 1
(Revenue) = P*Q1 = (100 – Q1 – Q2 ) Q1
(Total Cost) = TC = 10 Q1
– Step2: Find my own quantity Q1 such that MR = MC
Or
100 – 2Q1 – Q2 = 10 Or Q1 = 45 – 0.5Q2
• Best response function (or “reaction function”): Q1 = 45 – 0.5Q2.
– This function describes my optimal behavior (Q1) given my
rival’s choice (Q2)
43
Cournot Equilibrium
• Definition: Cournot equilibrium is a pair of (Q*1,Q*2) such that
– Q*1 is firm 1’s best response to Q*2
– Q*2 is firm 2’s best response to Q*1
• In other words, this equilibrium is a pair of (Q*1,Q*2) such that
no firm can increase profits by unilaterally changing its output
• How can we derive the equilibrium?
• Find (Q1, Q2) that satisfy the system of reaction functions
– Solve for (Q1, Q2) that satisfy Q1 = 45 – 0.5Q2 & Q2 = 45 –
0.5Q1
– We obtain (Q1, Q2) =(30,30)
• Implication on (Q1, Q2) space?
44
Cournot Reaction Functions
Price and profits under Cournot competition?
45
Cournot Equilibrium: Properties 1
• If one of the firms is off the equilibrium, both firms will have to
adjust their outputs
– Equilibrium is the point where adjustments will not be
needed
• In the Cournot model, “aggressive” behavior by one firm
(output expanding) is met by “passive” behavior by rivals
(output reduction)
46
Cournot Equilibrium: Properties 2
• Price: P_perfectcomp < P_cournot < P_monopoly
• Quantity: Q_perfectcomp > Q_cournot > Q_monopoly
• Price cost margin drops as number of firms increases
PCM = (P-MC)/P
– If every firm is symmetric, PCM = – 1/Nη where N is the
number of firms
– As the number of firms increases, (1) output will
increase (2) prices and profit margin per firm will
decline
47
Cournot vs. Bertrand
• Some industries are more described by Cournot, some by
Bertrand
– Cournot: Industries where quantity adjustment is more
difficult than price adjustment in the short run: wheat,
cement, steel, cars, computers, video game consoles,..:
– Bertrand: Industries where price adjustment is more
difficult than quantity adjustment: Software, insurance,
banking, regulated industries,..
• Cournot and Bertrand are basic tools to understand
oligopolistic competition
48
(end of slides)
Purchase answer to see full
attachment