Description
This is a biostatistics assignment and it’s about correlation and linear regression. Please answer each question correctly and carefully. There are three files and two of them have the data so please use them correctly. The other file has the directions and questions that need to be answered.
Unformatted Attachment Preview
gestage
bwt
30
31
32
33
34
35
36
37
37
38
38
39
39
39
40
40
40
41
41
42
1820
1900
2210
2250
2500
2775
2901
3090
3200
3270
3347
3400
3424
3500
3493
3600
3616
3680
3720
3800
study
iron
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
lead
17
22
35
43
80
85
91
72
96
100
11
11
25
30
44
50
65
71
82
90
100
80
8
17
18
25
58
59
41
30
43
58
3
10
22
33
40
50
52
54
55
57
60
53
PHSL 6504
HWK 10: Correlation and Linear Regression
Reading Assignment
• Chapters 17 and 18 in the textbook.
The Data Sets
1) Birth weight and gestational age study (note: for some gestational ages there are multiple observations, but
all of these data points are independent):
gestage (wks)
30
31
32
33
34
35
36
37
38
39
40
41
42
bwt (grams)
1820
1900
2210
2250
2500
2775
2901
3090, 3200
3270, 3347
3400, 3424, 3500
3493, 3600, 3616
3680, 3720
3800
2) Lead and iron exercise (percent of dose absorbed): each value is the amount of the metal (mg)
measured in one ml of blood
Study population 1
iron
lead
17
8
22
17
35
18
43
25
80
58
85
59
91
41
72
30
96
43
100
58
11
3
Study population 2
iron
lead
11
10
25
22
30
33
44
40
50
50
65
52
71
54
82
55
90
57
100
60
80
53
INSTRUCTIONS:
1. Make a scatter-plot of the relationship of gestational age (in weeks, on the x-axis) to birth weight (in grams,
on the y-axis).
2. Perform a linear regression analysis for the model: bwt = gestage. Add the fitted regression line to the plot.
Answer questions 1-6 on the worksheet about the data plot and regression results
3. Make separate scatter-plots of the iron and lead data from the 2 different study populations (it does not matter
which variables are plotted on the x- and y-axes). Compute the correlation coefficients and test the null
hypothesis that ρ =0 for each population separately. Answer questions 7-10 on the worksheet.
NOTE: do not have to include graphs.
1
Homework 5
Name ________________
Questions about the analysis of birth weight and gestational age:
(1) What is your impression from the scatter plot with the fitted regression line, in terms of how well
the linear regression model appears to fit the data?
_______________________________________________
__________________________________________________________________________________
(2) Report the estimated slope______ and estimated intercept______
Interpretation_________________________________________________________________
__________________________________________________________________________________
(3) Report and interpret the R2 value.
R2 =______ Interpretation______________________________________________________
(4) Report the test statistic and the p-value for the test that the slope equals 0
t = _____ p =_____ Interpretation __________________________________________________
__________________________________________________________________________________
(5) Use the regression equation for a baby born at 38 weeks. What is the predicted birth weight for
such an infant? ______________
(6) Suppose we want to predict the birth weight of a child born at 28 weeks. How would you do it, or
why would you not? _______________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Correlation results of the analysis of lead and iron data.
(7) Based on the scatter plots of each of the 2 study populations, is it appropriate to perform a
correlation analysis? Describe your impressions of the data.
Population 1 _________________________________________________________________
__________________________________________________________________________________
Population 2 _________________________________________________________________
__________________________________________________________________________________
(8) Report the correlation coefficients:
r =______ in population 1 and r =______ in population 2
(9) Report the test statistic and the p-value from the test that ρ =0 in population 1; interpret the results:
__________________________________________________________________________________
__________________________________________________________________________________
(10) “Outliers” are said to be a potential problem for correlation and regression analyses – why do you
think this is so? How would an outlier data point affect the results of a correlation analysis and how
might it affect the results of a linear regression?
2
Purchase answer to see full
attachment