PH 627: Advanced Statistical Methods in Public Health
M2 Assignment
- Shown below are results based on a portion of the Honolulu Heart data set. Total sample size is 100. The
dependent variable is cholesterol (mg/dL). The independent variables are systolic blood pressure (mmHg),
ponderal index (computed as height in centimeters divided by the cubed root of weight in kilograms), age
(years), smoking (1-smoker, 0-nonsmoker) and blood glucose (mg/dL). Output is from SAS PROC REG.
a. Construct a table to report the results including the regression estimate, the 95% confidence interval and
the p-value. Report the results using units that are more suitable for the problem: for SBP use 10 mmHg,
for age use 5 years, and for blood glucose use 20 mg/dL.
b. For each independent variable, note the statistical significance of the variable, and, if significant, provide
an interpretation of the relationship of the independent variable to cholesterol.
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 5 25021 5004.22307 3.779 0.0037
Error 94 124466 1324.11409
Corrected Total 99 149487
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > : |t| 95% Confidence Limits
INTERCEP 1 318.850 91.4003 3.49 0.0007 137.373 500.328
SBP 1 0.210 0.1826 1.15 0.2524 -0.152 0.573
PONDERED 1 -4.139 2.0604 -2.01 0.0474 -8.231 -0.049
AGE 1 0.208 0.7496 0.28 0.7817 -1.280 1.697
SMOKING 1 -5.023 7.6725 -0.65 0.5142 -20.258 10.210
BLDGLUC 1 0.191 0.0712 2.69 0.0084 0.050 0.333 - Using the following ANOVA table based on data in Problem 3 of the M1 Assignment about the regression
relationship of respiratory cancer mortality rates (Y) to air pollution index (X1), mean age (X2), and
percentage of workforce employed in a certain industry (X3), test the following hypotheses:
Source df SS
X1 1 1,523.658
X2|X1 1 181.743
X3|X1,X2 1 130.529
Residual 19 551.723
Total 22 2,387.653
a. 2 1 0 | : 0 H ρ YX X =
b. 31 2 0 |, : 0 H ρ YX X X =
c. H0: “The addition of X2 and X3 to a model already containing X1 does not significantly improve the
prediction of Y.” - An experiment involved a quantitative analysis of factors found in high-density lipoprotein (HDL) in a
sample of human blood serum. Three variables thought to be predictive of or associated with HDL
measurement (Y) were the total cholesterol (X1) and total triglyceride (X2) concentrations in the sample, plus
the presence or absence of a certain sticky component called sinking pre-beta, or SPB (X3), which was
coded 0 if absent and 1 if present. Use the provided SAS Output to complete this problem.
a. Test whether X1, X2, or X3 alone significantly helps in predicting Y.
b. Test whether X1, X2, and X3 taken together significantly help to predict Y.
c. Test whether the true coefficients of the product terms X1X3 and X2X3 are simultaneously zero in the
model containing X1, X2, and X3 plus these product terms. State the null hypothesis in terms of a multiple
partial correlation coefficient. If this test is not rejected, what can you conclude about the relationship of
Y to X1 and X2 when X3 equals 1, as compared with when X3 equals 0? In other words, are X1 and X2
modifying the effect of X3 in predicting Y? Explain.
d. Using α = 0.05, test whether X3 is associated with Y, after the combined contribution of X1 and X2 is
taken into account. State the appropriate null hypothesis in terms of a partial correlation coefficient. - Radial keratotomy is type of refractive surgery in which radial incisions are made in the cornea of myopic
patients in an effort to reduce their myopia. The Prospective Evaluation of Radial Keratotomy (PERK) study
began in 1983 to investigate the effects of radial keratotomy. Lynn et al. (1987) examined the factors
associated with the five-year postsurgical change in refractive error (Y, measured in diopters, D). Two
independent variables under consideration were baseline refractive error (X1, in diopters) and baseline
curvature of the cornea (X2, in diopters).
Use the given output tables to answer the following questions.
a. State the model that relates change in refraction (Y) to baseline refraction (X1), baseline curvature (X2),
and the interaction of X1 and X2. Perform a significance test for the interaction. What do you conclude
about the interaction?
b. Is it appropriate to assess confounding given your answer to part (a)? Explain.
c. If your answer to part (b) is yes, does X2 confound the relationship between Y and X1?
d. If your answer to part (b) is yes, does X1 confound the relationship between Y and X2?
e. Based on your answers in part (a)—(d), which variable(s) should be included in the model to improve
precision.