### Question 3

Given the following information, answer the questions that follow:

From the household budget survey of 1980 by the Dutch Central Bureau of Statistics, J. S. Cramer obtained the following logit model based on a sample of 2820 households. (The results given here are based on the method of maximum likelihood and are after the third iteration.) The purpose of the logit model was to determine car ownership as a function of (logarithm of) income. Car ownership was a binary variable: [tex]\( Y = 1 \)[/tex] if a household owns a car, zero otherwise.

[tex]\[
\begin{array}{l}
\hat{L}_i = -2.77231 + 0.347582 \ln(\text{Income}) \\
t = (-3.35) \quad (4.05) \\
x^2(1 \, \text{df}) = 16.681 \, (p \text{ value } = 0.0000)
\end{array}
\][/tex]

where [tex]\(\hat{L}_i\)[/tex] is the estimated logit and [tex]\(\ln(\text{Income})\)[/tex] is the logarithm of income. The [tex]\(x^2\)[/tex] measures the goodness of fit of the model.

a. Interpret the estimated logit model.

b. From the estimated logit model, how would you obtain the expression for the probability of car ownership?

c. What is the probability that a household with an income of 20,000 will own a car? And at an income level of 25,000? What is the rate of change of probability at the income level of 20,000?

d. Comment on the statistical significance of the estimated logit model.



Answer :

Sure, let's go step-by-step through each part:

### a. Interpret the estimated logit model.

The estimated logit model is given by:
[tex]\[ \hat{L}_i = -2.77231 + 0.347582 \cdot \ln(\text{Income}) \][/tex]

Where:
- [tex]\(\hat{L}_i\)[/tex] is the estimated logit which is a linear combination of the coefficients.
- The coefficient for the intercept is -2.77231.
- The coefficient for the logarithm of income is 0.347582.

Interpretation:
- The intercept [tex]\(-2.77231\)[/tex] represents the log-odds of car ownership when the logarithm of income is zero [tex]\( \ln(\text{Income}) = 0 \)[/tex]. Since [tex]\(\ln(1) = 0\)[/tex], this essentially means the log-odds of car ownership when income is 1 unit of currency (assuming the unit used allows logarithm to make sense).
- The coefficient of [tex]\(\ln(\text{Income})\)[/tex], which is 0.347582, indicates how the log-odds of car ownership change with respect to a one-unit change in the logarithm of income.
- The t-values (in parentheses under the coefficients) indicate the statistical significance of these coefficients. The t-value for the intercept is -3.35, and for the slope (log of income) it is 4.05.

### b. From the estimated logit model, how would you obtain the expression for the probability of car ownership?

To convert the logit function into a probability, we use the logistic function. The probability [tex]\( \hat{P}_i \)[/tex] of car ownership can be obtained by transforming the logit [tex]\(\hat{L}_i\)[/tex]:

[tex]\[ \hat{P}_i = \frac{1}{1 + e^{-\hat{L}_i}} \][/tex]

Substituting [tex]\( \hat{L}_i \)[/tex] in, we get:

[tex]\[ \hat{P}_i = \frac{1}{1 + e^{-(-2.77231 + 0.347582 \cdot \ln(\text{Income}))}} \][/tex]

### c. What is the probability that a household with an income of 20,000 will own a car? And at an income level of 25,000? What is the rate of change of probability at the income level of 20,000?

Given the calculated values:

- For income of 20,000:
[tex]\[ \ln(\text{Income})_{20k} = 9.903487552536127 \][/tex]
[tex]\[ \text{logit}_{20k} = 0.6699640104856122 \][/tex]
[tex]\[ \text{probability}_{20k} = 0.6614951005017523 \][/tex]

- For income of 25,000:
[tex]\[ \ln(\text{Income})_{25k} = 10.126631103850338 \][/tex]
[tex]\[ \text{logit}_{25k} = 0.7475246923385082 \][/tex]
[tex]\[ \text{probability}_{25k} = 0.678639102980446 \][/tex]

- The rate of change of probability at the income level of 20,000 is:
[tex]\[ \text{rate of change}_{20k} = 0.07783032943385645 \][/tex]

### d. Comment on the statistical significance of the estimated logit model.

The model's statistical significance can be inferred from:

- The t-values for the intercept [tex]\(-2.77231\)[/tex] and the slope [tex]\(0.347582\)[/tex] are -3.35 and 4.05, respectively. A high t-value (in absolute terms) indicates that the coefficient is significantly different from zero. Given that the critical t-value for common significance levels (e.g., 1.96 for 5% significance) is much lower than the given t-values, both the intercept and slope coefficients are statistically significant.

- The chi-square ([tex]\( \chi^2 \)[/tex]) value for the goodness of fit is 16.681 with a p-value of 0.0000. The null hypothesis here would be that the model fits no better than a model with no predictors. The very low p-value (much less than 0.05) indicates that we reject the null hypothesis, confirming that the model provides a statistically significant improvement in fit.

Thus, the model is statistically significant in explaining the probability of car ownership as a function of log-income.