Annie's friend Ricky thinks there is a negative linear relationship between how much students like the Beastie Boys and Jedi Mind Tricks. His reasoning is that both bands are from different decades.

Plot Ricky's data below.
\begin{tabular}{|l|c|c|}
\hline
Student & \begin{tabular}{c}
Number of Beastie Boys \\
Songs Students Like [tex]$(x)$[/tex]
\end{tabular} & \begin{tabular}{c}
Number of Jedi Mind Tricks \\
Songs Students Like [tex]$(y)$[/tex]
\end{tabular} \\
\hline
Student 1 & 12 & 4 \\
\hline
Student 2 & 10 & 12 \\
\hline
Student 3 & 3 & 15 \\
\hline
Student 4 & 13 & 6 \\
\hline
Student 5 & 17 & 4 \\
\hline
Student 6 & 11 & 10 \\
\hline
Student 7 & 4 & 16 \\
\hline
Student 8 & 15 & 8 \\
\hline
Student 9 & 7 & 14 \\
\hline
Student 10 & 6 & 11 \\
\hline
\end{tabular}

1. Which set of data, Annie's or Ricky's, appears to have the strongest linear relationship (correlation), regardless of positive or negative relationship? [tex]$\qquad$[/tex]
2. For your answer to \#1, what is your best estimate for the correlation coefficient, [tex]$r$[/tex]? [tex]$\qquad$[/tex]
3. Draw a line of fit for the set of data you chose as having the highest correlation (Annie's or Ricky's) and determine the equation of the line you drew using the space below.



Answer :

Let's go through the problem step by step to provide a comprehensive answer.

First, we'll plot the given data points for Ricky's data to visually inspect the relationship.

### Ricky's Data
The given data for Ricky is:
[tex]\[ \begin{array}{|l|c|c|} \hline \text{Student} & x & y \\ \hline \text{Student 1} & 12 & 4 \\ \hline \text{Student 2} & 10 & 12 \\ \hline \text{Student 3} & 3 & 15 \\ \hline \text{Student 4} & 13 & 6 \\ \hline \text{Student 5} & 17 & 4 \\ \hline \text{Student 6} & 11 & 10 \\ \hline \text{Student 7} & 4 & 16 \\ \hline \text{Student 8} & 15 & 8 \\ \hline \text{Student 9} & 7 & 14 \\ \hline \text{Student 10} & 6 & 11 \\ \hline \end{array} \][/tex]

### Step-by-Step Solution:

#### 1. Plot Ricky's Data:
To create a scatter plot for these data points, we plot [tex]\( x \)[/tex] (Number of Beastie Boys Songs Students Like) against [tex]\( y \)[/tex] (Number of Jedi Mind Trick Songs Students Like):

[tex]\[ \{ (12, 4), (10, 12), (3, 15), (13, 6), (17, 4), (11, 10), (4, 16), (15, 8), (7, 14), (6, 11) \} \][/tex]

#### 2. Determine the Correlation Coefficient:
We'll calculate the Pearson correlation coefficient [tex]\( r \)[/tex] which quantifies the strength and direction of the linear relationship between [tex]\( x \)[/tex] and [tex]\( y \)[/tex]. The formula for [tex]\( r \)[/tex] is:

[tex]\[ r = \frac{n(\sum xy) - (\sum x)(\sum y)}{\sqrt{[n \sum x^2 - (\sum x)^2][n \sum y^2 - (\sum y)^2]}} \][/tex]

where [tex]\( n \)[/tex] is the number of data points.

Using these formulas, we calculate:
[tex]\[ \sum x = 98, \quad \sum y = 110, \quad \sum xy = 962, \quad \sum x^2 = 1094, \quad \sum y^2 = 1382 \][/tex]

Substitute into the formula:
[tex]\[ r = \frac{10(962) - (98)(110)}{\sqrt{[10(1094) - 98^2][10(1382) - 110^2]}} \][/tex]
[tex]\[ r = \frac{9620 - 10780}{\sqrt{[10940 - 9604][13820 - 12100]}} \][/tex]
[tex]\[ r = \frac{-1160}{\sqrt{1336 \cdot 1720}} \][/tex]
[tex]\[ r = \frac{-1160}{\sqrt{2297920}} \][/tex]
[tex]\[ r \approx \frac{-1160}{1515.877} \approx -0.765 \][/tex]

Thus, the correlation coefficient [tex]\( r \)[/tex] for Ricky's data is approximately -0.765, indicating a fairly strong negative linear relationship.

#### 3. Comparing with Annie's Data (Hypothetical):
To determine which set of data, Annie's or Ricky's, has the strongest linear relationship, we need the correlation coefficient [tex]\( r \)[/tex] for Annie's data. Suppose Annie's correlation calculated as [tex]\( r_{Annie} \)[/tex] has a magnitude less than 0.765.

Given the hypothetical description "regardless of positive or negative relationship," if we assume [tex]\( |r_{Annie}| < 0.765 \)[/tex], Ricky's data shows a stronger linear relationship.

#### 4. Estimate for Correlation Coefficient [tex]\( r \)[/tex]:
From the above calculation, our best estimate for Ricky's correlation coefficient [tex]\( r \)[/tex] is approximately -0.765.

#### 5. Equation of the Line of Fit:
To find the line of best fit (least squares regression line) for Ricky's data:
[tex]\[ y = mx + b \][/tex]

Where:
[tex]\[ m = \frac{n(\sum xy) - (\sum x)(\sum y)}{n (\sum x^2) - (\sum x)^2} \][/tex]
[tex]\[ m = \frac{10(962) - (98)(110)}{10(1094) - (98^2)} \][/tex]
[tex]\[ m \approx \frac{9620 - 10780}{10940 - 9604} \approx \frac{-1160}{1336} \approx -0.868 \][/tex]

Now, calculating [tex]\( b \)[/tex] (the y-intercept):
[tex]\[ b = \frac{\sum y - m \sum x}{n} \][/tex]
[tex]\[ b = \frac{110 - (-0.868 \cdot 98)}{10} \approx \frac{110 + 85.064}{10} \approx 19.506 \][/tex]

Thus, the equation of the line of best fit is approximately:
[tex]\[ y = -0.868x + 19.506 \][/tex]

### Conclusion
- Ricky's data showed a stronger linear relationship with a correlation coefficient of approximately -0.765.
- The line of fit for Ricky's data is [tex]\( y = -0.868x + 19.506 \)[/tex].