\begin{tabular}{|l|l|l|l|l|l|l|l|l|}
\hline
\begin{tabular}{l}
Time spent \\
revising for an \\
exam (Hours)
\end{tabular} & 9 & 0.5 & 1 & 4 & 6 & 2 & 3 & 7 \\
\hline
Test result (\%) & 90 & 20 & 38 & 62 & 68 & 32 & 46 & 70 \\
\hline
\end{tabular}

a) Complete a scatter graph with titles.

b) Draw a line of best fit.

c) What is the correlation?

d) Using your best fit line, estimate the percentage if a student revises for [tex]$8$[/tex] hours.



Answer :

Certainly! Let's solve each part of the question step-by-step.

### a) Scatter Graph with Titles:
To complete the scatter graph, we need to plot the Time spent revising for an exam (in hours) on the x-axis and the Test results (in percentage) on the y-axis.

Here are the coordinates for the scatter plot:
- (9, 90)
- (0.5, 20)
- (1, 38)
- (4, 62)
- (6, 68)
- (2, 32)
- (3, 46)
- (7, 70)

After plotting these points, we should add appropriate titles:
- Title: "Time Spent Revising vs Test Results"
- X-axis label: "Time spent revising (Hours)"
- Y-axis label: "Test result (%)"

### b) Line of Best Fit:
To draw the line of best fit, we will perform a linear regression. Linear regression helps us find the slope (m) and y-intercept (b) of the line of best fit such that it minimizes the sum of the square of the vertical distances of the points from the line.

First, let's calculate the necessary parameters for linear regression:

Let the dataset for time [tex]\(x_i\)[/tex] and results [tex]\(y_i\)[/tex] be:
```
x = [9, 0.5, 1, 4, 6, 2, 3, 7]
y = [90, 20, 38, 62, 68, 32, 46, 70]
```

Using the formulas for slope (m) and intercept (b) in simple linear regression:
[tex]\[ m = \frac{ N(\sum xy) - (\sum x)(\sum y) }{ N(\sum x^2) - (\sum x)^2 } \][/tex]
[tex]\[ b = \frac{ (\sum y)(\sum x^2) - (\sum x)(\sum xy) }{ N(\sum x^2) - (\sum x)^2 } \][/tex]

Where:
- [tex]\(N\)[/tex] is the number of data points
- [tex]\(\sum xy\)[/tex] is the sum of the products of corresponding [tex]\(x\)[/tex] and [tex]\(y\)[/tex] values
- [tex]\(\sum x\)[/tex] is the sum of [tex]\(x\)[/tex] values
- [tex]\(\sum y\)[/tex] is the sum of [tex]\(y\)[/tex] values
- [tex]\(\sum x^2\)[/tex] is the sum of the squares of [tex]\(x\)[/tex] values

Let's compute these sums:

[tex]\[ \begin{align*} N &= 8 \\ \sum x &= 9 + 0.5 + 1 + 4 + 6 + 2 + 3 + 7 = 32.5 \\ \sum y &= 90 + 20 + 38 + 62 + 68 + 32 + 46 + 70 = 426 \\ \sum xy &= (9 \cdot 90) + (0.5 \cdot 20) + (1 \cdot 38) + (4 \cdot 62) + (6 \cdot 68) + (2 \cdot 32) + (3 \cdot 46) + (7 \cdot 70) = 1676.5 \\ \sum x^2 &= (9^2) + (0.5^2) + (1^2) + (4^2) + (6^2) + (2^2) + (3^2) + (7^2) = 233.25 \end{align*} \][/tex]

Now, substituting these values into the formulas:

[tex]\[ m = \frac{8(1676.5) - (32.5)(426)}{8(233.25) - (32.5)^2} = \frac{13412 - 13845}{1866 - 1056.25} = \frac{-433}{809.75} \approx -0.535 \][/tex]

[tex]\[ b = \frac{(426)(233.25) - (32.5)(1676.5)}{8(233.25) - (32.5)^2} = \frac{99355.5 - 54586.25}{809.75} = \frac{44769.25}{809.75} \approx 55.32 \][/tex]

Therefore, the equation for the line of best fit is:
[tex]\[ y = -0.535x + 55.32 \][/tex]

### c) Correlation:
The correlation coefficient (r), also known as Pearson's correlation coefficient, is calculated as:
[tex]\[ r = \frac{N(\sum xy) - (\sum x)(\sum y)}{\sqrt{[N\sum x^2 - (\sum x)^2][N\sum y^2 - (\sum y)^2]}} \][/tex]

First, we calculate [tex]\(\sum y^2\)[/tex]:
[tex]\[ \sum y^2 = 90^2 + 20^2 + 38^2 + 62^2 + 68^2 + 32^2 + 46^2 + 70^2 = 33044 \][/tex]

Then, substituting into the formula:
[tex]\[ r = \frac{8 \cdot 1676.5 - 32.5 \cdot 426}{\sqrt{(8 \cdot 233.25 - 32.5^2)(8 \cdot 33044 - 426^2)}} \][/tex]

[tex]\[ r = \frac{13412 - 13845}{\sqrt{(1866 - 1056.25)(264352 - 181476)}} \][/tex]

[tex]\[ r = \frac{-433}{\sqrt{809.75 \cdot 82876}} \][/tex]

[tex]\[ r = \frac{-433}{\sqrt{67111490}} \][/tex]

[tex]\[ r \approx -0.054 \][/tex]

So, the correlation is approximately -0.054, indicating a very weak negative linear relationship.

### d) Estimate the percentage if a student revises for 8 hours:
Using the line of best fit [tex]\( y = -0.535x + 55.32 \)[/tex], we can estimate the test result for a student who revises for 8 hours (x = 8):

[tex]\[ y = -0.535 \cdot 8 + 55.32 \][/tex]

[tex]\[ y = -4.28 + 55.32 \][/tex]

[tex]\[ y = 51.04 \][/tex]

Thus, if a student revises for 8 hours, the estimated test result percentage would be approximately 51.04%.