We are testing the association between college major and gender. The null and alternative hypotheses are defined as follows:
- \( H_0 \): College major and gender are independent.
- \( H_1 \): College major and gender are dependent.
The observed frequencies from the survey are organized in a contingency table as follows:
\[
\begin{array}{|c|c|c|c|c|}
\hline
& \text{Math/Science} & \text{Arts/Humanities} & \text{Business/Econ.} & \text{Other} \\
\hline
\text{Men} & 92 & 93 & 99 & 58 \\
\hline
\text{Women} & 113 & 113 & 94 & 33 \\
\hline
\end{array}
\]
The expected frequencies for each cell in the contingency table are calculated using the formula:
\[
E = \frac{R_i \times C_j}{N}
\]
where \( R_i \) is the total for row \( i \), \( C_j \) is the total for column \( j \), and \( N \) is the grand total. The expected frequencies are:
\[
\begin{array}{|c|c|c|c|c|}
\hline
& \text{Math/Science} & \text{Arts/Humanities} & \text{Business/Econ.} & \text{Other} \\
\hline
\text{Men} & 100.8777 & 101.3698 & 94.9727 & 44.7799 \\
\hline
\text{Women} & 104.1223 & 104.6302 & 98.0273 & 46.2201 \\
\hline
\end{array}
\]
The Chi-Square test statistic \( \chi^2 \) is calculated using the formula:
\[
\chi^2 = \sum \frac{(O - E)^2}{E}
\]
where \( O \) is the observed frequency and \( E \) is the expected frequency. The contributions from each cell are:
- For cell (1, 1): \( O = 92, E = 100.8777 \Rightarrow \frac{(92 - 100.8777)^2}{100.8777} = 0.7813 \)
- For cell (1, 2): \( O = 93, E = 101.3698 \Rightarrow \frac{(93 - 101.3698)^2}{101.3698} = 0.6911 \)
- For cell (1, 3): \( O = 99, E = 94.9727 \Rightarrow \frac{(99 - 94.9727)^2}{94.9727} = 0.1708 \)
- For cell (1, 4): \( O = 58, E = 44.7799 \Rightarrow \frac{(58 - 44.7799)^2}{44.7799} = 3.9029 \)
- For cell (2, 1): \( O = 113, E = 104.1223 \Rightarrow \frac{(113 - 104.1223)^2}{104.1223} = 0.7569 \)
- For cell (2, 2): \( O = 113, E = 104.6302 \Rightarrow \frac{(113 - 104.6302)^2}{104.6302} = 0.6695 \)
- For cell (2, 3): \( O = 94, E = 98.0273 \Rightarrow \frac{(94 - 98.0273)^2}{98.0273} = 0.1655 \)
- For cell (2, 4): \( O = 33, E = 46.2201 \Rightarrow \frac{(33 - 46.2201)^2}{46.2201} = 3.7813 \)
Summing these contributions gives:
\[
\chi^2 = 10.9193
\]
The degrees of freedom \( df \) for the Chi-Square test is calculated as:
\[
df = (r - 1)(c - 1)
\]
where \( r \) is the number of rows and \( c \) is the number of columns. In this case:
\[
df = (2 - 1)(4 - 1) = 3
\]
The critical value for \( \chi^2 \) at \( \alpha = 0.05 \) with 3 degrees of freedom is:
\[
\chi^2_{\alpha, df} = 7.8147
\]
The p-value associated with the calculated Chi-Square statistic is:
\[
P = P(\chi^2 > 10.9193) = 0.0122
\]
Since the p-value \( 0.0122 \) is less than the significance level \( \alpha = 0.05 \), we reject the null hypothesis \( H_0 \).
There is sufficient evidence to conclude that college major and gender are dependent.
a. The correct statistical test to use is Chi-square Test of Independence.
b. \( H_{0} \): College major and gender are independent.
Thus, the answers are:
a. Chi-square Test of Independence
b. \( H_{0} \): College major and gender are independent.