fbpixel
IFT Notes for Level I CFA® Program

R11 Hypothesis Testing

Part 3


 

4. Hypothesis Tests Concerning Variance and Correlation

 

Instructor’s Note:

Focus on the basics of this topic, the probability of being tested on the details is low.

4.1.     Tests Concerning a Single Variance

In tests concerning the variance of a single normally distributed population, we use the chi-square test statistic, denoted by χ2.

Properties of the chi-square distribution

The chi-square distribution is asymmetrical and like the t-distribution, is a family of distributions. This means that a different distribution exists for each possible value of degrees of freedom, n – 1. Since the variance is a squared term, the minimum value can only be 0. Hence, the chi-square distribution is bounded below by 0. The graph below shows the shape of a chi-square distribution:

L1-V1-38
  1. \rm H_0: \sigma^2 = {\sigma}^2_0 \ versus \ H_a: \sigma^2 \neq {\sigma}^2_0. This is used when we believe the population variance is not equal to 0, or it is different from the hypothesized variance. It is a two-tailed test.
  2. \rm H_0: \sigma^2 \geq {\sigma}^2_0 \ versus \ H_a: \sigma^2 < {\sigma}^2_0.This is used when we believe the population variance is less than the hypothesized variance. It is a one-tailed test.
  3. \rm H_0: \sigma^2 \leq {\sigma}^2_0 \ versus \ H_a: \sigma^2 > {\sigma}^2_0. This is used when we believe the population variance is greater than the hypothesized variance. It is a one-tailed test.

After drawing a random sample from a normally distributed population, we calculate the test statistic using the following formula using n – 1 degrees of freedom:

where:

n = sample size

s = sample variance

We then determine the critical values using the level of significance and degrees of freedom. The chi-square distribution table is used to calculate the critical value.

Example

Consider Fund Alpha which we discussed in an earlier example. This fund has been in existence for 20 months. During this period the standard deviation of monthly returns was 5%.  You want to test a claim by the fund manager that the standard deviation of monthly returns is less than 6%.

Solution:

The null and alternate hypotheses are: H0: σ2 ≥ 36 versus Ha: σ2 < 36

Note that the standard deviation is 6%. Since we are dealing with population variance, we will square this number to arrive at a variance of 36%.

We then calculate the value of the chi-square test statistic:

c2 = (n – 1) s2 / σ02   = 19 x 25/36 = 13.19

Next, we determine the rejection point based on df = 19 and significance = 0.05. Using the chi-square table, we find that this number is 10.117.

Since the test statistic (13.19) is higher than the rejection point (10.117) we cannot reject H0. In other words, the sample standard deviation is not small enough to validate the fund manager’s claim that population standard deviation is less than 6%.

4.2.     Tests Concerning the Equality (Inequality) of Two Variances

In order to test the equality or inequality of two variances, we use an F-test which is the ratio of sample variances.

The assumptions for a F-test to be valid are:

  • The samples must be independent.
  • The populations from which the samples are taken are normally distributed.

Properties of the F-distribution

The F-distribution, like the chi-square distribution, is a family of asymmetrical distributions bounded from below by 0. Each F-distribution is defined by two values of degrees of freedom, called the numerator and denominator degrees of freedom. As shown in the figure below, the F-distribution is skewed to the right and is truncated at zero on the left hand side.

 

L1-V1-39

As shown in the graph, the rejection region is always in the right side tail of the distribution.

When working with F-tests, there are three hypotheses that can be formulated:

  1. \rm H_0 : \sigma_{12} = \sigma_{22} \ versus \ H_a: \sigma_{12} \neq \sigma_{22} . This is used when we believe the two population variances are not equal.
  2. \rm H_0 : \sigma_{12} \leq \sigma_{22} \ versus \ H_a: \sigma_{12} > \sigma_{22} . This is used when we believe the variance of the first population is greater than the variance of the second population.
  3. \rm H_0 : \sigma_{12} \geq \sigma_{22} \ versus \ H_a: \sigma_{12} < \sigma_{22} . This is used when we believe the variance of the first population is less than the variance of the second population.

The term σ1represents the population variance of the first population and σ2represents the population variance of the second population.

The formula for the test statistic of the F-test is:

\rm F = \frac{{s_1}^2}{{s_2}^2}

where:

\rm {s_1}^2= the sample variance of the first population with n observations

\rm {s_2}^2= the sample variance of the second population with n observations

A convention is to put the larger sample variance in the numerator and the smaller sample variance in the denominator.

df1 = n1 – 1 numerator degrees of freedom

df2 = n2 – 1 denominator degrees of freedom

The test statistic is then compared with the critical values found using the two degrees of freedom and the F-tables.

Finally, a decision is made whether to reject or not to reject the null hypothesis.

Example

You are investigating whether the population variance of the Indian equity market changed after the deregulation of 1991. You collect 120 months of data before and after deregulation.

Variance of returns before deregulation was 13. Variance of returns after deregulation was 18.  Check your hypothesis at a confidence level of 99%.

Solution:

Null and alternate hypothesis: H0: σ12 = σ22 versus HA: σ12 ≠ σ22

\rm F-statistic:  \frac{18}{13} = 1.4

df = 119 for the numerator and denominator

α = 0.01 which means 0.005 in each tail. From the F-table: critical value = 1.6

Since the F-stat is less than the critical value, do not reject the null hypothesis.

Summary of types of test statistics

Hypothesis test of Use
One population mean t-statistic or z-statistic
Two population mean t-statistic
One population variance Chi-square statistic
Two-population variance F-statistic

4.3 Tests Concerning Correlation

The strength of linear relationship between two variables is assessed through correlation coefficient. The significance of a correlation coefficient is tested by using hypothesis tests concerning correlation.

There are two hypotheses that can be formulated (ρ represents the population correlation coefficient):

  • \rm H_0: \rho = 0
  • \rm H_a: \rho \neq 0

This test is used when we believe the population correlation is not equal to 0, or it is different from the hypothesized correlation. It is a two-tailed test.

As long as the two variables are distributed normally, we can use sample correlation, r for our hypothesis testing. The formula for the t-test is

\rm t = \frac {r \sqrt{n-2}} {\sqrt{1-r^2}}

Where, n – 2 = degrees of freedom if H0 is true.

The magnitude of r needed to reject the null hypothesis H0: ρ= 0 decreases as sample size n increases due to the following:

  1. i. As n increases, the number of degrees of freedom increases and the absolute value of the critical value tc decreases.
  2. ii. As n increases, the absolute value of the numerator increases, leading to larger-magnitude t-values.

In other words, as n increases, the probability of Type-II error decreases, all else equal.

Example

The sample correlation between the oil prices and monthly returns of energy stocks in a Country A is 0.7986 for the period from January 2014 through December 2018. Can we reject a null hypothesis that the underlying or population correlation equals 0 at the 0.05 level of significance?

Solution:

\rm H_0: \rho = 0 → true correlation in the population is 0.

\rm H_a: \rho \neq 0 → correlation in the population is different from 0.

From January 2014 through December 2018, there are 60 months, so n = 60. We use the following statistic to test the above.

\rm t = \frac {0.7986 \sqrt{60-2}} {\sqrt{1-0.7986^2}} = \frac {6.0820}{0.6019} = 10.1052

At the 0.05 significance level, the critical level for this test statistic is 2.00 (n = 60, degrees of freedom = 58). When the test statistic is either larger than 2.00 or smaller than 2.00, we can reject the hypothesis that the correlation in the population is 0. The test statistic is 10.1052, so we can reject the null hypothesis.

5. Other Issues: Nonparametric Inference

The hypothesis-testing procedures we have discussed so far have two characteristics in common:

  • They are concerned with parameters, such as the mean and variance.
  • Their validity depends on a set of assumptions.

Any procedure which has either of the two characteristics is known as a parametric test.

Nonparametric tests are not concerned with a parameter and/or make few assumptions about the population from which the sample are drawn. We use nonparametric procedures in three situations:

  • Data does not meet distributional assumptions.
  • Data are given in ranks. (Example: relative size of the company and use of derivatives.)
  • The hypothesis does not concern a parameter. (Example: Is a sample random or not?)

The Spearman rank correlation coefficient test is a popular nonparametric test. The coefficient is calculated based on the ranks of two variables within their respective samples.


Quantitative Methods Hypothesis Testing Part 3