Hypothesis testing is the process of making judgments about a larger group (a population) on the basis of observing a smaller group (a sample). The results of such a test then help us evaluate whether our hypothesis is true or false.
For example, let’s say you are a researcher and you believe that the average return on all Asian stocks was greater than 2%. To test this belief you can draw samples from a population of all Asian stocks and employ hypothesis testing procedures. The results of this test can tell you if your belief is statistically valid.
A hypothesis is defined as a statement about one or more populations. In order to test a hypothesis, we follow these steps:
Though the curriculum suggests a seven step process for hypothesis testing, you can arrive at a decision quickly using these four steps:
Let’s understand these steps with the help of a few examples.
Suppose you are a researcher and believe that the average return on all Asian stocks was greater than 2%. In this case, you are making a statement about the population mean (µ) of all Asian stocks.
Step 1: Stating the hypothesis
The first step is stating the two hypotheses: the null hypothesis (H0) and the alternative hypothesis (Ha).
Null hypothesis (H0): It is the hypothesis that the researcher wants to reject.
Alternative hypothesis (Ha): It is the hypothesis that the researcher wants to prove. If the null hypothesis is rejected then the alternative hypothesis is considered valid.
For our example, the null and alternative hypotheses are:
H0: µ ≤ 2%
Ha: µ > 2%
(The value 2% is known as µ0, the hypothesized value of the population mean.)
An easy way to differentiate between the two hypotheses is to remember that the null hypothesis always contains some form of the equal sign.
Step 2: Compute the test statistic.
A test statistic is calculated from sample data and is compared to a critical value to decide whether or not we can reject the null hypothesis. The formula for computing test statistic is:
Continuing with our example, let’s further suppose that the sample mean of 36 observations of Asian stocks is 4 and the standard deviation of the population is 4. Therefore,
Step 3: Determine the critical value based on significance level
Continuing with our example of Asian stocks, suppose we want to test our hypothesis at the 5% significance level. This is a one-tailed test because we are trying to assess whether the population mean is greater than 2% or not. Hence, we are only interested in the right tail of the distribution. If we were trying to assess whether the population mean is less than 2% we would be interested in the left tail.
The sign in the alternative hypothesis points to the direction of the tail that we should use in our test. Since in our example the alternative hypothesis has a ‘>’ sign it points to the right, therefore we are interested in the right tail.
The critical value is also known as the rejection point for the test statistic. Graphically, this point separates the acceptance and rejection regions for a set of values of the test statistic. This is shown below:
The region to the left of the test statistic is the ‘acceptance region’. This represents the set of values for which we do not reject (accept) the null hypothesis. The region to the right of the test statistic is known as the ‘rejection region’.
Using the Z –table and 5% level of significance, the critical value = Z0.05= 1.65
Step 4: Compare the test statistic with the critical value and determine whether or not to reject the null hypothesis
If the test statistic > critical value, i.e. if the test statistic lies in the rejection region we will reject H0.
On the other hand, if the test statistic < critical value, i.e. if the test statistic lies in the acceptance region, then we cannot reject H0.
In our example, because the test statistic z = 3 is greater than the critical value of 1.645, we reject the null hypothesis in favor of the alternative hypothesis that the average return on all Asian stocks is greater than 2%.
We use a left tailed test to determine whether the estimated value of a population parameter is less than a hypothesized value.
An analyst believes that the average return on all Asian stocks was less than 2%. The sample size is 36 observations with a sample mean of -3. The standard deviation of the population is 4. Will he reject the null hypothesis at the 5% level of significance?
In this case, our null and alternative hypotheses are:
H0: µ ≥ 2
Ha: µ < 2
The standard error of the sample is:
The test statistic is:
The critical values corresponding to a 5% level of significance is -1.65.
When we consider the left tail of the distribution, our decision rule is as follows: Reject the null hypothesis if the test statistic is less than the critical value and vice versa. Since our calculated test statistic of -7.5 is less than the critical value of -1.65, we reject the null hypothesis.
We use a two-tailed hypothesis test to determine whether the estimated value of a population parameter is different from the hypothesized value. In a two-tailed test, we reject the null in favor of the alternative if the evidence indicates that the population parameter is either smaller or larger than the value of the parameter under H0.
For example, we believe that the average return on all Asian stocks was not 0%. We take a sample of 36 observations with a sample mean of 1 and a population standard deviation of 4. In this case our null and alternative hypotheses are:
H0: µ = 0
Ha: µ ≠ 0
The standard error of the sample is unchanged at 0.67:
The test statistic is:
In a two-tailed test, two critical values exist, one positive and one negative. For a two-sided test at the 5% level of significance, we calculate the z-values that correspond to
level of significance. These are +1.96 and -1.96. Therefore, we reject the null hypothesis if we find that the test statistic is less than -1.96 or greater than +1.96. We fail to reject the null hypothesis if -1.96 ≤ test statistic ≤ +1.96. Graphically, this can be shown as:
The above figure also illustrates the relationship between confidence intervals and hypothesis tests. A confidence interval specifies the range of values that may contain the hypothesized value of the population parameter. The 5% level of significance in the hypothesis tests corresponds to a 95% confidence interval. When the hypothesized value of the population parameter is outside the corresponding confidence interval, the null hypothesis is rejected. When the hypothesized value of the population parameter is inside the corresponding confidence interval, the null hypothesis is not rejected.
Summary: One-tailed test versus two-tailed tests
One-tailed tests are used to test if the population parameter is greater than or less than a hypothesized value. They can be either right-tailed tests or left-tailed tests.
Two-tailed tests are used to test if the population parameter is not equal to the hypothesized value. They have two rejection points: one positive and one negative, because here we are interested in both tails.
The p-value is the smallest level of significance at which the null hypothesis can be rejected. It can be used in the hypothesis testing framework as an alternative to using rejection points.
For example, if the p-value of a test is 4%, then the hypothesis can be rejected at the 5% level of significance, but not at the 1% level of significance.
Relationship between test-statistic and p-value
A high test-statistic implies a low p-value.
A low test-statistic implies a high p-value.
In reaching a statistical decision, we can make two possible errors:
The following table shows the possible outcomes of a test.
|H0 true||H0 false|
|Do not reject H0||Correct decision||Type II error|
|Reject H0 (accept Ha)||Type I error||Correct decision|
The significance level “α” is the probability of type I error.
Power of a test
The power of a test is the probability of correctly rejecting the null (rejecting the null when it is false). It is expressed as:
Power of a test = 1 – P (Type II error)
A statistical decision simply consists of rejecting or not rejecting the null hypothesis. Whereas, an economic decision takes into consideration all economic issues relevant to the decision, such as transaction costs, risk tolerance, and the impact on the existing portfolio. Sometimes a test may indicate a statistically significant result which may not be economically significant.