Statistics is a powerful tool for analyzing data and drawing conclusions. As investors, we are concerned about the returns on our investments and the distribution of those returns. We need this information to evaluate our investments.
For example, we often look for central tendency. Let’s say that the stock market has returned on average 14% over the last 10 years. Obviously, we are concerned with this average number but we are also concerned with the dispersion which tells us how spread out the returns have been. One of the simplest measures of dispersion is range. Let’s say that over the last 10 years the stock market has ranged between -20% and +35%. This tells us about the riskiness of our investments.
In this reading, we will study statistical methods that allow us to summarize return distributions.
Specifically, we will explore four properties of return distributions:
The term statistics can have two broad meanings, one referring to data and the other to a method used to analyze data. Statistical methods include:
The focus of this reading is on descriptive statistics. We will cover inferential statistics in a later reading.
A ‘population’ is defined as all members of a specified group. A ‘parameter’ describes the characteristics of a population.
A ‘sample’ is a subset drawn from a population. A ‘sample statistic’ describes the characteristic of a sample.
All stocks listed on the exchange of a country is an example of population. If 30 stocks are selected from among the listed stocks, then this is a sample.
The different types of measurement scales are:
A frequency distribution is a tabular display of data summarized into a relatively small number of intervals. In order to construct a frequency distribution, we can follow the following procedure:
Say you are evaluating 100 stocks with prices ranging from 46 to 65 that are divided into the following four intervals of stock price each having a width of 5:
46-50, 51-55, 56-60 and 61-65. Assume the number of stocks whose prices fall in each of these intervals are 25, 35, 29, and 11 respectively.
Calculate the cumulative frequency, relative frequency, and the cumulative relative frequency for the stock prices given the set of intervals above.
|Cumulative Frequency||Relative Frequency||Cumulative Relative Frequency|
Absolute frequency: The actual number of observations in a given interval is called the absolute frequency, or simply the frequency, which is given here.
Cumulative frequency: For an interval, it is calculated as the sum of the absolute frequencies of all intervals lower than and including that interval.
Relative frequency: It is the absolute frequency of each interval divided by the total number of observations.
Cumulative relative frequency: For an interval, it is calculated as the sum of the relative frequencies of all intervals lower than and including that interval.
It is a bar chart of data that has been grouped together into a frequency distribution. The height of each bar is equal to the absolute frequency of each interval.
The advantage of the visual display is that we can quickly see where most of the observations lie.
A frequency polygon plots the midpoints of each interval on the X-axis and the absolute frequency of that interval on the Y-axis. Each point is then connected with a straight line.
Another graphical tool is the cumulative frequency distribution. Such a graph can plot either the cumulative frequency or cumulative relative frequency against the upper interval limit. The cumulative frequency distribution allows us to see how many or what percent of the observations lie below a certain value. The figure below is an example of a cumulative frequency distribution.