Exact matches only

Search in title

Search in content

Filter by Categories

101 concepts level I

101 concepts level II

2021 Level I Corporate Finance Full Videos

2021 Level I Economics Full Videos

2021 Level I Ethics Full Videos

2021 Level I FRA Full Videos

2021 Level I Portfolio Management Full Videos

2021 Level I Quantitative Methods Full Videos

Advice and How to Study Videos

All-Levels

Alternative Investments

Alternative Investments (AI)

BookLet Top Level

Corporate Issuers

Corporate Issuers (CI)

Demystified Videos

Derivatives

Derivatives (DV)

Economics

Economics (EC)

Equity

Equity Investments (EI)

Ethical and Professional Standards (ES)

Ethics

featured

Financial Reporting and Analysis

Financial Statement Analysis (FSA)

Fixed Income

Fixed Income (FI)

Level I

Level II

Level III

LM01 Alternative Investment Features, Methods, and Structures

LM01 Categories, Characteristics, and Compensation Structures of Alternative Investments

LM01 Corporate Structures and Ownership

LM01 Derivative Instrument and Derivative Market Features

LM01 Ethics and Trust in the Investment Profession

LM01 Fixed-Income Instrument Features

LM01 Fixed-Income Securities: Defining Elements

LM01 Introduction to Financial Statement Analysis

LM01 Market Organization & Structure

LM01 Market Organization and Structure

LM01 Organizational Forms, Corporate Issuer Features, and Ownership

LM01 Portfolio Management Overview

LM01 Portfolio Management: An Overview

LM01 Rates and Returns

LM01 The Firm & Market Structures

LM01 Time Value of Money

LM01 Topics in Demand and Supply Analysis

LM02 Alternative Investment Performance and Returns

LM02 Analyzing Income Statements

LM02 Code of Ethics and Standards of Professional Conduct

LM02 Code of Ethics and Standards of Professional Conduct Profession

LM02 Financial Reporting Standards

LM02 Fixed Income Markets - Issuance Trading and Funding

LM02 Fixed-Income Cash Flows and Types

LM02 Forward Commitment and Contingent Claim Features and Instruments

LM02 Introduction to Corporate Governance and Other ESG Considerations

LM02 Investors and Other Stakeholders

LM02 Organizing, Visualizing, and Describing Data

LM02 Performance Calculation and Appraisal of Alternative Investments

LM02 Portfolio Risk & Return: Part I

LM02 Portfolio Risk and Return Part I

LM02 Security Market Indexes

LM02 The Firm and Market Structures

LM02 Time Value of Money in Finance

LM02 Understanding Business Cycles

LM03 Aggregate Output, Prices and Economic Growth

LM03 Analyzing Balance Sheets

LM03 Business Models & Risks

LM03 Corporate Governance: Conflicts, Mechanisms, Risks, and Benefits

LM03 Derivative Benefits, Risks, and Issuer and Investor Uses

LM03 Fiscal Policy

LM03 Fixed-Income Issuance and Trading

LM03 Guidance for Standards I-VII

LM03 Guidance for Standards I–VII

LM03 Introduction to Fixed Income Valuation

LM03 Investments in Private Capital: Equity and Debt

LM03 Market Efficiency

LM03 Portfolio Risk & Return: Part II

LM03 Portfolio Risk and Return Part II

LM03 Private Capital, Real Estate, Infrastructure, Natural Resources, and Hedge Funds

LM03 Probability Concepts

LM03 Statistical Measures of Asset Returns

LM03 Understanding Income Statements

LM04 An Introduction to Asset-Backed Securities

LM04 Analyzing Statements of Cash Flows I

LM04 Arbitrage, Replication, and the Cost of Carry in Pricing Derivatives

LM04 Basics of Portfolio Planning & Construction

LM04 Basics of Portfolio Planning and Construction

LM04 Capital Investments

LM04 Common Probability Distributions

LM04 Fixed-Income Markets for Corporate Issuers

LM04 Introduction to the Global Investment Performance Standards (GIPS)

LM04 Monetary Policy

LM04 Overview of Equity Securities

LM04 Probability Trees and Conditional Expectations

LM04 Real Estate and Infrastructure

LM04 Understanding Balance Sheets

LM04 Understanding Business Cycles

LM04 Working Capital and Liquidity.

LM05 Analyzing Statements of Cash Flows II

LM05 Capital Investments and Capital Allocation

LM05 Company Analysis: Past and Present

LM05 Fixed-Income Markets for Government Issuers

LM05 Introduction to Geopolitics

LM05 Introduction to Industry and Company Analysis

LM05 Monetary and Fiscal Policy

LM05 Natural Resources

LM05 Portfolio Mathematics

LM05 Pricing and Valuation of Forward Contracts and for an Underlying with Varying Maturities

LM05 Pricing and Valuation of Forward Contracts.

LM05 Sampling and Estimation

LM05 The Behavioral Biases of Individuals

LM05 Understanding Cash Flow Statements

LM05 Understanding Fixed-Income Risk and Return

LM05 Working Capital & Liquidity

LM06 Analysis of Inventories

LM06 Capital Structure

LM06 Cost of Capital-Foundational Topics

LM06 Equity Valuation: Concepts and Basic Tools

LM06 Financial Analysis Techniques

LM06 Fixed-Income Bond Valuation: Prices and Yields

LM06 Fundamentals of Credit Analysis

LM06 Hedge Funds

LM06 Hypothesis Testing

LM06 Industry and Competitive Analysis

LM06 International Trade

LM06 Introduction to Geopolitics

LM06 Introduction to Risk Management

LM06 Pricing and Valuation of Futures Contracts

LM06 Simulation Methods

LM07 Analysis of Long-Term Assets

LM07 Business Models

LM07 Capital Flows and the FX Market

LM07 Capital Structure

LM07 Company Analysis: Forecasting

LM07 Estimation and Inference

LM07 International Trade and Capital Flows

LM07 Introduction to Digital Assets

LM07 Introduction to Linear Regression

LM07 Inventories

LM07 Pricing and Valuation of Interest Rate and Other Swaps

LM07 Pricing and Valuation of Interest Rates and Other Swaps

LM07 Technical Analysis

LM07 Yield and Yield Spread Measures for Fixed-Rate Bonds.

LM08 Currency Exchange Rates

LM08 Equity Valuation: Concepts and Basic Tools

LM08 Exchange Rate Calculations

LM08 Fintech in Investment Management

LM08 Hypothesis Testing

LM08 Long Lived Assets

LM08 Measures of Leverage

LM08 Pricing and Valuation of Options

LM08 Topics in Long-Term Liabilities and Equity

LM08 Yield and Yield Spread Measures for Floating-Rate Instruments

LM09 Analysis of Income Taxes

LM09 Income Taxes

LM09 Option Replication Using Put-Call Parity

LM09 Option Replication Using Put–Call Parity

LM09 Parametric and Non-Parametric Tests of Independence

LM09 The Term Structure of Interest Rates: Spot, Par, and Forward Curves

LM10 Financial Reporting Quality

LM10 Interest Rate Risk and Return

LM10 Non-current (Long-Term) Liabilities

LM10 Simple Linear Regression

LM10 Valuing a Derivative Using a One-Period Binomial Model

LM11 Financial Analysis Techniques

LM11 Financial Reporting Quality

LM11 Introduction to Big Data Techniques

LM11 Yield-Based Bond Duration Measures and Properties

LM12 Applications of Financial Statement Analysis

LM12 Introduction to Financial Statement Modeling

LM12 Yield-Based Bond Convexity and Portfolio Properties

LM13 Curve-Based and Empirical Fixed-Income Risk Measures

LM14 Credit Risk

LM15 Credit Analysis for Government Issuers

LM16 Credit Analysis for Corporate Issuers

LM17 Fixed-Income Securitization

LM18 Asset-Backed Security (ABS) Instrument and Market Features

LM19 Mortgage-Backed Security (MBS) Instrument and Market Features

New Booklet Top level

Portfolio Management

Portfolio Management (PM)

Quantitative Methods

Quantitative Methods (QM)

Uncategorized

Please select your exam.

IFT Notes for Level I CFA^® Program

LM02 Organizing, Visualizing, and Describing Data

Part 1

1. Introduction

This reading presents tools and techniques for organizing, visualizing and describing data. These tools and techniques can help us convert raw data into useful information for investment analysis.

2. Data Types

Data can be defined as a collection of numbers, characters, words and text – as well as images, audio, and video – in a raw or organized format to represent facts or information.

Data can be classified in three ways:

Numerical versus categorical data
Cross-sectional versus time-series versus panel data
Structured versus unstructured data

2.1 Numerical versus Categorical Data

Based on a statistical perspective, data can be classified into numerical data and categorical data.

Numerical data: Numerical data (also called quantitative data) are values that represent measured or counted quantities as numbers. Numerical data can be further classified into two types:

Continuous data: Data that can be measured and can take on any numerical value in a specified range of values. For example, the future value of a sum of money invested today. The FV can take on range of values depending on the investment period and interest rate.
Discrete data: Data that can take numerical values that result from a counting process. The data is limited to a finite number of values. For example, the frequency of discrete compounding (m). The frequency could be monthly (m = 12), quarterly (m = 4), semi-yearly (m = 2), or yearly (m = 1).

Categorical data: Categorical data (also called qualitative data) are values that describe a quality or characteristic of a group of observations. It can usually take only a limited number of values that are mutually exclusive. Categorical data can be further classified into two types:

Nominal data: Categorical values that cannot be organized in a logical order. For example, classification of publicly listed stocks into different sectors, such as: energy, information technology, financials, health care etc.
Ordinal data: Categorical values that can be organized in a logical order or ranked. For example, Standard & Poor’s star ratings for mutual funds. One star represents the group of mutual funds with the worst performance. Similarly, groups with two, three, four, and five stars represent groups with increasingly better performance.

Although the categories represented by ordinal data can be ranked, the numerical differences between the categories is not necessarily the same, and it cannot be used to draw inferences.

Example: Identifying Data Types

Identify the data type for each of the following items:

Number of coupon payments for a bond
Dividends paid by a stock
Credit ratings for corporate bonds
Hedge fund classification types

Solution:

Based on our above discussion, we can classify these items as follows:

Number of coupon payments for a bond – Discrete data
Dividends paid by a stock – Continuous data
Credit ratings for corporate bonds – Ordinal data
Hedge fund classification types – Nominal data

2.2 Cross-Sectional versus Time-Series versus Panel Data

Based on how data is collected, it can be classified into three types: cross-sectional, time-series, and panel.

Before we describe these data types, we need to understand two terms: ‘variable’ and ‘observation’.

A variable (also called field, attribute, or feature) is characteristic or quantity that can be measured, counted, or categorized. A variable is subject to change. For example, the returns on Microsoft stock in a given quarter can be considered a variable.
An observation is a value of a specific variable collected at a point in time or over a specified period of time. For example, if the returns on Microsoft stock in 2019 Q1 were 3%, then 3% is an observation.

Time-series data: Time-series data consists of observations for a single subject taken at specific and equally spaced intervals of time. For example, the quarterly returns of Microsoft stock from 2019 to 2020.

Cross-sectional data: Cross-sectional data consists of observations for multiple subjects taken at a specific point in time. For example, the quarterly returns in 2019 Q1 of a group of similar stocks – Microsoft, Oracle, and HP.

Panel data: Panel data is a combination of time-series and cross-sectional data. It consists of observations through time on one or more variables for multiple subjects. It is generally presented as a table. For example, the quarterly returns of Microsoft, Oracle, and HP from 2019 to 2020.

2.3 Structured versus Unstructured Data

Based on whether data is available in a highly organized form or not, it can be classified into structured and unstructured data.

Structured data: Structured data is highly organized in a pre-defined manner, usually with repeating patterns. It is easier to enter, store, query and analyze, without much manual processing. Examples:

Market data: Daily closing stock prices and trading volumes.
Fundamental data: Data contained in financial statement such as earnings per share.
Analytical data: Data derived from analytics, such as cash flow projections.

Unstructured data: Unstructured data does not follow any conventionally organized forms. It is typically alternative data and is usually collected from unconventional sources. Based on the source, unstructured data can be classified into:

Produced by individuals (i.e., via social media posts, web searches, etc.);
Generated by business processes (i.e., via credit card transactions, corporate regulatory filings, etc.); and
Generated by sensors (i.e., via satellite imagery, foot traffic by mobile devices, etc.).

2.4 Data Summarization

Raw data typically cannot be used by humans or computers directly to extract information and insights. The data usually has to be organized first. In the following sections we will discuss various techniques for organizing and summarizing data.

3. Organizing Data for Quantitative Analysis

Raw data is typically organized into either a one-dimensional array or a two-dimensional rectangular array (also called a data table) for quantitative analysis.

A one-dimensional array is suitable for representing a single variable. For example, the closing price for the first 10 trading days for a company after it went public.
A two-dimensional array consists of columns and rows to hold multiple variables and multiple observations, respectively. For example, quarterly revenue, EPS, and DPS for a company for the past two years.

4. Summarizing Data Using Frequency Distributions

A frequency distribution (also called a one-way table) is a tabular display of data summarized into a relatively small number of intervals.

Frequency distributions for categorical variables: The steps for constructing a frequency distribution for a categorical variable are:

Count the number of observations for each unique value of the variable.
Construct a table listing each unique value and the corresponding counts.
Sort the records by number of counts in descending or ascending order.

A sample frequency distribution of 200 companies across four sectors is presented below:

Sector	Absolute Frequency	Relative Frequency
Technology	22	11%
Healthcare	50	25%
Financial	58	29%
Industrial	70	35%
Total	200	100%

Points to note:

Absolute frequency: The actual number of observations in a given interval is called the absolute frequency.
Relative frequency: It is the absolute frequency of each interval divided by the total number of observations.

Frequency distributions for numerical variables: The steps for constructing a frequency distribution for numerical variables are:

Sort the data in ascending order.
Calculate the range of data.
Decide on the number of bins (k).
Determine bin width.
Determine bins.
Determine the number of observations in each bin.
Construct a table of the bins listed from smallest to largest.

A sample frequency distribution for 100 stocks with prices ranging between 45.00 and 65.00 is presented below.

Stock Price (Min – Max)	Absolute Frequency	Cumulative Frequency	Relative Frequency	Cumulative Relative Frequency
45.00 – 50.00	25	25	0.25	0.25
50.00 – 55.00	35	60	0.35	0.60
55.00 – 60.00	29	89	0.29	0.89
60.00 – 65.00	11	100	0.11	1.00

Points to note:

Range of the data = Maximum value – Minimum value = 00 – 45.00 = 20
We decided to have 4 bins
Bin width = Range / Number of bins = 20/ 4 = 5
The end points of each bin are determined as minimum value + bin width i.e. 45.00 + 5.00 = 50.00, 50.00 + 5.00 = 55.00, 55.00 + 5.00 = 60.00, 60.00 + 5.00 = 65.00. Thus, we get the bins listed in the table above.
Minimum values are included in the bins whereas maximum values are excluded. For example, the observation 50.00 will fall in the 50.00 – 55.00 bin and not in the 45.00 – 50.00 bin. However, the last bin includes the maximum values. The observation 65.00 will both fall in the 60.00 – 65.00 bin, since it is the last bin.
Cumulative frequency: For an interval, it is calculated as the sum of the absolute frequencies of all intervals lower than and including that interval.
Cumulative relative frequency: For an interval, it is calculated as the sum of the relative frequencies of all intervals lower than and including that interval.

Instructor’s Note: On the exam you are unlikely to be asked to construct a frequency distribution. However, you may be tested on the process and the terminology.

Example:

The actual number of observations in a given interval is called the:

absolute frequency.
relative frequency.
cumulative relative frequency.

Solution:

A is correct. The actual number of observations in a given interval is known as absolute frequency. Relative frequency is the absolute frequency of each interval divided by the total number of observations. Cumulative absolute frequency is the running total of all absolute frequencies.

Example:

Which of the following is most likely to be accurate?

An observation can fall in more than one interval.
The data is sorted in descending order for the construction of a frequency distribution.
The cumulative relative frequency tells the observer the fraction of the observations that are less than the upper limit of each interval.

Solution:

C is correct. The cumulative relative frequency tells the observer the fraction of the observations that are less than the upper limit of each interval. An observation cannot fall in more than one interval. The data is sorted in an ascending order for the construction of a frequency distribution.

Ace the Exam with Active Learning!

Ace the Exam with IFT Notes!

Accelerate your studies!

Do IFT Mocks to make you exam-ready!

Practice your way to success!

Enjoying IFT Study Notes?

Check out the full product with download option and more!

Explore Level I Study Notes