Structured data
For structured data, data preparation and wrangling involve data cleansing and data preprocessing.
Data cleansing involves resolving:
Data preprocessing involves performing the following transformations:
Scaling is the process of adjusting the range of a feature by shifting and changing the scale of data. Two common methods used for scaling are:
Unstructured data
For unstructured data, data preparation and wrangling involve a set of text-specific cleansing and preprocessing tasks.
Text cleansing involves removing the following unnecessary elements from the raw text:
Text preprocessing involves performing the following transformations: