In penalized regression the regression coefficients are chosen to minimize sum of squared residuals plus a penalty term that increases with the number of included variables. So, a feature must make a sufficient contribution to the model fit to offset the penalty from including it. Because of this penalty, the model remains parsimonious and only the most important variables for explaining Y remain in the model. A popular type of penalized regression is LASSO.
Support vector machine (SVM) is a linear classifier that aims to seek the optimal hyperplane – the one that separates the two sets of data points by the maximum margin.
K-nearest neighbor (KNN) classifies a new observation by finding similarities (“nearness”) between it and its k-nearest neighbors in the existing data set.
Classification and regression tree (CART) can be applied to predict a categorical variable or a continuous target variable. A binary CART tree is a combination of an initial root node, decision nodes, and terminal nodes. The root node and each decision node represent a single feature (f) and a cutoff value (c) for that feature. The CART algorithm iteratively partitions the data into sub-groups until terminal nodes are formed that contain the predicted label.
A random forest classifier is a collection of many decision trees generated by a bagging method or by randomly reducing the number of features available during training.
In ensemble learning, we combine predictions from a collection of models. This method typically produces more accurate and more stable predictions than the best single model.