Skip to main content

Table 3 Advantages and disadvantages of classification methods chosen for the pipeline configuration

From: Pipeline design to identify key features and classify the chemotherapy response on lung cancer patients using large-scale genetic data

Classification methods

 

Advantages

Disadvantages

Linear SVM

By introducing the kernel, SVMs gain flexibility in the choice of the form of the threshold separating samples from different classes, which needs not be linear and even needs not have the same functional form for all data, since its function is non-parametric and operates locally.

The lack of transparency of the results.

 

Since the kernel implicitly contains a non-linear transformation, no assumptions about the functional form of the transformation, which makes data linearly separable, is necessary.

The SVM moves the problem of over-fitting from optimizing the parameters to model selection.

 

SVMs provide a good out-of-sample generalization, if the parameters (C for example) are appropriately chosen. This means that, by choosing an appropriate generalization grade, SVMs can be robust, even when the training sample has some bias.

 
 

SVMs deliver a unique solution, since the optimality problem is convex.

 

RF

It decides the final classification by voting, decreasing the variance of the model without increasing the bias.

It is hard to visualize the model or understand why it predicted something, as compared to a single decision tree.

 

It uses a random subset of features at each node of the decision trees, to identify the best split among this subset, and the subsets are different in each node. This is to avoid the most powerful features being selected too frequently in each tree, making them more correlated to each other.

A large number of trees may make the algorithm slow for real-time prediction.

 

It is fast even on large data-sets.

RFs have been observed to over-fit for some data-sets with noisy classification/regression tasks.

 

It gives estimates of what variables are important in the classification.

 

KNN

The cost of the learning process is zero.

The algorithm must compute the distance and sort all the training data at each prediction, which can be slow if there are a large number of training examples.

 

No assumptions about the characteristics of the concepts to learn have to be done.

The algorithm does not learn anything from the training data, which can result in the algorithm not generalizing well and also not being robust to noisy data.

 

Complex concepts can be learned by local approximation using simple procedures.

Changing k can change the resulting predicted class label.