Feature selection to enhance precision and reduce training time
When operating on a problem, you are always seeking to get the most out of the information that you have at your disposal. You wish for the best precision you can obtain.
Usually, the largest wins are in better comprehending the problem you are solving. This is why we stress that a lot of time is to be spent up front giving definition to your problem, analysis of the data, and prepping datasets for your models.
A critical part of data prep is developing transforms of the datasets like rescaled attribute values and attributes decomposed into their constituent parts, all with the intent of exposing additional useful structure to the modelling algorithms.
A critical suite of strategies to deploy when prepping the dataset are automatic feature selection algorithms. In this blog article you will find out about feature selection, the advantages of simple feature selection and how to make ideal usage of these algorithms in Weka on your dataset.
Not all attributes are equal
Regardless of if you choose and collect sample data by yourself or if it is furnished to you by domain specialists, the selection of attributes is crucially important. It is critical as it can imply the difference between successfully and meaningfully modelling the problem and not.
Integrating redundant attributes can be misleading to modelling algorithms. Instance-based strategies like k-nearest neighbour leverage small neighbourhoods in the attribute space to decide classification and regression predictions. These forecasts can be greatly skewed by redundant attributes.
Keeping irrelevant attributes within your dataset can have the outcome of overfitting. Decision tree algorithms like C4.5 look to make optimal splits in attribute values. These attributes that are more connected with the forecast are split on to start with. Deeper in the tree attributes are of lesser relevance and irrelevance are leveraged to make forecasting decisions that might just be beneficial by chance in the training dataset. This overfitting of the training data can impact the modelling power (of the method) in a negative sense and cripple the forecasting precision.
It is critical to delete redundant and irrelevant attributes from your dataset prior to evaluation of algorithms. This activity ought to be handled in the Prepare Data step of the applied machine learning process.
Feature Selection or Attribute Selection is a procedure through which you can automatically look for the ideal subset of attributes in your dataset. The notion of “ideal” is relative to the problem you are attempting to find a solution to, but usually implies highest precision.
A useful way to think about the problem of choosing attributes is a state-space search. The search space is discrete and consists of all potential combos of attributes you could select from the dataset. The objective is to navigate through the search space and locate the ideal or an adequate combo that enhances performance over choosing all attributes.
Three critical advantages of performing feature selection on your data are:
- Minimizes overfitting: Less redundant information implies lesser opportunity to make decisions on the basis of noise.
- Enhances precision: Less misleading data implies modelling precision improves.
- Minimizes training time: Less data implies that algorithms train quicker.
Attribute Selection in Weka
Weka furnishes an attribute selection tool. The procedure is separate into two portions:
- Attribute evaluator: Strategy by which attribute subsets are evaluated.
- Search method: Strategy by which the space of potential subsets is searched.
The Attribute Evaluator is the strategy by which a subset of attributes are evaluated. For instance, they might be evaluated by developing a model and assessing the precision of the model.
Some instances of attribute evaluation methods are:
- CfsSubsetEval: Values subsets that correlate highly with the class value and low correlation with one another.
- ClassifierSubsetEval: Evaluates subsets leveraging a predictive algorithm and another dataset that you mention.
- WrapperSubsetEval: Evaluate subsets leveraging a classifier that you mention and n-fold cross validation.
The search method is the structured way in which the search space of potential attribute subsets is navigated on the basis of the subset evaluation. Baseline strategies consist of Random Search and Exhaustive Search, even though graph search algorithms are widespread as Best First Search.
Some instances of attribute evaluation strategies are:
- Exhaustive: Tests all combos of attributes.
- BestFirst: Leverages a best-first search technique to navigate attribute subsets.
- GreedyStepWise: Leverages a forward (additive) or backward (subtractive) step-wise strategy to navigate attribute subsets.
How to leverage attribute selection in Weka
In this portion, we wish to share with you three intelligent ways of leveraging attribute selection in Weka.
- Explore Attribute Selection
When you are just beginning out with attribute selection we recommend toying with a few of the strategies in the Weka Explorer. Load your dataset and choose the “Select attributes” tab. Attempt differing Attribute Evaluators and Search Strategies on your dataset and review the outcomes in the output window. The idea is to obtain a feeling and develop an intuition for 1) how many and 2) which attributes are chosen for your problem. You could leverage this data moving forward into either or both of the next steps.
- Prep data with Attribute Selection
The next stage would be to leverage attribute selection as portion of your data prep step. There is a filter you can leverage during pre-processing of your dataset that will run an attribute selection scheme then trim your dataset to just the chosen attributes. The filter is referred to as “Attribute Selection” under the Unsupervised Attribute Filters.
You could then save the dataset for leveraging in experiments when spot checking algorithms.
3. Run algorithms with attribute selection
Lastly, there is one more intelligent way you can integrate attribute selection and that is to integrate it with the algorithm directly.
There is a meta algorithm you can carry out and integrate in experiments that chooses attributes running the algorithm. The algorithm is referred to as “AttributeSelectedClassifier” under the “meta” grouping of algorithms. You can setup this algorithm to leverage your algorithm of selection as well a the Attribute Evaluator and Search Method of your selection.
You could integrate several variants of this meta algorithm configured with differing variations and configurations of the attribute selection scheme and observe how to compare to one another.
In this blog article by AICorespot, you found out about feature selection as a collection of strategies that can enhance model precision, reduce model training time and minimize overfitting.
You also found out that feature selection strategies are baked into Weka and you learned three intelligent ways for leveraging feature selection strategies on your dataset in Weka, namely through exploring, prepping data and in coupling it with your algorithm in a meta classifier.
Wikipedia’s entry on Feature Selection has some good information.
If you are seeking the next step, we recommend the book “Feature Extraction: Foundations and Applications. It is a grouping of articles by academics covering an array of issues on and connected to feature selection. It’s expensive but is value due to the difference it can make in finding solutions to your problem.
Another book you might find useful (and is less expensive on Kindle) is Computational Methods of Feature Selection.