Model Prediction Precision versus interpretation with machine learning
In the book Applied Predictive Modelling, Johnson and Kuhn talk early on the trade-off of model prediction precision versus model interpretation.
For a provided problem, it is crucial to have an obvious idea on what should be prioritized, precision, or explainability so that this trade-off can be made overtly instead of implicitly.
In this blog post you will find out and consider this critical trade-off.
Precision and explainability
Model performance is estimated in terms of its precision to forecast the occurrence of an event on unobserved data. A more precise model is observed as a more valuable model.
Model interpretability furnishes insight into the relationship between the inputs and the output. An interpreted model can find solutions to questions as to why the independent features forecast the dependent attribute.
The issue props up as model precision increases so does model intricacy, at the expense of interpretability.
Model Complexity
A model with relatively high precision can imply more opportunities, advantages, time or money to a company. And as such forecasting precision is optimized.
The optimization of precision leads to subsequent increases in the intricacy of models in the form of additional model parameters (and resources needed to fine tune those parameters)
Unluckily, the predictive models that are most potent are typically the least interpretable.
A model with lesser parameters is easier to interpret. This is intuitive. A linear regression model has a coefficient per input feature and an intercept term. For instance, you can look at every term and comprehend how they contribute to the output. Shifting to logistic regression provides more power in terms of the underlying relationships that can be modelled at the expense of a function transform to the output that now too must be comprehended along with the coefficients.
A decision tree (of modest size) can be understood, a bagged decision tree needs a differing viewpoint to go about interpreting why an event is forecasted to happen. Going further, the optimized blend of several models into a single forecast might go beyond meaningful or timely interpretation.
Accuracy Trumps Explainability
In their book, Kuhn and Johnson are concerned with model precision at the expense of interpretation.
They comment:
“As long as complicated models are correctly validated, it might be improper to leverage a model that is developed for interpretation instead of predictive performance”
Interpretation is secondary to model precision and they mention instances like categorizing email into spam and non-spam and the evaluation of a house as instances of problems where this is the case. Medical instances are touched upon twice and in both scenarios are leveraged to defend the absolute requirement and desirability for precision of explainability, as long as the models are appropriately validated.
We’re certain that “but I validated my model” would be no defence at an inquest when a model makes forecasts that have the outcome of loss of life. Nonetheless, there is no doubt that this is a critical matter that needs meticulous consideration.
Conclusion
Whenever you are modelling a problem, you are making a decision on the trade-off amongst model precision and model interpretation.
You can leverage knowledge of this trade-off in the choosing of strategies you leverage to model your problem and be clear of your objectives when putting forth outcomes.