Machine learning that matters
Reading bootstrapping ML, Louis specified a research paper that we had to look into and read. The paper is entitled is Machine Learning that Matters (PDF) by Kiri Wagstaff from JPL and was put out in 2012.
Kiri’s thesis is that the ML research community has lost its path. She indicates that a majority of machine learning is performed for the sake of machine learning. She highlights three critical problems.
- Over-concentrate on Benchmark Data: Concentrate on datasets in UCI repository, but very minimal have an influence within the field that is being tackled. She highlights the lack of standards with regards to experiment reproducibility which voided the leveraging of conventional datasets and the skew towards regression and classification problems. She makes the statements that leveraging the UCI repository is worse than leveraging synthetic data as we don’t even possess control over the how the data was developed.
- Over-concentrate on Abstract Metrics: A strong concentration on algorithm racing or bake-offs and the leveraging of generic metrics such as RMSE and F-measure that do not possess an overt meaning in the domain.
- Lacking follow-through: It is really simple to download datasets and execute algorithms in Weka. It is really difficult to go about interpreting the outcomes and relate them to the field, but that is what is needed to have an influence.
The crux of the issue is that she details machine learning as three categories of activities and the “machine learning contribution” concentrates on algorithm selection and experiments ignoring problem definition and outcome interpretation.
Change in Mindset
Kiri indicates the research community requires to alter the method it formulates, attacks, and assesses machine learning research projects. She comments on three spheres to tackle:
- Meaningful evaluation strategies: Measure the overt impact of the machine learning system in the domain. For instance, dollars saved, lives preserved, time saved, or effort minimized. Choosing a overt impact measure will have a flow-on impact on the design of the experiment and the choosing of the data.
- Involvement of the outside world: Bring in domain specialists to give definition to the problem and data, and more critically leverage them to interpret the criticality of the outcomes within the domain. This is to stop the solving of problems of minimal significance (iris plant classification) and generate systems that are reliant and useful enough to be taken up in practice.
- Eyes on the prize: Choose research problems for their influence. Take up the status quo in the problem domain and detail the outcomes as a level of improvement above that status quo. Engage the community and motivate them by compelling adoption.
Kiri throws down the gauntlet and highlights six problems as instances of research projects where machine learning could make a difference.
1] A law passed or legal decision made that is reliant on the outcome of an ML analysis.
2] $100 million saved via enhanced decision making furnished by an ML system.
3] A conflict amongst nations prevented via high-quality translation furnished by an ML system.
4] A 50% reduction in cybersecurity break-ins via machine learning defences.
5] A human being’s life saved via a diagnosis or intervention recommended by an ML system.
6] Enhancement of 10% in one nation’s Human Development Index (HDI) attributable to a machine learning system.
She intentionally let the problems open to avert indicating a single problem or technical capability. Actual challenges are tough. There are instances which intend to inspire instead of a comprehensive and prioritized a listing of problems to operate on.
Lastly, Kiri completes with a comment on the hurdles that may be in the way of efficiently addressing research problems that make a difference.
Jargon: The over-leveraging of machine learning nomenclature which is a good shorthand in the domain but essentially impenetrable outside of the domain. More general language is required when targeting a wider audience.
Risk: When a machine learning framework is making decisions of consequence, who is accountable when it commits errors? Who upkeeps the system moving forward? We cannot help but have the feeling that civil engineering and the safety critical manufacturing industries have operated through similar problems.
Complexity: Machine learning strategies are still not fire-and-forget, and PhD is still needed to comprehend and leverage the strategies. We require better utilities. We believe commoditized machine learning is moving very quickly.
We believe that it is a good paper that could compel young analysts away from racing algorithm toward more influential work. It reminds us of O’reilly’s call to arms “work on stuff that matters” We would have liked some more concrete instances through, probably less idealistic and more business concentrated such as IBM Watson, Siri, and large scale image classification.
We also cannot help but feel that there are categories of problems where starters can make progression and obtain overt personal advantages. Like classification of their own photos, organization of their documents or trading on the stock market.