5 errors programmers commit when beginning with machine learning
There is no correct way to get started with machine learning. We all learn a bit differently, and in different ways with different strategies and have differing objectives of what we wish to do with or for machine learning.
A typical objective is to get productive with machine learning quickly. If that is indeed your objective then this post highlights five typical mistakes programmers commit on the path to swiftly being productive machine learning practitioners.
1] Placing machine learning on a pedestal
Machine learning is merely another family of techniques and strategies that you can leverage to develop solutions to complicated problems.
As it is a burgeoning domain, machine learning is usually communicated in academic publications and textbooks for postgraduate students. This provides it the aura that it is elite and has somewhat of a barrier of entry to crack.
A shift in mindset is needed to be efficient at machine learning, from technology to processes, from precision to “good enough”, but the same could be stated for other complicated strategies that programmers are interested in adopting.
2] Author machine learning code
Beginning in machine learning by authoring code can make things tough as it means that you are finding solutions to at least two problems instead of one, how a strategy functions so that you can implement it and how to go about applying the strategy to a provided problem.
It is much simpler to work on one issue at a time and harness machine learning and statistical environments and libraries of algorithms to learn how to go about applying a strategy to a problem. This enables you to spot check and tune an array of algorithms comparatively swiftly and tune the one or two that demonstrate promise instead of investing large amounts of time interpreting ambiguous research papers consisting of algorithm descriptions.
Implementation of an algorithm can be regarded as a separate project to be finished at a later time, like for a learning exercise or if the prototype system requires to be inserted into operations. Learn one thing at a time, it is recommended beginning with a GUI based machine learning framework regardless of if you’re a programmer or not.
3] Performing things manually
A process surrounds applied machine learning which includes problem definition, data prep, and presentation of outcomes, amongst other tasks. These processes combined with the testing and tuning of algorithms can and ought to be automated.
Automation is a huge part of modern software development for builds, tests, and deployment. There is a massive advantage in scripting data prep, algorithm evaluation and tuning and the prep of results in order to obtain the advantages of rigor and speed of improvement. Remember and reuse the lessons obtained in professional software development.
The failure to begin with automation (like Makefiles or similar build system) is probably owing to the fact that several programmers come to machine learning from books and courses that possess reduced focus on the applied nature of the domain. As a matter of fact, bringing automation to applied machine learning is a huge avenue for programmers.
4] Reinvent solutions to typical problems
Hundreds and thousands of individuals have probably implemented the algorithm you are implementing prior to solving a problem variant similar to the problem you are solving, and exploit the lessons that you have gained.
There is a treasure trove of knowledge out there of solving applied machine learning. Granted a lot of it might be tied up in literature and research publications, however, you can access it. Do your homework and look through Google, Google Books, Google Scholar and reach out to the machine learning community.
If you are undertaking implementation of an algorithm:
- Do you need to implement it? Can you recycle a current open source algorithm implementation in a library or tool?
- Do you need to perform its implementation from the ground up? Can you code review, learn from or port a current open-source implementation?
- Do you have to interpret the canonical algorithm description? Are there algorithm descriptions in other books, research papers, theses, or blog articles that you can review and learn from?
If you are tackling a problem?
Do you have to evaluate all algorithms on the problem? Can you exploit research on this or similar problem examples of the same general variant that suggest algorithms and algorithm categories that feature good performance?
Do you have to gather your own data? Are there publicly available data sets or APIs that you can leverage directly or as a proxy for your problem to swiftly learn which strategies are probable to feature good performance.
Do you have to optimize the parameters of the algorithm? Are the heuristics you can leverage for configuring the algorithm put forth in papers or research regarding the algorithm?
What would be your technique if you possess a problem with a programming library or a particular variant of data structure? Leverage the same tactics in the domain of machine learning. Reach out to the community and request for resources that you might be able to exploit to quicken up your learning and progress on your project. Consider forums and Q&A sites to begin with and contact specialists and subject matter experts as the next step.
5] Ignoring the math
You do not require the mathematical theory to begin, but mathematics is a major portion of machine learning. The reason for this is it furnishes probably the most effective and unambiguous method to detail problems and the behaviours of systems.
Ignoring the mathematical treatments of algorithms can cause problems like possessing a restricted understanding of a strategy or adopting a restricted interpretation of an algorithm. For instance, several machine learning algorithms possess an optimization at their core that is incrementally updated. Knowledge with regards to the nature of the optimization being solved (is the function convex) enables you to leverage effective optimization algorithms that exploit this know-how.
Internalizing the mathematical treatment of algorithms is slow and has to be mastered. Specifically if you are implementing sophisticated algorithms from the ground up which includes the internal optimization algorithms, take the time to learn the algorithm from the mathematical viewpoint.
Conclusion
In this blog post, you came to know about five common mistakes that programmers commit when beginning in machine learning. The five lessons are:
- Do not place machine learning on a pedestal
- Do not author machine learning code
- Don’t perform things manually
- Don’t reinvent solutions to typical problems
- Don’t gloss over the mathematics.