Machine Learning Research Advice

A few years ago, this post (Confession as AI Researcher) on machine learning Reddit community got a lot of attention. It was asking for machine learning research advice.

A lot of suggestions/advice have been made by the community. I am summarizing all the points made by different community members along with my thoughts. Hope this might help someone in their journey into AI research.

This post from reddit documents the problem faced by many AI researchers in dealing with mathematics of AI research papers. OP is unable to come up with innovative ideas and moreover cannot understand advanced math in papers as well.

The main pattern among the threads I could see is that, there are supporters for both top-down approach, bottom approach, and advisor related suggestions.

Do Top-Down Approach

For these people, the frustration is pretty normal and will get better as we learn more. The major claim by these people is that OP is overestimating the technical depth of papers and underestimating his/her ability to eventually understand and build upon these papers. You just need to have mathematical maturity to attack any concept. According to them, the concept dependencies are like a directed acyclic graph or a finite tree where we will reach leaves eventually. The main suggestion by them is to believe in this and try to build your intuition toolset (or idea castle) gradually, which can be used later on like in dynamic programming.

Another major point here is that the purpose here is defined, the learner is in the driving seat and is more likely to remember the lessons longer. But in my experience even to do a top-down approach one needs to have some knowledge which might only be achievable through the bottom-up method. One minor suggestion by them is to look for resources like ‘math for ML’ which is likely to have high utility density. Top-down people suggest bottom-up only after exhaustively trying all top-down methods and still not seeing any progress.

Learn the largest features new to you. Important features keep appearing. To be creative is to have a lot of ideas. Ideas can be the same idea in different contexts. The way of understanding comes from frequent exposure to different contexts. Creativity comes from exploring the difference in what you are doing versus what everyone else is doing.

Over time things will get more and more clear to you, first slowly here and there and suddenly whole subtopics will start making sense. The intuition toolset building process needs to strike a balance between depth versus black-boxing. You need to learn to find this balance as you go.

When learning/reading a paper focus on what it is that they actually do rather than what they explain or use words. Ask yourself why are they doing this/why do they need this then you can often figure out what is going on despite the dense terminology.

Systematic Planning and Execution

The most voted answer for the post suggests a systematic task assessment and planning. Your job as a PhD candidate is to find a place in the sandbox you feel comfortable playing in. The essential idea is to define the goals and trace back to the essential things needed and acquire them. The best sentence from it I like is “You need to form your own intuitions. If you use second hand thinking as a substitute of your own, you might never move beyond the border in the unknown”.

Do a Bottom-UP Approach

The concept dependencies for AI math are huge and are likely an exponential tree as claimed by the bottom-up fans. As per them, it is impossible to learn concepts on the go as encountered because one unknown leads to ‘n’ unknowns. Their suggestion is to go for specific courses/books that help you build the fundamentals. Although this might take some time, according to them it is worth trying than giving up. Also, this is essential to have a breadth of knowledge to do a top-down approach later stage. You will develop the ability to make good judgments about what you need to reach to achieve your goals. You can judge the reference is at/above/below your level and when to black-box concepts. The resources suggested by them include the following.

Papula, 3 Germen math books
John Beaz’s Resources
Real analysis resources
Matrix calculus
Convex optimization
Stochastic optimization
Numerical optimization
Statistics
Probability
Algebra
Abstract algebra
Topology
Proofs
Group theory
Functional analysis courses
Rudin’s books (~3 months) (1)
Analysis by Abbot (1)
Mathematical Analysis by Tom Apostol (slowly carefully struggle through exercises)(1)
Princeton Companion to Mathematics (1)
Statistical Machine Learning Course by Larry Wasserman (1)
Reading List by Michael Jordan (1)
Linear Algebra by Gilbert Strang (1)
Numerical linear algebra
Probabilistic Graphical Models
Linear Algebra by Otto Bretscher (1)
Fundamentals of Analysis by Michael Reed (1)
Book by Stein and Shakarchi (1)
Statistical Inference by Casella and Berger (1)
Optimization Book/Course by Stephen Boyd (1)
Information Theory book by Cover and Thomas (1)
Book Ladder from a course page (1), link
Information Geometry Tutorials by Baez (1)
What is mathematics book (1)
Probabilistic Machine Learning by Murphy (1)
Statistical Learning Theory course from MIT (especially lecture 3)(1)
Deep Learning Coursera Specialization (1)
Book by Yoshua Bengio (1)
Essence of Linear Algebra Playlist by Grant Sanderson 3B1B(1)
Machine Learning Course by Andrew N G (1)
Probabilistic Graphical Models by Koller & Friedman (1)
Introduction to Information Theory Book

Get Help From Mentors or Advisor

This is more closely related to the Top-down approach because a mentor/advisor is one who is supposed to have the professional skill to identify your weakness and suggest resources to fill those gaps. Supporters of this class truly believe that it is the duty of the mentor/advisor to help someone like OP. You can ask for informal explanations from peers to get through the concept easily. Online Reading Groups is an option to get a peer network.

Summary

One can apply one or more or a combination of these suggestions depending on their situation and mathematical maturity. For me, the top-down approach seems more suitable with a fixed time dedicated to the bottom-up approach as well. Mentorship support for me seems a distant possibility. Systematic planning is always helpful to me. I hope this post will save time for someone who would like to go through the reddit discussion. Moreover in the best case the machine learning research advice suggested may actually make an impact in their career and life.