Learning to notebook
I’ve decided to improve my overall knowledge of machine learning. I work in the data space and while my work thus far has been focused on building the infrastructure to ingest, store and analyze data, it hasn’t covered machine learning. With the current industrial trends in the space, I’d be doing a disservice to myself to not become knowledgeable on the subject.
I am also a sucker for learning new things and being good at using Jupyter notebooks is a good, fun starting point. My goal will be to get to a place where I know enough to understand what is going on with machine learning teams but not to the expertise of a machine learning engineer.
What’s my plan to get there?
I don’t fully have one yet. There’s a lot of material available advising what one should do. For starters, I’m not going to bother with Kaggle. I appreciate Julia Evans’ post on the subject. My superficial understanding, is that doing Kaggle is akin to being good at passing problems on project euler. While they are fun problems to solve, being good at them doesn’t necessarily translate into being good at either programming or machine learning.
Instead, I decided to check out what is being taught at my undergrad university. Looking at it, things have really changed in the CS syllabus! When I studied there (2010-2013), machine learning was a part of the AI course and there wasn’t much emphasis on doing any practical work. Now, its a core part of the curriculum!
So, my path is going to be to start with the courses listed below and see how it goes as I’m familiar and comfortable with the Cambridge style of teaching. I especially like the emphasis they are placing on using python notebooks. The order of what I’ll be going through is:
- Scientific computing - this is jupyter notebook 101
- Machine Learning and Real world data
- Data Science - Seems like a good recap of the maths that I likely will mostly skim
- Data Science principles and practice - It’s the follow up to the first data science course
- Machine Learning and Bayesian Inference
Am I really starting from scratch?
Probably not. I know python well and I got about halfway through Andrew Ng’s course (perhaps I’ll come back to this) years ago though I’ve forgotten most of it. I have also learned a lot of the fundamental maths at play: Linear regression, probability, SVMs and neural networks while I was at university. Though I’ll require a refresher on a lot of these, it’s not the same as learning anew. Overall, I hope to do most of my learning through practical implementation.
So here we go. Stay tuned for posts as I go through this. They’ll show up under the learning tag.