I’m very, very slowly working my way through the first of Andrej Karpathy’s “Neural Networks: Zero to Hero” video series as part of one of my missions this year — to pick up some AI programming skills.
I’m deliberately moving very, very, slowly because I’m trying to make sure that I’m not missing any key details, and also because it’s a field that I don’t know much about and one that’s dense with branches of math that I haven’t touched in a long time, such as calculus.
Whenever I do this, I use an older method for note-taking, namely an actual paper notebook, and preferably one that’s graph-ruled. Fortunately, the nearby Walmart is a reliable source. The notebooks they carry even look like the notebook that the hacker character Elliot uses in the Mr. Robot TV series, minus the damage incurred from the considerably more dangerous (if fictional) life that he leads:
For me, taking handwritten notes and working out all the details behind Karpathy’s description of the system that he’s building step by step in his video helps me get a solid understanding of how and why it works. It’s also so much easier to write equations and draw diagrams by hand…
…and of course, if you’re so inclined, you can easily enhance hand-written notes with comics:
In his videos, Karpathy does his level best to spell out every last little bit about how AI systems work. It’s impressive — he does explain AI programming better than most lecturers I’ve seen or books I’ve read.
But he’s also someone with a bachelor’s degree in computer science and physics, a masters in which he worked on computer simulations, and a PhD from Stanford where his area of study was some mash-up of computer vision and natural language processing. While he says that all you need to understand his video is a “basic knowledge of Python and a vague recollection of calculus from high school,” it’s pretty clear that he operates on a different level than the rest of us.
I have years of Python experience and took university-level calculus, but I still decided to not skip or skim over anything Karpathy covered. I followed the math in his “first principles” review of what a derivative is, and walked through the process of stepping through the neural network he introduced, which is essentially depth-first binary tree traversal. It’s straight from the standard “algorithms and data structures” course from the sophomore year of a computer science degree program, But hey, it never hurts to review stuff to make sure you truly understand it:
I’m thinking of putting out a series of videos to help people who don’t have a computer science and math background understand some of the stuff that Karpathy covers. It’ll be kind of like the help that Homer Simpson could’ve used when trying to understand marketing:
I’ll close with the first video in Karpathy’s series. If you have any questions about it, feel free to ask me — I’ll see if I can help!