I was watching some of Richard Schneeman’s Rails screencasts, and he mentioned that one thing he wished he’d done when he first learned to program was keep a blog and include, from time to time, the things he didn’t yet know/know how to do. So, here it is: the things I don’t know.
Algorithms & Data Structures
I knew from the start that this was an area I’d be weak in—it’s generally not something you learn “on the job” and is extensively covered in undergraduate computer science classes, of which I only took one—so I’ve made a conscious effort to learn as much as possible in this area. I picked up Kyle Loudon’s Mastering Algorithms with C and George Heineman’s Algorithms in a Nutshell, and while I feel like I’m really starting to internalize the concepts and form the right mental models, I’ve still got a long way to go. (I’ll pick up the canonical CLRS as soon as I can find it for under $80.)
I’ve learned a bunch of sorting algorithms and know their big-O time complexities, but I’ve never sat down and implemented a search algorithm from scratch. I know what linked lists, (de)queues, stacks, B-trees, binary trees, tries, hash tables (chained and open-addressed), and graphs are, but I’ve only implemented a few of these in C. (I think more C practice in general will help with a lot of the topics covered in this post; see below.)
Design & Architecture
I picked up the GoF book and am working through it, and while I understand the thought and history behind the patterns, they’re not yet obvious to me when I read code “in the wild.” (That is, I probably wouldn’t immediately say, “Oh, this is the Decorator pattern” after reading someone else’s program.) I think this will come with practice, and one of my immediate goals is to replace the janky global variable in Ruben with a singleton.
In terms of general architecture, I think this will improve with continued reading and writing. Ten years ago, I had no idea how to structure a poem or book of poems; I learned by reading tens of thousands of poems and hundreds of books of poetry, as well as by writing my own. Similarly, I expect to have to read thousands of programs across hundreds of projects (while continually writing my own) before I feel like the process is intuitive. The process will also become faster, since I’ll know (through having read others’ mistakes and having made my own) what works well and what doesn’t; as the saying goes, “In the beginner’s mind there are many possibilities, in the expert’s mind there are few.”
Distributed Programming & the Internet
I understand the ideas behind databases and RDMSs and have written a bit of SQL, but I don’t understand the actual relational calculus. (I’m not sure this is truly necessary, but if it will deepen my understanding of SQL and relational databases, I feel like it can’t hurt.) There are some concepts, such as queries and JOINs, that I’ve reinforced by writing code and working with a database, and others (like database normalization/denormalization) that I’ve never done before. A good example of something I figured out through experience is that until last week, I didn’t understand that primary keys increment monotonically—that is, IDs for deleted rows aren’t reused. (You might notice that the previous post had an ID of 6, but this one has an ID of 11. Evidence of learning!)
I’ve done a bit of work with MongoDB and I understand the broad differences between relational databases like PostgreSQL and document-oriented NoSQL databases like Mongo, but again, I’m somewhat lacking on the details. I do understand that Mongo’s storage of entire documents obviates computationally expensive JOINs and makes reads much faster than writes, and I understand how NoSQL solutions like Mongo scale much better horizontally than a traditional relational database, but if you plunked me down at the command line and asked me to shard the database or deploy a replica set, I wouldn’t know where to begin. Some of this is particular to Mongo, though, and I feel like practicing more with that technology will help close some of the practical knowledge gaps.
Speaking of the machine—I want to learn way more UNIX stuff. I know what the kernel is and what it does, as well as the general concerns of the operating system (device, file system, memory, and process management), but I don’t know how, say, the scheduler actually works. I’ve never written a shell script of more than a dozen lines or written anything interesting, like a cron job or bootstrap script, so I want to dig into that more. I think Kernighan and Pike’s The UNIX Programming Environment and Cameron Newham’s Learning the bash Shell would be great for this. I’m especially excited to pick up Kernighan & Pike, since I really enjoyed the straightforward (and often witty) K&R.
Lastly, threads and concurrency are still sort of a mystery to me. This is yet another arena where I understand what threads and processes are and what concurrency is, but having never written a program in which I had to manage threads and processes manually, I don’t think I fully understand them. I’m sure I will after the first time I spend hours trying to debug a program with subtle race conditions, but until then, I’m going to seek out projects designed to help me work with threads/concurrency and better understand the topic.
As mentioned, I want to learn shell scripting as part of my UNIX/systems programming development. From what I’ve seen, this shouldn’t be too difficult after mastering a scripting language like Ruby.
Finally, there are two everyday tools I want to get much better at using: Vim and Git. Both are a matter of daily practice and reading the documentation/doing tutorials for each; practice will instill the necessary muscle memory, and better learning the commands will give me a much wider range of options when trying to accomplish a particular task. In terms of books, I’ve started reading Drew Neil’s Practical Vim and I’m planning to pick up a copy of Scott Chacon’s Pro Git.
I didn’t realize how long this post would be until I wrote it, which is somewhat humbling but simultaneously really galvanizing—now that everything I currently feel like I need to learn is in one place, I can iterate on this list and backfill my knowledge in a systematic way. Of course, this isn’t everything I want or need to know, and I’m sure that (hydra-like) each thing I learn will be replaced by two more things I want to learn. But that’s the whole point, right? One of the major perks of software engineering is that there’s always something new to pick up and learn, and I can’t imagine not being excited about that.