SRE Study Guide

I feel like at some point in the past, I used to be a pretty decent sysadmin–I mean DevOps–I mean SRE. Over the years, the specialization in HPC and subsequent relegation to management has displaced most of the knowledge I used to have. For no particular reason, I thought it might be nice to see what resources are available that one could use to brush up on one’s Linux chops.

Since the only person reading this is me (hi me!) I’ll just make relevant posts as I find materials?

Without much thought, I roughly see these breaking down into a few categories, to be used later as tags:

Linux Internals

Just general linux-ey and unix-ey things. Kernel, processes, concurrency, IPC, etc.

System Design

Something along the lines of what Google calls Non-Abstract Large Scale Design. I’m not sure exactly how this differs from the HPC stuff I’ve been doing, but I think I’ll find out along the way. Thinking about these systems without being restricted by MPI, IB, POSIX filesystems, etc. will be interesting.

Troubleshooting

‘nuff said

Sysadmin Skillz

Should this be combined with Troubleshooting?

Algorithms/Data Structures

Not sure how much of this I consider to be SRE vs. SWE, but I guess we’ll see how it progresses.