Mark As Completed Discussion

Read White Papers

This is something that I hadn't considered until a colleague of mine told me to read Jeff Dean and Sanjay Ghemawat's MapReduce: Simplified Data Processing on Large Clusters. You'll notice after reading a few paragraphs that it's surprisingly approachable. The entire paper reads like this:

The master pings every worker periodically. If no response is received from a worker in a certain amount of time, the master marks the worker as failed. Any map tasks completed by the worker are reset back to their initial idle state, and therefore become eligible for scheduling on other workers. Similarly, any map task or reduce task in progress on a failed worker is also reset to idle and becomes eligible for rescheduling.

It is especially crucial to hone in on passages like the above, as it is these specific technical considerations that demonstrate engineering proficiency. Thinking through failure cases and scenarios are a mark of a good developer, as is finding elegant solutions to them. White papers are chock-full of these opinions, and usually include multiple citations that you can branch off of.

For more white papers, here's a decent list to get you started. The point isn't to skim over these papers today and forget about them. They are challenging to read, so read them over weeks and months. Revisit them when you get a chance, or as needed if you are working on a similar problem and want to learn how others have tackled it. It's like strength-training for developers.