AlgoDaily - A Systems Design Interview Primer For New Engineers

Home > Systems Design and Architecture 🔥 > Fundamentals of Systems Design > A Systems Design Interview Primer For New Engineers

Objective: In this lesson, we'll introduce systems design interviews, and focus on these outcomes:

You'll learn what systems design is.
We'll walk you through the various concepts you should know to design highly reliable and performant systems.
We'll show you how to get better at answering systems design interview questions.
You'll see how it helps your career in the long run.

When I first got into programming, one of the biggest impediments to leveling up was my lack of understanding of software systems and how they worked. I felt this frustration in multiple sprints. There were always some intriguing, big, and hairy tasks in the backlog that I wanted to tackle-- but I didn't have enough knowledge to even think through how my piece of a project could, or should, integrate with the rest of the systems.

It turns out this is a very common experience. A large part of the reason that large software companies ask new university graduates mostly algorithm questions is because they can't give them systems design problems! As you become more senior, your job incrementally shifts from solving "small code" problems (figuring out syntax, structuring code well, etc.) to larger scale ones (how should we design our database schemas? What tools are available to ensure our requests are as fast as possible?)

This tutorial aims to be a primer focusing on systems design interview preparation, but can also be used to simply get better at systems design (a required skill) as a working developer.

What is Systems Design in Software?

When I speak of "systems design", I'm talking about the process by which we as engineers make decisions regarding the elements of a complex application. These system elements-- such as the data models and structures, overall architecture, modules and components, and the different interfaces of those components-- have to be carefully contemplated to ensure speed, reliability, and stability down the line.

When one first starts their development career, it's easy to gloss over these high level technical decisions. After all, you're already trying to hold the fundamentals of syntax, modules, and perhaps object-oriented programming in your head-- not to mention having to learn how to review code written by others, how to structure code well for others, working with source control, etc. This can already be overwhelming.

Around your third to fifth year of software engineering though, you'll have learned enough "small code" problem solving to provide a foundation for thinking through the bigger picture. It's also when you'll have sufficient experience with different parts of a system (application, database, message queue, etc.) and know enough about their pros and cons to start making good trade-offs.

These trade-offs are especially important in business and enterprise software (read: most jobs), which has an (often contractual) expectation of reliability and good service. Corporations will not be happy paying for services that are often down or fragile.

Additionally, poor systems design causes frustration for other people on a software team-- systems that aren't designed well have bugs that are hard to track down, difficult-to-maintain code bases, and an increased level of effort for adding new functionality and features. It also makes it more challenging to on-board a new engineer, as there might be more complexity than is necessary in the setup and learning of an application.

What Does a Systems Design Question Look Like?

It's pretty easy to tell when you're getting a systems design question during an inteview-- most interviewers will start off with a high level overview of an application or service. They might ask how familiar you are with it, and will then ask you to design it.

Here are some sample questions:

How would you build Google Analytics?
Choose a web application that you use and walk me through its moving parts.
How would you design Instagram/Yelp/Youtube/Facebook?
Why do you think X framework fits better than Y framework on this application?
Suppose we want to build a ticketing system. How do we handle X, Y, Z..?
If your web app failed to give responses, how do you find out what happened, and how do you plan to fix the issue?
We want to design a service that does X.

What Does A Systems Design Question Look Like?

Beyond testing your knowledge of technical concepts, trade-offs, identifying bottlenecks, and thoughts on maintainability, interviewers are also looking to see how well you understand and clarify requirements.

Hence why questions like "How would you build Google Analytics?" are less frequent than "Suppose we wanted to build an analytics service..." Interviewers are vague on purpose, and are expecting to hear questions like:

What are the use cases? What are we trying to build here? Let's say we're building a CMS to help bloggers edit their content. Will it only be bloggers using the application, or could it be illustrators, marketing folks, and operations people that will also have their own needs?
How long do we need to store data for? Identifying the data store that we'll be using (in-memory vs. on-disk, NoSQL vs. SQL, columnar vs. time-series, etc.) is dependent on a multitude of factors, one of which is the length of time we'll need to store information.
What is the scale of the metrics we'll be getting? (What's our database strategy?)
Does there need to be a web client? (Do we need to design components?)
What should the user interaction be? (Do we want MVC on the frontend?)
How up to date should the metrics be?
Do we want to expose logs? (For maintainability)

As far as the actual "solution", interviewers are usually looking for some sort of diagram of all the moving parts of the system that looks like this:

You usually have 45 minutes to an hour to get a working solution on a whiteboard.

How Do I Get Better?

The most obvious way to improve upon systems design knowledge (and arguably only real way to internalize the concepts) is to get more development experience working on complex, data-intensive applications using various solutions.

As you implement more designs, you'll naturally see what works in what scenario, and what doesn't. During the NoSQL hype, a ton of companies found that they did indeed prefer a relational database, and learned a painful lesson in the costs of switching back to one.

Additionally, certain themes carry over in seemingly separate aspects of software development (e.g. patterns for multi-thread concurrency in programming are surprisingly similar to multi-datacenter concurrency, the execution and breakdown of tasks in an ETL process is similarly broken up and timed like rendering components in UI-rich applications).

Build Something For Yourself

It is crucial to actually do the work of building something-- it's in the doing that you make numerous realizations around the "why" of software design. It's especially a good learning experience when it's your own project because of the emotional investment.

To put it bluntly, you need to feel the pain of your site being down to understand why there needs to be a load balancer. You need to lose part of your data during an outage to get the importance of redundancy. You have to spend hours digging through multiple services and components in an effort at debugging to fully grasp why it's important to have good logging.

The only requirement is to work on projects that are comprised of multiple moving pieces. A good start is any CRUD web application that provide some kind of tool or service to an end user. Some ideas and tips to maximize systems learning:

Try to use a data store like a modern relational databse
Make sure to use a modern web framework with an ORM (and without)
Try using a frontend framework with a REST API
Use a job queue does some kind of background processing
Add a cache layer that scales reading of data
Incorporate a load balancer to your application
Build a microservice that your application depends on (e.g. thumbnail service for photos)

Opportunities in Open Source

If you don't want to start from scratch, choose some piece of software that you're fascinated by, and see if there's an open source library with similar features. Then try to take it apart, understand what each piece does, and contribute something new to the repository and community.

Github's search engine is a great place to start. Some amazing open source projects that are worth learning from are listed below. Notice the diversity in projects-- this is especially important to get insight into parts of software you might not normally encounter.

https://github.com/skulpt/skulpt - Python to JS Compiler
https://github.com/uber/ludwig - Tensorflow Toolbox without code
https://github.com/freeCodeCamp/freeCodeCamp - Learning curriculum for JS
https://github.com/firefox-devtools/debugger - Firefox's debugger (written in React)
https://github.com/spring-projects/spring-boot - Create stand-alone Spring applications
https://github.com/elastic/elasticsearch - RESTful search engine

However, it can often be intimidating to do learn just by jumping into complex projects. Additionally, certain people like to learn the theory while simultaneously building things. Combining the two approaches will accelerate your understanding of these concepts.

Read White Papers

This is something that I hadn't considered until a colleague of mine told me to read Jeff Dean and Sanjay Ghemawat's MapReduce: Simplified Data Processing on Large Clusters. You'll notice after reading a few paragraphs that it's surprisingly approachable. The entire paper reads like this:

The master pings every worker periodically. If no response is received from a worker in a certain amount of time, the master marks the worker as failed. Any map tasks completed by the worker are reset back to their initial idle state, and therefore become eligible for scheduling on other workers. Similarly, any map task or reduce task in progress on a failed worker is also reset to idle and becomes eligible for rescheduling.

It is especially crucial to hone in on passages like the above, as it is these specific technical considerations that demonstrate engineering proficiency. Thinking through failure cases and scenarios are a mark of a good developer, as is finding elegant solutions to them. White papers are chock-full of these opinions, and usually include multiple citations that you can branch off of.

For more white papers, here's a decent list to get you started. The point isn't to skim over these papers today and forget about them. They are challenging to read, so read them over weeks and months. Revisit them when you get a chance, or as needed if you are working on a similar problem and want to learn how others have tackled it. It's like strength-training for developers.

Study Design Docs

Design docs are widely used in software engineering teams to communicate design decisions. They usually consist of an in-depth explanation of the problem being solved, the scope of the solution, the actual design decisions (including data models, high level architecture and schematics, libraries used, etc.), and (most importantly) a discussion of why the decisions were made.

The first place to look for good design docs is at your current company or university. These can be valuable resources for new engineers, especially during on-boarding, and particularly in regards to applications you will be maintaining. I often read at the design docs of systems I am tasked with working on to get an overview of how it came to be, and why it was built in such a manner. A side benefit is that is also points me to the right person (the author) to talk to if I have further questions about the architecture.

Note: I've tried to read the company design docs for applications that I'm not directly involved in, and find it hard to retain anything or stay motivated while reading. Though it sounds good in theory, it's much more useful to read design docs on systems that you're actually interested in, as the material can otherwise be dry.

The ASOS (The Architecture of Open Source Applications) books are tremendous for this. From their page:

"Architects look at thousands of buildings during their training, and study critiques of those buildings written by masters. In contrast, most software developers only ever get to know a handful of large programs well—usually programs they wrote themselves—and never study the great programs of history. As a result, they repeat one another's mistakes rather than building on one another's successes."

They also talk about their motivations:

"Our goal is to change that. In these two books, the authors of four dozen open source applications explain how their software is structured, and why. What are each program's major components? How do they interact? And what did their builders learn during their development? In answering these questions, the contributors to these books provide unique insights into how they think. If you are a junior developer, and want to learn how your more experienced colleagues think, these books are the place to start. If you are an intermediate or senior developer, and want to see how your peers have solved hard design problems, these books can help you too."

I am especially fond of the Audacity chapter of the book. I've used Audacity for over a decade, and have never considered how intricate the design of the (very simple) UI was.

Note that the ASOS books are 100% free online at their site, but you can also purchase the physical versions on their site and Amazon.

Another great place to read up on "design docs" is the HighScalability blog. Though not design docs in the proper sense, these real-life architecture breakdowns are extremely useful in understanding modern web and cloud systems at a high-scale.

I found this blog to be among the most approachable resources, especially for people who are new to development but are tasked with working on high-trafficked systems. It also sports a collection of really interesting tweets at any given time.

Amazing Resources For Further Enhancement

I'll also share a few resources that I really would have appreciated when I first started.

Firstly, this Systems Design Primer repository on Github is perfect for review right before an interview. It basically sums up all the things that interviewers are looking for in systems design interviews. If you can touch upon several of the major concepts, you'll get a pass. Thank you Donne Martin for creating this!

https://github.com/donnemartin/system-design-primer

Second, my favorite book in computer programming is Designing Data-intensive Applications by Martin Kleppmann. It provides a gradual but deep overview of systems design, as you start with understanding the how/why of data models, and work your way into batch processing, and distributed system considerations. It's a stellar read. The usual advice holds of choosing a good reading cadence.

Amazing Resources For Further Enhancement

In Conclusion

As with all things in technology, systems design can be tricky at first, but it's only due to lack of experience. A lot of the knowledge just comes by working-- so keep applying a solid deep work philosophy towards your career, study the resources above, and you'll be well prepared for any systems design interview that comes your way (and be a better engineer)!

One Pager Cheat Sheet

This tutorial aims to provide an introduction to systems design, its various concepts and how to answer systems design interview questions, and how these skills can help with career growth as a developer.
Systems design in software engineering is the process of making decisions regarding the elements of a complex application to ensure the speed, reliability and stability of its services.
Interviewers typically use systems design questions to gauge candidates' ability to identify and communicate requirements, and demonstrate their knowledge of technical concepts, trade-offs and bottlenecks, as well as thoughts on maintainability, by developing a diagram of all the moving parts of the system.
Get development experience working on complex, data-intensive applications and look for similarities between seemingly separate aspects of software development to improve upon systems design knowledge.
By actually building projects comprised of multiple moving pieces, you can gain invaluable experience that will help understand the "why" of software design.
Exploring open source projects is a great way to deeply understand code and can provide a range of unique opportunities to contribute.
Reading white papers and learning from technical considerations from them is crucial for honing engineering proficiency and like strength-training for developers.
Design docs are an in-depth explanation of how and why a particular system was built, and can be extremely valuable resources for engineers, particularly during on-boarding. The ASOS and HighScalability blog books are excellent sources for such documentation.
Designing Data-Intensive Applications by Martin Kleppmannand theGithub Repository Systems Design Primer` are both great resources to check out when preparing for a systems design interview.
Utilizing a solid deep work philosophy and reading the given resources will help you to become well prepared for any systems design interview that comes your way.