AlgoDaily - Data Partitioning For Systems Design

Home > Systems Design and Architecture 🔥 > Database Concepts and Applications > Data Partitioning For Systems Design

Types of Data Partitioning

There are a few common ways databases and other data stores partition data across clusters:

Horizontal Partitioning (Sharding)

This refers to putting different rows of data in different tables or databases. For example, a users table with millions of rows could be split across multiple "shards" (tables) based on user IDs or regions. Reads and writes can scale as shards are added.

Sharding is commonly used by web-scale companies like Facebook to distribute user data across databases. The downside is that joins and transactions across shards become challenging.

Vertical Partitioning

Here, different columns of a table are stored separately. For example, frequently accessed columns like UserID can be put in one table while less frequently accessed columns like Address are put in a separate table.

This makes sense when some columns are accessed much more than others. However, database joins become expensive. Vertical partitioning is losing favor compared to sharding.

Directory-Based Partitioning

A lookup service or directory is used to divide data across nodes. For example, a hash function assigns a partition key to each data item which is then used to lookup the database node where it is stored.

This provides transparency to applications since the lookup abstraction hides the physical data placement. However, the lookup service can become a bottleneck.

Types of Data Partitioning

Horizontal Partitioning (Sharding)

Vertical Partitioning

Directory-Based Partitioning

Programming Categories

Popular Lessons