Mark As Completed Discussion

One Pager Cheat Sheet

  • By understanding the different types of storage layers (Ephemeral & Persistent) and their respective use cases, this tutorial will help you to address the challenges of storage layer design and choose the best solutions for various workloads.
  • Ensuring business continuity and performance, along with accessing large data volumes and ensuring correct operation are all concerns when designing a storage layer.
  • All 4 types of data stores: In Memory Key Value, Relational, Large Scale (NoSQL) Key Value and Distributed File System provide different performance characteristics and have the potential to scale up to different capacities in order to solve any data management problem.
  • In Memory Key Value stores provide the fastest data lookups due to their in-memory architecture, latency advantages, and ability to leverage parallelism.
  • Transactions are atomic, consistent, isolated, and durable operations which ensure that data is always kept in a consistent state across restarts.
  • The performance of a data store is usually measured by benchmarking, by testing its latency (how fast it can process a request) and throughput (how many requests can be processed per unit of time), though it is difficult to have both low latency and high throughput at the same time.
  • The most common use case is to use a relational data store, with strong ACID guarantees, SQL support and performance, to address correctness, performance and data access challenges.
  • High latency and throughput are not necessarily inversely proportional, and the throughput can still be high even with high latency if the number of operations is high enough.
  • Optimize system performance for a particular workload by running benchmarks and adjusting settings according to the product's documentation.
  • A well-implemented Ephemeral Storage Layer can dramatically improve performance by allowing responses to be retrieved from a cache server and reducing latency and increasing throughput.
  • Designing a persistent storage layer requires careful consideration of the different solutions available to serve large volumes of data, such as sharding, distributed file systems, and Large scale NoSQL data stores, while meeting the requirements of ACID, scalability, and low latency.
  • The sharding method applied for key value stores is horizontal, allowing different sub-sets of the data store to be retrieved and managed separately, making them more efficient for data retrieval.
  • Ensuring business continuity by regular backups or through replication across multiple data centers can minimize downtime in the event of disasters.
  • Geographical distribution is used to solve latency issues, as well as form a business continuity strategy using disjoint data sets.
  • Data replication provides a way to restore data quickly in case of system or hardware failures, reducing the risk of data loss by creating multiple copies of the data stored in different locations.