How AWS SDM Dive Deep (2) - Datastore
An efficient way for new SDMs to learn how a service works is to study its datastore. Data lasts a lot longer than application logic. Datastore is always the performance and scalability bottleneck of your service, wether you like it or not. It is the heart of your service. It is worth the time to dive deep. You can start with the following questions to explore your datastores. Where is your Online Transactional Processing (OLTP) datastore? Is it relational (RDBMS) or NoSql style? How big is your datastore? How fast is it growing? What is your transaction per second? What is your read and write ratio? Most services’ data are read heavy, how is yours? How do your datastore schemas look like? What are the most important tables? Are there tables more static (sometimes we call them metadata) than others? How are your datastore queried? Where are the main clients of the datastore? How many connections can you handle from the clients to your datastore? Do you have connection pool mechanisms? Does your service partition data? What is your partition key? What is your data redundancy solution? How is your data replicated? Do you use leader-follower between datastore nodes? How is leader selected? Do you replicate across Availability Zones and/or across regions? In the CAP (Consistency, Availability and Partition Tolerance) theorem, which two does your service choose? What is your data durability story? How is your data archived and restored? Do you have mechanisms to validate the restore process actually works? How often do you back up your datastore? How much data loss can you tolerate? What is your data retention policy? Do you have hot data that needs to be read a lot more than other data? Do you have cache solutions? How do you keep the cached data up to date? Do you have cold start problem to warm up the cache when you bootstrap a new host in your service? Do you have Online Analytic Process (OLAP) datastore? How do you keep your OLTP data and OLAP data in sync? Who are the consumers of your analytics datastore? Do you have Event Sourcing datastore to track data mutation history? Can you query them efficiently? Where are your log data? How do you stream logs out of your service hosts? Do you have data lake solution to turn logs into gold mine for operation excellence? Are they sufficient to support your oncall’s troubleshooting? What is the cost structure of your datastore solution? As someone who makes a living on cryptography, are you data encrypted? How are they encrypted? How do you manage your encryption keys? (Hint, use AWS KMS to manage your encryption keys) ... So here is my second suggestion to new SDM in AWS: if you want to understand a service, start with its datastores. In KMS, reading the book below is mandatory for all members.
Last updated
Was this helpful?