Learn system design from Git - Multilayered Architecture
Last updated
Was this helpful?
Last updated
Was this helpful?
Git is a good example of multilayered architecture.
At the bottom of Git is its content addressable key/value datastore implemented as a file system. The abstraction from this layer are: blob, tree, commit, reference. Notice there is no such concept like file, directory, version or branch in the data layer that are typical in version control system. One unique design in this layer is the file name of a blob is stored inside the tree construct, instead of the blob itself. It allows file names to be decoupled from file contents. Unix/Linux file system uses similar approach to associate a human friendly name to a file on disk, using the inode construct.
Above the data layer we have the domain logic layer: branch is a named mutable reference to a commit; tag is a named immutable reference to a commit. The HEAD is the reference of the last commit of the branch the users checked out. In Git is a commit is a checkpoint or snapshot of the entire project tracked by the Git repository, instead of a singple mutation. This design makes complex operations in domain logic layer like branching, merging, rebasing so much easier comparing to other version control systems. If you have experience using version control systems like Clearcase, Perforce, or Subversion etc., you will really appreciate the beauty of Git.
Above the domain logic lauer we have the user interface layer. The most popular user interface for Git is the CLI - Command Line Interface. But Git has nice graphical user interface from GitHub, or integration with IDE like Visual Studio. When Git was launched initially there were a lot of complaints about its user interface not friendly enough - it was designed by a legendary Linux kernel developer for other kernel developers, so friendliness of the user interface was probably not their top concern. But with the clean separation of the user interface layer from its domain logic layer, the Git community was able to improve the user interface incrementally and dramatically over the years. Nowadays it is arguably the best designed user interface for version control system.
Git has a client-server tier that allows it to be the best distributed version control system on earth. (I am obviously biased here). The client-server tier is elegantly abstracted as the “remote” construct. Different network protocols can be plugged into this tier: Http, SSH etc. Git was designed to allow developers to work offline, then collaborate through eventual consistency. Each Git local repository has everything developers need to work independently. As a distributed computing system, Git prioritizes Availability and Partition Tolerance, by sacrificing real-time Consistency (CAP theorem). Git’s event sourcing architecture makes concurrent mutations very easy to reason and reconcile.
KISS - Keep It Simple, Stupid and Keep It Layered!