Git and the Anatomy of the Git Repository

Learn about Git and the Git repository.

Introduction

Git is an excellent example of the aphorism, “Necessity is the mother of invention.” Git was born of necessity by none other than the inventor of Linux, Linus Torvalds. It was developed to replace BitKeeper, the source control system used by Linus and his fellow hobbyists and researchers, to develop the Linux kernel. According to Linus, BitKeeper wasn’t perfect, but it was one of very few distributed SCM systems at the time. It also had one other attractive attribute: it was free. That changed in 2005 when the copyright holder revoked the open-source license.

It was apparent that no other vendor was likely to fill the space BitKeeper had left, so Linus set to creating its replacement. Following his penchant for naming his projects after himself, Linus self-deprecatingly named it Git, which is British slang for an unpleasant person. Linus had several goals for this new system; primarily, it must be able to apply patches—updates from other developers—much faster than BitKeeper had. The first benchmarks were run after less than a month of development, in April 2005. Linus achieved his goal: patches were applied by Git hundreds of times faster than BitKeeper.

Anatomy of a Git repository

A typical Git repository is a directory structure such as the one depicted below. The root of the directory contains all the files and subdirectories tracked by the repository. This is called the working directory, where the code files (and/or any other types of file) are edited by a developer. The .git directory in the root directory is the repository itself. It contains all the data structures used by the Git program to store repository metadata.

Git works by recording snapshots of the state of files in the working directory. Git refers to these snapshots as commits. This strategy differs from centralized systems, which focus on deltas, which are changes between files. Git’s method for recording version metadata is more efficient, faster, and requires less space on disk than centralized systems. The Git command-line program and graphical programs compatible with Git are used to manage the commits recorded in a Git repository.

Get hands-on with 1200+ tech skills courses.