Skip to content

MIT Missing Semester 6: Version Control

感觉自己对于Git以及版本控制的了解还是太浅薄,看了Missing Semester后有一些启发和感悟,做此记录。本讲主要讲了Git底层的数据模型如tree、blob、object、reference,以及一些常用的命令如merge、branch等。

Git Data Model

假设我们有如下的文件布局:

├── root
│   ├── file
│   └── foo
│       └── bar.txt
Git把目录看作一个tree,把文件看作一个blob。对于一个目录(tree),下面可以包含子目录sub-tree或/和文件blob。

朴素的版本控制就是对于当前的文件及结构,建立一个snapshot,包含有作者、时间戳的meta data;不同时间戳下的snapshot构成类似于链表的线性结构。

而Git的版本控制用的有向无环图Directed Acyclic Graph,一个commit可以有多个parents (即版本合并)。下面是一个规范化语言描述的Git中重要的概念和过程:

type blob = array<byte>

type tree = map<string, tree | blob>

type commit = struct {
    parent: array<commit>,
    author: string,
    message: string,
    snapshot: tree
}

type object = tree | blob | commit

objects = map<string (hash number), object>

def store(o):
    key = SHA1(o)
    objects[key] = 0

def load(key):
    return objects[key]

Workflow

Some useful command

git status: View the current status.

git log --all --graph --oneline: You can view the DAG commit graph in the terminal.

git merge : merge the into the current branch that HEAD points to.

git checkout : replace the workplace with the repository .

git fetch : If the local repository has a online hosting repository, you can fetch the branch "/" to the local repository.

git pull : == git fetch + git merege. You can fetch the remote branch and update the corresponding current local branch.