The real workhorse enabling core Git operations

As a data scientist or a data engineer, working with Git is always a breeze when he/she is aware of all its core elements that are utilised under-the-hood by a Git operation like git-clone or git-push. Although, core Git operations have received sufficient hands-on attention from fellow enthusiasts, special interest is laid, here, on a core entity called Git _references _that enable several such core Git operations.

To understand Git reference_s and their importance, let us consider the below complex-structured_ remote repository

Illustrates a remote repository structure with five branches including the main branch. Image by author, made using diagrams.

and clone the above illustrated remote repository as below

$ git clone https://github.com/<git_username>/my-repo.git

As expected, the cloning operation results in a local repository with a default local main branch and a remote connection origin, see below

Illustrates the cloning operation of a remote repository with five branches including the main branch. Image by author, made using diagrams.

However, one thing to notice in the above cloning illustration is that, by default, the local repository my_repo on your local machine only contains a main branch. And, on closer observation in your local machine, the local directory my_repo will also only contain a copy of the files that are present in your remote main branch. Remote branches like _branch_1 _and its file contents are nowhere to be found on your local machine. Thus, leading to the question

#git #programming #data-science

Demystifying Git References Aka Refs
1.20 GEEK