Understanding Git is essential to open source development, but it can be intimidating to learn. Let this tutorial be your first step to getting to know Git.
Git has become the default way to store and transport code in the DevOps generation. Over 93% of developers report that Git is their primary version control system. Almost anyone who has used version control is familiar with
git commit, and
git push. For most users, that’s all they ever plan to do with Git, and they're comfortable with that. It just works for their needs.
However, from time to time, almost everyone encounters the need to do something a little more advanced, like
git rebase or
git cherry-pick or work in a detached head state. This is where many devs start to get a bit nervous.
I'm here to tell you it is ok! Everyone who has or will ever use Git will likely go through those same pangs of panic.
Git is awesome, but it's also intimidating to learn, and it can feel downright confusing sometimes. Git is unlike almost anything else in computer science. You typically learn it piecemeal, specifically in the context of other coding work. Most developers I have met have never formally studied Git beyond perhaps a quick tutorial.
Git is open source, meaning you have the freedom to examine the code and see how it works. It's written mainly in C which, for many devs and people learning computer science, can make it hard to understand. At the same time, the documentation uses terms like massage parameters and commit-ish. It can feel a little baffling. You might feel like Git was written for an advanced Linux professional. That is because it originally was.
Git started as a specific set of scripts for Linus Torvalds to use to manage patches.
Here's how he introduced what would become Git to the Linux kernel mailing list:
So I'm writing some scripts to try to track things a whole lot faster. Initial indications are that I should be able to do it almost as quickly as I can just apply the patch, but quite frankly, I'm at most half done, and if I hit a snag, maybe that's not true at all. Anyway, the reason I can do it quickly is that my scripts will _not_ be an SCM, they'll be a very specific "log Linus' state" kind of thing. That will make the linear patch merge a lot more time-efficient, and thus possible.
One of the first things I do when I get confused about how Git works is to imagine why and how Linus would apply it to managing patches. It has grown to handle a lot more than that and is indeed a full SCM, but remembering the first use case is helpful in understanding the "why" sometimes.
The core conceptual unit of work in Git is the commit. These are snapshots of the files being tracked within your project folder ( where the
.git folder lives.)
(Git-scm.com, CC BY-NC-SA 3.0)
It's important to remember that Git stores compressed snapshots of the file system, not diffs. Any time you change a file, a whole new compressed version of that file is made and stored in that commit. It does this by creating a super compressed Binary Large Object (blob) out of the file, and then keeping track of it by generating a checksum made with the SHA hashing algorithm. The permanence of your Git history is one of the reasons it's vital never to store or hardcode sensitive data in your Git projects. Anyone who can clone the repo has full access to all the versions of the files.
Git is really efficient. If a file does not change between commits, Git does not make a whole new compressed version for storage. Instead, it just refers back to the previous commit. All commits know what commit came directly before it, called its parents. You can easily see this chain of commits when you run
(Git-scm.com, CC BY-NC-SA 3.0)
You have full control over these chains of commits and can do some pretty cool things with them. You can create, delete, merge and reorder them as you see fit. You can even effectively travel through time, explore and even write your commit histories. But it all relies on understanding how Git sees chains of commits, which are generally referred to as branches.
Branching lets you work with multiple chains of commits inside a project. Working with multiple branches (especially when you work with
rebase) is where many users start to sweat. A common mental model most people have about what branches even are adds to the confusion.
When thinking about branching, most people conjure up images of swim lanes, diverging, and intersecting dots. While those models can be helpful when understanding specific branching strategies and workflows, thinking of Git as a series of numbered dots on a graph can muddy the waters when trying to think about how Git does what it does.
An alternative mental model I find helpful is to think of branches existing in a big spreadsheet. The first column is the parent commit ID, the second is the new commit ID, and then there are columns for metadata, including all the pointers.
(Dwayne McDaniel, CC BY-SA 4.0)
Pointers keep track of where you are, and which branch is which. Pointers are convenient human-readable references to specific commits. Any reference that leads back to a specific commit is referred to as commit-ish.
The special pointer used to name a branch always points to the newest commit on the chain. You can arbitrarily assign a pointer to any commit with
git tag, which doesn't move. When you
git checkout or
git switch between branches, you're really telling Git that you want to change the point of reference of Git and move a very special pointer called HEAD in every
One of the best ways to understand what is going on with Git is to dig into the
.git folder. If you've never opened this file before, I highly encourage you to open it up and see what's there. If you're very nervous you might break something, clone an arbitrary open source project to play around with until you feel confident to look into your own project repos.
(Dwayne McDaniel, CC BY-SA 4.0)
One of the first things you notice is how small the files are.
Things are measured in terms of bytes or kilobytes, at the largest. Git is extremely efficient!
Here in the
.git folder, you find the special file HEAD. It's a very small file, only a handful of bytes in size. If you open it up, you see it’s only one line long.
git > HEAD ref:refs/heads/main
One of the phrases you will often encounter when reading about Git is "everything is local." From Git's perspective, wherever HEAD is pointing is "here." HEAD is the point of reference for how Git interacts with other branches, other refs, and other copies of itself.
In this example, the
ref: is pointing at another pointer, a branch name pointer. Following that path, you can find a file that looks much like the spreadsheet from earlier. Git just takes the latest commit ID from the file and knows that is the commit HEAD is referring to.
If HEAD refers to a specific commit with no other pointer attached, then HEAD is referred to as "detached." Working in a detached HEAD state is completely safe, but limits what you can do, like make new commits from there. To get out of a detached HEAD state, just checkout another pointer, like the branch name, for example,
git checkout main.
Another critical file for helping Git keep track of things is the
.git/config file. This is just one of the places Git leads and stores configuration. You're likely already familiar with the
--global level of Git config, stored in your home directory in your
.gitconfig file. There are actually five places Git loads config, each overriding the previous configuration. The order Git loads config is:
--system This loads config specific to your operating system
--global Affects you as a user,
user.email stored here
--local This sets repo specific info, like remotes and hooksPath
--worktree The worktree is what is compressed into an individual commit
--blob individual compressed files can have their own settings
You can see all config for a repo by running
git config --list --show-origin
You can leverage your local config to use multiple Git personas. Override the
.gitconfig. Leveraging the local config is particularly useful when dividing your time between work projects, personal repos, and any open source contributions.
Git has a built-in powerful automation platform called Git hooks. Git hooks allows you to execute scripts that will run when certain events happen in Git. You can write scripts in any scripting language you prefer that is available to your environment. There are 17 hooks available.
If you look in any repo's
.git/hooks folder, you see a collection of
.sample files. These are pre-written samples meant to get you started. Some of them contain some odd-looking code. Odd, perhaps, until you remember that these mainly were added to serve the original use case for Linux kernel work and were written by people living in a sea of Perl and Bash scripts. You can make the scripts do anything you want.
Here's an example of a hook I use for personal repos:
#!/sur/bin/env bash curl https://icanhazdadjoke.com echo “”
In this example, every time I run
git commit but before the commit message is committed to my Git history, Git executes the script. (Thanks to Edward Thomson's git-dad for the inspiration.)
Of course, you can do practical things, too, like checking for hardcoded secrets before making a commit. To read more about Git Hooks and to find many, many example scripts, check out Matthew Hudson's fantastic GitHooks.com site.
Now you have a better understanding of how Git sees the world and works behind the scenes, and you've seen how you can make it do your bidding with scripts. In my next article, I'll address some advanced tools and commands in Git.
Original article source at: https://opensource.com/
Git has become ubiquitous as the preferred version control system (VCS) used by developers. Using Git adds immense value especially for engineering teams where several developers work together since it becomes critical to have a system of integrating everyone’s code reliably.
But with every powerful tool, especially one that involves collaboration with others, it is better to establish conventions to follow lest we shoot ourselves in the foot.
At DeepSource, we’ve put together some guiding principles for our own team that make working with a VCS like Git easier. Here are 5 simple rules you can follow:
Oftentimes programmers working on something get sidetracked into doing too many things when working on one particular thing — like when you are trying to fix one particular bug and you spot another one, and you can’t resist the urge to fix that as well. And another one. Soon, it snowballs and you end up with so many changes all going together in one commit.
This is problematic, and it is better to keep commits as small and focused as possible for many reasons, including:
Additionally, it helps you mentally parse changes you’ve made using
#open source #git #git basics #git tools #git best practices #git tutorials #git commit
(This suite of tools is 100% compatible with branches. If you think this is confusing, you can suggest a new name here.)
git-branchless is a suite of tools which enhances Git in several ways:
It makes Git easier to use, both for novices and for power users. Examples:
git undo: a general-purpose undo command. See the blog post git undo: We can do better.
git restack: to repair broken commit graphs.
It adds more flexibility for power users. Examples:
git sync: to rebase all local commit stacks and branches without having to check them out first.
git move: The ability to move subtrees rather than "sticks" while cleaning up old branches, not touching the working copy, etc.
git next/prev: to quickly jump between commits and branches in a commit stack.
git co -i/--interactive: to interactively select a commit to check out.
It provides faster operations for large repositories and monorepos, particularly at large tech companies. Examples:
git statusor invalidate build artifacts).
git-branchlessprovides the fastest implementation of rebase among Git tools and UIs, for the above reasons.
Undo almost anything:
git reflog is a tool to view the previous position of a single reference (like
HEAD), which can be used to undo operations. But since it only tracks the position of a single reference, complicated operations like rebases can be tedious to reverse-engineer.
git undo operates at a higher level of abstraction: the entire state of your repository.
git reflog also fundamentally can't be used to undo some rare operations, such as certain branch creations, updates, and deletions. See the architecture document for more details.
git undo handle?
git undo relies on features in recent versions of Git to work properly. See the compatibility chart.
git undo can't undo the following. You can find the design document to handle some of these cases in issue #10.
git reset HEAD^.
git uncommitcommand instead. See issue #3.
git statusshows a message like
path/to/file (both modified), so that you can resolve that specific conflict differently. This is tracked by issue #10 above.
git undo is not intended to handle changes to untracked files.
Comparison to other Git undo tools
gitjk: Requires a shell alias. Only undoes most recent command. Only handles some Git operations (e.g. doesn't handle rebases).
git-extras/git-undo: Only undoes commits at current
git-annex undo: Only undoes the most recent change to a given file or directory.
thefuck: Only undoes historical shell commands. Only handles some Git operations (e.g. doesn't handle rebases).
Visualize your commit history with the smartlog (
Why not `git log --graph`?
git log --graph only shows commits which have branches attached with them. If you prefer to work without branches, then
git log --graph won't work for you.
To support users who rewrite their commit graph extensively,
git sl also points out commits which have been abandoned and need to be repaired (descendants of commits marked with
rewritten as abcd1234). They can be automatically fixed up with
git restack, or manually handled.
Edit your commit graph without fear:
Why not `git rebase -i`?
Interactive rebasing with
git rebase -i is fully supported, but it has a couple of shortcomings:
git rebase -ican only repair linear series of commits, not trees. If you modify a commit with multiple children, then you have to be sure to rebase all of the other children commits appropriately.
When you use
git rebase -i with
git-branchless, you will be prompted to repair your commit graph if you abandon any commits.
Short version: run
cargo install --locked git-branchless, then run
git branchless init in your repository.
git-branchless is currently in alpha. Be prepared for breaking changes, as some of the workflows and architecture may change in the future. It's believed that there are no major bugs, but it has not yet been comprehensively battle-tested. You can see the known issues in the issue tracker.
git-branchless follows semantic versioning. New 0.x.y versions, and new major versions after reaching 1.0.0, may change the on-disk format in a backward-incompatible way.
To be notified about new versions, select Watch » Custom » Releases in Github's notifications menu at the top of the page. Or use GitPunch to deliver notifications by email.
There's a lot of promising tooling developing in this space. See Related tools for more information.
Thanks for your interest in contributing! If you'd like, I'm happy to set up a call to help you onboard.
For contributing documentation, see the Wiki style guide.
Contributors should abide by the Code of Conduct.
#rust #rustlang #git
There is no doubt that Git plays a significant role in software development. It allows developers to work on the same code base at the same time. Still, developers struggle for code quality. Why? They fail to follow git best practices. In this post, I will explain seven core best practices of Git and a Bonus Section.
Committing something to Git means that you have changed your code and want to save these changes as a new trusted version.
Version control systems will not limit you in how you commit your code.
But is it good? Not quite.
When you do an atomic commit, you’re committing only one change. It might be across multiple files, but it’s one single change.
Many developers make some changes, then commit, then push. And I have seen many repositories with unwanted files like dll, pdf, etc.
You can ask two questions to yourself, before check-in your code into the repository
You can simply use the .gitignore file to avoid unwanted files in the repository. If you are working on more then one repo, it’s easy to use a global .gitignore file (without adding or pushing). And .gitignore file adds clarity and helps you to keep your code clean. What you can commit, and it will automatically ignore the unwanted files like autogenerated files like .dll and .class, etc.
#git basics #git command #git ignore #git best practices #git tutorial for beginners #git tutorials
Recently, researchers from Google proposed the solution of a very fundamental question in the machine learning community — What is being transferred in Transfer Learning? They explained various tools and analyses to address the fundamental question.
The ability to transfer the domain knowledge of one machine in which it is trained on to another where the data is usually scarce is one of the desired capabilities for machines. Researchers around the globe have been using transfer learning in various deep learning applications, including object detection, image classification, medical imaging tasks, among others.
#developers corner #learn transfer learning #machine learning #transfer learning #transfer learning methods #transfer learning resources
In this post we’ll learn about Git — What is git and its terminal command in 3 minutes. So let’s start with What is Git.
In 2005, Linus Torwalds created a GIT.
Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
Git is easy to learn and has a tiny footprint with lightning fast performance. It outclasses SCM tools like Subversion, CVS, Perforce, and ClearCase with features like cheap local branching, convenient staging areas, and multiple workflows.
$ brew install git
$ yum -y install git
$ git --version
$ git init
$ git init [project-name]
It will create new local repository with specific name.
$ git clone [url]
$ git status
This command list all new or modified files
$ git add [file-name]
If you want to add more than one file
$ git add .
$ git commit -m "Message"
Messages are good way to comment on changes.
#github #begineers #programming #git #learning #deep learning