I was not particularly inspired by any SCM system until I dove into Git, created by Linus Torvalds, the founder of Linux. In this tutorial, I discuss what's unique about Git and I demonstrate how to set up a repository on GitHub, one the main free Git hosting services. Then I explain how to make a local copy of the GitHub repository, make some changes locally, and push them back to GitHub. The second installment of this tutorial will build on this base, explain branching and merging, and discuss a workflow that I use, which might be of interest to you. As a side note, I learned much of what I know about Git from the book Pro Git, which is is hosted free online. I recommend that you use the book to fill out the matter presented here and as a reference for later work with Git.
Git has numerous attractive benefits that, for me, make it my preferred DVCS:
- When you create a new branch, Git doesn't copy all your files over. A branch will point to the original files and only track the changes (commits) specific to that branch. This makes it blazingly fast to create branches compared to other approaches, such as Subversion (which laboriously copies the files).
- Git lets you work on your own copy of a project, merging your commits into the central repository, often on GitHub.com, when you want your commits to be available to others. Github.com, by the way, will host your project for free as long as it's open source. (And cheaply, if it's not. Another alternative is Bitbucket, which allows unlimited private Git repositories.) This means you can reliably access your code from anywhere with an Internet connection. If you lose that Internet connection, you can continue to work locally and sync up your changes when you're able to reconnect.
- When you screw up, you can usually undo your changes. You might need to call in an expert in serious cases, but there's always hope. This is the best "key benefit" a version control system can have.
- Git also lets you keep your commit history very organized. If you have lots of little changes, it lets you easily rewrite history so you see it as one big change (via something called "rebasing"). You can add/remove files in each commit, and certainly change the descriptions of each.
- It's open source, fast, and very flexible, so it's widely adopted and well-supported.
- With Git, you can create "hooks," which enable actions to occur automatically when you work on your code. A common use case is to create a hook to check the description submitted with each commit to make sure it conforms to a particular format. Perhaps you have your bugs described in a bug tracking system, and each bug has an ID #. Git can ensure each message has an entry for
- Another under-appreciated feature is how Git tracks files. It uses the SHA-1 algorithm to take the contents of files and produce a large hexadecimal number (hash code). The same file will always produce the same hash code. This way, if you move a file to a different folder, Git can detect that the file moved, and not think that you deleted one file and added another. This allows Git to avoid keeping two copies of the same file.
- While Git is not necessarily the most intuitive version control system out there, once you get used to it, you're able to browse through its internal directories and it makes complete sense. Wondering where the file with the hash code
"d482acb1302c49af36d5dabe0bccea04546496f7"is? Check out this file:
"<your project>/.git/objects/d4/82acb1302c49af36d5dabe0bccea04546496f7"There are also lots of lower-level commands that let you build the operations you want, in case, for instance, Git's
mergecommand doesn't work how you'd like it to.
Let's jump in. In whatever programming language, you're going to start a new project, and you want to use version control? I'm going to create a silly, sample application in Scala that's very easy to understand for a demonstration. I'll assume you're familiar with your operating system's command-line interface, and that you're able to write something in the language of your choice.
Github is one of the go-to places to get your code hosted for free and it's what I'll use here. (BitBucket, Google Code, and SourceForge are some of the other free repository hosts that support Git). All these hosts give you a home for your code that you can access from anywhere. Initial steps:
- Go to http://GitHub.com and "Sign up for Github"
- You'll need Git. Follow this step-by-step installation process
- Review how to create a new repository
- Finally, you're going to want to get used to viewing files that start with a "." These files are hidden by default; so at the command line, when you're listing the contents of a directory, you need to include an "
a" option. That's "ls -a" in OSX and Linux, and "
dir /a" for Windows. In your folder options, you can turn on "Show hidden files and folders" as well.
Once you get this far, there's nothing stopping you (outside of setting aside some time to explore what Git has to offer). Let's look at some of the typical actions.
Clone a Repository
Cloning a repository lets you grab the source code from an existing project (yours or someone else's) that you have access to. Unless it's your project, you won't be able to make changes unless you "fork" the project, which means creating your own copy of it under your own account, after which you can modify it to your heart's content. I keep all of my projects locally (on my computer) in a "projects" folder in my home directory,
"/Users/sdanzig/projects", so I'm going to use "
projects" for this demo.
First, I fork my repository…
I create a sample project, called
potayto, on GitHub, as you now should know how to do. Let's get this project onto your hard drive so you can add comments to my source code for me. First, log into your GitHub account, then go to my repository at https://GitHub.com/sdanzig/potayto and click Fork:
Figure 1: Forking (cloning) a repository.
Then select your user account on GitHub and copy it there. When this is complete, it's as though it were your own repository and you can actually make changes to the code on GitHub. Now, let's copy the repository onto your local hard drive, so we can both edit and compile the code there.
Figure 2: Copying a repository.
There are a few key things to know about what Git is doing with your files. Type:
cd potayto. There are useful things to see here when you list the contents in the potayto folder, being careful to show the hidden files and folders (with the
Figure 3: Examining the contents of a Git repository.
The src folder contains the source code, and its structure conforms to the Maven standard directory structure. You'll also see a .git folder, which contains a complete record of all the changes that were made to the potayto project, as well as a .
gitignore text file. We're not going to dive into the contents of .git in this tutorial, but it's easier to understand than you think. If you're curious, please refer to the free online book.