A few introducing words ...
My Notions on the MatterFor those who want to know what SCM systems are out there and how they compare — there is a list of SCM systems as well as a comparison available. There is also a comparison among SVN and GIT available. As of now (August 2008), I mainly use GIT (= a random three-letter combination) to manage code and to do all kinds of work related to software/data on my computer systems. Before that, my main code revision and management system for about two years or so had been SVN (Subversion). And even before that I used a greater variety of SCM (Software Configuration Management) systems including CVS (Concurrent Versions System), GNU Arch and Darcs. The situation now is that I use mainly GIT and SVN and a little bit of CVS and GNU Arch every now and then. GIT is used for my own projects and those I actively contribute to. I also contribute to projects using SVN but for the most part SVN usage is limited to get the SVN HEAD from the remote repository to my local working copy. CVS and GNU Arch is only used for updating the local working copies only — I do not use them for active development anymore. Roughly speaking, the reason why I ended up only using two (one to be more precisely) SCM systems actively now is that, for some time now, I try to consolidate1 pretty much everything. Also, I abandoned every redundancy I could identify because I do not need/want two or more things providing the same functionality. The gain from doing so is that one frees up a lot of time for other things plus one gets to know those things that are left in more detail and thus he is able to work more efficient. Why GIT?It is important to note that GIT is very different from most SCM systems that we may be familiar with. Subversion, CVS, Perforce, Mercurial and the like all use Delta Storage systems — they store the differences between one commit and the next. GIT does not do this — it stores a snapshot of what all the data in our project looks like in the tree structure each time we commit. This is a very important concept to understand when using GIT. Some of the reasons why I finally favor GIT over all other SCM systems can be told in brief:
There are other reasons as well but those are the main reasons why I find GIT the best solution for me and what I do on a daily basis. It is even so that I import code from other SCMs into GIT, work on the code and when I am done, I push the code from GIT back to whatever upstream SCM system a particular project uses. I cover this further down ... GIT Glossary and PrinciplesI decided to intentionally put this not to the end of the page put here. Best would be to skim over it once, then go read the reminder of the page and finally read it a second time in-depth. Glossary
PrinciplesAside from all the terms used with GIT, it is important to understand the core principles how GIT works in order to use it successfully. The nature of a DSCM SystemOf course, there are fundamental differences in how centralized and decentralized SCM systems build and work. This subsection names two major differences and, from my point of view, advantages of DSCM systems. Everything is LocalThis is basically true of all the distributed SCM systems, but in my
experience even more so with GIT. There is very little outside of That may not sound like a big deal, but many of us often work offline. Being able to branch, merge, commit and browse history of a project while on the plane, train or riding with the AEP (Autonomous Expedition Platform) vehicle trough the Outback while your buddy is driving, is a big plus that comes with a DSCM system as is GIT.
Even in Mercurial, common commands like This means that it is very easy to have copies of not only our branches, but also of everyone else's branches that we are working with in our GIT repository without having to mess up their stuff. No Single Point of FailureI already mentioned that above but it is actually so great that I am talking about it again. One of the coolest features of any of the Distributed SCMs, GIT included, is that it is distributed. This means that instead of doing a checkout of the current tip of the source code, we do a clone of the entire repository. This means that even if we are using a centralized workflow, every user has what is essentially a full backup of the main repository, each of which could be pushed up to replace the main repository in the event of a hardware failure or software triggered corruption. There is basically no single point of failure with GIT unless there is only a single point e.g. a repository that has not been mirrored/cloned by someone else. Repository LayoutIt is quite interesting and helpful to grasp the big picture about GIT
aside from the daily usage of GIT. Understanding the layout of a GIT
repository and its meaning and implications on daily usage can be very
helpful in avoiding misuse of GIT that may badly affect ones work. Tree vs. CommitA tree is a particular object type. It represents a particular directory state of a working directory whereas a commit represents that state in time, and explains how we got there. We create a commit object by giving it the tree that describes the state at the time of the commit, and a list of parent trees (those tree states that lead up to the current one). Working Tree vs. Index vs. HEADWhen we have a piece of code/data under GIT's control and make changes
to it (e.g. editing some text file, removing/adding/altering a bitmap,
etc.), the journey those changes take are in essence like this:
I will go into more detail later when we talk about GIT's workflow. Anyways, there is a number of commands which are useful for keeping track of what we are about to commit:
Now, the alerted reader might have asked himself already, we can
commit changes all the way from the working tree, over the index,
right into
However, how do we get changes from the working tree into the index
without committing them all the way through to
The index holds a snapshot of the content of the working tree, and it
is this snapshot that is taken as the contents of the next commit.
Thus after making any changes to the working directory, and before
running Of course, as the best practices example outlined,
Detailed Look at the IndexMany a times the subject comes up on the mailing list or IRC (Internet Relay Chat) channel, Why keep the index? or The index is a performance trick?. The truth is, the index is a staging area. Every SCM system has it, but GIT explicitly exposes it to us. A staging areaFor those familiar with CVS, SVN or similar archaic stuff, what
happens when we do With the second command, we can finally commit. But what happens to
the other modified files? Are they committed? The answer is no, the
last revision is updated with the new version of So really, it is neither a new concept, nor an intimidating one. The
index comes naturally to us when we issue And here comes the difference to CVS: once we put something into the
index, a simple One special case exists though. Let us assume we issue This operation — save the current staging area, construct a new one, commit it, and then restore the staging area — seems a bit illogical, since we would usually expect only one staging area. However, in practice it happens quite often that we forget to commit something very important. So, all we have to do is to just edit the respective files, commit just these, and continue with what we were doing before. In essence: The index is a staging area for the next commit, but for
convenience, passing filenames explicitely to MergesNormally, a GIT user will rarely be exposed to the index if he is not committing a revision. But there is one notable exception: merging. When we merge the work of others, sometimes conflicts happen. These are put in the index. Strictly speaking, the whole merge is done inside the index by inserting the current version, the version of the branch-to-be-merged, and the merge base into the index, and merging them using a three-way-diff. If there are no conflicts, these three entries are collapsed into a single entry. Otherwise the three entries stay there, with the common ancestor being replaced by the result of the merge. Again, GIT is intelligent about what to show us upon a Now we know what the index is good for — as mentioned above, the index it is neither a new concept, nor an intimidating one. The index is our friend and companion! File stagesAssuming two branches contain the same file i.e. Recall that the commit which will be committed after we resolve this conflict will have two parents instead of the usual one:
During the merge, the index holds three versions of each file. Each of these three file stages represents a different version of the file:
Each time we resolve the conflicts in a file and update the index Installing and Configuring GITThis section will tell about how to install GIT and how to configure it afterwards. Installing GITInstalling GIT is trivial. Just issue wks:/home/sa# apt-get install git-core Reading package lists... Done Building dependency tree Reading state information... Done git-core is already the newest version. 0 upgraded, 0 newly installed, 0 to remove and 2 not upgraded. wks:/home/sa# which does the trick and installs GIT. Note, that I already had it installed. One might find it a bit strange "Just one package and that is it? ... I do not believe ...". This person might take a look at ,----[ apt-file list git-core | grep bin/ ] | git-core: usr/bin/git | git-core: usr/bin/git-add | git-core: usr/bin/git-add--interactive [skipping a lot of lines ...] | git-core: usr/bin/git-am | git-core: usr/bin/git-whatchanged | git-core: usr/bin/git-write-tree `---- That is the current (Sat Aug 25 16:53:27 UTC 2007) status of the
notable contents of the I strongly recommend to also the package sa@wks:~$ acsn git | grep ^git-doc git-doc - fast, scalable, distributed revision control system (documentation) sa@wks:~$ For later use you might install more as you need it — DebianGNU/Linux provides a bunch of GIT related packages ,----[ apt-cache search --names-only git | grep ^git ] | git - GNU Interactive Tools, a file browser/viewer and process viewer/killer | git-arch - fast, scalable, distributed revision control system (arch interoperability) | git-buildpackage - Suite to help with Debian packages in Git repositories | git-completion - content addressable filesystem (bash completion) | git-core - fast, scalable, distributed revision control system | git-cvs - fast, scalable, distributed revision control system (cvs interoperability) | git-daemon-run - fast, scalable, distributed revision control system (git-daemon service) | git-doc - fast, scalable, distributed revision control system (documentation) | git-email - fast, scalable, distributed revision control system (email add-on) | git-gui - fast, scalable, distributed revision control system (GUI) | git-load-dirs - Import upstream archives into git | git-svn - fast, scalable, distributed revision control system (svn interoperability) | gitk - fast, scalable, distributed revision control system (revision tree visualizer) | gitweb - fast, scalable, distributed revision control system (web interface) | git-p4 - fast, scalable, distributed revision control system (p4 interoperability) `---- Do not get confused about the package sa@wks:~$ acsn git | grep '^git ' git - GNU Interactive Tools, a file browser/viewer and process viewer/killer sa@wks:~$ Configure GITWe will postpone this until we have seen how to carry out basic tasks with GIT. TaxonomyIt is so that the GIT community identifies several sets of commands depending on their abstraction level (high level versus low level) and if they belong to the core git package or to some ancillary tools. We name high level (porcelain) commands and low level (plumbing) commands:
These matters are beyond the scope of this page and will not be
covered since it is only of interest to the power-user or developer.
However, the interested reader might issue Using GITThere is lots and lots of information available to all sorts of tasks
one might carry out with GIT. Because of that, I will not provide
another tutorial nor write some documentation. If you are new to GIT
then you might want to take a look at GIT Wikis documentation page
and/or read the GIT user manual. I also strongly recommend to read the
man page i.e. However, I will provide some shortscreen dumps and information on topics that I needed for myself. This section is split into two subsections — one covering knowledge that everyone needs on a daily basis and the second subsection covering some things that look a bit deeper into what can be done with GIT. WorkflowThis is probably one of the most interesting subsections to read for folks who are planning on using GIT or maybe have already started using GIT. Here I will tell about the workflow with regards to GIT from different angles:
Low-level Look a the Local WorkflowGenerally, all GIT operations work on the index file. Some operations work purely on the index file (showing the current state of the index), but most operations move data to and from the index file. Either from the database or from the working directory. Thus there are four main combinations:
Below we will look at all of those four combinations, but before we do so, there is a sketch picturing the local workflow right below:
This piece of ASCII art illustrates how various pieces fit together.
It features the current states (boxes) and the commands to make the
transition from one state to another with the name of the objects at
the current states. Please note that all the commands mentioned below
are not intended to be used by the end user i.e. instead of
git-commit-tree
commit obj
+----+
| |
| |
V V
+-----------+
| Object DB |
| Backing |
| Store |
+-----------+
^
git-write-tree | |
tree obj | |
| | git-read-tree
| | tree obj
V
+------------------+
| Index |
+------------------+
^
git-update-index |
blob obj | |
| |
git-checkout-index -u | | git-checkout-index
stat | | blob obj
V
+-----------+
| Working |
| Directory |
+-----------+
Working Directory to IndexWe update the index with information from the working directory with
the However, to avoid common mistakes with filename globbing etc., the command will not normally add totally new entries or remove old entries, i.e. it will normally just update existing cache entries. To tell git that yes, we really do realize that certain files no
longer exist, or that new files should be added, we should use the
As a special case, we can also do Index to Object DatabaseWe write our current index file to a tree object with Object Database to IndexWe read a tree file from the object database (also known as GIT back end), and use that to populate (and overwrite i.e. we should not do this if our index contains any unsaved state that we might want to restore later!) our current index. The low-level operation to accomplish this would be Index to Working DirectoryWe update our working directory from the index by checking out files.
This is not a very common operation, since normally we would just keep
our files updated rather than write to our working directory, we would
tell the index files about the changes in our working directory (i.e.
However, if we decide to jump to a new version, or check out somebody
else's version, or just restore a previous tree, we would populate our
index file with
High-Level Look at the WorkflowI suppose this is probably the most interesting subsection within the workflow section — a high level view on the workflow, involving not just the local repository but also interacting with remote repositories, this time using GIT's high-level commands also known as porcelains. Instead of going to explain things with words, I opted to have one picture that pretty much tells us all there is about ones every day workflow with GIT.
I used Inkscape to create this work. I got asked a lot if I could provide a PDF — here it is, optimized for DIN A4 for those who would like to print it. However, the PDF export scrambles the fonts a bit and so I would recommend to stick with the bitmap. Update: I also found another nice imagery on the net depicting GIT's high-level workflow Workflow ModelsOne of the amazing things about GIT is that because of its distributed nature and super branching system, we can easily implement pretty much any workflow we can think of. Subversion-Style WorkflowA very common GIT workflow, especially from people transitioning from a centralized system, is a centralized workflow. GIT will not allow us to push if someone has pushed since the last time we fetched, so a centralized model where all developers push to the same server works just fine.
Integration Manager WorkflowAnother common GIT workflow is where there is an integration manager — a single person who commits to the blessed repository, and then a number of developers who clone from that repository, push to their own independent repositories and ask the integrator to pull in their changes. This is the type of development model we often see with open source repositories. I also use this model to maintain and further develop this website/platform i.e. I am the integration manager who solely maintains the blessed repository where all contributors pull/fetch from. They make changes, I then fetch from their independent repositories and so forth. Of course I am also a contributor aside from being the integration manager ;-] ... Thanks to GIT's mighty branching powers, that is no problem ...
Dictator and Lieutenants WorkflowFor more massive projects, we can setup our developers similar to the way the Linux kernel is run, where people are in charge of a specific subsystem of the project (the lieutenants) and merge in all changes that have to do with that subsystem. Then another integrator (the dictator) can pull changes from only his/her lieutenants and then push to the blessed repository that everyone then clones from again.
Again, GIT is entirely flexible about this, so we can mix and match and choose the workflow that is right for us. Mandatory KnowledgeThis subsection is about what I need on a daily basis and thus it is knowledge that should be known without ever having to look things up. Getting HelpThe best help is what is at our hands at any times. With the If we have access to the Internet then we might also want to check at
sa@wks:~$ git --help usage: git [--version] [--exec-path[=GIT_EXEC_PATH]] [-p|--paginate|--no-pager] [--bare] [--git-dir=GIT_DIR] [--work-tree=GIT_WORK_TREE] [--help] COMMAND [ARGS] The most commonly used git commands are: add Add file contents to the index apply Apply a patch on a git index file and a working tree archive Create an archive of files from a named tree bisect Find the change that introduced a bug by binary search branch List, create, or delete branches checkout Checkout and switch to a branch cherry-pick Apply the change introduced by an existing commit clone Clone a repository into a new directory commit Record changes to the repository diff Show changes between commits, commit and working tree, etc fetch Download objects and refs from another repository grep Print lines matching a pattern init Create an empty git repository or reinitialize an existing one log Show commit logs merge Join two or more development histories together mv Move or rename a file, a directory, or a symlink prune Prune all unreachable objects from the object database pull Fetch from and merge with another repository or a local branch push Update remote refs along with associated objects rebase Forward-port local commits to the updated upstream head reset Reset current HEAD to the specified state revert Revert an existing commit rm Remove files from the working tree and from the index show Show various types of objects show-branch Show branches and their commits status Show the working tree status tag Create, list, delete or verify a tag object signed with GPG (use 'git help -a' to get a list of all installed git commands) sa@wks:~$ The person who knows and understands these commands (the main set from porcelains) can pretty much do anything he ever wants to do. All the rest that git offers is thought to be beyond the scope of the every-day-users needs. If we need anything aside from the above then we can simply go look it up in the man files or elsewhere. I know the above commands, use them on a daily basis from the CLI (Command Line Interface) or even better, I use the emacs frontend and it is not often that I have to use some other commands except for maintenance on repositories matters. In order to get help about a particular command e.g. Creating A New RepositoryWith this subsection, I will show how to create a new repository in a few ways depending on the current situation where we start from. From a Common Directory:Usually people have their directory structure already in place when they start out using GIT — thus they want to bring their file system or parts of it under version control with GIT. 1 sa@wks:~$ cd /tmp/ 2 sa@wks:/tmp$ mkdir commondir 3 sa@wks:/tmp$ cd commondir/ 4 sa@wks:/tmp/commondir$ cp /ws/local/scm.muse . 5 sa@wks:/tmp/commondir$ la 6 total 96 7 drwxr-xr-x 2 sa sa 4096 2007-09-13 14:10 . 8 drwxrwxrwt 20 root root 12288 2007-09-13 14:10 .. 9 -rw-r--r-- 1 sa sa 74463 2007-09-13 14:10 scm.muse 10 sa@wks:/tmp/commondir$ Nothing unusual here. All I did was to create a new directory (line 2) and copy a file into it (line 4). For now the directory contains only this particular file as we can see in lines 6 to 9. There is no repository in place so far. 11 sa@wks:/tmp/commondir$ git init 12 Initialized empty git repository in .git/ 13 sa@wks:/tmp/commondir$ git add . 14 sa@wks:/tmp/commondir$ line 11. This command creates an empty GIT repository — basically a
We now have a fully functional GIT repository with content already
under version control. GIT needs to store all information about the
repository in 15 sa@wks:/tmp/commondir$ la .git/ 16 total 44 17 drwxr-xr-x 7 sa sa 4096 2007-09-13 15:09 . 18 drwxr-xr-x 3 sa sa 4096 2007-09-13 15:08 .. 19 drwxr-xr-x 2 sa sa 4096 2007-09-13 15:08 branches 20 -rw-r--r-- 1 sa sa 92 2007-09-13 15:08 config 21 -rw-r--r-- 1 sa sa 58 2007-09-13 15:08 description 22 -rw-r--r-- 1 sa sa 23 2007-09-13 15:08 head 23 drwxr-xr-x 2 sa sa 4096 2007-09-13 15:08 hooks 24 -rw-r--r-- 1 sa sa 104 2007-09-13 15:09 index 25 drwxr-xr-x 2 sa sa 4096 2007-09-13 15:08 info 26 drwxr-xr-x 5 sa sa 4096 2007-09-13 15:09 objects 27 drwxr-xr-x 4 sa sa 4096 2007-09-13 15:08 refs 28 sa@wks:/tmp/commondir$ Time for a short recap. We created a directory, populated it with
content ( 29 sa@wks:/tmp/commondir$ git status 30 # On branch master 31 # 32 # Initial commit 33 # 34 # Changes to be committed: 35 # (use "git rm --cached <file>..." to unstage) 36 # 37 # new file: scm.muse 38 # 39 sa@wks:/tmp/commondir$ As we can see, we are currently on/in the master branch (line 30) of our repository. As I said above, line 34 tells us that there is nothing to be committed from the index (formerly known as directory cache) to GITs back end (which roughly speaking consists of the references and the object database) since nothing changed in the working tree — the index and the working tree are the same at this point in time.
40 sa@wks:/tmp/commondir$ git diff 41 sa@wks:/tmp/commondir$ git diff --cached 42 sa@wks:/tmp/commondir$ git diff HEAD 43 sa@wks:/tmp/commondir$ To proof what I said above (all three stages in the repository (working tree, index, back end) contain the same at this point in time i.e. the working tree is clean) I issued lines 40 to 43. Line 40 shows that there are no differences between the working tree
and the index. Line 41 tells us that there are no differences between
the index and the latest commit (if there is no explicit commit
specified — as is here — it points to the current active branch
Finally, we have to commit the changes. Of course, we have not made
changes so far but the GIT back end is empty at that point — it does
not know about the index and the repository contents. Running Line 44 shows how to commit changes made to the repository. Actually
what we do is using 44 sa@wks:/tmp/commondir$ git commit -m "This is the inital commit." 45 Created initial commit a4325c8: This is the inital commit. 46 1 files changed, 2235 insertions(+), 0 deletions(-) 47 create mode 100644 scm.muse 48 sa@wks:/tmp/commondir$ Note the Line 45 shows the SHA1 hash ( sa@wks:/tmp/commondir$ lsO scm.muse name file type octal permissions human readable permissions group name owner user name owner size in bytes scm.muse regular file 644 -rw-r--r-- sa sa 81330 sa@wks:/tmp/commondir$
Last but not least, we check the last commit we did. Line 49 issues the command. Line 50 shows the commit's unique identifier and line 51 who committed changes. Line 52 is a time stamp and in line 54 we can see the commit/log message supplied in line 44. 49 sa@wks:/tmp/commondir$ git log 50 commit a4325c8a50f4b277fbc3b255b8d77ceb17e5daad 51 Author: suno ano <sa@wks> 52 Date: Sat Sep 15 10:31:41 2007 +0100 53 54 This is the inital commit. 55 sa@wks:/tmp/commondir$ From a tarball:Aside from extracting the tarball, this the same as the former example. 1 sa@wks:/tmp$ mkdir test 2 sa@wks:/tmp$ mv my_tarball.tar.bz2 test/ 3 sa@wks:/tmp$ cd test/ 4 sa@wks:/tmp/test$ la 5 total 1552 6 drwxr-xr-x 2 sa sa 4096 2007-09-15 18:23 . 7 drwxrwxrwt 21 root root 12288 2007-09-15 18:23 .. 8 -rw-r--r-- 1 sa sa 1568674 2007-09-15 18:22 my_tarball.tar.bz2 9 sa@wks:/tmp/test$ tar -xjf my_tarball.tar.bz2 10 sa@wks:/tmp/test$ la 11 total 1556 12 drwxr-xr-x 3 sa sa 4096 2007-09-15 18:24 . 13 drwxrwxrwt 21 root root 12288 2007-09-15 18:23 .. 14 -rw-r--r-- 1 sa sa 1568674 2007-09-15 18:22 my_tarball.tar.bz2 15 drwxr-xr-x 2 sa sa 4096 2007-09-15 18:22 nose 16 sa@wks:/tmp/test$ cd nose/ 17 sa@wks:/tmp/test/nose$ la 18 total 9236 19 drwxr-xr-x 2 sa sa 4096 2007-09-15 18:22 . 20 drwxr-xr-x 3 sa sa 4096 2007-09-15 18:24 .. 21 -rw-r--r-- 1 sa sa 732731 2007-09-15 18:20 bashref.html 22 -rw-r--r-- 1 sa sa 24071 2007-09-15 18:20 crypto.html 23 -rw-r--r-- 1 sa sa 3581730 2007-09-15 18:20 elisp.html 24 -rw-r--r-- 1 sa sa 3035824 2007-09-15 18:20 emacs.html 25 -rw-r--r-- 1 sa sa 823905 2007-09-15 18:20 emacs-lisp-intro.html 26 -rw-r--r-- 1 sa sa 1159227 2007-09-15 18:20 texinfo.html 27 -rw-r--r-- 1 sa sa 52965 2007-09-15 18:20 vserver_configuration.html 28 sa@wks:/tmp/test/nose$ git init 29 Initialized empty Git repository in .git/ 30 sa@wks:/tmp/test/nose$ git add . 31 sa@wks:/tmp/test/nose$ git commit -m "Intial commit from just extracted tarball." 32 Created initial commit 2b36f4f: Intial commit from just extracted tarball. 33 7 files changed, 166507 insertions(+), 0 deletions(-) 34 create mode 100644 bashref.html 35 create mode 100644 crypto.html 36 create mode 100644 elisp.html 37 create mode 100644 emacs-lisp-intro.html 38 create mode 100644 emacs.html 39 create mode 100644 texinfo.html 40 create mode 100644 vserver_configuration.html 41 sa@wks:/tmp/test/nose$ git log HEAD 42 commit 2b36f4f83dc95d0e05a23f974415f9bd6b55fa66 43 Author: suno ano <sa@wks> 44 Date: Sat Sep 15 18:25:34 2007 +0100 45 46 Intial commit from just extracted tarball. 47 sa@wks:/tmp/test/nose$ In line 9 we extract the tarball. The was nothing but the tarball in
the From a remote repository:There is just one command we need to know. In line 1 we are issuing
1 sa@wks:/tmp$ git clone git://git.kernel.org/pub/scm/git/git.git 2 Initialized empty Git repository in /tmp/git/.git/ 3 remote: Counting objects: 92034, done. 4 remote: Compressing objects: 100% (24736/24736), done. 5 remote: Total 92034 (delta 67243), reused 90062 (delta 65711) 6 Receiving objects: 100% (92034/92034), 19.30 MiB | 1743 KiB/s, done. 7 Resolving deltas: 100% (67243/67243), done. 8 sa@wks:/tmp$ du -sh git/ 9 36M git/ 10 sa@wks:/tmp$ As of now (February 2009) the whole GIT source tree has a size of
about 36 MiB as line 9 shows. Note, that there is no need to run Importing/Exporting data from/to SVNBefore we actually start, folks familiar to SVN but not GIT might read the this. Also, I am not going to explicitly cover grafts here. Install
|
![]() |
| ... long face after a big *PENG* ... |
We all make mistakes all the time. In order for life to evolve we need mistakes to happen.
Experience is that marvelous thing that enables you recognize a mistake
when you make it again.
— Franklin P. Jones
GIT can ease the pain after a mistake has been made. In most cases, GIT can make it go away entirely by using one of three commands:
git resetgit revertgit checkoutIn case we messed up the working tree, but have not yet committed our
mistake, we can return the entire working tree as well as the index to
the last committed state with git reset --hard HEAD — aside from
--hard there two other switches to git reset which produce different
results. This is called a hard reset and cannot be undone!
1 sa@wks:/tmp$ mkdir test
2 sa@wks:/tmp$ cd test/
3 sa@wks:/tmp/test$ touch file_{a,b}
4 sa@wks:/tmp/test$ ll
5 total 0
6 -rw-r--r-- 1 sa sa 0 2009-02-18 18:24 file_a
7 -rw-r--r-- 1 sa sa 0 2009-02-18 18:24 file_b
8 sa@wks:/tmp/test$ git add .
9 fatal: Not a git repository (or any of the parent directories): .git
10 sa@wks:/tmp/test$ git init
11 Initialized empty Git repository in /tmp/test/.git/
12 sa@wks:/tmp/test$ git add .
13 sa@wks:/tmp/test$ git cwh -m "initial commit"
14 [master (root-commit)]: created 8eda414: "initial commit"
15 0 files changed, 0 insertions(+), 0 deletions(-)
16 create mode 100644 file_a
17 create mode 100644 file_b
18 sa@wks:/tmp/test$ rm file_b
19 sa@wks:/tmp/test$ echo "I will regret this" > file_a
20 sa@wks:/tmp/test$ ll
21 total 4.0K
22 -rw-r--r-- 1 sa sa 19 2009-02-18 18:25 file_a
23 sa@wks:/tmp/test$ cat file_a
24 I will regret this
25 sa@wks:/tmp/test$ git reset --hard HEAD
26 HEAD is now at 8eda414 initial commit
27 sa@wks:/tmp/test$ ll
28 total 0
29 -rw-r--r-- 1 sa sa 0 2009-02-18 18:25 file_a
30 -rw-r--r-- 1 sa sa 0 2009-02-18 18:25 file_b
31 sa@wks:/tmp/test$ cat file_a
In line 13 a usual commit happens. Then we mess up — line 18 and 19.
No problem, line 25 returns he working tree right to the state it had
been at line 13 i.e. file file_b is still around and file_a is empty.
The above example is trivial because no commit happened which would
already commit a mistake — the mess was contained to the working tree
and had not made it to the index or HEAD. But what if? What if we had
committed a mistake?
If we make a commit that we later wish we had not, there are two fundamentally different ways to fix the problem:
git push.Creating a new commit that reverts an earlier change is easy. In order
to do so, the working tree must be clean. Then, all we need to do is
to pass a ref of the bad commit to git revert
e.g. to revert the most recent commit with git revert HEAD.
32 sa@wks:/tmp/test$ echo "making a mistake and committing it" > file_b 33 sa@wks:/tmp/test$ git cwh -m 'omg, I am committing a mistake' 34 [master]: created 0708ce3: "omg, I am committing a mistake" 35 1 files changed, 1 insertions(+), 0 deletions(-) 36 sa@wks:/tmp/test$ cat file_b 37 making a mistake and committing it 38 sa@wks:/tmp/test$ git revert HEAD 39 40 41 [ here the default editor opened ...] 42 43 44 [master]: created 1bf6c3d: "Revert "omg, I am committing a mistake"" 45 1 files changed, 0 insertions(+), 1 deletions(-) 46 sa@wks:/tmp/test$ cat file_b 47 sa@wks:/tmp/test$ gllol 48 1bf6c3d8de2942b9c7dc3c9a6f8dbdeedb39f32b 2 minutes ago CN: Suno Ano AN: Suno Ano S: Revert "omg, I am committing a mistake" 49 0708ce3f00eb4108bc36735641eaae46b948f84b 5 minutes ago CN: Suno Ano AN: Suno Ano S: omg, I am committing a mistake 50 8eda414d07791a8a1160a7b4ac5c913b1a06643d 42 minutes ago CN: Suno Ano AN: Suno Ano S: initial commit 51 sa@wks:/tmp/test$
As we can see in lines 32 to 51, reverting a change works just fine. Line 36 and 46 proof it. What can now also be seen from lines 48 to 50 is that the command issued in line 25 did not create any commit but either one, line 33 and 38 did. Those commits are now part of the history which is perfectly fine even though they represent a back and forth action. The point is, prior to it and also after it, the history is coherent and intact.
Of course, we can also revert an earlier change, for example, the
grandparent git revert HEAD^. In this case GIT will attempt to undo
the old change while leaving intact any changes made since then. If
more recent changes overlap with the changes to be reverted, then we
will be asked to fix conflicts manually, just as in the case of
resolving a merge.
Going back even further is also possible (any ID is possible) but may become more and more tricky based on the complexity of some project. However, again, the point is, as long as the history is coherent and intact, we will not make mistakes we cannot recover from ... GIT just prevents us from doing so as long as we are not explicitly fiddling with the history on a low level.
We have used git reset above already by undoing changes that had not
made it into the index and therefore also not into the back end of the
repository (HEAD). However, git reset can do more — we can set the
current head to any specified commit.
Optionally we can also reset the index and working tree to match that
commit if we use the --hard option. --mixed would only reset the index
but not the working tree and --soft would reset neither but only let
HEAD point to the specified commit.
If the problematic commit is the most recent commit, and we have not
yet made that commit public, then we may just destroy it using
git reset.
1 sa@wks:/tmp/test$ la 2 total 4 3 drwxr-xr-x 2 sa sa 6 2009-02-19 11:30 . 4 drwxrwxrwt 14 root root 4096 2009-02-19 11:18 .. 5 sa@wks:/tmp/test$ touch our_file 6 sa@wks:/tmp/test$ git init && git add . 7 Initialized empty Git repository in /tmp/test/.git/ 8 sa@wks:/tmp/test$ git cwh -m 'initial commit' 9 [master (root-commit)]: created b076931: "initial commit" 10 0 files changed, 0 insertions(+), 0 deletions(-) 11 create mode 100644 our_file 12 sa@wks:/tmp/test$ echo "this will be corrected using --soft" > our_file 13 sa@wks:/tmp/test$ git cwh -m 'wrote some content into our_file' 14 [master]: created cb565f1: "wrote some content into our_file" 15 1 files changed, 1 insertions(+), 0 deletions(-) 16 sa@wks:/tmp/test$ gllol 17 cb565f167f1f8405edd940159763f79d2aef7f61 7 seconds ago CN: Suno Ano AN: Suno Ano S: wrote some content into our_file 18 b07693168359a71b6bb4635de6d62cb6f1119a76 73 seconds ago CN: Suno Ano AN: Suno Ano S: initial commit 19 sa@wks:/tmp/test$ cat .git/HEAD 20 ref: refs/heads/master 21 sa@wks:/tmp/test$ cat .git/refs/heads/master 22 cb565f167f1f8405edd940159763f79d2aef7f61 23 sa@wks:/tmp/test$ git reset --soft HEAD^ 24 sa@wks:/tmp/test$ cat .git/ORIG_HEAD 25 cb565f167f1f8405edd940159763f79d2aef7f61 26 sa@wks:/tmp/test$ 27 sa@wks:/tmp/test$ gllol 28 b07693168359a71b6bb4635de6d62cb6f1119a76 5 minutes ago CN: Suno Ano AN: Suno Ano S: initial commit 29 sa@wks:/tmp/test$ cat our_file 30 this will be corrected using --soft 31 sa@wks:/tmp/test$ echo 'editing working tree; --soft did not change the working tree nor the index' > our_file 32 sa@wks:/tmp/test$ git commit -a -c ORIG_HEAD 33 34 35 [ here the default editor opened ...] 36 37 38 sa@wks:/tmp/test$ gllol 39 b8c2b79d917608d2ac08597ee8008a862ad47fe1 2 minutes ago CN: Suno Ano AN: Suno Ano S: wrote some content into our_file (corrected version) 40 b07693168359a71b6bb4635de6d62cb6f1119a76 9 minutes ago CN: Suno Ano AN: Suno Ano S: initial commit 41 sa@wks:/tmp/test$ cat .git/ORIG_HEAD 42 cb565f167f1f8405edd940159763f79d2aef7f61 43 sa@wks:/tmp/test$ cat .git/HEAD 44 ref: refs/heads/master 45 sa@wks:/tmp/test$ cat .git/refs/heads/master 46 b8c2b79d917608d2ac08597ee8008a862ad47fe1
What can be is most often done when we remember that what we just committed is incomplete, or we misspelled our commit message, or both. Again, I want to point out that this sort of fixing a mistake is only recommended as long as the tainted commit/history has not been made public.
That we really just replaced one commit with another one without
resetting the working tree can be seen from lines 17 and 22
respectively 39 and 46. git reset copies the old head to
.git/ORIG_HEAD so we can redo the commit by starting with its log
message — compare the log message we gave in line 13 to after it had
been edited in lines 33 to 37 i.e. compare lines 17 and 39.
Bottom line here is, we can replace a commit and edit its commit
message without destroying the working tree or the index. This is
different to git revert where we fix a mistake by making another
commit on top of the current history — this is how it should be done
if the repository history including the mistake has already been made
public.
Alternatively, we can edit the working directory and update the index
to fix our mistake, just as if we were going to create a new commit,
then run git commit --amend. The result with git commit --amend is the
same as with git reset --soft HEAD^ above but it can also be used to
amend a merge commit.
The commit git commit --amend we create replaces the current tip — if
it was a merge commit, it will have the parents of the current tip as
parents, the current top commit is discarded.
47 sa@wks:/tmp/test$ git st 48 # On branch master 49 nothing to commit (working directory clean) 50 sa@wks:/tmp/test$ ll 51 total 4.0K 52 -rw-r--r-- 1 sa sa 75 2009-02-19 11:38 our_file 53 sa@wks:/tmp/test$ cat our_file 54 editing working tree; --soft did not change the working tree nor the index 55 sa@wks:/tmp/test$ git commit --amend 56 57 58 [ here the default editor opened ...] 59 60 61 [master]: created 12c9cf6: "wrote some content into our_file (corrected version of the corrected version)" 62 1 files changed, 1 insertions(+), 0 deletions(-) 63 sa@wks:/tmp/test$ gllol 64 12c9cf603286326553dcdc10b90086be5f62cd33 2 minutes ago CN: Suno Ano AN: Suno Ano S: wrote some content into our_file (corrected version of the corrected version) 65 b07693168359a71b6bb4635de6d62cb6f1119a76 2 hours ago CN: Suno Ano AN: Suno Ano S: initial commit 66 sa@wks:/tmp/test$ cat .git/refs/heads/master 67 12c9cf603286326553dcdc10b90086be5f62cd33 68 sa@wks:/tmp/test$ cat our_file 69 a 70 sa@wks:/tmp/test$
Ooops, we did it again ;-] ...
We just swapped the old commit for a new one in line 55, just as we did above in line 23. The new commit message can be seen in line 64 after I issued my well beloved gllol. The cherry on top thing is that, the author and time stamp is not being altered by all the replacing games we just did i.e. it is taken/reused from the commit in line 13. Of course, we can also amend those (see manual files for detailed information).
So, we swapped the commit again and we also edited the commit message
again and made changes to the working tree (our_file). All that is
possible because we were just toying with HEAD but not with the index
nor the working tree.
Again, we should never do this to a commit that may already have been
merged into another branch i.e. which has been made public — one
should use git revert instead in that case.
In the process of undoing a previous bad change, we may find it useful
to check out an older version of a particular file using git checkout.
We have used git checkout before to switch branches, but it has quite
different behavior if it is given a path name.
git checkout HEAD^ path/to/file replaces path/to/file by the contents
it had in the commit HEAD^ (or any other commit ID for that matter),
and also updates the index to match. It does not change branches. In
case we did not want to overw
If we just want to look at an older version of the file, without
modifying the working directory, git show HEAD^:path/to/file is our
friend. Of course, in both cases HEAD^ can be replaced by
anything that names a commit.
1 sa@wks:/tmp$ mkdir git_demo && cd git_demo && touch my_file && git init && git add . && git cwh -m 'initial commit'
2 Initialized empty Git repository in /tmp/git_demo/.git/
3 [master (root-commit)]: created 89f870d: "initial commit"
4 0 files changed, 0 insertions(+), 0 deletions(-)
5 create mode 100644 my_file
6 sa@wks:/tmp/git_demo$ la
7 total 8
8 drwxr-xr-x 3 sa sa 31 2009-02-19 14:16 .
9 drwxrwxrwt 14 root root 4096 2009-02-19 14:16 ..
10 drwxr-xr-x 9 sa sa 4096 2009-02-19 14:16 .git
11 -rw-r--r-- 1 sa sa 0 2009-02-19 14:16 my_file
12 sa@wks:/tmp/git_demo$ git st
13 # On branch master
14 nothing to commit (working directory clean)
15 sa@wks:/tmp/git_demo$ type bani
16 bani is aliased to `banshee --query-{artist,title} >& `tty`'
17 sa@wks:/tmp/git_demo$ bani > my_file
18 sa@wks:/tmp/git_demo$ cat my_file
19 artist: Patricia Barber
20 title: Morpheus
21 sa@wks:/tmp/git_demo$ git cwh -m "Suno\'s current track"
22 [master]: created 83e5cf6: "Suno\'s current track"
23 1 files changed, 2 insertions(+), 0 deletions(-)
24 sa@wks:/tmp/git_demo$ banshee --query-{artist,title} | tee -a my_file && cat my_file
25 artist: Paul Hardcastle
26 title: Rain Forest
27 artist: Patricia Barber
28 title: Morpheus
29 artist: Paul Hardcastle
30 title: Rain Forest
31 sa@wks:/tmp/git_demo$ git cwh -m "last two tracks (including current one)"
32 [master]: created 5a978e9: "last two tracks (including current one)"
33 1 files changed, 2 insertions(+), 0 deletions(-)
Nothing special in lines 1 to 23. In line 24 I basically use the bani
alias (an alias to control banshee from the CLI (Command Line
Interface)) but this time without >& in order to get stdout (see here
and man bash for more information) redirected to the terminal so tee
can grab it and write it to stdout and into the file my_file as can be
seen in lines 25 to 30.
34 sa@wks:/tmp/git_demo$ git show HEAD^:my_file 35 artist: Patricia Barber 36 title: Morpheus 37 sa@wks:/tmp/git_demo$ cat my_file 38 artist: Patricia Barber 39 title: Morpheus 40 artist: Paul Hardcastle 41 title: Rain Forest 42 sa@wks:/tmp/git_demo$ git show HEAD:my_file 43 artist: Patricia Barber 44 title: Morpheus 45 artist: Paul Hardcastle 46 title: Rain Forest 47 sa@wks:/tmp/git_demo$ git checkout HEAD^ my_file 48 sa@wks:/tmp/git_demo$ cat my_file 49 artist: Patricia Barber 50 title: Morpheus 51 sa@wks:/tmp/git_demo$ git st 52 # On branch master 53 # Changes to be committed: 54 # (use "git reset HEAD <file>..." to unstage) 55 # 56 # modified: my_file 57 # 58 sa@wks:/tmp/git_demo$
Line 34 is a perfect example of how to use git show in order to take a
look at former revision of some file without changing it in the
working directory (lines 37 to 41). In line 47 we checkout a former
revision — in contrast to line 34, this amends the file my_file in
the working directory as can be seen in lines 48 to 50.
Now is a good time to revisit the workflow section again. Sharing our changes with others is one of the most important, if not the most important purpose why one would want to use a SCM (Software Configuration Management) system.
Before we start with detailed issues, there are some things we should consider and keep in mind whenever we make commits and/or prepare patches.
Suppose we are contributors to a large project, and we want to add a complicated feature. We want to present it to the other developers in a way that makes it easy for them to read our changes, verify that they are correct, and understand why we made each change.
So the ideal is usually to produce a series of patches/commits such that the following checklist gets a nod on every item:
git diff --check before
committing; maybe use a hook to automate itSigned-off-by: Your Name <you@example.com> line to the commit
message — use -s when committing or simply create an
alias in ~/.gitconfig. The signed-of-by message confirms that we
agree to the Developer's Certificate of Origin.Below I will introduce some tools that can help us do this, explain how to use them, and then explain some of the problems that can arise because we are rewriting history.
Though not required, it is a good idea to begin the commit message with a single short line summarizing the change, followed by a blank line and then a more thorough description. Tools that turn commits into email, for example, use the first line on the subject line and the rest of the commit in the body.
GITosis aims to make hosting GIT repositories easier and safer. It manages multiple repositories under one user account, using SSH (Secure Shell) keys to identify users. End users do not need their own fully fledged user account on the server, they will all talk to one shared user account that will not let them run arbitrary commands.
GITosis is written in Python which is why we are going to install it too if not already installed — since we install software via APT (Advanced Packaging Tool), Python will be installed as a dependency of GITosis anyways.
There are other ways of providing a public repository as well e.g. not using SSH for push and pull actions, creating a distinct user account for any contributor, access via HTTP (Hypertext Transfer Protocol) etc. All this works but I do not like it because there is something better ... there is GITosis!
I opted to only cover one particular use case which is the most secure one, the one that scales best, and the one that CLI (Command Line Interface) folks are most comfortable with i.e. I opted to cover setting up a public GIT repository using GITosis.
As we know, GIT does not need to be setup and run in a star topology setup simply because it is no centralized SCM (Software Configuration Management) system like for example SVN but, rather, it is a decentralized SCM system which means, any clone contains the full history (all commits ever made and the metadata information that goes with it e.g. who did what and when) and can therefore be merged/diffed/etc. back and forth with any other clone/branch out there.
We can think of centralized SCM systems of enforcing the unavoidable star topology on its usres, and of decentralized systems, well, as everything from fully connected to star or, even better, anything that can be seen below.
The point is, decentralized SCM systems, as opposed to centralized ones, do not enforce silly limits with regards to topology and usage but rather leave the choice to their users.
However, sometimes it makes sense to even use a decentralized SCM system like GIT in a star topology — one such use case is with GITosis, where we have one remote machine running GITosis and therefore hosting GIT repositories for us. The GITosis machine makes for the center of the star and we, the users, are all leaves the central server running GITosis:
As already mentioned above, GITosis uses just a single system user
account for all repositories and users with write/commit/push
permissions to one or several of those repositories on the remote
machine e.g. the server within the datacenter that is going to host
our GIT repositories. This remote machine runs GITosis under the
system user account name gitosis. This system user account is
automatically created when we install the debian package gitosis i.e.
there is no need for us to issue adduser --system gitosis.
GITosis itself is basically just used to manage/control who can write/commit/push to which repository — GITosis does not, because it does not need to, be concerned about who can read/pull/fetch since this can easily be done via GIT-daemon i.e. GIT-daemon can be used to provide anonymous read/pull/fetch access to our repositories if needed.
In order to differentiate amongst folks with write/commit/push
permissions, even though we only have one shared system user account
called gitosis, a users public SSH key is used by GITosis to
differentiate amongst users. Everybody who wants to write/commit/push
to a repository on the remote machine running GITosis, has to provide
the GITosis administrator with his public key so the he can place it
onto the remote machine where GITosis can access.
For those who's public key is placed onto the remote machine and therefore GITosis does recognize them, read and write access, or rather pull/fetch and push in GIT terms, then happens via SSH (Secure Shell) i.e. it is secure. Read respectively pull/fetch access via GIT-daemon, can, but does not have to be set up for SSH — that is solely in the hands of the administrator of the GITosis server.
The PKA (Public Key Authentication) setup for the system user gitosis
makes use of additional security precautions i.e. even those with
write/commit/push permissions cannot execute arbitrary command on our
remote machine running GITosis.
Because of the fact that a system user account is used for the user
gitosis rather than a normal user account and the fact that the
command=<command_issued_when_public_key_authentication_is_ok>
part is present, on top of the PKA setup (password login is disabled),
GITosis is a rather secure thing to use even for a huge community if
need be.
Also, as for any other way of hosting GIT repositories, firewall
settings in table filter, chain OUTPUT and INPUT respectively FORWARD
in case of OpenVZ, have to allow port 9418. This is necessary for
GIT-daemon for example.
For those who have permissions to write/commit/push, the SSH service ports are relevant thus the firewall has to allow the SSH port for inbound and outbound traffic. Of course, we can and we will use a non-standard listening port for sshd as we will see below. For the more paranoid, even port knocking might be set up if needed — I leave it to the particular user group to decide whether or not they might find it to much of a hassle or not ...
1 sa@wks:~$ ssh dolmen-devel
2
3 / \ _-'
4 _/ \-''- _ /
5 __-' { \
6 / \
7 / "o. |o }
8 | \ ; YOU ARE BEING WATCHED!
9 ',
10 \_ __\
11 ''-_ \.//
12 / '-____'
13 /
14 _'
15 _-'
16
17
18 This computer system is the private property of its owner, whether individual, corporate or government. It is
19 for authorized use only. Users (authorized or unauthorized) have no explicit or implicit expectation of
20 privacy.
21
22 Any or all uses of this system and all files on this system may be intercepted, monitored, recorded, copied,
23 audited, inspected, and disclosed to your employer, to authorized site, government, and law enforcement
24 personnel, as well as authorized officials of government agencies, both domestic and foreign.
25
26 By using this system, the user consents to such interception, monitoring, recording, copying, auditing,
27 inspection, and disclosure at the discretion of such personnel or officials.
28
29
30 UNAUTHORIZED OR IMPROPER USE OF THIS SYSTEM MAY RESULT
31 IN CIVIL AND CRIMINAL PENALTIES AND ADMINISTRATIVE OR
32 DISCIPLINARY ACTION, AS APPROPRIATE !!
33
34
35 By continuing to use this system you indicate your awareness of and consent to these terms and conditions of
36 use. LOG OFF IMMEDIATELY if you do not agree to the conditions stated in this warning. However, if you are
37 authorized personal with no bad intentions please continue. Have a nice day! :-)
38
39 sa@rh0-ve3:~$ su
40 Password:
41 rh0-ve3:/home/sa# type dpl; dpl git* | grep ii
42 dpl is aliased to `dpkg -l'
43 ii git-core 1:1.6.3.3-1 fast, scalable, distributed revision control
44 ii gitosis 0.2+20080825-14 git repository hosting application
45 rh0-ve3:/home/sa# grep git /etc/passwd
46 gitosis:x:105:108:git repository hosting,,,:/srv/gitosis:/bin/sh
47 rh0-ve3:/home/sa# cd /srv/gitosis/
48 rh0-ve3:/srv/gitosis# type la; la
49 la is aliased to `ls -la'
50 total 8
51 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-09 12:53 .
52 drwxr-xr-x 3 root root 4096 2009-07-09 11:33 ..
53 lrwxrwxrwx 1 root root 25 2009-07-09 11:33 git -> /srv/gitosis/repositories
We are going to install GITosis on a remote machine located within a
datacenter. In order to do so, we use SSH (Secure Shell) to leave our
local machine, my workstation with its hostname wks, and log into the
remote machine (rh0-ve3) as can be seen from line 39. What can be seen
in lines 2 to 38 is just the usual banner message. The very short
command from line 1 is possible because all the sshd port information
of the remote machine etc. lives within my ~/.ssh/config.
These days, we make use of some virtualization technology of course. In the current case we are going to use OpenVZ i.e. our remote machine is a VE (Virtual Environment) which means it shares a HNs (Hardware Nodes) resources with other VEs — a VE however behaves and feels no differently than any non-virtualized Debian machine.
On the remote machine, we start with installing the gitosis package
which I already did as can be seen in line 44. For those who have not
installed it already, aptitude install gitosis will do the trick. dpl
in line 41 is just an aliases in my ~/.bashrc.
By installing gitosis, some work is done for us automatically like for
example setting up the system user account for the system user gitosis
(line 46, see man 5 passwd) and a location (line 53) where our
repositories will live is set up as well.
Next thing to do is to create a public SSH key for PKA (Public Key
Authentication) for the GITosis administrator account on the remote
machine running GITosis. That we need to do on our local machine i.e.
wks in my case and not on rh0-ve3.
As the above link shows, I already have my SSH keypair and thus a
public SSH key (/home/sa/.ssh/ssh_pka_key_for_user_Suno_Ano.pub) which
we are now going to use for being the GITosis administrator as well —
a single SSH keypair can be used for many services and tasks if needed
i.e. there is no need to have n keypairs for n services that require
SSH PKA.
54 rh0-ve3:/srv/gitosis# exit 55 exit 56 sa@rh0-ve3:~$ exit 57 logout 58 Connection to devel.example.com closed. 59 sa@wks:~$ cd .ssh/keypairs/; type pi; pi Suno 60 pi is aliased to `ls -la | grep' 61 -rw------- 1 sa sa 6431 2009-03-13 11:11 ssh_pka_key_for_user_Suno_Ano 62 -rw-r--r-- 1 sa sa 1501 2009-03-13 11:11 ssh_pka_key_for_user_Suno_Ano.pub 64 sa@wks:~/.ssh/keypairs$ scp -P 58445 ssh_pka_key_for_user_Suno_Ano.pub devel.example.com:/tmp 65 66 67 [skipping a lot of lines ...] 68 69 70 ssh_pka_key_for_user_Suno_Ano.pub 100% 1501 1.5KB/s 00:00 71 sa@wks:~/.ssh/keypairs$ ssh dolmen-devel 72 73 74 [skipping a lot of lines ...] 75 76 77 sa@rh0-ve3:~$ cd /tmp/ 78 sa@rh0-ve3:/tmp$ pi Suno 79 -rw-r--r-- 1 sa sa 1501 2009-07-12 13:48 ssh_pka_key_for_user_Suno_Ano.pub 80 sa@rh0-ve3:/tmp$ su 81 Password: 82 rh0-ve3:/tmp# dpl sudo* | grep ii 83 ii sudo 1.7.0-1 Provide limited super user privileges to specific users
So, as mentioned we need to either create or grab the public SSH key on our local machine if it already exists there and then transfer it onto the remote machine running GITosis.
With line 58 we have finally left rh0-ve3 and thus we are back on wks
again in line 59 where we check for my already existing SSH keypair.
As can be seen from line 62, there it is, my public key that we are
going to use for setting up the GITosis administrator account plus,
later on, we are also going to use it in order to provide myself with
write/commit/push permissions to GIT repositories hosted on our remote
machine rh0-ve3.
With line 64 we use SCP (Secure Copy) to copy the public key (the one
with the .pub suffix) from my local machine (wks) onto the remote
machine. In terms of security considerations, it is, as usual, very
important to keep the private key save i.e. to copying the private key
instead of the public key would be a very dangerous thing to do since,
having the private key physically on some remote machine is a huge
security risk. That is true even if the private key is
protected by a passphrase which of course it should be for security
reasons.
The -P switch in line 64 specifies a non-standard sshd listening port
and :/tmp determines the destination directory on the remote machine
i.e. rh0-ve3. devel.example.com is a standard URL (Uniform Resource
Locator) that resolves to an IPv4 address e.g. 123.23.43.118. We could
of course also specify the IP address directly but since we already
have the domain pointer onto the IP address, why not use it.
For both cases, domain name or IP address, the important thing is that
there is an sshd listening on that particular IP and port combination
else our SSH connection/transfer would not succeed. Last but not
least, the devel in devel.example.com denotes/hints that our GITosis
VE is actually used for more than just a fully fledged GIT hosting
platform — later we are going to lay a Trac layer on top the GIT
infrastructure and thus have a ticketing/wiki/project management
system using GIT as its SCM (Software Configuration Management)
backend. Anyways, Trac and GITosis actually have nothing to do with
each other from a technical point of view other than GITosis can
provide SCM backend functionality to Trac i.e. one can set up and use
GITosis without putting an additional Trac layer on top GITosis, there
is no dependency on Trac whatsoever.
The [skipping a lot of lines ...] in line 67 and further down just
indicates the missing/skipped banner message — there is no point in
showing it over and over again. Line 70 shows that we successfully
transferred the public key ssh_pka_key_for_user_Suno_Ano.pub to /tmp
on the remote machine — line 79 is just about providing proof that it
is really true, we did not screw up here.
With installing the debian package gitosis, sudo got installed as a
dependency — we will need it now as can be seen below in line 84.
84 rh0-ve3:/tmp# sudo -H -u gitosis gitosis-init < /tmp/ssh_pka_key_for_user_Suno_Ano.pub 85 Initialized empty Git repository in /srv/gitosis/repositories/gitosis-admin.git/ 86 Reinitialized existing Git repository in /srv/gitosis/repositories/gitosis-admin.git/ 87 rh0-ve3:/tmp# cd /srv/gitosis/repositories/gitosis-admin.git/ 88 rh0-ve3:/srv/gitosis/repositories/gitosis-admin.git# type la; la 89 la is aliased to `ls -la' 90 total 52 91 drwxr-x--- 8 gitosis gitosis 4096 2009-07-12 13:52 . 92 drwxr-xr-x 3 gitosis gitosis 4096 2009-07-12 13:52 .. 93 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-12 13:52 branches 94 -rw-r--r-- 1 gitosis gitosis 66 2009-07-12 13:52 config 95 -rw-r--r-- 1 gitosis gitosis 73 2009-07-12 13:52 description 96 -rw-r--r-- 1 gitosis gitosis 90 2009-07-12 13:52 gitosis.conf 97 drwxr-xr-x 3 gitosis gitosis 4096 2009-07-12 13:52 gitosis-export 98 -rw-r--r-- 1 gitosis gitosis 23 2009-07-12 13:52 HEAD 99 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-12 13:52 hooks 100 -rw-r--r-- 1 gitosis gitosis 272 2009-07-12 13:52 index 101 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-12 13:52 info 102 drwxr-xr-x 4 gitosis gitosis 4096 2009-07-12 13:52 objects 103 drwxr-xr-x 4 gitosis gitosis 4096 2009-07-12 13:52 refs 104 rh0-ve3:/srv/gitosis/repositories/gitosis-admin.git# la hooks/post-update 105 lrwxrwxrwx 1 gitosis gitosis 61 2009-07-12 13:52 hooks/post-update -> /usr/share/pyshared/gitosis/templates/admin/hooks/post-update 106 rh0-ve3:/srv/gitosis/repositories/gitosis-admin.git# la /usr/share/pyshared/gitosis/templates/admin/hooks/post-update 107 -rwxr-xr-x 1 root root 69 2009-04-25 14:38 /usr/share/pyshared/gitosis/templates/admin/hooks/post-update 108 rh0-ve3:/srv/gitosis/repositories/gitosis-admin.git# cd .. 109 rh0-ve3:/srv/gitosis/repositories# !! 110 cd ..
We are back on rh0-ve3, became root, and then issue line 84. What this
command sequence does is, sudo is used to run it as system user
gitosis even though we are currently logged in as root.
gitosis-init itself takes our public SSH key and does its magic with
it — in essence, it sprinkles some magic into the home directory of
the gitosis user and puts our public SSH key into the list of
authorized keys. The reason why we use /tmp on the remote machine is
because, for once the key is not needed anymore after issuing line 84
(it will vanish on reboot) plus, by using /tmp we are unlikely going
to run into permission problems like for example the user gitosis is
unable to read the public key.
That we succeed with line 84 can be seen from lines 85 and 86. After
taking a look around in lines 93 to 103, inside our just created
GITosis administrator area, (yes, that is a standard GIT
repository layout — we are using GIT to manage our GIT hosting
platform ... how cool is that?! ;-]) we check if our post update hook
has the correct permission i.e. can be executed by others than root
itself or members of group root — it is all good, the permissions are
all right as they are 755 in octal notation and thus allow others, and
therefore the gitosis system user, to execute the hook.
111 rh0-ve3:/srv/gitosis# la 112 total 20 113 drwxr-xr-x 5 gitosis gitosis 4096 2009-07-12 13:52 . 114 drwxr-xr-x 3 root root 4096 2009-07-09 11:33 .. 115 lrwxrwxrwx 1 root root 25 2009-07-09 11:33 git -> /srv/gitosis/repositories 116 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-12 13:52 gitosis 117 lrwxrwxrwx 1 gitosis gitosis 56 2009-07-12 13:52 .gitosis.conf -> /srv/gitosis/repositories/gitosis-admin.git/gitosis.conf 118 drwxr-xr-x 3 gitosis gitosis 4096 2009-07-12 13:52 repositories 119 drwx------ 2 gitosis gitosis 4096 2009-07-12 13:52 .ssh 120 rh0-ve3:/srv/gitosis# la repositories/ 121 total 12 122 drwxr-xr-x 3 gitosis gitosis 4096 2009-07-12 13:52 . 123 drwxr-xr-x 5 gitosis gitosis 4096 2009-07-12 13:52 .. 124 drwxr-x--- 8 gitosis gitosis 4096 2009-07-12 13:52 gitosis-admin.git 125 rh0-ve3:/srv/gitosis# cat .gitosis.conf 126 [gitosis] 127 128 [group gitosis-admin] 129 writable = gitosis-admin 130 members = sunoano 131 132 rh0-ve3:/srv/gitosis# la .ssh/ 133 total 12 134 drwx------ 2 gitosis gitosis 4096 2009-07-12 13:52 . 135 drwxr-xr-x 5 gitosis gitosis 4096 2009-07-12 13:52 .. 136 -rw-r--r-- 1 gitosis gitosis 1652 2009-07-12 13:52 authorized_keys 137 rh0-ve3:/srv/gitosis# cat .ssh/authorized_keys 138 ### autogenerated by gitosis, DO NOT EDIT 139 command="gitosis-serve sunoano",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty ssh-rsa AAAAB3NzaC [skipping a lot of characters ...] TuB4zOt+Ay9dfoq5nMIekW2TNts24F/9k2NQ== PKA (Public Key Authentication) SSH keypair for user Suno Ano; reach me at sunoano
If we compare lines 51 to 53 with lines 113 to 119, we can see that
the command from line 84 also created a symmetric link to our
.gitosis.conf file, it created .ssh, the place where the public keys
are kept on the remote machine, and then there is ../repositories, the
place where all GIT repositories will live from now on, including the
one used to administer GITosis itself as can be seen in line 124.
The most important file it includes can be seen in lines 126 to 131 —
in line 129 it says, that the repository git-admin.git is writable and
with line 130 it also says that only sunoano (that is me) can write to
it i.e. only I can use git push and put new configuration settings for
the GITosis platform running on rh0-ve3 into place. This is true
because of my public SSH key as it can be seen in line 139.
Security is good, as mentioned above already, even I cannot issue
arbitrary commands because of the
command=<command_issued_when_public_key_authentication_is_ok>
part.
140 rh0-ve3:/srv/gitosis# grep AllowUsers /etc/ssh/sshd_config 141 AllowUsers foo@xxx.xx.* bar@xx.xxx.* baz@xxx.xxx.xxx.xxx 142 rh0-ve3:/srv/gitosis# nano /etc/ssh/sshd_config 143 144 145 [ here we use nano to edit /etc/ssh/sshd_config ... ] 146 147 148 rh0-ve3:/srv/gitosis# grep AllowUsers /etc/ssh/sshd_config 149 AllowUsers foo@xxx.xx.* bar@xx.xxx.* baz@xxx.xxx.xxx.xxx gitosis@* 150 rh0-ve3:/srv/gitosis# /etc/init.d/ssh reload 151 Reloading OpenBSD Secure Shell server's configuration: sshd. 152 rh0-ve3:/srv/gitosis# exit 153 exit 154 sa@rh0-ve3:~$ exit 155 logout 156 Connection to devel.example.com closed.
Since AllowUsers is used for the SSH setup, we have to explicitly
grant our system user gitosis access to rh0-ve3. The before (line 141)
and The after (line 149) can be seen above. Line 150 shows how to
activate the new sshd setting without rebooting the entire VE (Virtual
Environment) or even restarting the sshd — doing so, restarting that
is, would kill our currently active SSH connection to rh0-ve3 as well
...
157 sa@wks:~$ cd 0/ 158 sa@wks:~/0$ mkdir -p gitosis_projects/dolmen 159 sa@wks:~/0$ cd gitosis_projects/dolmen/ 160 sa@wks:~/0/gitosis_projects/dolmen$ la 161 total 8 162 drwxr-xr-x 2 sa sa 4096 2009-07-12 18:36 . 163 drwxr-xr-x 3 sa sa 4096 2009-07-12 18:36 .. 164 sa@wks:~/0/gitosis_projects/dolmen$ nano /home/sa/.ssh/config 165 166 167 [ here we use nano to edit ~/.ssh/config ... ] 168 169 170 sa@wks:~/0/gitosis_projects/dolmen$ grep -C4 gitosis ~/.ssh/config 171 ###_ , devel.example.com 172 # description: just a dummy stanza to make git push/pull work with 173 # devel.example.com 174 Host devel.example.com 175 User gitosis 176 Port 58445 177 Hostname devel.example.com 178 IdentityFile %d/.ssh/keypairs/ssh_pka_key_for_user_Suno_Ano 179 TCPKeepAlive yes
Back on the local machine, I decided to have a dedicated directory to host all my GITosis administrator related data for several projects that I am going to migrate to GIT, the first of which is Dolmen.
As I mentioned above already, we are using a non-standard listening
port for the sshd running on rh0-ve3. Because of that, we need to put
a new stanza into ~/.ssh/config that will provide information to the
local SSH client running on wks — that is, for example, the sshd
listening port on rh0-ve3 and which keyfile to use. The important
lines here are those from line 174 to line 178. Line 179 is a
nice-to-have, especially if we want to avoid indefinitely hanging SSH
sessions and the like.
Line 174 is what we specify on the CLI (Command Line Interface) i.e.
ssh devel.example.com. This is how our local SSH client finds this
particular stanza so he knows which URL/IP to use (line 177), which
port to use (line 176) and that he should disguise as user gitosis
when asking for access on the remote machine i.e. the sshd listening
on port 58445 at devel.example.com.
Line 178 is useful if we have several SSH keys loaded into our
SSH-agent (on our local machine, wks in the current case) and, in
addition, MaxAuthTries is set to a low number on the remote system —
in short, line 178 enables our local SSH client to pick the right
private/public keypair combination right away without the need to
iterate a few times until it finds the correct counterpart to the
public key on the remote machine. If MaxAuthTries 1 is set on the
remote machine, we only have one attempt after which the remote
machine's sshd cancels the connection. Therefore, using IdentityFile
is, if not mandatory anyways, good practice.
Note, I do have another stanza of course as well
Host dolmen-devel User sa HostName devel.example.com Port 58445 IdentityFile %d/.ssh/keypairs/ssh_pka_key_for_user_Suno_Ano TCPKeepAlive yes
which I use for regular SSH access to rh0-ve3. The point is, it is the
same SSH keypair, once I disguise myself as user gitosis and therefore
I am limited by the command= option, and once I am what I
am, sa, and can therefore administer this VE as usual after becoming
root from sa or using sudo for example.
Again ... we have two different stanzas in ~/.ssh/config, same VE, same sshd, but different users and thus permissions because of the two stanzas:
rh0-ve3 (we use a OpenVZ VE (Virtual Environment) but that does
not matter — a non-virtualized box would feel/behave no
different) as usual e.g. login via SSH, become root, issue
aptitude update && aptitude full-upgrade for example
180 sa@wks:~/0/gitosis_projects/dolmen$ git clone gitosis@devel.example.com:gitosis-admin.git
181 Initialized empty Git repository in /home/sa/0/gitosis_projects/dolmen/gitosis-admin/.git/
182
183 / \ _-'
184 _/ \-''- _ /
185 __-' { \
186 / \
187 / "o. |o }
188 | \ ; YOU ARE BEING WATCHED!
189 ',
190 \_ __\
191 ''-_ \.//
192 / '-____'
193 /
194 _'
195 _-'
196
197
198 This computer system is the private property of its owner, whether individual, corporate or government. It is
199 for authorized use only. Users (authorized or unauthorized) have no explicit or implicit expectation of
200 privacy.
201
202 Any or all uses of this system and all files on this system may be intercepted, monitored, recorded, copied,
203 audited, inspected, and disclosed to your employer, to authorized site, government, and law enforcement
204 personnel, as well as authorized officials of government agencies, both domestic and foreign.
205
206 By using this system, the user consents to such interception, monitoring, recording, copying, auditing,
207 inspection, and disclosure at the discretion of such personnel or officials.
208
209
210 UNAUTHORIZED OR IMPROPER USE OF THIS SYSTEM MAY RESULT
211 IN CIVIL AND CRIMINAL PENALTIES AND ADMINISTRATIVE OR
212 DISCIPLINARY ACTION, AS APPROPRIATE !!
213
214
215 By continuing to use this system you indicate your awareness of and consent to these terms and conditions of
216 use. LOG OFF IMMEDIATELY if you do not agree to the conditions stated in this warning. However, if you are
217 authorized personal with no bad intentions please continue. Have a nice day! :-)
218
219 remote: Counting objects: 5, done.
220 remote: Compressing objects: 100% (5/5), done.
221 remote: Total 5 (delta 0), reused 5 (delta 0)
222 Receiving objects: 100% (5/5), done.
223 sa@wks:~/0/gitosis_projects/dolmen$ la
224 total 12
225 drwxr-xr-x 3 sa sa 4096 2009-07-12 18:45 .
226 drwxr-xr-x 3 sa sa 4096 2009-07-12 18:36 ..
227 drwxr-xr-x 4 sa sa 4096 2009-07-12 18:45 gitosis-admin
228 sa@wks:~/0/gitosis_projects/dolmen$ cd gitosis-admin/
229 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin$ la
230 total 20
231 drwxr-xr-x 4 sa sa 4096 2009-07-12 18:45 .
232 drwxr-xr-x 3 sa sa 4096 2009-07-12 18:45 ..
233 drwxr-xr-x 8 sa sa 4096 2009-07-12 18:45 .git
234 -rw-r--r-- 1 sa sa 90 2009-07-12 18:45 gitosis.conf
235 drwxr-xr-x 2 sa sa 4096 2009-07-12 18:45 keydir
236 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin$ la keydir/
237 total 12
238 drwxr-xr-x 2 sa sa 4096 2009-07-12 18:45 .
239 drwxr-xr-x 4 sa sa 4096 2009-07-12 18:45 ..
240 -rw-r--r-- 1 sa sa 1501 2009-07-12 18:45 sunoano.pub
241 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin$ cat keydir/sunoano.pub
242 ssh-rsa AAAAB3NzaC1yc [skipping a lot of characters ...] q5nMIekW2TNts24F/9k2NQ== PKA (Public Key Authentication) SSH keypair for user Suno Ano; reach me at sunoano
243 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin$ cat gitosis.conf
244 [gitosis]
245
246 [group gitosis-admin]
247 writable = gitosis-admin
248 members = sunoano
249
With line 180 we transfer those bits and pieces needed for
administering GITosis onto my local machine (wks) — note that
devel.example.com in line 180 triggers all the stuff we put in place
with lines 174 to 179 e.g. we use port 58445 without explicitly
specifying it in line 180.
What follows is the usual banner message and some GIT specific chatter in lines 219 to 222. From now on, everybody will see this banner message when cloning/pulling/fetching from one of our GIT repositories via SSH — of course, one might alter the banner message to whatever he might think fits better e.g. some message telling those who clone that this is GITosis they are talking to, company info, some URL to some website, etc.
The result of line 180 can be seen from lines 227 onwards like for
example ../gitosis-admin/keydir which is used the collect and store
the public SSH keys from anybody who has write/commit/push permissions
to one of our GIT repositories.
As we can see, gitosis.conf is now also present on our local machine
i.e. with GITosis we do not even need to enter rh0-ve3 via SSH and
edit gitosis.conf on the remote machine (lines 126 to 131) but we can
do all management tasks locally and when done, use git push in order
to push them to rh0-ve3 and thereby make the settings active on the
remote machine i.e. our GITosis hosting platform running on rh0-ve3.
That part is pure GIT power — a decentralized SCM system does not need to be connected to some central instance all in order for us to get some work done.
We could for example configure a new repository while sitting on some
airplane without connectivity to the Internet and then, once we have
Internet connectivity again, just issue git push and the new
repositories with all its permissions and users will be available
immediately .... that is just plain cool! GIT is just plain cool I
should say. Try this with some centralized SCM like for example SVN
;-]
250 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin$ 251 252 253 [ here we use nano to edit /home/sa/0/gitosis_projects/dolmen/gitosis-admin/gitosis.conf ... ] 254 255 256 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin$ cat gitosis.conf 257 [gitosis] 258 259 [group gitosis-admin] 260 writable = gitosis-admin 261 members = sunoano 262 263 [group dolmen] 264 members = sunoano 265 writable = dolmen 266 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin$ git dwh 267 diff --git a/gitosis.conf b/gitosis.conf 268 index b8000ed..621dc63 100644 269 --- a/gitosis.conf 270 +++ b/gitosis.conf 271 @@ -4,3 +4,6 @@ 272 writable = gitosis-admin 273 members = sunoano 274 275 +[group dolmen] 276 +members = sunoano 277 +writable = dolmen 278 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin$ git cwh -m 'allow Suno Ano write access to dolmen' 279 [master d939ac7] allow Suno Ano write access to dolmen 280 1 files changed, 3 insertions(+), 0 deletions(-)
Next we edit our local gitosis.conf in order to provide
write/commit/push permissions to our first user. The entry we make in
line 264 has to be the same name as the name of the public keyfile
(line 240) of this user but without the .pub extension. This is how
permitting write/commit/push for a new user works — collecting their
public key files in ../keydir and adding the name of their keyfile to
the members line in gitosis.conf, that is all, very simple and
straight forward — can be done on any airplane if need be, I know ;-]
With our current setup, we will now also specify the name of a new
project called Dolmen i.e. we will have dolmen.git, a
bare GIT repository, on the remote machine once we are done.
Therefore we create a new group in line 263 — it makes sense to name
the group dolmen as well, same name as the project name in line 265.
However, the group name does not need to be the same as the
repository/project name.
Naming generally works like this: The repository name on the remote
machine rh0-ve3 (dolmen.git) has the suffix .git. The project name
(line 265) comes without the suffix, and the directory on the
filesystem which we create with line 297 and which contains the data
like for example source code for Dolmen, also has the name dolmen.
Starting with line 266 we make use of some of my
aliases in ~/.gitconfig like for example git dwh which is short for
git diff HEAD.
With line 278 we commit the changes to our local clone of
gitosis-admin and with line 281 we push them to rh0-ve3 i.e. our
GITosis hosting platform.
281 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin$ git push 282 283 284 [skipping a lot of lines ...] 285 286 287 Counting objects: 5, done. 288 Delta compression using up to 4 threads. 289 Compressing objects: 100% (3/3), done. 290 Writing objects: 100% (3/3), 393 bytes, done. 291 Total 3 (delta 0), reused 0 (delta 0) 292 To gitosis@devel.example.com:gitosis-admin.git 293 3c86640..d939ac7 master -> master 294 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin$ gllol 295 d939ac7fa4f9a29e517541f494e00285d18a4b63 10 seconds ago CN: Suno Ano AN: Suno Ano S: allow Suno Ano write access to dolmen 296 3c866407f4fabd7c7afbcf434b76024b1476e58d 24 hours ago CN: Gitosis Admin AN: Gitosis Admin S: Automatic creation of gitosis repository.
That the push was successful can be seen from lines 287 to 293.
Internally, for this push, GITosis checked whether we are in
possession of the private key ssh_pka_key_for_user_Suno_Ano. Also, as
before, the SSH settings in ~/.ssh/config were responsible that GIT
knew were to put its stuff.
gllol in line 294 is a somewhat fancy command which I have come to like a lot since it shows me what I need to know quite easily and with not much effort.
297 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin$ mkdir dolmen; cd dolmen 298 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin/dolmen$ git init 299 Initialized empty Git repository in /home/sa/0/gitosis_projects/dolmen/gitosis-admin/dolmen/.git/ 300 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin/dolmen$ git remote add origin gitosis@devel.example.com:dolmen.git 301 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin/dolmen$ echo "WRITEME" > README 302 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin/dolmen$ la 303 total 16 304 drwxr-xr-x 3 sa sa 4096 2009-07-13 16:11 . 305 drwxr-xr-x 5 sa sa 4096 2009-07-13 16:09 .. 306 drwxr-xr-x 7 sa sa 4096 2009-07-13 16:10 .git 307 -rw-r--r-- 1 sa sa 8 2009-07-13 16:11 README 308 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin/dolmen$ git add README 309 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin/dolmen$ git status 310 # On branch master 311 # 312 # Initial commit 313 # 314 # Changes to be committed: 315 # (use "git rm --cached <file>..." to unstage) 316 # 317 # new file: README 318 # 319 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin/dolmen$ git cwh -m 'initial commit' 320 [master (root-commit) ac84821] initial commit 321 1 files changed, 1 insertions(+), 0 deletions(-) 322 create mode 100644 README 323 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin/dolmen$ gllol 324 ac8482172485bc3322ab7e22a189dd320bf666f9 2 seconds ago CN: Suno Ano AN: Suno Ano S: initial commit 325 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin/dolmen$ cat .git/config 326 [core] 327 repositoryformatversion = 0 328 filemode = true 329 bare = false 330 logallrefupdates = true 331 [remote "origin"] 332 url = gitosis@devel.example.com:dolmen.git 333 fetch = +refs/heads/*:refs/remotes/origin/*
We have already specified a new group above in line 263 and specified that our new repository will allow write/commit/push actions. That is very cool but then, it would be even cooler if we actually had that repository too no? ;-]
With line 297/298 we create it on our local machine, add the remote
information in line 300 and add some file in line 301. Line 300, where
we set the origin, is yet another line which implicitly uses our SSH
settings in ~/.ssh/config.
After looking at the current status with line 309, we commit the
changes with line 319 which works fine as can be seen. Again, git cwh
-m is an aliases in ~/.gitconfig and is just the short version of git
commit -a -s -m.
Now is a good time to take a look at the config file of our just
created repository. As we can see in line 332, we have successfully
added/created a bare repository on rh0-ve3 respectively
devel.example.com i.e. onto our own GITosis hosting platform.
334 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin/dolmen$ git push origin master:refs/heads/master 335 336 337 [skipping a lot of lines ...] 338 339 340 Initialized empty Git repository in /srv/gitosis/repositories/dolmen.git/ 341 Counting objects: 3, done. 342 Writing objects: 100% (3/3), 233 bytes, done. 343 Total 3 (delta 0), reused 0 (delta 0) 344 To gitosis@devel.example.com:dolmen.git 345 * [new branch] master -> master
Last but not least, after making our local clone think it got cloned
from devel.example.com:dolmen.git with line 300, we push again with a
somewhat special command, a refspec, in line 334 and thus make the
whole thing complete which means, now we have two master branches on
both sides, locally and remotely which are now in sync and therefore
the repository is ready to be used ... of to the races ladies and
gentlemen, start your engines ;-]
346 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin/dolmen$ ssh dolmen-devel 347 348 349 [skipping a lot of lines ...] 350 351 352 sa@rh0-ve3:~$ su 353 Password: 354 rh0-ve3:/home/sa# cd /srv/gitosis/repositories/ 355 rh0-ve3:/srv/gitosis/repositories# la 356 total 16 357 drwxr-xr-x 4 gitosis gitosis 4096 2009-07-13 14:13 . 358 drwxr-xr-x 5 gitosis gitosis 4096 2009-07-12 13:52 .. 359 drwxr-x--- 7 gitosis gitosis 4096 2009-07-13 14:13 dolmen.git 360 drwxr-x--- 8 gitosis gitosis 4096 2009-07-13 13:41 gitosis-admin.git 361 rh0-ve3:/srv/gitosis/repositories# cd dolmen.git/ 362 rh0-ve3:/srv/gitosis/repositories/dolmen.git# la 363 total 40 364 drwxr-x--- 7 gitosis gitosis 4096 2009-07-13 14:13 . 365 drwxr-xr-x 4 gitosis gitosis 4096 2009-07-13 14:13 .. 366 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-13 14:13 branches 367 -rw-r--r-- 1 gitosis gitosis 66 2009-07-13 14:13 config 368 -rw-r--r-- 1 gitosis gitosis 73 2009-07-13 14:13 description 369 -rw-r--r-- 1 gitosis gitosis 23 2009-07-13 14:13 HEAD 370 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-13 14:13 hooks 371 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-13 14:13 info 372 drwxr-xr-x 7 gitosis gitosis 4096 2009-07-13 14:13 objects 373 drwxr-xr-x 4 gitosis gitosis 4096 2009-07-13 14:13 refs 374 rh0-ve3:/srv/gitosis/repositories/dolmen.git# cat config 375 [core] 376 repositoryformatversion = 0 377 filemode = true 378 bare = true 379 rh0-ve3:/srv/gitosis/repositories/dolmen.git# la info/ 380 total 12 381 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-13 14:13 . 382 drwxr-x--- 7 gitosis gitosis 4096 2009-07-13 14:13 .. 383 -rw-r--r-- 1 gitosis gitosis 240 2009-07-13 14:13 exclude 384 rh0-ve3:/srv/gitosis/repositories/dolmen.git# la refs/ 385 total 16 386 drwxr-xr-x 4 gitosis gitosis 4096 2009-07-13 14:13 . 387 drwxr-x--- 7 gitosis gitosis 4096 2009-07-13 14:13 .. 388 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-13 14:13 heads 389 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-13 14:13 tags 390 rh0-ve3:/srv/gitosis/repositories/dolmen.git# la refs/heads/ 391 total 12 392 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-13 14:13 . 393 drwxr-xr-x 4 gitosis gitosis 4096 2009-07-13 14:13 .. 394 -rw-r--r-- 1 gitosis gitosis 41 2009-07-13 14:13 master 395 rh0-ve3:/srv/gitosis/repositories/dolmen.git# cat refs/heads/master 396 ac848ac8482172485bc3322ab7e22a189dd320bf666f9 397 rh0-ve3:/srv/gitosis/repositories/dolmen.git# exit 398 exit 399 sa@rh0-ve3:~$ exit 400 logout 401 Connection to devel.example.com closed. 402 sa@wks:~/0/gitosis_projects/dolmen/gitosis-admin/dolmen$ cd /tmp/test/
Lines 346 to 397 are just to take another look around on rh0-ve3 after
we pushed all the local configurations we did upstream i.e. from the
local machine (wks) to the remote machine (rh0-ve3).
Note that this time, in line 346, we use dolmen-devel to refer to my
usual stanza in ~/.ssh/config i.e. the one where we use my standard
user sa rather than our system user gitosis.
The thing that is most interesting here is with line 396 — the
object type we are looking at here is a so-called commit object, the
one (ac84821 ...) we created with line 319 on our local machine and
which is now available, after the push in line 334, on rh0-ve3 as
well.
There are two ways this can be done
git pull is all that is needed to
update a local clone to the current status of the upstream
repository. If at some point, they would like to become
contributors, the GITosis administrator can collect their public
SSH key and add them to the members line in gitosis.conf.
403 sa@wks:/tmp/test$ git clone gitosis@devel.example.com:dolmen.git
404 Initialized empty Git repository in /tmp/test/dolmen/.git/
405
406 / \ _-'
407 _/ \-''- _ /
408 __-' { \
409 / \
410 / "o. |o }
411 | \ ; YOU ARE BEING WATCHED!
412 ',
413 \_ __\
414 ''-_ \.//
415 / '-____'
416 /
417 _'
418 _-'
419
420
421 This computer system is the private property of its owner, whether individual, corporate or government. It is
422 for authorized use only. Users (authorized or unauthorized) have no explicit or implicit expectation of
423 privacy.
424
425 Any or all uses of this system and all files on this system may be intercepted, monitored, recorded, copied,
426 audited, inspected, and disclosed to your employer, to authorized site, government, and law enforcement
427 personnel, as well as authorized officials of government agencies, both domestic and foreign.
428
429 By using this system, the user consents to such interception, monitoring, recording, copying, auditing,
430 inspection, and disclosure at the discretion of such personnel or officials.
431
432
433 UNAUTHORIZED OR IMPROPER USE OF THIS SYSTEM MAY RESULT
434 IN CIVIL AND CRIMINAL PENALTIES AND ADMINISTRATIVE OR
435 DISCIPLINARY ACTION, AS APPROPRIATE !!
436
437
438 By continuing to use this system you indicate your awareness of and consent to these terms and conditions of
439 use. LOG OFF IMMEDIATELY if you do not agree to the conditions stated in this warning. However, if you are
440 authorized personal with no bad intentions please continue. Have a nice day! :-)
441
442 remote: Counting objects: 3, done.
443 remote: Total 3 (delta 0), reused 0 (delta 0)
444 Receiving objects: 100% (3/3), done.
445 sa@wks:/tmp/test$ la
446 total 12
447 drwxr-xr-x 3 sa sa 4096 2009-07-13 17:19 .
448 drwxrwxrwt 26 root root 4096 2009-07-13 17:19 ..
449 drwxr-xr-x 3 sa sa 4096 2009-07-13 17:19 dolmen
450 sa@wks:/tmp/test$ cat dolmen/README
451 WRITEME
452 sa@wks:/tmp/test$
We issue line 403 and what happens is just plain lovely! Cloning as a
user who has write/commit/push permissions worked just fine as line
444 shows — on my local machine, wks, I have provided SSH-agent with
my private key before so ...
We now have a clone of Dolmen in /tmp which also contains README, the
file we created in line 301 on our local machine, then pushed to the
remote machine rh0-ve3 and now, again, cloned i.e. transferred it from
the remote machine onto our local machine wks.
Ok, great, cloning works but what about doing some changes locally and
then pushing them back again onto GITosis hosting platform that, among
other GIT repositories, houses dolmen.git? Or in other words, what
would the girl/boy see if she/he were a contributor with
write/commit/push permissions do Dolmen? Let us find out ...
453 sa@wks:/tmp/test$ cd dolmen/
454 sa@wks:/tmp/test/dolmen$ la
455 total 16
456 drwxr-xr-x 3 sa sa 4096 2009-07-13 17:19 .
457 drwxr-xr-x 3 sa sa 4096 2009-07-13 17:19 ..
458 drwxr-xr-x 8 sa sa 4096 2009-07-13 17:19 .git
459 -rw-r--r-- 1 sa sa 8 2009-07-13 17:19 README
460 sa@wks:/tmp/test/dolmen$ git st
461 # On branch master
462 nothing to commit (working directory clean)
463 sa@wks:/tmp/test/dolmen$ echo "PLEASE WRITEME" > README
464 sa@wks:/tmp/test/dolmen$ cat README
465 PLEASE WRITEME
466 sa@wks:/tmp/test/dolmen$ git st
467 # On branch master
468 # Changed but not updated:
469 # (use "git add <file>..." to update what will be committed)
470 # (use "git checkout -- <file>..." to discard changes in working directory)
471 #
472 # modified: README
473 #
474 no changes added to commit (use "git add" and/or "git commit -a")
475 sa@wks:/tmp/test/dolmen$ git cwh -m 'did some changes to README'
476 [master a137d2d] did some changes to README
477 1 files changed, 1 insertions(+), 1 deletions(-)
478 sa@wks:/tmp/test/dolmen$ git push
479
480 / \ _-'
481 _/ \-''- _ /
482 __-' { \
483 / \
484 / "o. |o }
485 | \ ; YOU ARE BEING WATCHED!
486 ',
487 \_ __\
488 ''-_ \.//
489 / '-____'
490 /
491 _'
492 _-'
493
494
495 This computer system is the private property of its owner, whether individual, corporate or government. It is
496 for authorized use only. Users (authorized or unauthorized) have no explicit or implicit expectation of
497 privacy.
498
499 Any or all uses of this system and all files on this system may be intercepted, monitored, recorded, copied,
500 audited, inspected, and disclosed to your employer, to authorized site, government, and law enforcement
501 personnel, as well as authorized officials of government agencies, both domestic and foreign.
502
503 By using this system, the user consents to such interception, monitoring, recording, copying, auditing,
504 inspection, and disclosure at the discretion of such personnel or officials.
505
506
507 UNAUTHORIZED OR IMPROPER USE OF THIS SYSTEM MAY RESULT
508 IN CIVIL AND CRIMINAL PENALTIES AND ADMINISTRATIVE OR
509 DISCIPLINARY ACTION, AS APPROPRIATE !!
510
511
512 By continuing to use this system you indicate your awareness of and consent to these terms and conditions of
513 use. LOG OFF IMMEDIATELY if you do not agree to the conditions stated in this warning. However, if you are
514 authorized personal with no bad intentions please continue. Have a nice day! :-)
515
516 Counting objects: 5, done.
517 Writing objects: 100% (3/3), 286 bytes, done.
518 Total 3 (delta 0), reused 0 (delta 0)
519 To gitosis@devel.example.com:dolmen.git
520 ac84821..a137d2d master -> master
521 sa@wks:/tmp/test/dolmen$ gllol
522 a137d2d85c68dd44a6b755ea8c020d6b1116c283 10 seconds ago CN: Suno Ano AN: Suno Ano S: did some changes to README
523 ac8482172485bc3322ab7e22a189dd320bf666f9 2 days ago CN: Suno Ano AN: Suno Ano S: initial commit
We do an edit in line 463, check the status with line 466 and commit the change/edit from line 463 with line 475. Now, will the push towards upstream work? It sure does as we can see from line 478 onward ... piece of cake! ;-]
The closer look in line 521 and following provides us with more details — we just went full circle ... we cloned the upstream repository, made local edits/changes which we committed and finally pushed them back into the upstream repository onto our GITosis hosting platform. Excellent!
Next we are going to set up GIT-daemon for anonymous read/pull/fetch access, the second one of two possible choices.
524 sa@wks:/tmp/test/dolmen$ ssh dolmen-devel 525 526 527 [skipping a lot of lines ...] 528 529 530 sa@rh0-ve3:~$ su 531 Password: 532 rh0-ve3:/home/sa# aptitude install git-daemon-run 533 Reading package lists... Done 534 Building dependency tree 535 Reading state information... Done 536 537 538 [skipping a lot of lines ...] 539 540 541 Reading extended state information 542 Initializing package states... Done 543 Writing extended state information... Done 544
At first we need to enter rh0-ve3 again and install GIT-daemon — the
debian package for GIT-daemon is called git-daemon-run.
545 rh0-ve3:/home/sa# cd /usr/share/doc/git-daemon-run/ 546 rh0-ve3:/usr/share/doc/git-daemon-run# la 547 total 300 548 drwxr-xr-x 2 root root 4096 2009-07-15 06:37 . 549 drwxr-xr-x 273 root root 12288 2009-07-15 06:37 .. 550 -rw-r--r-- 1 root root 15971 2009-06-26 08:18 changelog.Debian.gz 551 -rw-r--r-- 1 root root 259657 2009-06-26 08:18 changelog.gz 552 -rw-r--r-- 1 root root 3412 2009-06-26 08:18 copyright 553 -rw-r--r-- 1 root root 1143 2009-06-26 08:18 README.Debian 554 rh0-ve3:/usr/share/doc/git-daemon-run# cat /var/log/git-daemon/current 555 2009-07-15_06:37:39.93133 git-daemon starting. 556 rh0-ve3:/usr/share/doc/git-daemon-run# sv stat git-daemon 557 run: git-daemon: (pid 6156) 3275s; run: log: (pid 6155) 3275s 558 rh0-ve3:/usr/share/doc/git-daemon-run# ls -la /etc/init.d/ | grep git 559 rh0-ve3:/usr/share/doc/git-daemon-run# ln -s /usr/bin/sv /etc/init.d/git-daemon 560 rh0-ve3:/usr/share/doc/git-daemon-run# ls -la /etc/init.d/ | grep git 561 lrwxrwxrwx 1 root root 11 2009-07-15 07:33 git-daemon -> /usr/bin/sv 562 rh0-ve3:/usr/share/doc/git-daemon-run# cd /srv/gitosis/repositories/ 563 rh0-ve3:/srv/gitosis/repositories# la 564 total 16 565 drwxr-xr-x 4 gitosis gitosis 4096 2009-07-13 14:13 . 566 drwxr-xr-x 5 gitosis gitosis 4096 2009-07-12 13:52 .. 567 drwxr-x--- 7 gitosis gitosis 4096 2009-07-15 11:36 dolmen.git 568 drwxr-x--- 8 gitosis gitosis 4096 2009-07-15 11:33 gitosis-admin.git 569 rh0-ve3:/srv/gitosis/repositories# sudo -u gitosis touch dolmen.git/git-daemon-export-ok 570 rh0-ve3:/srv/gitosis/repositories# la dolmen.git/ 571 total 40 572 drwxr-x--- 7 gitosis gitosis 4096 2009-07-15 11:36 . 573 drwxr-xr-x 4 gitosis gitosis 4096 2009-07-13 14:13 .. 574 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-13 14:13 branches 575 -rw-r--r-- 1 gitosis gitosis 66 2009-07-13 14:13 config 576 -rw-r--r-- 1 gitosis gitosis 73 2009-07-13 14:13 description 577 -rw-r--r-- 1 gitosis gitosis 0 2009-07-15 11:36 git-daemon-export-ok 578 -rw-r--r-- 1 gitosis gitosis 23 2009-07-13 14:13 HEAD 579 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-13 14:13 hooks 580 drwxr-xr-x 2 gitosis gitosis 4096 2009-07-13 14:13 info 581 drwxr-xr-x 10 gitosis gitosis 4096 2009-07-15 05:15 objects 582 drwxr-xr-x 4 gitosis gitosis 4096 2009-07-13 14:13 refs
As usual, /usr/share/doc/git-daemon-run provides useful stuff (lines
550 to 553). From line 557 we can see that GIT-daemon is currently up
and running — installing git-daemon-run also started the daemon.
One important thing to note here is that GIT-daemon on Debian makes
use of runit — a UNIX init scheme with service supervision; it is a
replacement for SysV-init and other init schemes.
In order to make it work as usual i.e. in order to do
/etc/init.d/git-daemon restart for example, a symmetric link is
created in line 559.
We are going to setup GIT-daemon in a way, that by default, it does not grant read/pull/fetch access to a newly created GIT repository — that is recommended since it does avoid situations where a repository is available to the entire world when it should not be.
If the default is to not allow read/pull/fetch access, we need to
explicitly allow read/pull/fetch access for each repository. Line 569
shows how this is done — GIT-daemon looks for a file called
git-daemon-export-ok within each repository and only if it is present,
is it made available to the public.
Note, that with using GITosis next to GIT-daemon, there is an even
smarter way to do this i.e. as it is shown in line 569, we have to log
into the remote machine rh0-ve3 and create git-daemon-export-ok
manually. When using GITosis in its default setup then the manually
crated git-daemon-export-ok will vanish with any git push from the
GITosis administrator's local machine. However, if we alter
.gitosis.conf on wks to read
[gitosis] [group gitosis-admin] writable = gitosis-admin members = sunoano [group dolmen] writable = dolmen members = sunoano user2 user3 user4 [repo dolmen] daemon = yes
i.e. if we add a [repo dolmen] stanza which contains daemon =
yes and then git push this configuration to the remote machine
rh0-ve3, a manually created git-daemon-export-ok will not vanish and
even better, we do not even need to create it manually as we did above
with line 569 but GITosis will create git-daemon-export-ok for us
automatically without the need for the GITosis administrator to log
into rh0-ve3 via SSH and do it manually.
Again, this is good since we could do it on an airplane for example
where we do not currently have access to the Internet and then later
if we have Internet again, issue git push and thus transfer all the
new settings to rh0-ve3 and activate them in the process.
583 rh0-ve3:/srv/gitosis/repositories# cat /etc/sv/git-daemon/run 584 #!/bin/sh 585 exec 2>&1 586 echo 'git-daemon starting.' 587 exec chpst -ugitdaemon \ 588 /usr/lib/git-core/git-daemon --verbose --base-path=/var/cache /var/cache/git 589 rh0-ve3:/srv/gitosis/repositories# 590 591 592 [ here we use nano to edit /etc/sv/git-daemon/run ... ] 593 594 595 rh0-ve3:/srv/gitosis/repositories# cat /etc/sv/git-daemon/run 596 #!/bin/sh 597 exec 2>&1 598 echo 'git-daemon starting.' 599 exec chpst -ugitosis /usr/lib/git-core/git-daemon --base-path=/srv/gitosis/repositories 600 rh0-ve3:/srv/gitosis/repositories# sv restart git-daemon 601 ok: run: git-daemon: (pid 8282) 0s 602 rh0-ve3:/srv/gitosis/repositories# sv stat git-daemon 603 run: git-daemon: (pid 8282) 6s; run: log: (pid 6155) 18182s 604 rh0-ve3:/srv/gitosis/repositories# netstat -tulpen | grep 9418 605 tcp 0 0 0.0.0.0:9418 0.0.0.0:* LISTEN 105 680150 8282/git-daemon 606 tcp6 0 0 :::9418 :::* LISTEN 105 680151 8282/git-daemon 607 rh0-ve3:/srv/gitosis/repositories# type psa; psa git 608 psa is aliased to `ps aux | grep' 609 root 6154 0.0 0.0 108 28 ? Ss 06:37 0:00 runsv git-daemon 610 gitlog 6155 0.0 0.0 128 40 ? S 06:37 0:00 svlogd -tt /var/log/git-daemon 611 gitosis 8282 0.0 0.0 48972 1520 ? S 11:40 0:00 /usr/lib/git-core/git-daemon --base-path=/srv/gitosis/repositories 612 root 8288 0.0 0.0 7264 788 pts/1 S+ 11:41 0:00 grep git 613 rh0-ve3:/srv/gitosis/repositories# exit 614 exit 615 sa@rh0-ve3:~$ exit 616 logout 617 Connection to devel.example.com closed. 618 sa@wks:/tmp/test/dolmen$ cd ../..; mkdir test_git-daemon; cd test_git-daemon 619 sa@wks:/tmp/test_git-daemon$ git clone git://devel.dolmen-project.org/dolmen.git 620 Initialized empty Git repository in /tmp/test_git-daemon/dolmen/.git/ 621 remote: Counting objects: 6, done. 622 remote: Compressing objects: 100% (2/2), done. 623 remote: Total 6 (delta 0), reused 0 (delta 0) 624 Receiving objects: 100% (6/6), done. 625 sa@wks:/tmp/test_git-daemon$ diff dolmen/README ../test/dolmen/README 626 sa@wks:/tmp/test_git-daemon$ cat dolmen/README 627 PLEASE WRITEME 628 sa@wks:/tmp/test_git-daemon$
Now we need to tell GIT-daemon where to find our repositories which is
done by altering /etc/sv/git-daemon/run as can be seen above. Line 599
acknowledges the path where GIT-daemon lives and, also, the path to
our repositories. Another important part is with chpst -ugitosis which
makes a lookup for the UID (User ID) and GID (Group ID) of our system
user gitosis in /etc/passwd and starts GIT-daemon with those values
(see line 46, 605/606 as well as 609).
After restarting GIT-daemon in line 600 and checking if it is up and
running with line 602, we also take a look at services on rh0-ve3 that
listen on port 9418 in lines 605 and 606 — of course we will see it
is GIT-daemon since 9418 is GIT-daemon's standard port.
If we have a firewall in place it has to allow access to port 9418 in
table filter, chain OUTPUT and INPUT respectively FORWARD in case of
OpenVZ.
Line 611 is yet another quick check in order to be sure everything is up and running as expected ... we are done setting up GIT-daemon. Again, well done! ;-]
The rest is all about testing anonymous read/pull/fetch access and then compare the former work with what we see right now — there is no difference i.e. line 627 shows the change we did above with line 463 when we were testing the write/commit/push functionality.
We wrote a Python script to automate the manual steps shown in lines
263 to 265, 281, 278, 297, 298, 300, 301, 319, 334. One way to use it
is ./create_repo.py <new_repository_name> and then it performs all the
steps from lines 297 to 334 for us automatically.
With our current setup it puts git-daemon-export-ok in place i.e. it
grants read/pull/fetch permissions to anonymous users. However, there
is a -p respectively --private switch to create_repo.py in order to
not create git-daemon-export-ok i.e. to not grant read/pull/fetch
access to the general public (see above).
Later, when we are going to set up GITweb, we will see how
git-daemon-export-ok can not just be used to provide read/pull/fetch
access to anonymous users via GIT-daemon, but, how we also use it to
provide anonymous users with the joy of being able to browse those GIT
repositories using their web browser.
Browsing our GIT Repositories: Next we are going to put an additional
HTTP layer (GITweb) on top of the GITosis and GIT-daemon setup so
users can download snapshots in tar.gz or .zip format, browse, search,
etc. our GIT repositories ... all that using their web browser i.e. no
need install any additional software.
We have successfully set up GITosis and GIT-daemon above. Now we want our users to be able to browse the GIT repositories on the server via GITweb. Before we start though, let us take a look at some screenshots taken while setting up GITweb on http://gitweb.dolmen-project.org — those mark our progress so the reader can acquire some taste for what is to come:
629 sa@wks:/tmp/test_git_daemon$ cd 630 sa@wks:~$ ssh dolmen-devel 631 sa@rh0-ve3:~$ su 632 Password: 633 rh0-ve3:/home/sa# type dpl; dpl gitweb | grep ii 634 dpl is aliased to `dpkg -l' 635 ii gitweb 1:1.6.3.3-2 fast, scalable, distributed revision control system (web interface) 636 rh0-ve3:/home/sa# dpl apache* | grep ii 637 ii apache2-mpm-worker 2.2.11-6 Apache HTTP Server - high speed threaded mod 638 ii apache2-utils 2.2.11-6 utility programs for web servers 639 ii apache2.2-bin 2.2.11-6 Apache HTTP Server common binary files 640 ii apache2.2-common 2.2.11-6 Apache HTTP Server common files
Two things need to be installed as can be seen above —
gitweb and
some httpd like for example apache2-mpm-worker. When that is done, it
is a quick thing until we have things up and running.
First however, we make a little detour by enabling the HTTP (Hypertext
Transfer Protocol) for cloning/fetching/pulling our repositories
hosted on rh0-ve3 where GITosis is used to manage authentication and
repository management.
In other words, we will be able to do git clone
http://devel.dolmen-project.org/dolmen.git after we finished our
detour — note the http instead of git as for example line 619 shows
it.
641 rh0-ve3:/home/sa# cd /srv/gitosis/repositories/
642 rh0-ve3:/srv/gitosis/repositories# la
643 total 168
644 drwxr-xr-x 42 gitosis gitosis 4096 2009-08-05 09:38 .
645 drwxr-xr-x 5 gitosis gitosis 4096 2009-07-12 13:52 ..
646 drwxr-xr-x 7 gitosis gitosis 4096 2009-07-31 17:27 dolmen.app.authentication.git
647 drwxr-xr-x 7 gitosis gitosis 4096 2009-07-31 17:27 dolmen.app.content.git
648
649
650 [skipping a lot of lines ...]
651
652
653 drwxr-xr-x 7 gitosis gitosis 4096 2009-07-31 17:29 snappy.transform.git
654 drwxr-xr-x 7 gitosis gitosis 4096 2009-07-30 15:55 snappy.video.player.git
655 drwxr-xr-x 7 gitosis gitosis 4096 2009-07-30 15:55 snappy.video.transforms.git
656 rh0-ve3:/srv/gitosis/repositories# cat snappy.video.transforms.git/hooks/post-update.sample
657 #!/bin/sh
658 #
659 # An example hook script to prepare a packed repository for use over
660 # dumb transports.
661 #
662 # To enable this hook, rename this file to "post-update".
663
664 exec git-update-server-info
665 rh0-ve3:/srv/gitosis/repositories# mv snappy.video.transforms.git/hooks/post-update{.sample,}
First of two things to do is to enable the post-update hook within
each repository we have. We do this by removing the .sample suffix
from the filename as shown in line 665.
As said, that has to be done for each repository but then it is only
shown here once for the snappy.video.transforms.git repository. After
the next commit/push to the repository the post-update hook will kick
in an do its magic and we can pull/fetch/clone over HTTP ... if Apache
has been configured to allow it that is ;-]
Second thing to do is to enable read access for Apache to the
directory containing all our GIT repositories
(/srv/gitosis/repositories). We do this with a virtual host entry:
666 rh0-ve3:/srv/gitosis/repositories# grep -A15 'clone via http' /etc/apache2/sites-available/default 667 ### clone via http 668 669 <VirtualHost *:80> 670 ServerName devel.dolmen-project.org 671 DocumentRoot "/srv/gitosis/repositories" 672 673 <Directory "/srv/gitosis/repositories"> 674 Options FollowSymlinks 675 Allow from all 676 AllowOverride all 677 Order allow,deny 678 </Directory> 679 </VirtualHost> 680 681 682 rh0-ve3:/srv/gitosis/repositories# apache2ctl graceful 683 apache2: Could not reliably determine the server's fully qualified domain name, using xx.xxx.xxx.xxx for ServerName 684 rh0-ve3:/srv/gitosis/repositories# netstat -tulpen | grep apach 685 tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 0 1112381 614/apache2 686 rh0-ve3:/srv/gitosis/repositories# exit 687 exit 688 sa@rh0-ve3:~$ exit 689 logout 690 Connection to devel.dolmen-project.org closed. 691 sa@wks:~$ cd /tmp 692 sa@wks:/tmp$ git clone http://devel.dolmen-project.org/misc.git 693 Initialized empty Git repository in /tmp/misc/.git/ 694 got c29800bf59b3c329e8b012c04af452a9b49de7c6 695 walk c29800bf59b3c329e8b012c04af452a9b49de7c6 696 got 6508b60b0fc7aa20ddbcd1c3f7d2f2ff8b1e2fc0 697 got f100a6b63b7a13e8f3d154813e0d8e7260983d47 698 699 700 [skipping a lot of lines ...] 701 702 703 got a862fa709e3d8717717a7b49d45d675cb594c80d 704 walk a862fa709e3d8717717a7b49d45d675cb594c80d 705 got 16e0a5e646e103bac9581e67dc8ff129c98cffd0 706 sa@wks:/tmp$ la misc/ 707 total 132 708 drwxr-xr-x 3 sa sa 4096 2009-08-07 12:11 . 709 drwxrwxrwt 29 root root 4096 2009-08-07 12:11 .. 710 -rwxr-xr-x 1 sa sa 6162 2009-08-07 12:11 create_repo.py 711 -rw-r--r-- 1 sa sa 65450 2009-08-07 12:11 dolmen_logo_big.png 712 -rw-r--r-- 1 sa sa 37254 2009-08-07 12:11 dolmen.svg 713 drwxr-xr-x 8 sa sa 4096 2009-08-07 12:11 .git 714 -rw-r--r-- 1 sa sa 284 2009-08-07 12:11 .gitignore
There is not much to say here except that it works as can be seen from
lines 692 to 705. Note that we left rh0-ve3 and entered wks again
before we cloned in line 692. As mentioned, the hook is triggered by a
push to the repository i.e. after enabling the hook there has to be a
push to make it work — the push is not shown here i.e. one either has
to make an explicit push or wait until one happens for some other
reason.
715 sa@wks:/tmp$ ssh dolmen-devel
716 sa@rh0-ve3:~$ su
717 Password:
718 rh0-ve3:/home/sa# cd /usr/share/gitweb/
719 rh0-ve3:/usr/share/gitweb# la
720 total 24
721 drwxr-xr-x 2 root root 4096 2009-08-04 16:22 .
722 drwxr-xr-x 83 root root 4096 2009-08-04 16:22 ..
723 -rw-r--r-- 1 root root 164 2009-06-29 01:20 git-favicon.png
724 -rw-r--r-- 1 root root 208 2009-06-29 01:20 git-logo.png
725 -rw-r--r-- 1 root root 7431 2009-06-29 01:20 gitweb.css
726 rh0-ve3:/usr/share/gitweb# touch {footer,home}.html
727 rh0-ve3:/usr/share/gitweb# la
728 total 36
729 drwxr-xr-x 2 root root 4096 2009-08-08 09:06 .
730 drwxr-xr-x 83 root root 4096 2009-08-04 16:22 ..
731 -rw-r--r-- 1 root root 139 2009-08-08 09:02 footer.html
732 -rw-r--r-- 1 root root 164 2009-06-29 01:20 git-favicon.png
733 -rw-r--r-- 1 root root 208 2009-06-29 01:20 git-logo.png
734 -rw-r--r-- 1 root root 7514 2009-08-08 08:59 gitweb.css
735 -rw-r--r-- 1 root root 7499 2009-08-08 09:02 home.html
736 rh0-ve3:/usr/share/gitweb# cd /tmp/
737 rh0-ve3:/tmp# git clone git://devel.dolmen-project.org/misc.git
738 Initialized empty Git repository in /tmp/misc/.git/
739 remote: Counting objects: 32, done.
740 remote: Compressing objects: 100% (31/31), done.
741 remote: Total 32 (delta 12), reused 0 (delta 0)
742 Receiving objects: 100% (32/32), 282.30 KiB, done.
743 Resolving deltas: 100% (12/12), done.
744 rh0-ve3:/tmp# cd /usr/share/gitweb/
745 rh0-ve3:/usr/share/gitweb# cp /tmp/misc/a_dolmen_with_tree_in_front.jpg .
746 rh0-ve3:/usr/share/gitweb# cp /tmp/misc/dolmen_logo_big.png .
747 rh0-ve3:/usr/share/gitweb# la
748 total 316
749 drwxr-xr-x 2 root root 4096 2009-08-08 09:19 .
750 drwxr-xr-x 83 root root 4096 2009-08-04 16:22 ..
751 -rw-r--r-- 1 root root 210461 2009-08-08 09:19 a_dolmen_with_tree_in_front.jpg
752 -rw-r--r-- 1 root root 65450 2009-08-08 09:19 dolmen_logo_big.png
753 -rw-r--r-- 1 root root 139 2009-08-08 09:02 footer.html
754 -rw-r--r-- 1 root root 164 2009-06-29 01:20 git-favicon.png
755 -rw-r--r-- 1 root root 208 2009-06-29 01:20 git-logo.png
756 -rw-r--r-- 1 root root 7514 2009-08-08 08:59 gitweb.css
757 -rw-r--r-- 1 root root 7499 2009-08-08 09:02 home.html
758 rh0-ve3:/usr/share/gitweb# cd /usr/lib/cgi-bin/
759 rh0-ve3:/usr/lib/cgi-bin# la
760 total 212
761 drwxr-xr-x 2 root root 4096 2009-08-08 10:17 .
762 drwxr-xr-x 42 root root 16384 2009-08-03 17:00 ..
763 -rwxr-xr-x 1 root root 190145 2009-06-29 01:20 gitweb.cgi
764 rh0-ve3:/usr/lib/cgi-bin# ln -s /usr/share/gitweb/footer.html
765 rh0-ve3:/usr/lib/cgi-bin# ln -s /usr/share/gitweb/home.html
766 rh0-ve3:/usr/lib/cgi-bin# ln -s /usr/share/gitweb/git-favicon.png
767 rh0-ve3:/usr/lib/cgi-bin# ln -s /usr/share/gitweb/git-logo.png
768 rh0-ve3:/usr/lib/cgi-bin# ln -s /usr/share/gitweb/gitweb.css
769 rh0-ve3:/usr/lib/cgi-bin# ln -s /usr/share/gitweb/a_dolmen_with_tree_in_front.jpg
770 rh0-ve3:/usr/lib/cgi-bin# ln -s /usr/share/gitweb/dolmen_logo_big.png
771 rh0-ve3:/usr/lib/cgi-bin# la
772 total 212
773 drwxr-xr-x 2 root root 4096 2009-08-08 10:18 .
774 drwxr-xr-x 42 root root 16384 2009-08-03 17:00 ..
775 lrwxrwxrwx 1 root root 49 2009-08-08 10:18 a_dolmen_with_tree_in_front.jpg -> /usr/share/gitweb/a_dolmen_with_tree_in_front.jpg
776 lrwxrwxrwx 1 root root 37 2009-08-08 10:18 dolmen_logo_big.png -> /usr/share/gitweb/dolmen_logo_big.png
777 lrwxrwxrwx 1 root root 29 2009-08-08 10:17 footer.html -> /usr/share/gitweb/footer.html
778 lrwxrwxrwx 1 root root 33 2009-08-08 10:17 git-favicon.png -> /usr/share/gitweb/git-favicon.png
779 lrwxrwxrwx 1 root root 30 2009-08-08 10:17 git-logo.png -> /usr/share/gitweb/git-logo.png
780 -rwxr-xr-x 1 root root 190145 2009-06-29 01:20 gitweb.cgi
781 lrwxrwxrwx 1 root root 28 2009-08-08 10:17 gitweb.css -> /usr/share/gitweb/gitweb.css
782 lrwxrwxrwx 1 root root 27 2009-08-08 10:17 home.html -> /usr/share/gitweb/home.html
When we install gitweb, it places its CGI (Common Gateway Interface)
script (line 763) into /usr/lib/cgi-bin and all the rest into
/usr/share/gitweb (lines 723 to 725) — we can elevate security by
using chattr on /usr/lib/cgi-bin/gitweb.cgi There is also a config
file /etc/gitweb.conf which is shown in lines 783 to 870.
What we do above is simply linking the files from /usr/share/gitweb to
/usr/lib/cgi-bin as lines 764 to 770 show. In addition, we create
files like for example footer.html and home.html which we will later
fill with HTML code in order to provide a footer and a message on our
GITweb's website main page.
We also grab two images directly from one of our already existing GIT repositories as can be seen in lines 737 to 746.
783 rh0-ve10:/usr/lib/cgi-bin# cat /etc/gitweb.conf
784 ### GITweb config file for gitweb.dolmen-project.org
785
786 # directory to use for temp files
787 $git_temp = "/tmp";
788
789 # HTML text to include/render
790 $home_text = "home.html";
791 $site_footer = "footer.html";
792
793 # Sorting key for main page
794 $default_projects_order = "age";
795
796 # Project root for GITweb. This is the parent directory for all of
797 # your GIT repositories. As an example, 'gitosis-admin.git' should
798 # reside in this directory.
799 $projectroot = "/srv/gitosis/repositories";
800 $projects_list = $projectroot;
801
802 # Web display files. These are all _relative_ paths from the active
803 # gitweb.cgi file. If all three of these files are located in the same
804 # directory as gitweb.cgi (/urs/lib/cgi-bin), then the below settings
805 # should work fine. Remember that if they are in a different
806 # directory, you will need to give your Apache user/group read access
807 # to them!
808 $stylesheet = "/gitweb.css";
809 $logo = "/git-logo.png";
810 $favicon = "/git-favicon.png";
811
812 # Site name
813 $site_name = "The Dolmen Project's GIT Repositories";
814
815 # URL formatting. You can use this to make pretty URLs if you like. I
816 # am doing this using Apache rewrite rules, and so am not using these
817 # settings.
818 #$my_uri = "http://gitweb.dolmen-project.org/";
819 #$home_link = $my_uri;
820
821 # Base URL for repositories. This is used to prefix each of the GIT
822 # repositories on the webpages. So in my case, if you were viewing a
823 # GIT repository/tree called 'foo.git', the webpage would tell you
824 # that the tree was located at:
825 # 'ssh://gitosis@devel.dolmen-project.org:1234/foo.git'. Note that
826 # escaping the '@' character is necessary to render the URL properly.
827 @git_base_url_list = ("git://devel.dolmen-project.org");
828
829 # Length of the project description column in the webpage.
830 $projects_list_description_width = 70;
831
832 # Only export repositories we are allowing to be publically cloned.
833 # What this setting actually says is that if the given file _exists_
834 # in the GIT repository, then the repository/tree can be exported to
835 # the web. So, for example, the file:
836 # /srv/git/repositories/configs.git/git-daemon-export-ok file exists,
837 # so configs.git will be exported via Gitweb. This file can be created
838 # with a simple '$ touch git-daemon-export-ok'. I am using this
839 # filename as it doubles for the same use with the GIT export daemon
840 # which we set via gitosis.conf. If this setting does not exist, then
841 # all trees will be exported by default. Note that there ARE other
842 # methods for controlling which repositories get exported. This is
843 # just the one I prefer.
844 $export_ok = "git-daemon-export-ok";
845
846 # Enable PATH_INFO so the server can produce URLs of the form:
847 # http://devel.dolmen-project.org/project.git/xxx/xxx This allows for
848 # pretty URLs *within* the GIT repository, where my Apache rewrite
849 # rules are not active.
850 $feature{'pathinfo'}{'default'} = [1];
851
852 # Enable blame, pickaxe search, snapshop, search, and grep support,
853 # but still allow individual projects to turn them off. These are
854 # features that users can use to interact with your GIT repositories.
855 # They consume some CPU whenever a user uses them, so you can turn
856 # them off if you need to. Note that the 'override' option means that
857 # you can override the setting on a per-repository basis.
858 $feature{'blame'}{'default'} = [1];
859 $feature{'blame'}{'override'} = [1];
860
861 $feature{'pickaxe'}{'default'} = [1];
862 $feature{'pickaxe'}{'override'} = [1];
863
864 $feature{'search'}{'default'} = [1];
865
866 $feature{'grep'}{'default'} = [1];
867 $feature{'grep'}{'override'} = [1];
868
869 $feature{'snapshot'}{'default'} = ['zip', 'tgz'];
870 $feature{'snapshot'}{'override'} = [1];
Next we take look at the config file for GITweb in lines 783 to 870. In addition to the very verbose comments, there are a few important things to say about it:
git-daemon-export-ok exists within a repository, then
not only can anonymous users clone/pull/fetch the repository but
they can also browse it on http://gitweb.dolmen-project.org.
871 rh0-ve3:/usr/lib/cgi-bin#
872
873
874 [ here we use nano to edit home.html, footer.html, gitweb.css ... ]
875
876
877 rh0-ve3:/usr/lib/cgi-bin# grep -A33 '### Gitweb' /etc/apache2/sites-available/default
878 ### Gitweb
879
880 <VirtualHost *:80>
881 ServerName gitweb.dolmen-project.org
882 DocumentRoot "/usr/lib/cgi-bin"
883 DirectoryIndex gitweb.cgi
884 SetEnv GITWEB_CONFIG /etc/gitweb.conf
885
886 <Directory "/usr/lib/cgi-bin">
887 Options FollowSymlinks ExecCGI
888 Allow from all
889 AllowOverride all
890 Order allow,deny
891
892 <Files gitweb.cgi>
893 SetHandler cgi-script
894 </Files>
895
896 RewriteEngine on
897 RewriteCond %{REQUEST_FILENAME} !-f
898 RewriteCond %{REQUEST_FILENAME} !-d
899 RewriteRule ^.* /gitweb.cgi/$0 [L,PT]
900 </Directory>
901
902 <Directory "/srv/gitosis/repositories">
903 Allow from all
904 </Directory>
905
906 # I only used those debug rewrite rules
907 #RewriteLog /var/log/httpd/rewrite_log
908 #RewriteLogLevel 9
909
910 #ErrorLog /var/log/httpd/gitweb
911 </VirtualHost>
Next we need to alter a few files if we want/need to i.e. we for example change the CSS (Cascading Style Sheets) a bit, provide a footer and maybe a main page text. I do so as can be seen from the sceenshots.
We are done except for one last thing — we need to configure yet another Apache virtual host to make it work. I am not going into details about lines 880 to 911 since there is a lot of information about Apache on the Internet already. In addition, I recommend to make some changes to maybe elevate security a little bit — that step is totally optional however.
Apache's rewrite functionality is no core functionality which is why we load the rewrite module in line 912 and then restart Apache in line 915.
912 rh0-ve3:/usr/lib/cgi-bin# a2enmod rewrite 913 Enabling module rewrite. 914 Run '/etc/init.d/apache2 restart' to activate new configuration! 915 rh0-ve3:/usr/lib/cgi-bin# apache2ctl graceful 916 apache2: Could not reliably determine the server's fully qualified domain name, using xx.xxx.xxx.xxx for ServerName 917 918 919 [ here we use nano to edit /etc/passwd ... ] 920 921 922 rh0-ve3:/usr/lib/cgi-bin# grep gitosis /etc/passwd 923 gitosis:x:105:108:Dolmen Project,,,:/srv/gitosis:/bin/sh 924 rh0-ve3:/usr/lib/cgi-bin# cd /srv/gitosis/repositories/ 925 rh0-ve3:/srv/gitosis/repositories# la 926 total 168 927 drwxr-xr-x 42 gitosis gitosis 4096 2009-08-05 09:38 . 928 drwxr-xr-x 5 gitosis gitosis 4096 2009-07-12 13:52 .. 929 drwxr-xr-x 7 gitosis gitosis 4096 2009-07-31 17:27 dolmen.app.authentication.git 930 drwxr-xr-x 7 gitosis gitosis 4096 2009-07-31 17:27 dolmen.app.content.git 931 932 933 [skipping a lot of lines ...] 934 935 936 drwxr-xr-x 7 gitosis gitosis 4096 2009-08-07 07:16 misc.git 937 drwxr-xr-x 7 gitosis gitosis 4096 2009-08-07 07:16 snappy.git 938 drwxr-xr-x 7 gitosis gitosis 4096 2009-07-30 15:46 snappy.site.git 939 drwxr-xr-x 7 gitosis gitosis 4096 2009-07-31 17:29 snappy.transform.git 940 drwxr-xr-x 7 gitosis gitosis 4096 2009-07-30 15:55 snappy.video.player.git 941 drwxr-xr-x 7 gitosis gitosis 4096 2009-07-30 15:55 snappy.video.transforms.git 942 rh0-ve3:/srv/gitosis/repositories# cat misc.git/description 943 Repository for miscellaneous stuff with regards to the Dolmen Project. 944 rh0-ve3:/srv/gitosis/repositories# exit 945 exit 946 sa@rh0-ve3:~$ exit 947 logout 948 Connection to devel.dolmen-project.org closed. 949 sa@wks:~$ grep -A2 '\[repo misc\]' 0/gitosis_projects/dolmen/gitosis-admin/gitosis.conf 950 [repo misc] 951 daemon = yes 952 description = Repository for miscellaneous stuff with regards to the Dolmen Project. 953 sa@wks:~$
We are done except for two minor things — we want to give the
repositories a one-line description and provide owner information when
they are shown on GITweb. One easy way is shown in line 923. This way
we can set a default owner and if needed, overwrite that default with
settings in ../foo.git/config as shown below. Note that this has to be
done within the repository (probably a bare repository) on the server
not in ones local clone.
[gitweb]
owner = "Suno Ano"
Also on the server, each repository has a ../foo.git/description file
as can be seen in lines 942 and 943. We can either leave it as is or
provide a one-line description i.e. edit this file on the server,
rh0-ve3 in our case.
However, since we are using GITosis, there is a smarter way to do it
as is shown in lines 950 to 952 — note that we logged out of rh0-ve3
and thus we are back on wks again. This way, anybody with
administrator permissions to our GITosis platform can easily set new
descriptions by editing gitosis.conf without the need to even log into
the server using SSH (Secure Shell) for example.
We are done, the final version looks like this:
Note how all repositories have the default owner set but just a few so far have their individual description ...
Ones local repository can be used by others to pull changes from (git
pull), but normally one would have a private repository and a public
repository. The public repository is where everybody pulls from and
the owner does the opposite — he pushes his changes from his private
repository to his public repository using git push.
Pushing will push/synchronize the local branch(es) with the corresponding remote branch(es) — note that this works generally only over SSH (Secure Shell) or HTTP with special web server setup. It is highly recommended to setup a SSH to use keys (also known as PKA (Public Key Authentication)) and the SSH-agent mechanism so that there is no need to type in a password all the time.
GIT can work with the same workflow as Subversion, with a group of
developers using a single repository for exchange for their work (a
bare repository). The only change is that their changes are not
submitted automatically but they have to use git push. The developers
must have either an entry in htaccess (for HTTP DAV) or a user account
for SSH. It is possible for the server admin to restrict their shell
account to GIT pushing and fetching by using the git-shell login
shell.
It is also possible to exchange patches using email. GIT has very good
support for patches incoming by mail. We can apply them by feeding
mailboxes with patch emails to git am. The person who wants to send
patches can use git format-patch and possibly git send-email to do so.
In order to maintain a set of patches it is best to use the StGIT
tool.
WRITEME
Aside from using such popular sites like for example GIThub or setting up our own public repository with GITosis, we can share changes using email. In order to do so, the sender needs to create appropriate patches from his changes and the receiver needs to process those changes which were send to him via email.
In fact, sharing changes via email seems to be the most common way how changes are shared among folks as of now (February 2009). The reason why this is, is probably because it is one of the least cumbersome setups one can have — right after using sites like GIThub for example, which in my opinion is going to become the most common method for sharing changes in the future.
Most folks are just minor contributors and therefore they use git
clone to clone some public repository and then git pull respectively
get fetch followed by git merge to update their local repository on a
regular. This is very easy and straight forward to use for even
non-experts. All they have to do next is to make changes/improvements
to the code and then of course, get those changes back upstream.
Mostly, whenever someone make changes, he creates a topic branch, makes changes, tests, merges back into master and then deletes the topic branch after he is done. Now he needs to share his changes with the upstream repository. Probably the easiest way to do so, is to send the changes to someone who is considered a major contributor to the project. Another things many folks do is send their patches directly to a project's mailing list which is good for reasons of scrutiny etc.
We are now going to take a look at how to create patches that can be used to share changes via email and also, we are going to take a look at how to process such emails assuming we are on the receivers end of the pipe.
First of all we shall pay attention to some things considered best practices when it comes to creating patches:
git format-patch -M to create the patch--- and the diffstat informationgit send-email, please test it first by sending email to
yourselfWRITEME
WRITEME
GitHub is a web-based hosting service for projects that use GIT as their SCM (Software Configuration Management) system. There are others too like for example GITorious, http://repo.or.cz/ etc. (see here for more information).
I have chosen to host/manage the source code for this website/platform on GIThub simply because I figured that as of now (February 2009) it has the best tool set with regards to social interaction for folks so they can contribute.
What still sucks though is the lack of a decent ticketing system but then, we will see what the situation looks like in a year from now; I am pretty sure the folks at GIThub are very skilled and hardworking geeks ;-] Another thing that I would like to see is the whole source code for GIThub to be released under some FLOSS (Free/Libre Open Source Software) license.
The fact that some project uses a web-based source code hosting system like for example GIThub also enables non-geeks and/or folks with just little time, to contribute to the project — they might for example fix typos using the web interface i.e. there is no need to be a GIT expert, Debian developer or maybe some long-time GNU Emacs user or some other kind of geek of that magnitude.
Before that can be done, we need to create an account on GIThub. The information on how to do that on GIThub is fool-proof so I am not going to repeat anything here.
Once we have an account on GIThub, we need to put the public key of an SSH (Secure Shell) key pair into the account on GIThub.
1 sa@wks:~/.ssh$ ssh-keygen -b 8192 -t rsa 2 Generating public/private rsa key pair. 3 Enter file in which to save the key (/home/sa/.ssh/id_rsa): github_id_rsa 4 Enter passphrase (empty for no passphrase): 5 Enter same passphrase again: 6 Your identification has been saved in github_id_rsa. 7 Your public key has been saved in github_id_rsa.pub. 8 The key fingerprint is: 9 44:42:af:ea:d9:bf:b7:99:4b:24:ad:1a:ad:00:80:70 sa@wks 10 The key's randomart image is: 11 +--[ RSA 8192]----+ 12 | . | 13 | . . | 14 | . E . + | 15 | o . + . . | 16 | . o S . o | 17 | . . . + | 18 | . o o o | 19 | . o . =.o . | 20 | o ..*o..o. | 21 +-----------------+
I opted to create a new pair especially to use it for GIThub (line 1
to 21). What we can also see from line 1 is that I created a key pair
which has a higher number of bits than the default one which is 2048
bits long.
The name chosen in line 3 is of course one that indicates its usage —
I have tens of key pairs for different usage so github_id_rsa makes
sense. The password supplied in lines 4 and 5 has been created with
one of my aliases in my ~/.bashrc file
sa@wks:~$ type pwg pwg is aliased to `pwgen -sncB 55 1' sa@wks:~$ pwg jwKJcgs7uvnijwp73v4uxbbojghiaeesepwT3gUovKjbhFmzdmgNP7c sa@wks:~$ 22 sa@wks:~/.ssh$ pi github 23 -rw------- 1 sa sa 6431 2009-02-25 15:43 github_id_rsa 24 -rw-r--r-- 1 sa sa 1412 2009-02-25 15:43 github_id_rsa.pub 25 sa@wks:~/.ssh$ cat github_id_rsa.pub 26 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAABAEAretHEeiycQbbEvoQqB9l+9UP4iHFDwDJgQ33b44pMY0lXauE 27 OiLHZM3oqmgqPDpzF2O4qFJil1L+b9owEhkD51UIHe3kdoaTxdwxsm/1+dLl06yL3ZdmDbkRt3Vc9bFla0Sm 28 29 30 [skipping a lot of lines ...] 31 32 33 QNIL0n0WCC6llFA+8H+4xsA0/fHd24UoXR9E7Mjy6XxGF49nJVZYy6kj8g6RywwnNNP4sHcanVRh+Lz3s09D 34 WiSE0lTR87qbVNwG/zEhwWAU8hIsGnZZxBZyg8sDabPjIHm4Cb5Pzt6XCQ== sa@wks
In our current case github_id_rsa.pub (line 24) is the public key and
github_id_rsa is the private key from the just created key pair. The
public key is put onto GIThub and the private key kept locally to
identify ourselves against GIThub for certain operations.
We get our public key onto GIThub by copy pasting the output from lines 26 to 34 into the specified field on the account page (screenshot below). The private key however must never be shown to someone and kept secure!
35 sa@wks:~/.ssh$ cd ../0/0/ 36 sa@wks:~/0/0$ la 37 total 12 38 drwxr-xr-x 7 sa sa 71 2009-02-25 19:00 . 39 drwxr-xr-x 32 sa sa 4096 2009-02-25 23:01 .. 40 drwxr-xr-x 5 sa sa 54 2008-02-04 20:47 blog 41 drwxr-xr-x 5 sa sa 43 2008-03-12 16:28 misc 42 drwxr-xr-x 7 sa sa 88 2008-06-02 09:52 pim 43 -rw-r--r-- 1 sa sa 1844 2009-02-25 19:00 README 44 drwxr-xr-x 8 sa sa 111 2008-08-29 21:35 ws 45 sa@wks:~/0/0$ git init && git add . && git cwh -m 'inital commit' 46 Initialized empty Git repository in /tmp/0/.git/ 47 [master (root-commit)]: created b79875f: "inital commit" 48 1143 files changed, 101682 insertions(+), 0 deletions(-) 49 create mode 100644 README 50 create mode 100644 blog/local/weblog.business.muse 51 create mode 100644 blog/local/weblog.debian.muse 52 53 54 [skipping a lot of lines ...] 55 56 57 create mode 100644 ws/latex/latex2png-dm-crypt_luks__3904075528.png 58 create mode 100644 ws/latex/latex2png-dm-crypt_luks__3905517320.png 59 create mode 100644 ws/latex/latex2png-dm-crypt_luks__976832061.png 60 create mode 100644 ws/latex/latex2png-misc__2526884390.png 61 create mode 100644 ws/latex/latex2png-planner__2617796.png
After we have uploaded the public key (github_id_rsa.pub), we need to
initialize the GIT repository, add all files (recursively) and create
the initial commit which we do with line 45.
62 sa@wks:~/0/0$ git st 63 # On branch master 64 nothing to commit (working directory clean) 65 sa@wks:~/0/0$ gllol 66 b79875f3b2267915179313184ac84436984ad33d 14 seconds ago CN: Suno Ano AN: Suno Ano S: inital commit 67 sa@wks:~/0/0$ git remote add origin git@github.com:sunoano/0.git 68 sa@wks:~/0/0$ ssh-add ~/.ssh/github_id_rsa 69 Enter passphrase for /home/sa/.ssh/github_id_rsa: 70 Identity added: /home/sa/.ssh/github_id_rsa (/home/sa/.ssh/github_id_rsa) 71 sa@wks:~/0/0$ git push origin master 72 Counting objects: 1054, done. 73 Compressing objects: 100% (1049/1049), done. 74 Writing objects: 100% (1054/1054), 110.71 MiB | 88 KiB/s, done. 75 Total 1054 (delta 203), reused 0 (delta 0) 76 To git@github.com:sunoano/0.git 77 * [new branch] master -> master 78 sa@wks:~/0/0$
Before we can push our local repository onto GIThub, we need to add a remote branch in line 67. Actually we make our just created local repository think it got cloned from a remote bare repository on GIThub.
Next we need to tell the SSH authentication agent about our new key
pair (line 68) since, with every git push now, GIThub checks for our
private key to match up the public key we uploaded before. Folks who
forget about line 68 get a Permission denied (publickey) error when
they try to push.
Note that this information — because SSH-agent keeps its information within RAM (Random Access Memory) which is a volatile memory — does not survive a reboot or any other kind of power outage for that matter ergo line 68 need be issued after each reboot.
The passphrase requested in line 69 is the one we supplied in lines 4 and 5 respectively. Finally, in line 71 we can trigger the initial push which might take a while. When this command finishes, which it did here, we have successfully uploaded a GIT repository to GIThub in order to start collaborating with others like for example it is intended with this website/platform.
Update: After restructuring my SSH setup, I am now using the following stanza within ~/.ssh/config
sa@wks:~$ grep -A9 -m1 ', github' .ssh/config ###_ , github # description: just a dummy stanza to make git push work with # github.com i.e. to pick the right keyfile Host github.com User git Port 22 Hostname github.com IdentityFile %d/.ssh/github_id_rsa TCPKeepAlive yes IdentitiesOnly yes sa@wks:~$
However, if we were just using a standard SSH setup for
/etc/ssh/ssh_config and/or ~/.ssh/config respectively, then the
approach shown in lines 68 and 69 above i.e. letting the SSH-agent
sort out authentication for us would work perfectly fine.
Please go here for a practical example of how to do this — it is about how GitHub is used in order to contribute to this website/platform.
The contents in this section, I consider nice to know but not in anyways mandatory for folks who would like to complete a full workflow circle with GIT.
We can use git archive in order to create a tar or zip archive from
any commit of a project that uses GIT as its SCM system.
1 sa@wks:~/0/openvz/vzpkg_test$ gllol | head -n2 2 617669671fadd24edb1f3176153dd5fdd7f86053 5 months ago CN: Robert Nelson AN: Robert Nelson S: Fix read_vz_conf return code so it doesn't cause "set -e" scripts to fail. 3 5615b8134d16020617ba5b30fcbf1cd2fa6360ca 5 months ago CN: Robert Nelson AN: Robert Nelson S: Fix return value from read_vzpkg_conf. 4 sa@wks:~/0/openvz/vzpkg_test$ git archive -l 5 tar 6 zip 7 sa@wks:~/0/openvz/vzpkg_test$ git archive --format=tar --prefix=openvz_vzpkg2/ HEAD | gzip > vzpkg2_`date +%F`.tar.gz 8 sa@wks:~/0/openvz/vzpkg_test$ git archive --format=tar --prefix=openvz_vzpkg2/ HEAD | bzip2 > vzpkg2_`date +%F`.tar.bz2 9 sa@wks:~/0/openvz/vzpkg_test$ pi vzpkg2 10 -rw-r--r-- 1 sa sa 45574 2009-02-27 10:41 vzpkg2_2009-02-27.tar.bz2 11 -rw-r--r-- 1 sa sa 47614 2009-02-27 10:39 vzpkg2_2009-02-27.tar.gz 12 sa@wks:~/0/openvz/vzpkg_test$ tar -tjf vzpkg2_2009-02-27.tar.bz2 | head -n4 13 openvz_vzpkg2/ 14 openvz_vzpkg2/COPYING 15 openvz_vzpkg2/Makefile 16 openvz_vzpkg2/NEWS 17 sa@wks:~/0/openvz/vzpkg_test$
The above example creates a tarball release for an OpenVZ utility
called vzpkg2. As we can see in lines 5 and 6, as of now (February
2009) git archive is able to create tar as well as zip archives.
I opted to create tar archives which I further compressed using gzip
in line 7 and bzip2 in line 8. The result can be seen in lines 10 and
11 respectively. With line 12, we take a look inside the archive from
line 10 and can see that the --prefix option from line 8 worked fine
since each filename (or path for that matter) is preceded with
openvz_vzpkg2/.
The tarball is created using HEAD although, as we already know, HEAD
can be replaced by anything that names a commit.
Mostly, when releasing a new version of a software project, we may
want to simultaneously make a changelog to include in the release
announcement. Linus Torvalds, for example, makes new kernel releases
by tagging them, then running $ release-script 2.6.29 2.6.30-rc6
2.6.30-rc7 where release-script is a shell script that looks like:
#!/bin/sh stable="$1" last="$2" new="$3" echo "# git tag v$new" echo "git archive --prefix=linux-$new/ v$new | gzip -9 > ../linux-$new.tar.gz" echo "git diff v$stable v$new | gzip -9 > ../patch-$new.gz" echo "git log --no-merges v$new ^v$last > ../ChangeLog-$new" echo "git shortlog --no-merges v$new ^v$last > ../ShortLog" echo "git diff --stat --summary -M v$last v$new > ../diffstat-$new"
and then he just cuts and pastes the output after verifying that it looks good.
Last but not least, anybody should then of course digitally sign the just created tarball using GPG (GNU Privacy Guard) in order to ensure verifiable data integrity and authenticity to users who use this tarball
sa@wks:~/0/openvz/vzpkg_test$ gpg --detach-sign --armor vzpkg2_2009-02-27.tar.gz You need a passphrase to unlock the secret key for user: "Suno Ano (http://sunoano.name) <suno.ano@foo.bar>" 1024-bit DSA key, ID C0EC7E38, created 2009-02-06 sa@wks:~/0/openvz/vzpkg_test$ pi gz -rw-r--r-- 1 sa sa 47614 2009-02-27 10:39 vzpkg2_2009-02-27.tar.gz -rw-r--r-- 1 sa sa 197 2009-02-27 11:14 vzpkg2_2009-02-27.tar.gz.asc sa@wks:~/0/openvz/vzpkg_test$ cat *.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEABECAAYFAkmnvOIACgkQSOlKxsDsfjgd7wCfVwH6ZxhnALPuS7CsdZIy7ozv RbcAmwatP59hppcEWMnn0Q7O7N8WUFQ+ =YG39 -----END PGP SIGNATURE----- sa@wks:~/0/openvz/vzpkg_test$ gpg --verify vzpkg2_2009-02-27.tar.gz.asc vzpkg2_2009-02-27.tar.gz gpg: Signature made Fri 27 Feb 2009 11:13:54 AM CET using DSA key ID C0EC7E38 gpg: Good signature from "Suno Ano (http://sunoano.name) <suno.ano@foo.bar>" sa@wks:~/0/openvz/vzpkg_test$
Let us assume somebody hands us a copy of a file (received_file), and
asks which commits modified a file such that it contained the given
content either before or after the commit. The way we can find out is
1 sa@wks:/tmp$ mkdir demo; cd demo 2 sa@wks:/tmp/demo$ la 3 total 4 4 drwxr-xr-x 2 sa sa 6 2009-02-27 12:59 . 5 drwxrwxrwt 14 root root 4096 2009-02-27 13:00 .. 6 sa@wks:/tmp/demo$ echo 'some blabla' > our_file 7 sa@wks:/tmp/demo$ ll 8 total 4.0K 9 -rw-r--r-- 1 sa sa 12 2009-02-27 13:00 our_file 10 sa@wks:/tmp/demo$ git init; git add .; git cwh -m 'initial commit' 11 Initialized empty Git repository in /tmp/demo/.git/ 12 [master (root-commit)]: created 8de1f5c: "initial commit" 13 1 files changed, 1 insertions(+), 0 deletions(-) 14 create mode 100644 our_file 15 sa@wks:/tmp/demo$ echo 'more text' >> our_file 16 sa@wks:/tmp/demo$ git cwh -m 'added more content to file our_file' 17 [master]: created 45cb121: "added more content to file our_file" 18 1 files changed, 1 insertions(+), 0 deletions(-) 19 sa@wks:/tmp/demo$ gllol 20 45cb1212da5d48b062622785841fa4ea489c6b6c 35 minutes ago CN: Suno Ano AN: Suno Ano S: added more content to file our_file 21 8de1f5cc61482c965ded2d9abfaf8f0648d9cac9 36 minutes ago CN: Suno Ano AN: Suno Ano S: initial commit 22 sa@wks:/tmp/demo$ cp our_file received_file 23 sa@wks:/tmp/demo$ git log --raw -r --abbrev=40 --pretty=oneline our_file | grep -B1 $(git hash-object received_file) 24 45cb1212da5d48b062622785841fa4ea489c6b6c added more content to file our_file 25 :100644 100644 99e51806133707c0c518ed4ad2586b799196ab5a 538e6b4f74c64a7af7eaae4289f77d88405441b8 M our_file 26 sa@wks:/tmp/demo$
The answer is with line 24 — it shows that the file received_file
someone gave us represents the state of received_file right after we
issued line 16 (see line 20) — the commit ID we were looking for is
45cb1212da5d48b062622785841fa4ea489c6b6c.
Figuring out why and how this works is left as an exercise to the
reader — the person who understands line 25 and its meaning will
understand the whole thing we just did. The man pages for git log, git
diff-tree, and git hash-object may prove helpful.
Even if it might look quite similar, recovering lost changes is not to be confused with fixing mistakes.
Say we modify a branch with git reset --hard <some_commit_id> (like we
did above), and then realize that the branch was the only reference we
had to that point in history. Why is this a problem?
git reset --hard <some_commit_id> resets
not just the pointer which points to the current tip of the currently
active branch (HEAD) to point to some new commit object, but it also
resets the index and the working tree, thus practically deleting any
history from <some_commit_id> onwards. As I mentioned before already,
a git reset --hard cannot be undone! See ... there it is ... problem!Fortunately, GIT also keeps a log, called a reflog, of all the
previous values of each branch. So in this case we can still find the
old history using, for example, git log master@{1}.
This lists the commits reachable from the previous version of HEAD.
This syntax can be used with any GIT command that accepts a commit ID,
not just with git log. Some other examples are:
git show master@{2} See where the branch pointed 2,
git show master@{3} 3, ... commits ago
git show master@{one.week.ago} where master used to point to one week ago
gitk master@{yesterday} See where it pointed yesterday,
gitk master@{"1 week ago"} ... or last week
git log --walk-reflogs master show reflog entries for master
A separate reflog is kept for each HEAD, so git show HEAD@{"1 week
ago"} will show where HEAD pointed to one week ago, not what the
current branch pointed to one week ago. This allows us to see the
history of what we have checked out.
The reflogs are kept by default for 30 days, after which they may be
pruned —
man 1 git reflog and man 1 git gc have more details about
how to control pruning. There is more information to be found within
the Specifying Revisions section of man 1 git rev-parse.
I decided to configure the values for how long a reflog entry is kept
before it gets pruned by e.g. git gc
sa@wks:~/0/0$ git config --global gc.reflogexpire 365 sa@wks:~/0/0$ git config --global gc.reflogexpireunreachable 180 sa@wks:~/0/0$ git config --get gc.reflogexpireunreachable 180 sa@wks:~/0/0$ git config --get gc.reflogexpire 365 sa@wks:~/0/0$
Last but not least, a very important to understand fact on the reflog history — the reflog history is very different from normal GIT history. While normal history is shared by every repository that works on the same project, the reflog history is not shared i.e. it tells us only about how the branches in our local repository have changed over time.
In some situations the reflog may not be able to save us. For example, suppose we delete a branch (the reflog is also deleted when deleting the branch), then we realize that we need the history it contained.
If we have not yet pruned the repository by running git gc or git
prune directly, then there may still be a chance to find the lost
commits in the dangling objects that git fsck reports
git fsck dangling commit 7281251ddd2a61e38657c827739c57015671a6b3 dangling commit 2706a059f258c6b245f298dc4ff2ccd30ec21a63 dangling commit 13472b7c4b80851a1bc551779171dcb03655e9b5 [skipping a lot of lines ...]
We can examine one of those dangling commits with, for example, gitk
7281251ddd --not --all which does what it sounds like i.e. it says
that we want to see the commit history that is described by the
dangling commit(s), but not the history that is described by all our
existing branches and tags. Thus we get exactly the history reachable
from that commit that is lost.
Notice that it might not be just one commit — we only report the tip of the line as being dangling, but there might be a whole deep and complex commit history that was dropped.
If we decide we want the history back, we can always create a new
reference pointing to it, for example, a new branch git branch
recovered-branch 7281251ddd.
Other types of dangling objects (e.g. blobs and trees) are also possible, and dangling objects can arise in other situations.
This one I love! It is not just totally practical because it reflects how humans think and work, but it also allows me to obey best practices ...
We use git stash whenever we want to record the current state of the
working directory and the index, but want to go back to a clean
working directory in order to do some intermediate work that just
sprung into our face.
git stash save will save our changes away to the stash, and reset our
working tree and the index to match the tip of our current branch.
Then we can make our fixes or complete some intermediate work as
usual. After that, we can go back to what we were working on before
with git stash apply.
For example, while we are in the middle of working on something complicated, we might find an unrelated but obvious and trivial bug or something that can be seen as a recursion of what we are currently working on i.e. some intermediate step which is a logical unit for itself and deserves a separate commit.
Usually, a humans workflow is where we want to go from A to B but then
figure that there is a C necessary to be done before B can be finished
and so we use git stash to stash away the partial work done for B
already, complete C (by starting out with a clean slate) and then,
after finishing C, we finish B.
Below is an example where we use git stash to save the current state
of our work. After fixing a trivia, we unstash the work in progress
from before and continue as with it as usual.
It is just for the sake of brevity that I do not provide a demo on completing some intermediate step (optionally after doing so on a different branch and then coming back) which as we might have figured is a logical unit of itself.
1 sa@wks:/tmp$ mkdir demo; cd demo; touch afile; git init; git cwi; git cwh -m 'initial commit' 2 Initialized empty Git repository in /tmp/demo/.git/ 3 [master (root-commit)]: created 10decd6: "initial commit" 4 0 files changed, 0 insertions(+), 0 deletions(-) 5 create mode 100644 afile 6 sa@wks:/tmp/demo$ echo 'some teeeeeext' > afile; git cwh -m 'added some text' 7 [master]: created aff733c: "added some text" 8 1 files changed, 1 insertions(+), 0 deletions(-) 9 sa@wks:/tmp/demo$ gllol 10 aff733c4edacc845a26c4546c3cf3275043244b5 2 seconds ago CN: Suno Ano AN: Suno Ano S: added some text 11 10decd6a751c109afc0ce81d54cb2420af7e728e 17 seconds ago CN: Suno Ano AN: Suno Ano S: initial commit 12 sa@wks:/tmp/demo$ head -n3 afile; git dwh | wc -l 13 some teeeeeext 14 0
The whole example is self-explanatory so I will just mention the most
important steps taken during this demo. As said, our intention is to
fix some trivia (typo in line 13) using git stash. After committing in
line 6, the working tree is clean at this point as we can see in
line 14.
15 sa@wks:/tmp/demo$ for ((i=0; i < 500; i+=1)); do echo $i; done >> afile; head -n3 afile; git dwh | wc -l 16 some teeeeeext 17 0 18 1 19 506 20 sa@wks:/tmp/demo$ git stash list 21 sa@wks:/tmp/demo$ git stash save 'fixing some trivia' 22 Saved working directory and index state "On master: fixing some trivia" 23 HEAD is now at aff733c added some text 24 sa@wks:/tmp/demo$ head -n3 afile; git dwh | wc -l 25 some teeeeeext 26 0 27 sa@wks:/tmp/demo$ git branch -a 28 * master 29 sa@wks:/tmp/demo$ nano afile 30 31 32 [ here the default editor opened ...] 33 34
Line 15 is to simulate some work in progress before we figure out we might need to set aside some work to address some trivia or intermediate work. While line 19 shows us that the working tree as well as the index are not clean because of the work we did, we can see (line 26) that line 21 does what it is intended to do — it sets back the working tree and the index to the state of the last commit. In order to fix the typo I have chosen to use nano as can be seen in line 32.
35 sa@wks:/tmp/demo$ head -n3 afile; git dwh | wc -l 36 some text 37 7 38 sa@wks:/tmp/demo$ git cwh -m 'some intermediate step (a logical unit of itself)' 39 [master]: created 4e97eff: "some intermediate step (a logical unit of itself)" 40 1 files changed, 1 insertions(+), 1 deletions(-) 41 sa@wks:/tmp/demo$ gllol 42 4e97effd3505fdad3fc66781ea6be14ec19ef914 4 seconds ago CN: Suno Ano AN: Suno Ano S: some intermediate step (a logical unit of itself) 43 aff733c4edacc845a26c4546c3cf3275043244b5 5 minutes ago CN: Suno Ano AN: Suno Ano S: added some text 44 10decd6a751c109afc0ce81d54cb2420af7e728e 6 minutes ago CN: Suno Ano AN: Suno Ano S: initial commit 45 sa@wks:/tmp/demo$ git stash apply 46 Auto-merging afile 47 CONFLICT (content): Merge conflict in afile 48 sa@wks:/tmp/demo$ head -n8 afile 49 <<<<<<< Updated upstream:afile 50 some text 51 ======= 52 some teeeeeext 53 0 54 1 55 2 56 3
After the trivia is fixed, we commit this logical unit in line 38. Note, only for this demo do we make a separate commit for a single typo. Usually we should always create one commit for one logical unit.
As can be seen above, after issuing line 45, it might happen that we run into a merge conflict which we simply resolve manually (line 57).
57 sa@wks:/tmp/demo$ nano afile
58
59
60 [ here the default editor opened ...]
61
62
63 sa@wks:/tmp/demo$ head -n5 afile
64 some text
65 0
66 1
67 2
68 3
69 sa@wks:/tmp/demo$ git stash list
70 stash@{0}: On master: fixing some trivia
71 sa@wks:/tmp/demo$ git cwh -m 'finished yet another logial unit'
72 [master]: created 821616a: "finished yet another logial unit"
73 1 files changed, 501 insertions(+), 0 deletions(-)
74 sa@wks:/tmp/demo$ gllol
75 821616a41e03562159427896669b318731608154 2 seconds ago CN: Suno Ano AN: Suno Ano S: finished yet another logial unit
76 4e97effd3505fdad3fc66781ea6be14ec19ef914 2 minutes ago CN: Suno Ano AN: Suno Ano S: some intermediate step (a logical unit of itself)
77 aff733c4edacc845a26c4546c3cf3275043244b5 7 minutes ago CN: Suno Ano AN: Suno Ano S: added some text
78 10decd6a751c109afc0ce81d54cb2420af7e728e 7 minutes ago CN: Suno Ano AN: Suno Ano S: initial commit
79 sa@wks:/tmp/demo$ git dwh
80 sa@wks:/tmp/demo$
Aside from seeing the final result of our actions in lines 64 to 68, what is interesting are lines 75 to 78 as it shows that we have a succession of commits representing a logical unit each — it is not so that, because we do not know better or because the SCM system we use is incapable of, we would have to put the logical units from line 75 and 76 into a single commit.
In an earlier section we saw how to
fix a mistake by editing the history, which for example works by
replacing the most recent commit using git commit --amend. This will
replace the old commit by a new commit incorporating our changes, also
giving us a chance to edit the old commit message.
We can also use a combination of this and git rebase to edit commits
further back in our history. For example, first we tag the problematic
commit with git tag bad mywork~5 — go five commits back into the
past, take this commit and create the tag bad from it.
Then we check out that commit using git checkout, edit it using git
commit --amend, and rebase the rest of the series on top of it (note
that we could check out the commit on a temporary branch, but instead
we are using a detached head):
git checkout bad [ make changes here and update the index ... ] git commit --amend git rebase --onto HEAD bad mywork
I think some explanation for git rebase --onto <newbase> <upstream>
<branch> might help understanding what is going on. In our current
case, what happens is:
mywork (if not already on branch mywork)bad but in mywork to a temporary
area i.e. what git log bad..mywork would show usmywork to HEAD, which at this point in time points to
mywork~5, the commit we fixed that ismywork, one by one, in ordergit tag -d badWhen we are done, we will be left with branch mywork checked out, with
the top patches of mywork reapplied on top of our modified commit.
Note that the immutable nature of GIT history means that we have not really modified existing commits. Instead, we have replaced the old commits with new commits having new object names.
The primary problem with rewriting the history of a branch has to do with merging. Suppose somebody fetches our branch and merges it into their branch, with a result something like this:
o--o--O--o--o--o <-- origin
\ \
t--t--t--m <-- their branch
Then suppose we modify the last three commits:
o--o--o <-- new head of origin
/
o--o--O--o--o--o <-- old head of origin
If we examine all this history together in one repository, it will look like:
o--o--o <-- new head of origin
/
o--o--O--o--o--o <-- old head of origin
\ \
t--t--t--m <-- their branch:
GIT has no way of knowing that the new head is an updated version of the old head — it treats this situation exactly the same as it would if two developers had independently done the work on the old and new heads in parallel. At this point, if someone attempts to merge the new head in to their branch, GIT will attempt to merge together the two (old and new) lines of development, instead of trying to replace the old by the new. The results are likely to be unexpected.
We may still choose to publish branches whose history is rewritten, and it may be useful for others to be able to fetch those branches in order to examine or test them, but they should not attempt to pull such branches into their own work. As I said many times above already:
For true distributed development that supports proper merging, published branches should never be rewritten!
A look under the hood ...
We can examine the data represented in the object database (also known
as GIT back end) and the index with various helper tools. For every
object, we can use git cat-file to examine details about the object —
something we have already used in conjunction with git rev-parse
above.
1 sa@wks:/tmp/spear.clan$ git cat-file -t $(git rev-parse HEAD) 2 commit 3 sa@wks:/tmp/spear.clan$ git cat-file -s $(git rev-parse HEAD) 4 423 5 sa@wks:/tmp/spear.clan$ git cat-file commit $(git rev-parse HEAD) 6 tree 5dccb7bc01b01992f06185cb642f7f4f96b078b3 7 parent b1eba669f85e8d6b978217fcef2827d3f2c26eb2 8 author trollfot <trollfot@82af7df8-bc4b-4ebc-8022-2999806f7efb> 1235418218 +0000 9 committer trollfot <trollfot@82af7df8-bc4b-4ebc-8022-2999806f7efb> 1235418218 +0000 10 11 using the last spear.content way to declare portal_type 12 13 14 git-svn-id: http://tracker.trollfot.org/svn/projects/spear.clan@760 82af7df8-bc4b-4ebc-8022-2999806f7efb
Line 2 shows the type of the object, and once we have the type (which
is usually implicit in where we find the object), we can use line 5 to
show its contents. git cat-file -p $(git rev-parse HEAD) would have
also worked just fine though.
It is especially instructive to look at commit objects, since those
tend to be small and fairly self-explanatory. In particular, if we
follow the convention of having the top commit name in .git/HEAD, we
can do
15 sa@wks:/tmp/spear.clan$ git cat-file commit HEAD 16 tree 5dccb7bc01b01992f06185cb642f7f4f96b078b3 17 parent b1eba669f85e8d6b978217fcef2827d3f2c26eb2 18 author trollfot <trollfot@82af7df8-bc4b-4ebc-8022-2999806f7efb> 1235418218 +0000 19 committer trollfot <trollfot@82af7df8-bc4b-4ebc-8022-2999806f7efb> 1235418218 +0000 20 21 using the last spear.content way to declare portal_type 22 23 24 git-svn-id: http://tracker.trollfot.org/svn/projects/spear.clan@760 82af7df8-bc4b-4ebc-8022-2999806f7efb 25 sa@wks:/tmp/spear.clan$
to see what the top commit was. With this convention obeyed, line 5 and 15 cater for the same result as can be seen.
Note: Trees have binary content, and as a result there is a special
helper for showing that content, called git ls-tree, which turns the
binary content into a more easily readable form.
We have seen how GIT stores each object in a file named after the object's SHA1 hash. Unfortunately this system becomes inefficient once a project has a lot of objects. For example, the source for this website/platform looks like the below
1 sa@wks:~/0/0$ git count-objects 2 1168 objects, 118688 kilobytes
The first number (1168) is the number of objects which are kept in
individual files. The second is the amount of space taken up by those
loose objects.
We can save space and make GIT faster by moving those loose objects
into a so-called pack file, which stores a group of objects in an
efficient compressed format — the details of how pack files are
formatted can be found in ../technical/pack-format.txt.
3 sa@wks:~/0/0$ git repack 4 Counting objects: 1163, done. 5 Compressing objects: 100% (1150/1150), done. 6 Writing objects: 100% (1163/1163), done. 7 Total 1163 (delta 264), reused 0 (delta 0) 8 sa@wks:~/0/0$ git count-objects 9 1168 objects, 118688 kilobytes 10 sa@wks:~/0/0$ git prune 11 sa@wks:~/0/0$ git count-objects 12 0 objects, 0 kilobytes 13 sa@wks:~/0/0$
The actual magic is with lines 3 to 7. Line 10 removes any of the
loose objects that are now contained in the pack. This will also
remove any unreferenced objects (which may be created whenever we use
git reset for example). We can verify that the loose objects are gone
by looking at the .git/objects directory or by running git
count-objects again as we did in line 11.
Although the object files are gone, any commands that refer to those objects will work exactly as they did before because of the pack index.
As mentioned before already, the git gc command performs packing,
pruning and more for us in one shoot so is normally the only
high-level (porcelains) command we need.
The git fsck command will sometimes complain about dangling objects.
They are not a problem as we will find out ... they can actually be
very useful in case we need to revive deleted stuff.
The most common cause of dangling objects is that we have rebased a branch, or we have pulled from somebody else who rebased a branch. In that case, the old head of the original branch still exists, as does everything it pointed to. The branch pointer itself just does not exist anymore since we replaced it with another one.
There are also other situations that cause dangling objects. For
example, a dangling blob may arise because we did a git add of a file,
but then, before we actually committed it and made it part of the
bigger picture, we changed something else in that file and committed
that updated that file — the old state that we added originally ends
up not being pointed to by any commit or tree, so it is now a dangling
blob object.
Similarly, when the recursive merge strategy runs, and finds that there are criss-cross merges and thus more than one merge base (which is fairly unusual, but it does happen), it will generate one temporary midway tree (or possibly even more, if we had lots of criss-crossing merges and more than two merge bases) as a temporary internal merge base, and again, those are real objects, but the end result will not end up pointing to them, so they end up dangling in our repository.
Generally, dangling objects are not anything to worry about. They can even be very useful e.g. if we screw something up, the dangling objects can be how we recover our old tree (say, we did a rebase, and realized that we really did not want to — we can look at what dangling objects we have, and decide to reset our head to some old dangling state).
For commits, we can just use something like gitk
<dangling-commit-sha-goes-here> --not --all. This asks for all the
history reachable from the given commit but not from any branch, tag,
or other reference. If we decide it is something we want, we can
always create a new reference to it like this git branch
recovered-branch <dangling-commit-sha-goes-here>
For blobs and trees, we can not do the same, but we can still examine
them. We can just do git show <dangling-blob/tree-sha-goes-here> to
show what the contents of the blob were (or, for a tree, basically
what the ls for that directory was), and that may give us some idea of
what the operation was that left that dangling object floating around.
Usually, dangling blobs and trees are not very interesting. They are
almost always the result of either being a half-way mergebase (the
blob will often even have the conflict markers from a merge in it, if
we have had conflicting merges that we fixed up by hand), or simply
because we interrupted a git fetch with ^C (Ctrl + c or in Emacs
speech, C-c) or something like that, leaving some of the new objects
in the object database, but just dangling and useless.
Anyway, once we are sure that we are not interested in any dangling state, we can just prune all unreachable objects and they will be be gone.
But we should only run git prune on a quiescent repository — it is
kind of like doing a filesystem fsck recovery; we do not want to do
that while the filesystem is mounted.
The same is true of git fsck itself but since git fsck never actually
changes the repository, it just reports on what it found, git fsck
itself is never a dangerous thing to issue on some repository. Running
it while somebody is actually changing the repository can cause
confusing and scary messages, but it will not actually do anything
bad. In contrast, running git prune while somebody is actively
changing the repository is a bad idea.
Go here for information. Sample scripts can be found in
/usr/share/git-core/templates/hooks. Another place to look is directly
within the GIT source code
sa@wks:/tmp/git/contrib/hooks$ la total 48 drwxr-xr-x 2 sa sa 102 2009-03-01 19:27 . drwxr-xr-x 19 sa sa 4096 2009-03-01 19:27 .. -rw-r--r-- 1 sa sa 19324 2009-03-01 19:27 post-receive-email -rw-r--r-- 1 sa sa 1291 2009-03-01 19:27 pre-auto-gc-battery -rw-r--r-- 1 sa sa 6920 2009-03-01 19:27 setgitperms.perl -rw-r--r-- 1 sa sa 11647 2009-03-01 19:27 update-paranoid sa@wks:/tmp/git/contrib/hooks$
One nice example is with this source code of my website/platform
itself. In order to get rid of trailing whitespace (which we know is
bad), I decided to activate the pre-commit script by deleting the
.sample suffix from it
sa@wks:~/0/0$ ll .git/hooks/ | grep pre-comm -rwxr-xr-x 1 sa sa 519 2009-02-25 15:52 pre-commit sa@wks:~/0/0$
Depending on what method is chosen to edit the source code (I use
GNU Emacs) there may or may not be a means of control in place in
order to check for trailing whitespace — in my case, in order to get
rid of it automatically, I use (add-hook 'before-save-hook
'delete-trailing-whitespace) in my .emacs.
However, this hook (pre-commit) checks for trailing whitespace no
matter what way the source was edited i.e. which editor had been used
or who did it.
1 sa@wks:~/0/0$ cat .git/hooks/pre-commit 2 #!/bin/sh 3 # 4 # An example hook script to verify what is about to be committed. 5 # Called by git-commit with no arguments. The hook should 6 # exit with non-zero status after issuing an appropriate message if 7 # it wants to stop the commit. 8 # 9 # To enable this hook, rename this file to "pre-commit". 10 11 12 ## added by Suno Ano 13 exec git add . 14 15 16 ## default 17 if git-rev-parse --verify HEAD 2>/dev/null 18 then 19 against=HEAD 20 else 21 # Initial commit: diff against an empty tree object 22 against=4b825dc642cb6eb9a060e54bf8d69288fbee4904 23 fi 24 25 exec git diff-index --check --cached $against -- 26 sa@wks:~/0/0$
If we take a closer look, we can also see that I added some additional code in line 13. This line insures that, for example, new files/images/etc. I added are not forgotten to be added under version control with GIT.
This section is used to drop anything GIT related here but which on its own does not deserve a section on its own. The subsections here must not necessarily have anything to do with each another, except for the fact that GIT may be the only thing they have in common.
The most obvious benefits of putting /etc under version control are to
clean up the mess somebody inexperienced created when doing some sort
of trial and error within /etc — those folks do stuff but then can
not remember what they did so reverting their changes becomes quite
impossible. That is not so if /etc is under version control.
Another obvious reason is — resulting in pretty much the same actions
as above; looking at the changes (e.g. via git diff HEAD), and maybe
rollback — if for example aptitude full-upgrade or some other akin
tool did something bad.
A third reason why having /etc under version control is so great, is a
multi-user environment — certainly, we want to be able to see who did
what and when. This, in combination with sudo is quite powerful.
Sometimes a business case demands such standard via contracts anyways.
There are many more reasons but the former three are those which I
already experienced myself — once /etc is under version control using
GIT, pretty much only the stars become the only things we might not be
able to go to ...
Come quickly, I am tasting stars!
— Dom Perignon, upon discovering champagne.
isisetup is one possibility to put ones /etc under version control. I
opted for etckeeper simply because I did not wanted to learn another
UI (User Interface) aside GIT — isisetup has it is own UI so ...
The etckeeper program is designed to let us put /etc under version
control. There are a few files involved in the process:
/etc/.gitignore: stores ignore patterns as we already know; this
file is specific to /etc and does not affect other repositories
like for example ~/.gitignore does./etc/.metadata: stores metadata about file owners and permissions./etc/.etckeeper: stores information that can be used to recreate
the empty directories and symlinks./etc/etckeeper/etckeeper.conf: the configuration file for etckeeper/etc/.git: actual repository data for /etc; see repository layoutetckeeper is a collection of tools in order to put /etc under version
control in a GIT (the default), mercurial, bazaar or darcs repository.
It hooks into APT (Advanced Packaging Tool) to automatically commit
changes made to /etc during package upgrades.
It tracks file metadata that GIT does not normally support, but that
is important for /etc, such as the permissions of /etc/shadow. It is
quite modular and configurable, while also being simple to use if one
understands the basics of working with SCM (Software Configuration
Management) systems.
etckeeper has special support to handle changes to /etc caused by
installing and upgrading packages. Before APT installs packages,
etckeeper pre-install will check that /etc contains no uncommitted
changes. After APT installs packages, etckeeper post-install will add
any new interesting files to the repository, and commit the changes.
We can also run etckeeper commit by hand to commit changes. In
addition to pre and post hooks, as well as the possibility to manually
trigger things, there is also a cron job, that will use etckeeper to
automatically commit any changes to /etc each day.
1 sa@wks:/etc/etckeeper$ dpl etckeeper | grep ^ii 2 ii etckeeper 0.30 store /etc in git, mercurial, bzr or darcs 3 sa@wks:/etc/etckeeper$ type gr && gr HIGHLEVEL etckeeper.conf 4 gr is aliased to `grep -rni --color' 5 29:HIGHLEVEL_PACKAGE_MANAGER=apt 6 sa@wks:/etc/etckeeper$ cd ..
I have already installed etckeeper as can be seen in line 2. The dpl
command in line 1 and gr in line 4 are just aliases in my ~/.bashrc.
Since I use aptitude I made a change to /etc/etckeeper/etckeeper.conf
as can be seen in line 6. If this line already looks as shown above,
then no actions need to be taken. HIGHLEVEL_PACKAGE_MANAGER should be
apt for all Debian systems or any system using anything in the APT
family for package management. The variable mostly controls
installation of APT config files.
7 sa@wks:/etc$ su 8 Password: 9 wks:/etc# etckeeper init 10 Initialized empty Git repository in /etc/.git/ 11 wks:/etc# git commit -a -m "Initial Commit" 12 13 [skipping a lot of lines ...] 14 15 create mode 100644 xpdf/xpdfrc-arabic 16 create mode 100644 xpdf/xpdfrc-cyrillic 17 create mode 100644 xpdf/xpdfrc-greek 18 create mode 100644 xpdf/xpdfrc-hebrew 19 create mode 100644 xpdf/xpdfrc-latin2 20 create mode 100644 xpdf/xpdfrc-thai 21 create mode 100644 xpdf/xpdfrc-turkish 22 create mode 100644 yaird/Default.cfg 23 create mode 100644 yaird/Templates.cfg 24 wks:/etc# git gc 25 Counting objects: 2931, done. 26 Compressing objects: 100% (2177/2177), done. 27 Writing objects: 100% (2931/2931), done. 28 Total 2931 (delta 267), reused 0 (delta 0) 29 wks:/etc# ls -lat | head 30 total 1588 31 drwx------ 8 root root 4096 2009-02-14 19:21 .git 32 -rwx------ 1 root root 6101 2009-02-14 19:21 .etckeeper 33 drwxr-xr-x 171 root root 12288 2009-02-14 19:20 . 34 -rw------- 1 root root 433 2009-02-14 19:20 .gitignore 35 drwxr-xr-x 10 root root 4096 2009-02-14 17:47 etckeeper 36 -rw-r--r-- 1 root root 23 2009-02-14 14:29 resolv.conf 37 -rw-r--r-- 1 root root 111633 2009-02-14 13:52 ld.so.cache 38 drwxr-xr-x 2 root root 4096 2009-02-14 13:52 cron.daily 39 drwxr-xr-x 2 root root 4096 2009-02-14 13:52 bash_completion.d 40 wks:/etc# exit 41 exit 42 sa@wks:/etc$ ll etckeeper/post-install.d/ 43 total 12K 44 -rwxr-xr-x 1 root root 462 2008-12-17 00:14 50vcs-commit 45 -rwxr-xr-x 1 root root 22 2009-02-15 01:15 99git-gc 46 -rw-r--r-- 1 root root 141 2008-12-17 00:14 README 47 sa@wks:/etc$ cat etckeeper/post-install.d/99git-gc 48 #!/bin/sh 49 echo -e "\ngit repository housekeeping using git gc ..." 50 git gc 51 echo -e "git gc finished successfully ...\n" 52 sa@wks:/etc$
In line 9 I am initializing the GIT repository — using etckeeper init
instead of git init because the latter one would not take care of all
the metadata, creating ignore patterns, empty directories, etc.
Update: As of version 0.38, issuing etckeeper init is not necessary
anymore as can be seen
sa@wks:~$ zcat /usr/share/doc/etckeeper/changelog.gz | head -n7
etckeeper (0.38) unstable; urgency=low
* Use hostname if hostname -f fails. Closes: #533295
* Automatically commit on initial install, so users can
begin relying on etckeeper right away. Closes: #533290
-- Joey Hess <joeyh@debian.org> Wed, 08 Jul 2009 14:40:58 -0400
sa@wks:~$
We can then git status to check that it includes all the right files,
and none of the wrong files. Based on ones individual findings he
would then edit /etc/.gitignore. I did so in another terminal window
but did not include this above since it is individual to my whole
setup. When I was satisfied, I issued line 11 in order to make the
initial commit of /etc.
After that finished we can run git gc in line 24 to do the
housekeeping for us. Actually, we want that to happen after every
apt/aptitude run. Therefore we create a file and put the appropriate
commands in it (lines 48 to 51). As for the other files, it should be
owned by root and have the octal permissions 755 as can be seen in
line 45.
In lines 31 to 39 we can see things like /etc/.git, /etc/.etckeeper
and /etc/.gitignore that got created in the progress.
We have now successfully installed and setup etckeeper, the repository
will track all changes made to /etc, either via APT (Advanced
Packaging Tool), some daemon or manually. Detailed information can be
found with man 8 etckeeper.
No matter what SCM (Software Configuration Management) I am working
with, I usually use Emacs as a frontend since it is a lot faster then
using the CLI (Command Line Interface) and even much more speedy than
using some nonsense GUI (Graphical User Interface). Next to the saving
me a lot of time, using Emacs as a frontend also allows to use the
whole mighty range of Emacs magic that I am used to. I use psvn.el for
SVN. For GIT there are currently two choices
git.el, git-blame.el and vc-git.el orAs of now (August 2007) DVC undergoes heavy development and is not
fully ready for action that is why I use git.el. At some point in the
not so distant future, I will then switch to DVC. At that point I
would like to mention that it is good idea to read the developers
mailing list2 for DVC in order to be up-to-date about what is going
on. Update: I am now (February 2008) on DVC exclusively.
DVC is an Emacs frontend for various Decentralized Revision Control systems. It is the successor, and still includes Xtla, which is the Emacs frontend to tla and baz (GNU Arch client).
Take a look at the aliases in my ~/.bashrc (namely mudvc) in order to
see how I stay with up-to-the-minute DVC code. Installing and setting
up is a piece of cake as well
cd ~ bzr get http://bzr.xsteve.at/dvc/ cd ~/dvc autoconf ./configure make
Finally, take a look at the settings in my .emacs (plain text version)
how I load the code, what keybindings I have etc. — search for the
string dvc within .emacs.
The git manual says e.g. Emacs ediff can be used to resolve conflicts.
You may also use git-mergetool(1), which lets you merge the unmerged files using external tools such as emacs or kdiff3.
1. Well, there is not much to say into that. My personal experience as well as my observations are like this: After two decades or so a person has pretty much seen everything related to his area of expertise and thus is able to not just judge things instantly but also to avoid redundancy i.e. repeatedly doing the same work several times where the single one correct approach would have been enough. Some call that experience and collected knowledge others just call it getting older. My non-abstract statement here is, I do not use two or more tools to go from A to B anymore, I automatize anything possible, I try to save as much time as possible from repeating tasks and use this time savings to either not having to work 70+ hours a week or otherwise to make progress on really demanding areas of my research interests. Finally, I fell lucky, I am now able to judge things by just glancing towards them and make instant decisions. The tools (e.g. OS, Editor, SCM, Hardware, etc.) I use are the best solution — there is no redundancy at all anymore plus a have tailored the whole thing to fit my needs. The best technical solution is worth nothing if it requires a human to invest to much time for it e.g. GNU Emacs is just worth to go through the initial 6 month of pain because in the long-term it probably saves one the tenfold amount of time ... Same goes for DebianGNU/Linux, enterprise-class hardware e.g. IBM Blade Center, helicopter flying licence, etc.
2. If you are with Gnus then visit the group buffer, type B, choose
<code>nntp and news.gmane.org as news server. Then search for dvc in
the new buffer you just got (do so by using C-s dvc and repeat
<code>C-s until you find the Emacs devel group for DVC — note, you can
use d with point on that line to get a description). Then
subscribe the group with point on that line and u. After that
quit the buffer using q. Now you are subscribed to the DVC devel
mailing list via Gmane. I abandoned "normal" subscription to
mailing list completely in favor of Gmane since it is way easier
to handle, way faster to get things done and what is the most
important ... I am confronted with a single unique UI (User
Interface) no matter what ML (Mailing List) I deal with.
3. Generally, sourceforge.net allows for rsync access. Note that a
remote SVN repository has to be explicitly set up to allow
mirroring via rsync.
4. However, one can setup a post-commit hook that will automatically push for him every time he commits in his local repository. The downside is he loses the flexibility to fix up a screwed commit in his local repository by doing so.
5. We do not mirror the SVN repository as is locally. What git svn
clone does is setting up a local GIT repository which is set up to
allow for bidirectional operations between our local GIT
repository and the remote SVN repository which we cloned from.