This tutorial shows how to set up a centralized git repository hosted on your own git server, which can be any unix system with ssh and git and git-annex installed. A VPS, a home server, etc.
This sets up a very simple git server. More complex setups are possible. See for example using gitolite with git-annex.
set up the server
On the server, you'll want to install git, and git-annex, if you haven't already. If possible, install it using your distribution's package manager:
server# sudo apt-get install git git-annex
Note that git-annex-shell needs to be located somewhere in the PATH, so that a client can successfully run "ssh yourserver git-annex-shell". Installing git-annex using a package manager will take care of this for you. But if you're not root or otherwise can't install git-annex that way, you may need to do more work; see get git-annex-shell into PATH.
Decide where to put the repository on the server, and create a bare git repo there. In your home directory is a simple choice:
server# cd
server# git init annex.git --bare --shared
That's the server setup done!
make a checkout
Now on your laptop, clone the git repository from the server:
laptop# git clone ssh://example.com/~/annex.git
Cloning into 'annex'...
warning: You appear to have cloned an empty repository.
Checking connectivity... done.
Tell git-annex to use the repository, and describe where this clone is located:
laptop# cd annex
laptop# git annex init 'my laptop'
init my laptop ok
add files to the repository
Add some files, obtained however.
# git annex add *.mp4
add Haskell_Amuse_Bouche-b9OVqxmI.mp4 (checksum) ok
(Recording state in git...)
# git commit -m "added a video. I have not watched it yet but it sounds interesting"
Feel free to rename the files, etc, using normal git commands:
# git mv Haskell_Amuse_Bouche-b9OVqxmI.mp4 Haskell_Amuse_Bouche.mp4
# git commit -m 'better filenames'
Now push your changes back to the central repository on your server. As well as pushing the master branch, remember to push the git-annex branch, which is used to track the file contents.
# git push origin master git-annex
To git@github.com:joeyh/techtalks.git
* [new branch] master -> master
* [new branch] git-annex -> git-annex
That push went fast, because it didn't upload large videos to the server.
So, to finish up, tell git-annex to sync all the data in the repository to your server:
# git annex sync --content
...
make more checkouts
So far you have a central repository on your server, and a checkout on a laptop. Let's make another checkout elsewhere. Clone the central repository as before.
elsewhere# git clone ssh://example.com/~/annex.git
elsewhere# cd annex
Notice that your clone does not have the contents of any of the files yet.
If you run ls
, you'll see broken symlinks. It's easy to download them from
your server either by running git annex sync --content
, or by asking
git-annex to download individual files:
# git annex get Haskell_Amuse_Bouche.mp4
get Haskell_Amuse_Bouche.mp4 (from origin...)
12877824 2% 255.11kB/s 00:00
ok
I must add that after the previous commands finished, content of the repository was not shown on the server (it was all in .git). I made
and files appeared. Unfortunately it seems all timestamps were new, which I didn't like, and it was already asked here (http://git-annex.branchable.com/todo/does_not_preserve_timestamps/).
As usually, create a directory on the server,
git init
, thengit-annex init
there.Add that locally:
git remote add my-server-name my-server:~/my-repo
git-annex sync
locally seems to work fine and pushes data to the server.I needed to have this workaround before, because I could not get data from my laptop while on the server (I wasn't sure I had an open IP address for my laptop). This is mostly a basic thing in git, but I had errors with git-annex earlier and I try to be cautious now.
When I follow the tutorial and run
it replies
However one can use
git-annex sync
instead and that works fine.Is it possible to init git-annex repo on your local machine, add git-remote there and push local data to the server? Won't there be any problems with this 'non-official' approach?
git-annex sets up its own ssh connection caching because this makes it a lot faster.
To disable this feature, you can set annex.sshcaching=false, or set remote.origin.annex-ssh-options as you have.
git-annex has no way to know if you have another ssh socket to use, so it seems fine for you to need to configure it if you want it to use one.
To avoid frequent typing of pin + RSA passcode + password, we typically establish an ssh control master just once. This works fine with regular git commands, but the git-annex command apparently try to create a different socket. Even that would be ok, except that apparently it is a new socket each time we enter a command.
With sufficient "-vvvv" we see things like:
(Note I have eliminated references to the actual machine names and userid's.)
If the command had instead been:
everything would have worked fine. In fact, we are now using:
and this eliminates the issue. But it would be nice if git annex could somehow automatically use the pre-existing connection. Is there a better way to achieve this?