Monday, July 27, 2009

Sphinx|In-Depth django-sphinx Tutorial

In-Depth django-sphinx Tutorial | David Cramer's Blog

Again, I still suck at documentation, and my "tutorials" aren't in-depth enough. So hopefully this covers all of the questions regarding using the django-sphinx module.

The first thing you're going to need to do is install the Sphinx search software. You will be able to get this through http://www.sphinxsearch.com/, or probably even port or aptitude.

Configure Sphinx

Once you have successfully installed Sphinx you need to configure it. Follow the directions in on their website for the basic configuration, but most importantly, you need to configure a search index which can relate to one of your models.

Here is an example of an index from Curse's File model, which let's you search via name, description, and tags on a file. Please note, that "base" is a base source definition we created which has a few defaults which we use, but this is unrelated to your source definition.

source files_file_en : base { sql_query			= \ 	SELECT files_file.id, files_file.name, files_data.description, files_file.tags as tag \ 	FROM files_file JOIN files_data \ 	ON files_file.id = files_data.file_id \ 	AND files_data.lang = 'en' \ 	AND files_file.visible = 1 \ 	GROUP BY files_file.id sql_query_info		= SELECT * FROM files_file WHERE id=$id } 

Now that you have your source defined, you need to build an index which uses this source. I do recommend placing all of your sphinx information somewhere else, maybe /var/sphinx/data.

index files_file_en { 	source			= files_file_en 	path			= /var/data/files_file_en 	docinfo			= extern 	morphology			= none 	stopwords			= 	min_word_len		= 2 	charset_type		= sbcs 	min_prefix_len		= 0 	min_infix_len		= 0 }

Configure Django

Now that you've configured your search index you need to setup the configuration for Django. The first step to doing this is to install the django-sphinx wrapper. First things first, download the zip archive, or checkout the source from http://code.google.com/p/django-sphinx/.

Once you have your files on the local computer or server, you can simple do sudo python setup.py install to install the library.

After installation you need to edit a few settings in settings.py, which, again, being that I suck at documentation, isn't posted on the website.

The two settings you need to add are these:

SPHINX_SERVER = 'localhost' SPHINX_PORT = 3312 

Setup Your Model

Now you are fully able to utilize Sphinx within Django. The next step is to actually attach your search index to a model. To do this, you will need to import djangosphinx and then attach the manager to a model. See the example below:

from django.db import models import djangosphinx   class File(models.model):     name = models.CharField()     tags = models.CharField() # We actually store tags for efficiency in tag,tag,tag format here       objects = models.Manager()     search  = djangosphinx.SphinxSearch(index="files_file_en")

The index argument is optional, and there are several other parameters you can pass, but you'll have to look in the code (or pydoc if I did it right, but probably not).

Once we've defined the search manager on our model, we can access it via Model.manager_name and pass it many things like we could with a normal object manager in Django. The typical usage is Model.search.query('my fulltext query') which would then query sphinx, grab a list of IDs, and then do a Model.objects.filter(pk__in=[list of ids]) and return this result set.

Search Methods

There are a few additional methods which you can use on your search queryset besides the default query method. order_by, filter, count, and exclude to name a few. These don't *quite* work the same as Django's as they're used directly within the search wrapper. So here's a brief rundown of these:

  • query
    This is your basic full-text search query. It works exactly the same as passing your query to the full-text engine. It's search type will be based on the search mode, which, by default, is SPH_MATCH_EXTENDED.
  • filter/exclude
    The filter and excludes method holds the same idea as the normal queryset methods, except that it is used directly in Sphinx. What this means, is that you can only filter on attribute fields that are present in your search index.
  • order_by
    The order_by method also passes its parameters to Sphinx, with one exception. There are four reserved keywords: @id, @weight, @rank, and @relevance. These are detailed in the Sphinx documentation.
  • select_related
    This method is directly passed onto the Django queryset and holds no value to Sphinx.
  • index_on
    Allows you to specify which index(es) you are querying for. To query for multiple indexes you need to include a "content_type" name in your fields.

Sphinx works with Mysql and Postgres, just remember to run configure with the –with-pgsql option.

Seems the search command line tools doesn't like postgres though..

Saturday, July 25, 2009

Django|snippets: MediaWiki Markup

Django snippets: MediaWiki Markup

MediaWiki-style markup parse(text) -- returns safe-html from wiki markup code based off of mediawiki 

Thursday, July 23, 2009

SSH| User Howto

SSH User Howto - OIWiki

SSH User Howto

From OIWiki

Jump to: navigation, search

Contents

[hide]

About SSH

What is Secure Shell (SSH)

See WikiPedia if you need a full rundown Wikipedia SSH page Essentially it is an protocol for connecting to and executing commands on a remote system, using a secure encrypted tunnel.

What is SFTP

SFTP is a File Transfer Protocol that uses an SSH to authenticate and encrypt its traffic. It is essentially a sub-service of the SSH server.

Why not use FTP or RSH etc

Both FTP and RSH use no encryption and pass passwords over the network in plain text. This makes it possible for the passwords to be captured in a number of ways, which is obviously bad for the users and the systems security. Therefore whenever possible SSH/SFTP should be used for file transfers and remote connections - use of FTP or RSH in legacy applications requires Infosec approval.

What are SSH keys?

One of the ways SSH improves security is to identify a user by use of "key". The benefit of a key is that at no time does a password or even the key traverse the network, a challenge response mechanism is used to validate that the incoming user is who they say they are.

This works because there is a "private" and a "public" key. Basically the client proves who they are by exchanging values with the server using the public keys. The client uses the private key to provide a response which validates that it has the private key associated with the public key. When the server validates a correct response, it allows access.

By using keys, there is no need to provide passwords, which allows non-interactive or passwordless interactive logins.

SSH Versions

In UNIX, there are two main flavours of SSH in common usage - OpenSSH and Secure Shell (commercial SSH). Both are compatible, but each is slightly different in syntax, file formats and configuration. The simple way to tell is once you are on a system, use the "ssh -V" command to identify what version you are using.

The below sections outline the basic usage of each from a user perspective, as well as how to work between the two versions when the need arises.

OpenSSH

Identification

OpenSSH is the more common version and "ssh -V" will have either OpenSSH or OpenSSL in the output:

$ ssh -V Sun_SSH_1.1, SSH protocols 1.5/2.0, OpenSSL 0x0090704f  $ ssh -V OpenSSH_3.6.1p2, SSH protocols 1.5/2.0, OpenSSL 0x0090701f 

Configuration

OpenSSH stores its configuration files under the ".ssh" directory of the users home directory.

By default, it will identify a user using the keyfiles "~/.ssh/id_dsa" and "~/.ssh/id_rsa"

It will validate an incoming user by matching public keys stored in the "~/.ssh/authorized_keys" file.

Creating Keys

To create a key using OpenSSH, use the ssh-keygen command. The command below says to create a key using DSA encryption of 1024 bit strength with no passphrase (-N) to the file id_dsa:

$ cd ~/.ssh $ ssh-keygen -t dsa -b 1024 -N -f id_dsa 

You will see two files created - and id_dsa and an id_dsa.pub. The id_dsa will now be used by the ssh command to attempt to authenticate you to other servers.

If you have both RSA and DSA keys created, it will try them both.

Allowing Access

To allow a remote user to login to your account using SSH, you simply need to append their public key to your ~/.ssh/authorized_keys file. For example:

$ cat bobskey.pub >> ~/.ssh/authorized_keys 

Be sure the public key is in the OpenSSH format however. If it is in the SecureSSH format, use the ssh-keygen command to convert it:

$ ssh-keygen -i -f secsshkey.pub > opensshkey.pub 

You can then append the converted key to the authorized_keys file

Secure SSH

Identification

Secure SSH is a commercially produced SSH implementation. In its version output it will not reference OpenSSL and generally says a vendor:

$ ssh -V ssh2: SSH Secure Shell 2.4.0 on alphaev56-dec-osf4.0e  $ ssh -V ssh: SSH Tectia Server 4.0.5 on powerpc-ibm-aix5.1.0.0 

Configuration

Secure SSH stores its configuration under the ".ssh2" directory of a users home directory.

By default it identifies a user using key files listed the "~/.ssh2/identification" file.

It will validate an incoming user by matching public key files listed in the "~/.ssh2/authorization" file.

Creating Keys

To create a key using Secure SSH, again use the ssh-keygen command. The command below says to create a key using DSA encryption of 1024 bit strength with no passphrase (-P):

$ ssh-keygen -t dsa -b 1024 -P Generating 1024-bit dsa key pair    5 oOOo.oOo.oOo Key generated. 1024-bit dsa, root@o9030004, Thu May 31 2007 13:12:37 Private key saved to //.ssh2/id_dsa_1024_a Public key saved to //.ssh2/id_dsa_1024_a.pub 

To use this key to authenticate to remote servers, append the filename to the identification file as such:

$ echo "Key id_dsa_1024_a" >> ~/.ssh2/identification 

You can append multiple lines to this file and the SSH client will attempt them in order.

Allowing Access

To allow remote access, you need to copy the public key file (i.e id_dsa_1024_a.pub) to the remote system and place it under the users .ssh2 directory. You then need to list the key file in the ~/.ssh2/authorization file as such:

$ echo "Key  id_dsa_1024_a.pub" >> ~/.ssh2/authorization 

Generally it is helpful to identify the server and user that the key is from by the filename, for example "cdun1410-ipg_as.pub" so you know what file is what.

If you are copying the public key from an OpenSSH system, then you need to first convert it to the SECSSH format by using the ssh-keygen command on the OpenSSH system:

openssh$ ssh-keygen -e -f id_dsa.pub > securessh.pub 

You can then copy the resulting key and place on the Secure SSH system.

Working between Secure and OpenSSH

Both versions are just as secure, it just happens that the commercial version is made by a company called "Secure Communications". Both are also compatible and able to exchange keys provided that, as above, you convert the key files for use on the systems as needed.

The simple way to tell if you have a SecureSSH or OpenSSH key file is by viewing it.

A SecureSSH key file will have a "BEGIN SSH2" and "END SSH2" line surrounding the actual key text.

An OpenSSH private key will have BEGIN <keytype> PRIVATE KEY" around the key, and the public keys begin with either "ssh-dss" or "ssh-rsa" followed by the key text in a single line.

Common Issues

As a first step, try using ssh with the "-v" flag to get more verbose details on what the SSH client is attempting to do. Pay attention to what key files it attempts to use, and the responses from the remote server.

User Accounts

Even though you may authenticate to a server correctly, SSH is still at the mercy of the user account on the remote system. If the account is locked, expired or otherwise inaccessable it will appear as if the SSH connection is simply disconnecting.

If you can login using a password, then the account is ok. Most likely you have an issue with your configuration or key files as described below.

Key location

As a first step, make sure you are not using the wrong configuration directory. OpenSSH will not look at a ~/.ssh2 directory, and Commercial SSH wont look at a ~/.ssh directory.

Key Formats

Ensure that the key files have been converted as appropriate on the server system. See the above sections for details on conversion.

File Permissions

One of the more common gotchas with SSH is that it is militantly pedantic about file permissions. If the file permissions are not secure enough, SSH will ignore the key completely. This applies to the users configuration directory (~/.ssh or ~/.ssh2) as well as the key files. If the users home directory or the .ssh directory is writable by anyone other than the user, SSH will ignore it and all its contents competely. This applies to both the SSH client and the SSH server.

Here is what your permissions should be:

  • user home directory - 0755
  • ssh directory - 0755
  • private key files - 0400
  • public key files - 0644
  • other config files - 0644

As a first step, these permissions should be validated and set on both the client and the server to ensure that the SSH command is not ignoring your key files.

Troubleshooting

  1. Ensure you can login to the remote system interactivly with a password - if you cannot, you have account issues and should raise a Clarify case to the administrators of the system you are connecting to.
  2. Verify you have created the correct private and public keys on the client system (i.e the system you are initiating the connection from).
  3. Verify the permissions of those files are correct
  4. use ssh -v and verify that the ssh client is attempting to use the key files
  5. On the remote system, verify the public keys are installed, are converted to the correct format, in their correct locations and have the correct permissions
  6. If you still cannot connect, try from another system that you know does work or has worked, to ensure that there is not some other change on the server system preventing your connection
  7. If you get to here, you should raise a clarify case to your systems administration group for investigation

Git| common commands

Git - Fast Version Control System

Commands

Here is a list of the most common commands you're likely to use on a day-to-day basis.

Local Commands

git config Get and set repository or global options
git init Create an empty git repository or reinitialize an existing one
git add Add file contents to the index
git status Show the working tree status
git commit Record changes to the repository
git log Show commit history
git show Show information on any object
git tag Create, list, delete or verify tags

Remotey Commands

git clone Clone a repository into a new directory
git remote Manage set of tracked repositories
git pull Fetch from and merge with another repository or a local branch
git fetch Download objects and refs from another repository
git push Update remote refs along with associated objects

Branchy Commands

git checkout Checkout a branch or paths to the working tree
git branch List, create, or delete branches
git merge Join two or more development histories together
git rebase Forward-port local commits to the updated upstream head

Patchy Commands

git diff Show changes between commits, commit and working tree, etc
git apply Apply a patch on a git index file and a working tree
git format-patch Prepare patches for e-mail submission
git am Apply a series of patches from a mailbox

Git|Everyday GIT With 20 Commands Or So

Everyday GIT With 20 Commands Or So

Everyday GIT With 20 Commands Or So

[Basic Repository] commands are needed by people who have a repository --- that is everybody, because every working tree of git is a repository.

In addition, [Individual Developer (Standalone)] commands are essential for anybody who makes a commit, even for somebody who works alone.

If you work with other people, you will need commands listed in the [Individual Developer (Participant)] section as well.

People who play the [Integrator] role need to learn some more commands in addition to the above.

[Repository Administration] commands are for system administrators who are responsible for the care and feeding of git repositories.

Basic Repository

Everybody uses these commands to maintain git repositories.

Examples

Check health and remove cruft.
$ git fsck (1) $ git count-objects (2) $ git gc (3)
  1. running without --full is usually cheap and assures the repository health reasonably well.

  2. check how many loose objects there are and how much disk space is wasted by not repacking.

  3. repacks the local repository and performs other housekeeping tasks.

Repack a small project into single pack.
$ git gc (1)
  1. pack all the objects reachable from the refs into one pack, then remove the other packs.

Individual Developer (Standalone)

A standalone individual developer does not exchange patches with other people, and works alone in a single repository, using the following commands.

Examples

Use a tarball as a starting point for a new repository.
$ tar zxf frotz.tar.gz $ cd frotz $ git init $ git add . (1) $ git commit -m "import of frotz source tree." $ git tag v2.43 (2)
  1. add everything under the current directory.

  2. make a lightweight, unannotated tag.

Create a topic branch and develop.
$ git checkout -b alsa-audio (1) $ edit/compile/test $ git checkout -- curses/ux_audio_oss.c (2) $ git add curses/ux_audio_alsa.c (3) $ edit/compile/test $ git diff HEAD (4) $ git commit -a -s (5) $ edit/compile/test $ git reset --soft HEAD^ (6) $ edit/compile/test $ git diff ORIG_HEAD (7) $ git commit -a -c ORIG_HEAD (8) $ git checkout master (9) $ git merge alsa-audio (10) $ git log --since='3 days ago' (11) $ git log v2.43.. curses/ (12)
  1. create a new topic branch.

  2. revert your botched changes in curses/ux_audio_oss.c.

  3. you need to tell git if you added a new file; removal and modification will be caught if you do git commit -a later.

  4. to see what changes you are committing.

  5. commit everything as you have tested, with your sign-off.

  6. take the last commit back, keeping what is in the working tree.

  7. look at the changes since the premature commit we took back.

  8. redo the commit undone in the previous step, using the message you originally wrote.

  9. switch to the master branch.

  10. merge a topic branch into your master branch.

  11. review commit logs; other forms to limit output can be combined and include --max-count=10 (show 10 commits), --until=2005-12-10, etc.

  12. view only the changes that touch what's in curses/ directory, since v2.43 tag.

Individual Developer (Participant)

A developer working as a participant in a group project needs to learn how to communicate with others, and uses these commands in addition to the ones needed by a standalone developer.

  • git-clone(1) from the upstream to prime your local repository.

  • git-pull(1) and git-fetch(1) from "origin" to keep up-to-date with the upstream.

  • git-push(1) to shared repository, if you adopt CVS style shared repository workflow.

  • git-format-patch(1) to prepare e-mail submission, if you adopt Linux kernel-style public forum workflow.

Examples

Clone the upstream and work on it. Feed changes to upstream.
$ git clone git://git.kernel.org/pub/scm/.../torvalds/linux-2.6 my2.6 $ cd my2.6 $ edit/compile/test; git commit -a -s (1) $ git format-patch origin (2) $ git pull (3) $ git log -p ORIG_HEAD.. arch/i386 include/asm-i386 (4) $ git pull git://git.kernel.org/pub/.../jgarzik/libata-dev.git ALL (5) $ git reset --hard ORIG_HEAD (6) $ git gc (7) $ git fetch --tags (8)
  1. repeat as needed.

  2. extract patches from your branch for e-mail submission.

  3. git pull fetches from origin by default and merges into the current branch.

  4. immediately after pulling, look at the changes done upstream since last time we checked, only in the area we are interested in.

  5. fetch from a specific branch from a specific repository and merge.

  6. revert the pull.

  7. garbage collect leftover objects from reverted pull.

  8. from time to time, obtain official tags from the origin and store them under .git/refs/tags/.

Push into another repository.
satellite$ git clone mothership:frotz frotz (1) satellite$ cd frotz satellite$ git config --get-regexp '^(remote|branch)\.' (2) remote.origin.url mothership:frotz remote.origin.fetch refs/heads/*:refs/remotes/origin/* branch.master.remote origin branch.master.merge refs/heads/master satellite$ git config remote.origin.push \            master:refs/remotes/satellite/master (3) satellite$ edit/compile/test/commit satellite$ git push origin (4)  mothership$ cd frotz mothership$ git checkout master mothership$ git merge satellite/master (5)
  1. mothership machine has a frotz repository under your home directory; clone from it to start a repository on the satellite machine.

  2. clone sets these configuration variables by default. It arranges git pull to fetch and store the branches of mothership machine to local remotes/origin/* tracking branches.

  3. arrange git push to push local master branch to remotes/satellite/master branch of the mothership machine.

  4. push will stash our work away on remotes/satellite/master tracking branch on the mothership machine. You could use this as a back-up method.

  5. on mothership machine, merge the work done on the satellite machine into the master branch.

Branch off of a specific tag.
$ git checkout -b private2.6.14 v2.6.14 (1) $ edit/compile/test; git commit -a $ git checkout master $ git format-patch -k -m --stdout v2.6.14..private2.6.14 |   git am -3 -k (2)
  1. create a private branch based on a well known (but somewhat behind) tag.

  2. forward port all changes in private2.6.14 branch to master branch without a formal "merging".

Integrator

A fairly central person acting as the integrator in a group project receives changes made by others, reviews and integrates them and publishes the result for others to use, using these commands in addition to the ones needed by participants.

Examples

My typical GIT day.
$ git status (1) $ git show-branch (2) $ mailx (3) & s 2 3 4 5 ./+to-apply & s 7 8 ./+hold-linus & q $ git checkout -b topic/one master $ git am -3 -i -s -u ./+to-apply (4) $ compile/test $ git checkout -b hold/linus && git am -3 -i -s -u ./+hold-linus (5) $ git checkout topic/one && git rebase master (6) $ git checkout pu && git reset --hard next (7) $ git merge topic/one topic/two && git merge hold/linus (8) $ git checkout maint $ git cherry-pick master~4 (9) $ compile/test $ git tag -s -m "GIT 0.99.9x" v0.99.9x (10) $ git fetch ko && git show-branch master maint 'tags/ko-*' (11) $ git push ko (12) $ git push ko v0.99.9x (13)
  1. see what I was in the middle of doing, if any.

  2. see what topic branches I have and think about how ready they are.

  3. read mails, save ones that are applicable, and save others that are not quite ready.

  4. apply them, interactively, with my sign-offs.

  5. create topic branch as needed and apply, again with my sign-offs.

  6. rebase internal topic branch that has not been merged to the master, nor exposed as a part of a stable branch.

  7. restart pu every time from the next.

  8. and bundle topic branches still cooking.

  9. backport a critical fix.

  10. create a signed tag.

  11. make sure I did not accidentally rewind master beyond what I already pushed out. ko shorthand points at the repository I have at kernel.org, and looks like this:

    $ cat .git/remotes/ko URL: kernel.org:/pub/scm/git/git.git Pull: master:refs/tags/ko-master Pull: next:refs/tags/ko-next Pull: maint:refs/tags/ko-maint Push: master Push: next Push: +pu Push: maint

    In the output from git show-branch, master should have everything ko-master has, and next should have everything ko-next has.

  12. push out the bleeding edge.

  13. push the tag out, too.

Repository Administration

A repository administrator uses the following tools to set up and maintain access to the repository by developers.

  • git-daemon(1) to allow anonymous download from repository.

  • git-shell(1) can be used as a restricted login shell for shared central repository users.

update hook howto has a good example of managing a shared central repository.

Examples

We assume the following in /etc/services
$ grep 9418 /etc/services git             9418/tcp                # Git Version Control System
Run git-daemon to serve /pub/scm from inetd.
$ grep git /etc/inetd.conf git     stream  tcp     nowait  nobody \   /usr/bin/git-daemon git-daemon --inetd --export-all /pub/scm

The actual configuration line should be on one line.

Run git-daemon to serve /pub/scm from xinetd.
$ cat /etc/xinetd.d/git-daemon # default: off # description: The git server offers access to git repositories service git {         disable = no         type            = UNLISTED         port            = 9418         socket_type     = stream         wait            = no         user            = nobody         server          = /usr/bin/git-daemon         server_args     = --inetd --export-all --base-path=/pub/scm         log_on_failure  += USERID }

Check your xinetd(8) documentation and setup, this is from a Fedora system. Others might be different.

Give push/pull only access to developers.
$ grep git /etc/passwd (1) alice:x:1000:1000::/home/alice:/usr/bin/git-shell bob:x:1001:1001::/home/bob:/usr/bin/git-shell cindy:x:1002:1002::/home/cindy:/usr/bin/git-shell david:x:1003:1003::/home/david:/usr/bin/git-shell $ grep git /etc/shells (2) /usr/bin/git-shell
  1. log-in shell is set to /usr/bin/git-shell, which does not allow anything but git push and git pull. The users should get an ssh access to the machine.

  2. in many distributions /etc/shells needs to list what is used as the login shell.

CVS-style shared repository.
$ grep git /etc/group (1) git:x:9418:alice,bob,cindy,david $ cd /home/devo.git $ ls -l (2)   lrwxrwxrwx   1 david git    17 Dec  4 22:40 HEAD -> refs/heads/master   drwxrwsr-x   2 david git  4096 Dec  4 22:40 branches   -rw-rw-r--   1 david git    84 Dec  4 22:40 config   -rw-rw-r--   1 david git    58 Dec  4 22:40 description   drwxrwsr-x   2 david git  4096 Dec  4 22:40 hooks   -rw-rw-r--   1 david git 37504 Dec  4 22:40 index   drwxrwsr-x   2 david git  4096 Dec  4 22:40 info   drwxrwsr-x   4 david git  4096 Dec  4 22:40 objects   drwxrwsr-x   4 david git  4096 Nov  7 14:58 refs   drwxrwsr-x   2 david git  4096 Dec  4 22:40 remotes $ ls -l hooks/update (3)   -r-xr-xr-x   1 david git  3536 Dec  4 22:40 update $ cat info/allowed-users (4) refs/heads/master       alice\|cindy refs/heads/doc-update   bob refs/tags/v[0-9]*       david
  1. place the developers into the same git group.

  2. and make the shared repository writable by the group.

  3. use update-hook example by Carl from Documentation/howto/ for branch policy control.

  4. alice and cindy can push into master, only bob can push into doc-update. david is the release manager and is the only person who can create and push version tags.

HTTP server to support dumb protocol transfer.
dev$ git update-server-info (1) dev$ ftp user@isp.example.com (2) ftp> cp -r .git /home/user/myproject.git
  1. make sure your info/refs and objects/info/packs are up-to-date

  2. upload to public HTTP server hosted by your ISP.

Git| for the lazy - beginer

Git for the lazy - Spheriki

Git for the lazy

From Spheriki

Jump to: navigation, search

git is a distributed version control system. No, you don't need to know what that means to use this guide. Think of it as a time machine: Subversion or CVS without the suck.

If you make a lot of changes, but decided you made a mistake, this will save your butt.

This guide is for people who want to jump to any point in time with their project/game/whatever, and want something to use for themselves.


Contents

[hide]

Install git

Windows

  1. Download Cygwin.
  2. Put setup.exe in a folder of its own in your documents.
  3. Launch setup.exe.
  4. While installing Cygwin, pick these packages:
    • git from the DEVEL category
    • nano (if you're wimpy) or vim (if you know it), both in the EDITORS category

You'll now have a shortcut to launch Cygwin, which brings up something like the Linux terminal.

Linux

Install the git package using your preferred method (package manager or from source).


Introduce yourself to git

Fire up your Cygwin/Linux terminal, and type:

git config --global user.name "Joey Joejoe" git config --global user.email "joey@joejoe.com" 

You only need to do this once.


Start your project

Start your project using the Sphere editor, or from a ZIP file, or just by making the directory and adding files yourself.

Now cd to your project directory:

cd myproject/ 

Tell git to start giving a damn about your project:

git init 

... and your files in it:

git add . 

Wrap it up:

git commit 

Now type in a "commit message": a reminder to yourself of what you've just done, like:

Initial commit. 

Save it and quit (type Ctrl+o Ctrl+x if you're in nano, :x if you're in vim) and you're done!


Work in bits

When dealing with git, it's best to work in small bits. Rule of thumb: if you can't summarise it in a sentence, you've gone too long without committing.

This section is your typical work cycle:

  1. Work on your project.
  2. Check which files you've changed:
    git status
  3. Check what the actual changes were:
    git diff
  4. Add any files/folders mentioned in step 2 (or new ones):
    git add file1 newfile2 newfolder3
  5. Commit your work:
    git commit
  6. Enter and save your commit message. If you want to back out, just quit the editor.

Repeat as much as you like. Just remember to always end with a commit.


Admire your work

To see what you've done so far, type:

git log 

To just see the last few commits you've made:

git log -n3 

Replace 3 with whatever you feel like.

For a complete overview, type:

git log --stat --summary 

Browse at your leisure.


View changes

To view changes you haven't committed yet:

git diff 

If you want changes between versions of your project, first you'll need to know the commit ID for the changes:

git log --pretty=oneline 
6c93a1960072710c6677682a7816ba9e48b7528f Remove persist.clearScriptCache() function. c6e7f6e685edbb414c676df259aab989b617b018 Make git ignore logs directory. 8fefbce334d30466e3bb8f24d11202a8f535301c Initial commit. 

The 40 characters at the front of each line is the commit ID. You'll also see them when you git commit. You can use it to show differences between commits.

To view the changes between the 1st and 2nd commits, type:

git diff 8fef..c6e7 

Note how you didn't have to type the whole thing, just the first few unique characters are enough.

To view the last changes you made:

git diff HEAD^..HEAD 


How to fix mistakes

Haven't committed yet, but don't want to save the changes? You can throw them away:

git reset --hard 


You can also do it for individual files, but it's a bit different:

git checkout myfile.txt 


Messed up the commit message? This will let you re-enter it:

git commit --amend 


Forgot something in your last commit? That's easy to fix.

git reset --soft HEAD^ 

Add that stuff you forgot:

git add forgot.txt these.txt 

Then write over the last commit:

git commit 

Don't make a habit of overwriting/changing history if it's a public repo you're working with, though.


For the not so lazy

Just some extra reading here. Skip it if you're lazy.


Writing good commit messages

This part is all opinion, but worth reading.

Your first line should be a summary of the commit changes in a single sentence. It should be 50 characters or less. It should be in present tense: this matches up with git's merge commit messages, which you haven't met yet, but you'll eventually run into when you hit branching and merging.

The remaining body should go into more detail if needed. I use point form: a space, an asterisk (*), another space, followed by the point in detail.

e.g.

Add feature X to subsystem Y.   * Feature X isn't working well with feature Z. Worth investigating.  * Feature X still doesn't work for inputs A, B and C. 


Ignoring files

When you check your project status, sometimes you'll get something like this:

git status 
# On branch master # Untracked files: #   (use "git add <file>..." to include in what will be committed) # #       bleh.txt #       module.c~ nothing added to commit but untracked files present (use "git add" to track) 

If you don't want git to track these files, you can add entries to .gitignore:

nano .gitignore 

And add the files you want ignored:

bleh.txt *~ 

The first line ignores bleh.txt the second line ignores all files and directories ending with a tilde (~), i.e. backup files.

You can check if you got it right:

git status 
# On branch master # Changed but not updated: #   (use "git add <file>..." to update what will be committed) # #       modified:   .gitignore # no changes added to commit (use "git add" and/or "git commit -a") 

Don't forget to commit your changes to .gitignore!

git add .gitignore git commit 

With something like this for your commit message:

Make git ignore bleh.txt and backup files. 

Use .gitignore to keep your messages clean, and stop git from bugging you about stuff you don't care about. It's a good idea to ignore things like executable binaries, object files, etc. Pretty much anything that can be regenerated from source.


Branching and merging

A branch is a separate line of development. If you're going to make a bunch of changes related to a single feature, it might be a good idea to make a "topic branch": a branch related to a topic/feature.

To make a new branch:

git branch feature_x 

To view the current branches:

git branch 
  feature_x * master 

The asterisk (*) shows your current branch. master is the default branch, like the trunk in CVS or Subversion.

To switch to your new branch, just type:

git checkout feature_x 

If you check the branches again, you'll see the switch:

git branch 
* feature_x   master 

Now go through the usual edit/commit cycle. Your changes will go onto the new branch.

When you want to put your branch changes back onto master, first switch to master:

git checkout master 

Then merge the branch changes:

git merge feature_x 

This will combine the changes of the master and feature_x branches. If you didn't change the master branch, git will just "fast-forward" the feature_x changes so master is up to date. Otherwise, the changes from master and feature_x will be combined.

You can see the commit in your project's log:

git log -n1 

If you're happy with the result, and don't need the branch any more, you can delete it:

git branch -d feature_x 

Now when you see the branches, you'll only see the master branch:

git branch 
* master 

You can make as many branches as you need at once.


Tags

If you hit a new version of your project, it may be a good idea to mark it with a tag. Tags can be used to easily refer to older commits.

To tag the current version of your project as "v1.4.2", for example:

git tag v1.4.2 

You can use these tags in places where those 40-character IDs appear.


What now?

git can help with working with other people too. Of course, then you do have to learn about distributed version control. Until then, just enjoy this page.

But if you want to learn:

Main git selling points (ripped off the main site):

  • Distributed development, i.e. working with other people.
  • Strong support for non-linear development, i.e. working with other people at the same time!
  • Efficient handling of large projects, i.e. fast!
  • Cryptographic authentication of history, for the paranoid.
  • Scriptable toolkit design, you can script pretty much any git task.


If something doesn't seem right or is confusing, contact me at my blog. --tunginobi 10:14, 28 February 2009 (GMT)

Git|大步向前走: [Linux][軟體] Git-svn 使用簡單介紹

大步向前走: [Linux][軟體] Git-svn 使用簡單介紹

[Linux][軟體] Git-svn 使用簡單介紹

這個 git 版本控管軟體出來一段時間了,不過因為作者是 Linus Torvalds,所以一來說,他寫 code 的功力很好,所以軟體品質不錯,二來也是因為他的名氣的關係,所以這個軟體在 Linux 上面的支援很多(??) (Win32 的 git support 很少,不過可以參考一下)

這邊要來介紹的是 git 在 svn 上面的一個過渡應用(git-svn),以及在 git 和 svn 之間的比較:
首先說明,這個應用方式的好處在於:
1. 保留 svn 的優點, svn 有一堆 properties 給我用的很高興,像 keywords 、很簡單就可以設定好的 ignore list 、 executable 、 permission、 ownership -> 所以 repo 用的是 svn 。
2. 比 svk 快。 (svk transaction 式的通訊方式,做到一半掛點會從頭再來過,如果這個 commit 一次是上 Giga 的量,掛點時,我想你會很想哭,因為記憶體也吃很多, performance 還是一樣的爛。 Git 就好多了。)
3. 分散式的 repo ,所以你可以在 local commit 了,測試完成,才 "merge" 到 server 上面。(merge 是用 svk/svn 的用詞)

用法介紹:

在開始動作之前,我們先來搞個小動作,寫一些東西到 $HOME/.gitconfig
[user]
name = Anton Yu
email = xxxx@gmail.com
[color]
diff = auto
status = auto
branch = auto
[alias]
st = status
rb = svn rebase
ci = commit -a
co = checkout

這些東西包括了 alias, 個人的 email 還有 git 的 colordiff 等等設定。

$ mkdir gtalkbot 先找個地方來放 code
再找出 svn url 來,像是
https://gtalkbot.googlecode.com/svn/trunk/ 
執行:
$ git-svn init https://gtalkbot.googlecode.com/svn/trunk/ gtalkbot/
$ git-svn fetch


做完你就會發現你已經有一個 .git 的目錄在這裡了,不過跟 svn 不一樣的是,它並沒有每個子目錄都放 .git ,只有這個 root dir 放了一個 .git 而已,所以你可以很輕鬆的做到 (export) 的動作,而不需要特別的指令。

更新時,請下 git rb (就會自動執行 git svn rebase)
刪除檔案,請下 git-rm $file
新增檔案,請下 git-add $file
commit ,請下 git-svn ci
commit 到 svn 底下,請下 git-svn dcommit

不過 props 在 git 底下連 "svn:executable" 都失效。 所以不建議 svn 的使用者在沒有考慮周祥之前就轉移過來這個 SCM 管理平台。畢竟它的軟體生命才剛開始而已。

GIT|SVN+GIT=鱼与熊掌兼得

SVN+GIT=鱼与熊掌兼得 - Step Third - JavaEye技术网站

SVN+GIT=鱼与熊掌兼得

关键字: svn git
使用git已经有一段时间了,从使用git的第一天开始,就计划逐步放弃svn.

svn有的功能,git都能做到,而且做得更出色,况且git还有很多特性svn望尘莫及,还有什么理由继续使用svn呢?

well,理由很多. 比如,git在windows上的性能问题, TortoiseGIT还没有开发出来(或者根本没有这个计划?),团队中其它人员不习惯用git....等等.

那么,鱼与熊掌,能否兼得?

=== SVN 之痛与痒 ===

svn的最大问题是不支持分布式开发. 分布式并不一定就是指象Linux Kernel那样的大型协作开发场景.

例如,你想把没做完的工作带回家做,但是家里又不能连线到公司的svn服务器,那么你就不能commit. 实际上,这也是一种分布式开发的场景.

你会说,那你就不要commit啊 ... 我办不到, 我有个坏习惯,经常做些小改动,但是十分钟后就后悔了想改回来,只有经常commit我才能找回上次,上上次变更.

当然,我有坏习惯因此我不会commit到trunk或主branch上,否则会被扁死 :-)
所以,我经常有很多临时branch要merge,频率非常之高...在svn中的merge并不好玩.

不得不说,svn的repository设计很糟糕. 慢, 特别是在项目规模上去,开发周期长时,repository迅速膨胀.项目树中到处都是.svn也是很讨厌.

但是, TortoiseSVN实在是方便, 很多人使用SVN就是因为图这个方便.
支持SVN的IDE也数不胜数.

SVN,既痛又痒....

=== GIT 的威力 ===

git很快,真的很快,比小李飞刀还快...(当然是在Linux下).
试试checkout Linux Kernel的各个tag,那个速度,不得不佩服,呵呵~

其实对于小项目来说,速度倒无所谓,不差那么几秒,git还有很多cool things.

git diff很强大,真的很强大.比较任何两个历史版本,速度飞快.

git中做branch简直太简单了,branch merge也是非常的爽,更不用说three way merge了. 当然还有很多很cool的特性,例如,与别人的git tree进行merge ... 其实这些或多或少都是由于分布式的特性带来的.

还有那些通过email commit等等一般小团队开发用不到的功能,就不多说了.

=== 鱼与熊掌兼得 ===

首先,svn照用,主版本管理用svn(照顾团队嘛).

然后在项目目录下建git repository: git init.
这只在项目根目录下多出一个.git目录,不会象svn或cvs那样,每个子目录都有它的垃圾.

接下来,建立.gitignore文件,把不需要git管理的文件,加入此表,例如.svn. 或者进入.git/info编辑exclude文件.

加入git: git add .

完成了,就这么简单.

从此以后,小的,临时的改动,通通用git来管理,又快又准,还不影响别人. 因为你只用到本地git repository,与其他人无关.

各人建各人自己的git tree,互不干扰. 当然,如果你想日后某一天可以merge别人的tree,那么还是建一个bare public tree吧, 各人clone一个,然后工作在自己的branch下,平时还是照样离线commit,需要时push.

在家里工作?没问题,照样可以commit,git是分布式的.

回到公司后,想commit到svn?没问题,在git中checkout你想要的"working code"版本,再在svn中commit, 然后git再checkout HEAD,继续前行

=== 结论 ===

svn和git结合, 可以带来以下好处:
1) 与单独使用svn的其它组员不冲突
2) 享受git分布式带来的好处
3) 可以满足svn commit working code的需求
4) svn大粒度管理,减轻svn repository的压力.
5) svn继续发挥GUI便利的优势.

所以, SVN + GIT = 鱼与熊掌兼得