28.1.1 Introduction to Version Control
VC allows you to use a version control system from within Emacs, integrating the version control operations smoothly with editing. It provides a uniform interface for common operations in many version control operations.
Some uncommon or intricate version control operations, such as altering repository settings, are not supported in VC. You should perform such tasks outside VC, e.g., via the command line.
This section provides a general overview of version control, and describes the version control systems that VC supports. You can skip this section if you are already familiar with the version control system you want to use.
• Why Version Control? |   | Understanding the problems it addresses. |
• Version Control Systems |   | Supported version control back-end systems. |
• VCS Concepts |   | Words and concepts related to version control. |
• VCS Merging |   | How file conflicts are handled. |
• VCS Changesets |   | How changes are grouped. |
• VCS Repositories |   | Where version control repositories are stored. |
• Types of Log File |   | The VCS log in contrast to the ChangeLog. |
28.1.1.1 Understanding the Problems it Addresses​
Version control systems provide you with three important capabilities:
- Reversibility: the ability to back up to a previous state if you discover that some modification you did was a mistake or a bad idea.
- Concurrency: the ability to have many people modifying the same collection of files knowing that conflicting modifications can be detected and resolved.
- History: the ability to attach historical data to your data, such as explanatory comments about the intention behind each change. Even for a programmer working solo, change histories are an important aid to memory; for a multi-person project, they are a vitally important form of communication among developers.
28.1.1.2 Supported Version Control Systems​
VC currently works with many different version control systems, which it refers to as back ends:
- SCCS was the first version control system ever built, and was long ago superseded by more advanced ones. VC compensates for certain features missing in SCCS (e.g., tag names for releases) by implementing them itself. Other VC features, such as multiple branches, are simply unavailable. Since SCCS is non-free, we recommend avoiding it.
- CSSC is a free replacement for SCCS. You should use CSSC only if, for some reason, you cannot use a more recent and better-designed version control system.
- RCS is the free version control system around which VC was initially built. It is relatively primitive: it cannot be used over the network, and works at the level of individual files. Almost everything you can do with RCS can be done through VC.
- CVS is the free version control system that was, until circa 2008, used by the majority of free software projects. Since then, it has been superseded by newer systems. CVS allows concurrent multi-user development either locally or over the network. Unlike newer systems, it lacks support for atomic commits and file moving/renaming. VC supports all basic editing operations under CVS.
- Subversion (svn) is a free version control system designed to be similar to CVS but without its problems (e.g., it supports atomic commits of filesets, and versioning of directories, symbolic links, meta-data, renames, copies, and deletes).
- Git is a decentralized version control system originally invented by Linus Torvalds to support development of Linux (his kernel). VC supports many common Git operations, but others, such as repository syncing, must be done from the command line.
- Mercurial (hg) is a decentralized version control system broadly resembling Git. VC supports most Mercurial commands, with the exception of repository sync operations.
- Bazaar (bzr) is a decentralized version control system that supports both repository-based and decentralized versioning. VC supports most basic editing operations under Bazaar.
- SRC (src) is RCS, reloaded—a specialized version-control system designed for single-file projects worked on by only one person. It allows multiple files with independent version-control histories to exist in one directory, and is thus particularly well suited for maintaining small documents, scripts, and dotfiles. While it uses RCS for revision storage, it presents a modern user interface featuring lockless operation and integer sequential version numbers. VC supports almost all SRC operations.
28.1.1.3 Concepts of Version Control​
When a file is under version control, we say that it is registered in the version control system. The system has a repository which stores both the file’s present state and its change history—enough to reconstruct the current version or any earlier version. The repository also contains other information, such as log entries that describe the changes made to each file.
The copy of a version-controlled file that you actually edit is called the work file. You can change each work file as you would an ordinary file. After you are done with a set of changes, you may commit (or check in) the changes; this records the changes in the repository, along with a descriptive log entry.
A directory tree of work files is called a working tree.
Each commit creates a new revision in the repository. The version control system keeps track of all past revisions and the changes that were made in each revision. Each revision is named by a revision ID, whose format depends on the version control system; in the simplest case, it is just an integer.
To go beyond these basic concepts, you will need to understand three aspects in which version control systems differ. As explained in the next three sections, they can be lock-based or merge-based; file-based or changeset-based; and centralized or decentralized. VC handles all these modes of operation, but it cannot hide the differences.
28.1.1.4 Merge-based vs Lock-based Version Control​
A version control system typically has some mechanism to coordinate between users who want to change the same file. There are two ways to do this: merging and locking.
In a version control system that uses merging, each user may modify a work file at any time. The system lets you merge your work file, which may contain changes that have not been committed, with the latest changes that others have committed.
Older version control systems use a locking scheme instead. Here, work files are normally read-only. To edit a file, you ask the version control system to make it writable for you by locking it; only one user can lock a given file at any given time. This procedure is analogous to, but different from, the locking that Emacs uses to detect simultaneous editing of ordinary files (see Interlocking). When you commit your changes, that unlocks the file, and the work file becomes read-only again. Other users may then lock the file to make their own changes.
Both locking and merging systems can have problems when multiple users try to modify the same file at the same time. Locking systems have lock conflicts; a user may try to check a file out and be unable to because it is locked. In merging systems, merge conflicts happen when you commit a change to a file that conflicts with a change committed by someone else after your checkout. Both kinds of conflict have to be resolved by human judgment and communication. Experience has shown that merging is superior to locking, both in convenience to developers and in minimizing the number and severity of conflicts that actually occur.
SCCS always uses locking. RCS is lock-based by default but can be told to operate in a merging style. CVS and Subversion are merge-based by default but can be told to operate in a locking mode. Decentralized version control systems, such as Git and Mercurial, are exclusively merging-based.
VC mode supports both locking and merging version control. The terms “commit" and “update" are used in newer version control systems; older lock-based systems use the terms “check in" and “check out". VC hides the differences between them as much as possible.
28.1.1.5 Changeset-based vs File-based Version Control​
On SCCS, RCS, CVS, and other early version control systems (and also in SRC), version control operations are file-based: each file has its own comment and revision history separate from that of all other files. Newer systems, beginning with Subversion, are changeset-based: a commit may include changes to several files, and the entire set of changes is handled as a unit. Any comment associated with the change does not belong to a single file, but to the changeset itself.
Changeset-based version control is more flexible and powerful than file-based version control; usually, when a change to multiple files has to be reversed, it’s good to be able to easily identify and remove all of it.
28.1.1.6 Decentralized vs Centralized Repositories​
Early version control systems were designed around a centralized model in which each project has only one repository used by all developers. SCCS, RCS, CVS, Subversion, and SRC share this kind of model. One of its drawbacks is that the repository is a choke point for reliability and efficiency.
GNU Arch pioneered the concept of distributed or decentralized version control, later implemented in Git, Mercurial, and Bazaar. A project may have several different repositories, and these systems support a sort of super-merge between repositories that tries to reconcile their change histories. In effect, there is one repository for each developer, and repository merges take the place of commit operations.
VC helps you manage the traffic between your personal workfiles and a repository. Whether the repository is a single master, or one of a network of peer repositories, is not something VC has to care about.
28.1.1.7 Types of Log File​
Projects that use a version control system can have two types of log for changes. One is the log maintained by the version control system: each time you commit a change, you fill out a log entry for the change (see Log Buffer). This is called the version control log.
The other kind of log is the file ChangeLog
(see Change Log). It provides a chronological record of all changes to a large portion of a program—typically one directory and its subdirectories. A small program would use one ChangeLog
file; a large program may have a ChangeLog
file in each major directory. See Change Log. Programmers have used change logs since long before version control systems.
Changeset-based version systems typically maintain a changeset-based modification log for the entire system, which makes change log files somewhat redundant. One advantage that they retain is that it is sometimes useful to be able to view the transaction history of a single directory separately from those of other directories. Another advantage is that commit logs can’t be fixed in many version control systems.
A project maintained with version control can use just the version control log, or it can use both kinds of logs. It can handle some files one way and some files the other way. Each project has its policy, which you should follow.
When the policy is to use both, you typically want to write an entry for each change just once, then put it into both logs. You can write the entry in ChangeLog
, then copy it to the log buffer with C-c C-a
when committing the change (see Log Buffer). Or you can write the entry in the log buffer while committing the change (with the help of C-c C-w
), and later use the C-x v a
command to copy it to ChangeLog
(see Change Logs and VC).