Upgrading Drupal with Git
Drupal development team has released Drupal 5.2 on July 26, 2007. It fixes two security vulnerabilities, so it is highly recommended that you upgrade as soon as possible. Many Drupal installations often contain extra Drupal modules, and almost always the also contain local customizations.
Question arises: how to upgrade your Drupal installations
- timely,
- safely,
- with confidence that none of your local customizations are lost;
- without need to remember each line that was edited and re-applying those edits to new version;
- without need to drastically change your workflow (very little discipline is needed actually);
We show that Git solves those problems easily.
Table of Contents
- Assumptions
- Lines of development
- Importing Drupal source code
- Importing Drupal modules
- Local customizations of Drupal
- Upgrading Drupal
- Links
- Final words
It is well-known that Git is a distributed version control system that was created by Linus Torvalds to help with the development of Linux kernel. Distributed version control systems, such as Git, are contrasted with centralized version control systems, such as Subversion. Linux kernel development is characterized by hundreds of contributors and several dozens of development sub-projects, all spread out across the Internet. The repositories contain thousands of files and many thousands of revisions.
We show that Git is actually capable of handling much more lightweight problems, without any unnecessary overhead, with only half a dozen of commands to remember.
Assumptions
You will need:
- command line access to any modern Linux distribution or FreeBSD installation;
- basic knowledge of Linux command line and file system;
- sufficiently good knowledge of Drupal installation and maintenance;
- installed Git (consult the documentation);
The code which you will work with resides in working copies. All history of code that you keep in Git is stored in repositories, which is just a single subdirectory with a special name (.git/), residing at the top level of working copy. By default repositories and working copies are bound very tightly (unlike in Subversion).
We assume that you use Drupal 5.x with several modules, and several (or many) local customizations. Drupal and modules could be upgraded separately at any time. Local customizations could be rather extensive.
Our real installation which will be used as an example uses Drupal 5.1 (painlessly upgraded to 5.2 immediately after the release). We use the following:
- Adsense module;
- Google Analytics module;
- Print Friendly pages (which also had a security vulnerability up to a recently released 5.x-1.2 version).
Our local customizations are simple one-line fixes, e.g.:
- single line added to robots.txt, to exclude our “obsolete content” page from attention of spiders;
- several redirects for legacy URLs added to top-level .htaccess file;
- PHP set_time_limit() function is not allowed by our hosting provider, so its occurrences are commented out;
- sites/default/settings.php, where we have specified our Mysql server login and password (this is obviously most common customization);
The changes are few, but we expect them to accumulate over time, adding bug-fixes and such like.
Lines of development
Basically, you have three lines of development: Drupal, modules and your own. Correspondingly, we shall use three working copies (each with its own repository):
- drupal, containing pristine Drupal 5.x versions;
- drupal-and-modules, containing most recent version of Drupal together with current version of modules you use;
- drupal-production, containing most recent version of Drupal, current version of modules, and your local customizations.
Those three repositories will be chained together, with changes propagated from left to right. So,
- after you commit new version of Drupal to the drupal working copy, Git can merge changes automatically to drupal-and-modules, and then to drupal-production;
- also, when you commit new modules to drupal-and-modules, Git can merge changes automatically to drupal-production;
- your local customizations are committed to drupal-production;
- drupal-production is what actually gets copied to the web server.
That will be the complete workflow, basically.
NB: if you already have a Drupal installation, you will only need to remember its version, and extra modules you have installed. Then we will reproduce that installation, and cleanly import your existing changes: you will also have a chance to clearly look at them.
Importing Drupal source code
Let’s start from previous version of Drupal, so that we could simulate the upgrade.
$ mkdir ~/src $ cd ~/src $ wget http://ftp.drupal.org/files/projects/drupal-5.1.tar.gz $ mkdir drupal (that will be our first working copy) NB: your tar may not support this command line: what we need is to extract the tar archive into directory named `drupal’ $ tar --strip-components=1 -C drupal -zxv -f drupal-5.1.tar.gz $ cd drupal (verify that you actually have Drupal source correctly extracted: `index.php’ and `INSTALL.txt’ should be here $ git-init Initialized empty Git repository in .git/ $ git-add . $ git-commit -a -m “Drupal 5.1 imported” […] create mode 100644 themes/pushbutton/tabs-option-off.png create mode 100644 themes/pushbutton/tabs-option-on.png create mode 100644 update.php create mode 100644 xmlrpc.php $ git-tag drupal-5.1 Now you have a working copy, and a repository in `.git/’ directory. Verify that using ls. $ git-status # On branch master nothing to commit (working directory clean)That’s all with the first working copy (drupal) for now.
Importing Drupal modules
Now we shall create second working copy, called drupal-and-modules.
$ cd ~/src $ git-clone drupal drupal-and-modules Initialized empty Git repository in /home/alexm/src/drupal-and-modules/.git/ remote: Generating pack… Done counting 329 objects. Deltifying 329 objects… remote: 100% (329/329) done Indexing 329 objects… 100% (329/329) done Resolving 32 deltas… 100% (32/32) done remote: Total 329 (delta 32), reused 0 (delta 0)
Download and extract several modules (the same versions that are used in your current installation), and commit them to Git.
$ cd drupal-and-modules $ cd modules $ wget http://ftp.drupal.org/files/projects/adsense-5.x-1.6.tar.gz $ tar zxfv adsense-5.x-1.6.tar.gz […] adsense/po/adsense.pot adsense/po/adsense_help-inc.pot adsense/po/general.pot adsense/po/ru.po adsense/LICENSE.txt $ git-add adsense $ git-commit -m “Adsense 5.x-1.6 imported” adsense […] create mode 100644 modules/adsense/po/adsense.pot create mode 100644 modules/adsense/po/adsense_help-inc.pot create mode 100644 modules/adsense/po/general.pot create mode 100644 modules/adsense/po/ru.po
Repeat the “wget, tar, git-add, git-commit” routine with other modules (e.g., google_analytics and print).
So, our second working copy contains Drupal and modules. We may verify it in several ways.
a) We may look at the entire log of our changes, using git-log.
$ git-log
commit 1fb029d1466f1d3ee405bbee3025640f0add0c90
Author: Alexey Mahotkin <alexm@mynd.rinet.ru>
Date: Sun Jul 29 02:32:29 2007 +0400
Print 5.x-1.1 imported
commit 542fbb792f68523120158bf8a65cb754d08ef906
Author: Alexey Mahotkin <alexm@mynd.rinet.ru>
Date: Sun Jul 29 02:31:40 2007 +0400
Google Analytics 5.x-1.2 imported
commit 5c9a015a0070971f8380a4a2aa452363f1437f05
Author: Alexey Mahotkin <alexm@mynd.rinet.ru>
Date: Sun Jul 29 02:23:20 2007 +0400
Adsense 5.x-1.6 imported
commit b2d7641c66ff6a597b822745f5148e406ef86913
Author: Alexey Mahotkin <alexm@mynd.rinet.ru>
Date: Sun Jul 29 02:08:27 2007 +0400
Drupal 5.1 imported
b) We may look at the brief list of files that was added and committed in that working copy, using git-diff and diffstat.
$ git-diff drupal-5.1 | diffstat adsense/LICENSE.txt | 274 +++++++ adsense/README.txt | 96 ++ adsense/adsense.info | 10 adsense/adsense.install | 55 + […] print/print.info | 8 print/print.module | 382 +++++++++ print/print.node.tpl.php | 53 + print/print.profile.tpl.php | 44 + 37 files changed, 6206 insertions(+)
c) We may see that git-status shows all the .tar.gz files which we downloaded. They are not needed and could be deleted.
Local customizations of Drupal source code
Finally, we create the third working copy, drupal-production, and populate it with our local modifications (or just copy files from existing installation here).
$ cd ~/src $ git-clone drupal-and-modules drupal-production Initialized empty Git repository in /home/alexm/src/drupal-production/.git/ remote: Generating pack… Done counting 380 objects. Deltifying 380 objects… Indexing 380 objects… done 100% (380/380) done80) done remote: Total 380 (delta 47), reused 326 (delta 32) 100% (380/380) done Resolving 47 deltas… 100% (47/47) done
Now, copy your existing installation into drupal-production, or just edit all the necessary files. In the future, you will edit the files in that working copy as needed.
NB: important security issue: you have to edit the .htaccess file and add the following line (in bold) near the end, before the final RewriteRule:
#RewriteRule module.php index.php?q=%1 [L] RewriteRule ^\.git - [F] # Rewrite current-style URLs of the form ‘index.php?q=x’. RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^(.*)$ index.php?q=$1 [L,QSA] </IfModule>This is needed to protect the repository which you may accidentally (or deliberately) put on the Web. Without this line it is possible for the attacker to learn your MySQL login and password, which would be unfortunate.
If you have new files in your installation, add and commit them to git, using git-status:
$ git-status # On branch master # Untracked files: # (use “git add…” to include in what will be committed) # # NEW-FILE $ git-add NEW-FILE
Now you may take a look at what is actually changed in your existing installation (and/or what edits you’ve just made), using git-diff. Our own changes would look slightly like this:
$ git-diff
diff –git a/.htaccess b/.htaccess
index 46232a7..cf15272 100644
— a/.htaccess
+++ b/.htaccess
@@ -74,6 +74,9 @@ DirectoryIndex index.php
# the rewrite rules are not working properly.
#RewriteBase /drupal
+ RewriteRule ^index\.html / [L,R=301]
+ RewriteRule ^books\.html /books/ [L,R=301]
+
# Rewrite old-style URLs of the form ‘node.php?id=x’.
#RewriteCond %{REQUEST_FILENAME} !-f
#RewriteCond %{REQUEST_FILENAME} !-d
diff –git a/css/styles.css b/css/styles.css
new file mode 100644
index 0000000..25cf3c4
— /dev/null
+++ b/css/styles.css
@@ -0,0 +1,8 @@
+body {
+ background: white;
+ margin-left: 2em;
[… etc., etc… ]
Now, you have to commit all your changes. You may either commit everything entirely, with a single message:
$ git-commit -a -m “Imported existing installation” Created commit 2d5853b: Imported existing installation 3 files changed, 4 insertions(+), 2 deletions(-)
Upgrading Drupal
We’re just few steps away from the actual upgrade of your Drupal installation. It will almost certainly be easier than initial preparation!
Our plan for upgrading is as follows:
- commit new version of Drupal to the drupal working copy (risk-free);
- automatically merge changes to the drupal-and-modules working copy (almost certainly risk-free);
- automatically merge changes to the drupal-production working copy (difficulty will depend on the amount of your changes and amount of changes between versions;
Upgrading Drupal, pt. I, drupal
We simply unpack new Drupal archive into the first working copy:
$ cd ~/src $ wget http://ftp.drupal.org/files/projects/drupal-5.2.tar.gz $ tar --strip-components=1 -C drupal -zxv -f drupal-5.2.tar.gz $ cd drupal (verify that you have correctly extracted new Drupal source by looking into version number in CHANGELOG.txt) $ git-status (if this shows any “Untracked files”, you should add them to the commit with git-add)
If you are interested, you may look at the exact changes made by Drupal developers by executing git-diff.
Let’s commit new version:
$ git-commit -a -m “Drupal 5.2 imported” Created commit f1aa870: Drupal 5.2 imported 90 files changed, 859 insertions(+), 511 deletions(-) $ git-tag drupal-5.2
We are ready to move to the next part.
NB: There is one more reassuring thing to tell: Git combines the important property of immutability of history with the ability to controllably rollback commits without leaving almost any trace. That means that if you mistakenly committed something that shouldn’t have been, you could use
$ git-reset –soft HEAD^ (this rolls back last active commit) $ git-checkout -f (this reverts the working copy to a previous (good) commit
Upgrading Drupal, pt. II, drupal-and-modules
Let’s merge changes into second working copy, drupal-and-modules:
$ cd ~/src $ cd drupal-and-modules $ git-pull remote: Generating pack… remote: Done counting 269 objects. Result has 135 objects. remote: Deltifying 135 objects… 100% (135/135) done: ) done Indexing 135 objects… remote: Total 135 (delta 103), reused 0 (delta 0) 100% (135/135) done Resolving 103 deltas… 100% (103/103) done 103 objects were added to complete this thin pack. * refs/remotes/origin/master: fast forward to branch ‘master’ of /home/alexm/src/drupal/ old..new: b2d7641..989fff6 Merge made by recursive. .htaccess | 30 +++- CHANGELOG.txt | 18 ++- INSTALL.txt | 12 +- […] themes/garland/style.css | 14 ++- themes/garland/template.php | 2 + update.php | 12 +- 90 files changed, 859 insertions(+), 511 deletions(-)
You may now verify by looking into CHANGELOG.txt that this working copy was upgraded to Drupal 5.2. You may also see that all your extra modules: adsense, google_analytics, and print are still there.
As in the previous case, you may roll-back this git-pull operation with git-reset –soft HEAD^ and git-checkout -f.
Upgrading Drupal, pt. III, drupal-production
Upgrading the third and final working copy is exactly the same as in the previous case.
The only risk is that your changes may conflict with changes made by Drupal developers. Conflict is a technical term – there is a simple and straight-forward procedure resolving this commit. Next subsection is dedicated to this.
$ cd ~/src $ cd drupal-production $ git-pull Auto-merged includes/common.inc Auto-merged includes/locale.inc Merge made by recursive. .htaccess | 30 +++- CHANGELOG.txt | 18 ++- INSTALL.txt | 12 +- […] themes/garland/print.css | 2 + themes/garland/style.css | 14 ++- themes/garland/template.php | 2 + update.php | 12 +- 90 files changed, 859 insertions(+), 511 deletions(-)
The first two lines of git-pull output show that your changes were automatically merged with changes made in Drupal 5.2. You may wish to look at those files and make sure they look ok.
Thus, we are done. You should probably test your new upgraded Drupal code before actually putting it into production. Now we could just put this working copy to directory where web-server will see it.
Upgrading Drupal, pt. IIIb: resolving conflicts
You may skip this session on first reading
Sometimes merging can lead to the textual conflicts, which should be intelligently resolved by a human being.
$ cd ~/src $ cd drupal-production $ git-pull Auto-merged .htaccess CONFLICT (content): Merge conflict in .htaccess Auto-merged includes/common.inc Auto-merged includes/locale.inc Automatic merge failed; fix conflicts and then commit the result.
First rule of resolving conflicts: DON’T PANIC.
You may always either restore from backup or execute the command git-checkout -f which will restore from the previous (good) commit.
Let’s look at the conflicting file .htaccess, looking for the lines that start with <<<<<<<, =======, and >>>>>>>, called “conflict markers”. The conflicting fragment may look like this:
<<<<<<< HEAD:.htaccess RewriteRule ^\.git - [F] # If your site can be accessed both with and without the prefix www. you # can use one of the following settings to force user to use only one option: ======= # If your site can be accessed both with and without the 'www.' prefix, you # can use one of the following settings to redirect users to your preferred # URL, either WITH or WITHOUT the 'www.' prefix. Choose ONLY one option: >>>>>>> f1c6337a3a0ab4860578177754358e29186e425d:.htaccess
First part of this fragment is YOUR code, the second part is the code from Drupal. We have to decide which part (or combination thereof) we leave in the file. We have four choices:
- leave our code, removing conflict markers and second part;
- leave Drupal code, removing conflict markers and first part;
- leave some combination of both parts, removing conflict markets and editing both parts;
- remove both parts and conflict markers altogether;
In this case we will go the hardest way, leaving our code (RewriteRule directive) and new version of comment from Drupal code (second part of conflicting fragment).
We remove conflict markers and edit both parts, so that the following remains:
RewriteRule ^\.git - [F] # If your site can be accessed both with and without the 'www.' prefix, you # can use one of the following settings to redirect users to your preferred # URL, either WITH or WITHOUT the 'www.' prefix. Choose ONLY one option:
We mark the conflict as resolved, using
$ git-add .htaccess
Then as usual, we look at the files to verify that they are ok, and commit the results of merge:
$ git-commit -a -m “Drupal 5.2 merged” Created commit 1fc2102: Drupal 5.2 merged
Links
On Git:- Git homepage;
- Git documentation, including the following pages for every tool we’ve used in this article:
- Git Documentation Wiki: tutorials, handbook and HOWTO documents;
- Articles on Git in the Version Control Blog;
- Drupal home;
- Drupal 5.2 release announcement;
- Adsense module for Drupal;
- Google Analytics module for Drupal;
- Print Friendly Pages for Drupal;
Final words
We successfully use the method described above to upgrade our Drupal installation. If you have questions, comments or suggestions, please comment here or write us to squadette@gmail.com, we’ll try to help.Please consider digging this article or adding points at reddit if you find it useful.
If you’ve moved your Drupal installation to Git, taking inspiration from this article, please consider linking here from your blog or home site.
Thank you!
Recent posts on similar topics
- Dave Dribin: "Choosing a Distributed Version Control System" - February 10th, 2008
- AccuRev streams vs branches - January 14th, 2008
- Robin Luckey: "The World's Oldest Source Code Repositories" - October 18th, 2007
- Karl Fogel, Ben Collins-Sussman, on distributed version control - October 11th, 2007
- Tim O'Reilly: Why Congress Needs a Version Control System - October 7th, 2007
August 8th, 2007 at 4:36 pm
Version Control Blog: Upgrading Drupal with Git
January 30th, 2008 at 4:49 pm
Thank you for this fantastic, detailed plan for maintaining Drupal sites in Git.
I would greatly appreciate your thoughts on one further extension of it. Some of the modules Agaric uses are ones that we are developing ourselves — these are contributed to Drupal.org but in the interest of not having to check in every change to CVS we have our own development version. This also applies to other modules which we patch and then contribute the patches– how do you recommend dealing with upstream changes that will be the same (since we contributed them!), or nearly the same, to changes already made on the production site?
Or will this system handle that with no change.
February 1st, 2008 at 5:52 am
Note that the step for going to the ‘modules’ directory misses a step and contains a typo:
First, the directory ‘modules’ must be created (in Drupal 5)
mkdir sites/all/modules
Second, it is ’sites’ not ’site’
cd sites/all/modules
February 1st, 2008 at 8:43 pm
Also, this didn’t work: git-commit adsense -m “Adsense 5.x-1.6 imported”
The commit command doesn’t seem to repeat the directory just added, but should just be:
git-commit -m “Adsense 5.x-1.6 imported”
February 5th, 2008 at 7:42 pm
You’ve got some great ideas here for managing a Drupal installation with git, but I think it can be made a little simpler.
I think using branches, instead of repository clones, for drupal, drupal-and-modules and drupal-production would be more natural to git. For example, you would create the original git repository exactly the same way, but instead of “git-clone drupal drupal-and-modules” you would simply stain in your “drupal” repository and create a “drupal-and-modules” branch:
git checkout -b drupal-and-modules
You would then branch off again for drupal production:
# from your drupal repository on the drupal-and-modules branch
git checkout -b drupal-production
Then, when you need to upgrade drupal, you just checkout the master branch
git checkout master
Perform the same upgrade steps, then merge to the other branches when you’re ready.
February 7th, 2008 at 10:36 pm
I’m not very familiar with git. Is there anything here that would be more difficult to do with svn?
February 10th, 2008 at 9:27 pm
benjamin,
thank you for bug reports — I’ve fixed them all.
I believe you should use the new submodules features in Git 1.5.3. You would create ordinary Git repository with your development module, using git-cvsimport for tracking the CVS repository.
Then you would use this repository as a submodule of your drupal-and-modules repository.
Tutorial is available: http://git.or.cz/gitwiki/GitSubmoduleTutorial
So, the changes will flow 1. from Drupal CVS to your module repo; 2. from your module repo to the drupal-and-modules. When you’ll commit something of your work to the Drupal CVS, I guess everything will just merge automatically.
February 10th, 2008 at 9:34 pm
santry,
I believe that for tutorial my approach of multiple clones is much more accessible.
First, it’s easy to “screw up” the repository if you’re doing heavy branches here. My target audience is not Git experts, so they will have trouble if mental model is out of sync with the repository state.
Second, you will usually have several sites under this scheme, and it’s easier to just rsync the directory without worrying which branch is currently checked out there.
Third, I have to admit with a shame that I do not completely understand how branches work in Git — I just didn’t have the chance to use it in production.
February 10th, 2008 at 9:35 pm
hofo,
in short: merging will be much more tedious and error-prone.
In Git, merging really works like a charm.
In Subversion — wait for 1.5 where repeated merging will be much more integrated.
February 18th, 2008 at 5:18 pm
this line:
$ tar –strip-components=1 -C drupal -zxv -f drupal-5.2.tar.gz
results in this error:
tar: You may not specify more than one `-Acdtrux’ option
Urgh. That’s the line that’s supposed to make all of this worthwhile :-P
(I have read the tar man page more times than — ah, I got it. It needs to be doule-dash for a long-form command!)
Needs to be:
$ tar --strip-components=1 -C drupal -zxv -f drupal-5.2.tar.gzFebruary 23rd, 2008 at 12:46 am
benjamin,
fixed! stupid wordpress.