Mac in the Shell: The Difference Engine
Volume Number: 23 (2007)
Issue Number: 04
Column Tag: Mac in the Shell
Mac in the Shell: The Difference Engine
File differences for non-programmers
by Edward Marczak
Introduction
Far too often, SysAdmins make changes to files, and then are unsure which changes they've made. Or, for people investigating a system for the first time, files may be customized and you need to know how configuration files have been changed from a baseline. There's the venerable Unix tool diff -- probably the most widely used -- to help us find those answers. However, more and more GUI tools have nice ways to point out differences. This month, we look at ways to 'tell the difference'.
Genesis
diff is one of the older Unix utilities around. Originally intended for programmers to find differences in lines of code, it has come to be more generically useful. It can even be used to compare binary files and entire directories of files. Furthermore, a complementary program, patch, can provide an easy way to describe and pass those changes on to others that may require them.
While I extol the virtues of life in the shell, there are just times that a GUI tool works better (OK, emacs users, don't get your knickers in a twist). Most text editors have a way to diff files, and Apple supplies a fantastic differencing tool as part of the Developer Tools install. Let's look at a basic scenario.
Despite the nice GUI that Apple gives us for many tasks, as system administrators, there are still many text files that we need to touch to tap into the full power of the system. Postfix is one subsystem that comes to mind. It has many options and incredible features that are not exposed through the GUI. All of postfix's configuration files live in /etc/postfix, and they're all text (even though they may later be converted to binary or hash files for speed).
So, you start tweaking. And tweaking. And tweaking... and then something breaks. Postfix won't start, or, it does, but stops delivering mail. So, you reach for your backup and get things running again. However, you were tweaking for a reason. You still want to make some change that alters the behavior of postfix. But where did the wheels come off? diff to the rescue! Comparing your updated file with the working version should allow you to see what happened. Let's say, for arguments sake, you were updating the /etc/postfix/main.cf file. Your updated file is now named /etc/postfix/main.cf.updated, and the working version is back in place at /etc/postfix/main.cf.
Easy! Just change into the /etc/postfix directory, and run diff:
diff main.cf.updated main.cf
You'll get some output like this:
398c398
< recipient_delimiter = +
---
> #recipient_delimiter = +
Soooooo... what is that all about? The initial command without options simply compares the two files that you specify. By the way, the OS X man page for diff lists the invocation as "diff [OPTION]... FILES", and the BSD man page similarly states "diff [options] from-file to-file". Some other instructions specifically call the files to be compared "old-file" and "new-file." Better would simply be "left-file" and "right-file." Personally, I always like the newer file on the left (however, see further down where you cannot do this). On to the output: The first line tells you which line(s) would need to be altered to make the files match. In this case, we're told that there's a change ("c") needed on line 398. The "<" points to which file has the differing line, the left ("<") or right (">"). So, we can quickly see that the left file uncommented the recipient delimiter line.
Besides a change, you can also be notified that an add (a) or delete (d) will be required to make the files the same.
So, this is all well and good, but would make for a heck of a short column if that were the end of the story. As the files being compared get longer, and the changes increase, this kind of output can get a little tough to look at. Additionally, the complementary tool to diff is patch, and patch parses a slightly different kind of input: the universal diff. Before we get to that, though, let's look at an option to make diff a little easier on the eyes.
While the in-line view can be useful for a quick idea of what's different, in short files, with few changes, a side-by-side comparison is much more natural. With the -y flag, diff can provide just that. This also has the effect of displaying the entire file, so, piping the output to less is highly recommended (if you're viewing on screen):
$ diff -y -pr -W 70 main.cf.updated main.cf
# Global Postfix configuration # Global Postfix configuration
# of all 300+ parameters. See # of all 300+ parameters. See
# complete list. # complete list.
...
recipient_delimiter = + | #recipient_delimiter = +
...
disable_vrfy_command = <
smtpd_helo_required = yes <
(much output snipped for brevity)
From this example, you can see that instead of a single line that describes the modification, you get to see both files side-by-side, and a center column that describes the alteration needed. The pipe ("|") signifies a change, while the greater-than and less-than "pointers" point to a line that exists only in the left-hand file ("<") or right-hand file (">"). Please note that I only used the -W flag to fit the output for print. "W" restricts the maximum width of the output. In other cases, I wouldn't use it at all, as I like to see as much of the line as possible.
Proverbs
diff is such a popular and useful tool that several variations and extensions have cropped up over time, and I'll just touch on some here.
diffstat will output statistics about the differences found. Easier to see in action than describe:
$ diff -u main.cf.updated main.cf | diffstat
main.cf | 46 +++++++++++++++-------------------------------
1 files changed, 15 insertions(+), 31 deletions(-)
The output shows the file name, total number of changes to that file, a histogram and summary. Run in batch over many files, particularly source code, the overview provided is incredibly useful.
diff3 extends the concept to three files. This turns out to be a topic unto itself. However, the short version is that diff3 diffs the output of "diff file1 file2" with "diff file1 file3". Check the man page for its options.
bzdiff is really just a convenient way to diff bzip compressed files. All options are passed directly to diff--bzdiff just drops the manual steps needed to compare these files.
If you like printed output, try diffpp (pretty print) as a postscript filter:
enscript -G2re -p maindiffs.ps --filter='diff main.cf.updated main.cf | diffpp main.cf' main.cf.updated
then...
open maindiffs.ps
...and you'll have Preview showing you a beautiful "printed" version of the diff file. You can then save the file from Preview as a PDF.
Also probably a column unto itself, don't forget emacs and vim as diff tools. For a very nice color-output diff, try vimdiff:
vimdiff main.cf.updated main.cf
...gives you the output shown in figure 1.
Figure 1 -- Using vimdiff to highlight differences.
Last, but certainly not least, is a utility that is not "diff-derived", but rather a complementary tool. Mentioned earlier, patch will take a unified diff file and apply it to another file ("patch" it), making the appropriate changes.
Back to our example: main.cf. Let's say you've taken the stock OS X Server postfix main.cf file and altered it to add lines that help fight UBE (Unsolicited Bulk E-mail, a.k.a. "spam"). You have another OS X Server that could use the same exact alterations made to its main.cf. The usage is pretty simple. First, make the appropriate diff file:
diff -u main.cf main.cf.updated > maindiffs.diff
Important: unlike viewing only, creating a diff file for patch is a case where you must have the "from-file" (the old) on the left, and the "to-file" (the new, updated file) on the right. Then, ship the diff file to the other machine. To automatically alter the other file, use patch:
patch /etc/postfix/main.cf < maindiffs.diff
The cool thing is that the diff file is simply text. If you ever have multiple patches that you'd like to apply at once, simply combine them and then apply:
cat DiffFile1 DiffFile2 DiffFile3 > one_big_diff_file.diff
Make your life as easy as possible!
You can even use diff and patch to compare and/or patch entire directories. Let's say you've stored the distribution set of postfix config files in /etc/postfix.dist. You've continued to make changes to many files in /etc/postfix. Now, of course, you want to apply these same changes to another server. Make your diff file:
diff -ruN /etc/postfix.dist /etc/postfix > postfix_cfg.patch
Then, on the second machine you can simply:
cd /etc/postfix
patch < /path/to/postfix_cfg.patch
Done! The entire directory, including subdirectories will be updated.
Exodus
Now, while the diff shell tool solves 90% of my needs, there are some other very nice tools out there. There happen to be more than I can cover here, so, I'm going to touch on three that are popular and/or free.
FileMerge
The nicest GUI tool I've seen comes from Apple. FileMerge is installed as part of the developer tools, and can be found in the /Developer/Applications/Utilities folder. Running FileMerge.app brings up a simplistic window, as seen in figure 2.
Figure 2 -- Initial FileMerge window.
Selecting the files to compare is as easy as dragging-and-dropping the appropriate files on the appropriate panel -- or, you can, of course, use the "Left..." and "Right..." buttons to browse for the files you want. Once your files are selected, click on the "Compare" button and you'll get a three-paned window that looks like that in figure 3.
Figure 3 -- FileMerge.app in action.
Blocks of differences are shaded in grey, and are 'warped' to fit the difference. The shaded blocks are flexible and bend as needed as you scroll through the documents. It's actually an effect that you have to see in action to really appreciate. Another nice touch is the marking in the scroll bar that denotes where changes appear.
To merge changes into a new document, simply select a change by clicking on a grey block, and then choose the appropriate action from the "Actions" drop-down menu. Once you're satisfied with the file that you have in front of you, save it to a new document by choosing "Save Merge" from the File menu (or just Apple-s).
Figure 4 -- FileMerge actions
Like all good utilities, FileMerge can also be called via a shell. Using the opendiff binary simply launches FileMerge itself, but is handy when you're already in a shell, or as a way to integrate FileMerge with another GUI tool that can call shell apps.
TextMate
TextMate, from MacroMates -- basically Allan Odgaard -- is a text editor that can best be compared to emacs or the old DOS Brief Editor (which, was written by Dave Nanian who now runs Shirt Pocket Software and blesses us with SuperDuper!). That's all to say, TextMate unto itself could take up an article...or two...or a book! In fact, a book was just released. More than a text editor, TextMate is a programmer's editor. For anyone banging out code all day, or those trying to learn, it's an excellent tool. As such, it can perform your standard diff and patch operations. In short, it calls the shell tools listed above to perform its magic. Why re-invent the wheel, right? Nicely, you can perform the diff with files in several locations -- even the clipboard, as seen in figure 5.
Figure 5 -- TextMate's diff options
In this example, I have two files loaded into TextMate itself, and simply chose "Selected Files in Project Drawer". That gave me the standard, unified diff output shown in figure 6.
Figure 6 -- TextMate's colored unified diff output
There's a reason Allen and TextMate won an Apple Design Award at last year's WWDC (2006). TextMate can even diff a file against itself since it was first loaded so you can see the changes you've made in a single file (details in the TextMate blog: http://macromates.com/blog/archives/2005/08/11/tips-and-tricks/). With its plug-in architecture, TextMate is beginning to be used for many things, even as a blogging tool. It's a free trial download at http://www.macromates.com, so, check it out.
TextWrangler
I think BareBones Software surprised a lot of people when they shipped the free (as in beer) TextWrangler text editor. It took over for BBEdit Lite as BBEdit's little brother...except free. It's one of those apps that continually impresses, and its diff capabilities don't disappoint. You can compare files that you have loaded into TextWrangler, or any files on disk (or even files that you have loaded from a remote server -- very handy!). Start the diff from the Search menu, and you'll be shown three windows. One window for each file you're comparing, side by side, and a differences window. The differences window, shown in figure 7, keeps the two file windows in sync as you highlight changes.
Figure 7 -- TextWrangler's differences window.
You can apply the changes from one file to the other via the "apply" buttons, or via keyboard shortcuts.
TextWrangler's implementation is a little FileMerge-like visually, but has its own way of altering the files. Overall, TextWrangler is a valuable tool in any OS X user's arsenal, and it doesn't cost you a dime. Find it at http://www.barebones.com.
Revelation
Even though difference utilities started out as a programmer's tool, that doesn't mean that there are no other ways to apply them. I skipped over some other comparison tools such as cmp and comm, as diff is really the most useful for general use -- particularly for SysAdmins. I've even found people using diff for natural language processing: http://dspace.wul.waseda.ac.jp/dspace/bitstream/2065/585/1/interactive-11.pdf. Don't let the 'text' moniker fool you either; you can use these comparison tools on binary files as well. Check the man pages for each, or the documentation/help for the GUI tools and you'll find even more power to exploit.
Media of the month: Ghost in the Shell. How is it that I've never recommended this one to you before? Fantastic animation, a story that's ahead of its time, and a major influence on "The Matrix," it's a must see for anyone involved in tech culture.
WWDC is nigh! Unfortunately for me, I won't be able to stay the entire week, so I'm trying to cram as much into the front as possible. For those of you headed to the conference -- and that should be just about everyone reading this magazine -- I hope to see you there!
Ed Marczak owns and operates Radiotope, a technology consultancy that brings enterprise solutions to small and medium-sized businesses. Outside of this piece of the puzzle, he is Executive Editor of MacTech Magazine, a husband and father, and CTO of WheresSpot, among other things. Find the missing tech piece at www.radiotope.com.