mac in the Shell: More ftp
Volume Number: 23 (2007)
Issue Number: 09
Column Tag: Mac in the shell
More ftp
I've got a fever,
and the only prescription
is more ftp
by Edward Marczak
Introduction
ftp whether we're referring to the actual protocol, or just file transfer in general is something we all need on some basis. When I started toying with computers, I saw punch cards, but never really had to deal with them. But that was only one method of file (or more generally, data) transfer. Then we moved up to tape and floppy disk. Now, very few computers are not connected to a network of some type, and the primary reason is to transfer data in the form of files. Our cover story this month touches on several GUI-based clients, but when you read this column, those utilities get the "Mac In The Shell" treatment. We need to be able to transfer files easily from a shell!
Why?
I often create automated solutions that run on a server without a GUI. There are also plenty of times when a simple, repeated file transfer shouldn't pop up anything visually on a client machine, either. It should 'just happen' simply and reliably with no pomp and circumstance. Enter curl, ftp and wget.
Of the three, "ftp" is the oldest and most simple. wget brings further power, and curl is a veritable Swiss Army Knife of transfer agents. If one of these options can't do what you want, it's most likely not possible (or, consider a different tactic!).
ftp
ftp, the application, implements a client side version of the ftp protocol (which is detailed by Mary Norbury in this month's cover story, "FTP Clients for Mac OS X"). In simple use, you can use it interactively:
$ ftp ftp.example.com
Trying ftp.example.com...
Connected to ftp.example.com.
220 example.com FTP server ready.
Name (ftp.example.com:marczak):
331 Password required for marczak.
Password:
230 User marczak logged in.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> cd Public
250 CWD command successful.
ftp> ls
229 Entering Extended Passive Mode (|||50077|)
150 Opening ASCII mode data connection for '/bin/ls'.
total 1
-rw-r--r-- 1 marczak marczak 0 Nov 16 2006 .localized
drwx-wx-wx 3 marczak marczak 102 Nov 16 2006 Drop Box
-rw-r--r-- 1 marczak marczak 43796 Jul 20 07:14 test.jpg
226 Transfer complete.
ftp> bin
200 Type set to I.
ftp> get test.jpg
local: test.jpg remote: test.jpg
229 Entering Extended Passive Mode (|||50079|)
150 Opening BINARY mode data connection for 'test.jpg' (43796 bytes).
100% |********************| 43796 121.76 MB/s 00:00
226 Transfer complete.
43796 bytes received in 00:00 (21.79 MB/s)
ftp> quit
Any techy person over the age of 25 should recognize this immediately. They should also remember in the days before big-business-on-the-Internet that it was polite to wait until "after hours" before using ftp against a University server! In the annals of tech-history, though, one needed to be familiar with transferring files this way.
The shell-based ftp application has a good lexicon in its interpreter. It's one that has grown substantially since its inception. However, for purposes of automation, that can get clumsy. You could script it with expect. Some versions of ftp allow creating a script and having ftp simply run through the motions that the script indicates. However, the version of ftp that ships with OS X (at least in Tiger) omits this option. It does keep the macro definition option in, though.
Never fear! The parameters available to you are greatly expanded, including passing a user name and password along. This is ideal for scripting within your own scripting environment. So, if I know in advance the names of the files I need to transfer, I could script this in bash thusly:
ftp -V ftp://user:password@server.example.com/directory/file.txt
...which will download file.txt and name the local file "file.txt". Note the -V switch, which is the opposite of -v keeping the output quiet. I can also go the other way using:
ftp -V -u ftp://user:password@server.example.com/directory/ file.txt file2.txt
...which will upload the specified file(s) in this case, file.txt and file2.txt to the given directory. Don't forget the trailing slash on the target directory!
Don't miss the fact that any valid URL syntax will work, so, you can 'ftp' a file from an HTTP server, too:
$ ftp http://www.example.com/directory/cars.jpg
Requesting http://www.example.com/directory/cars.jpg
17746 29.87 KB/s
So, good 'ol ftp provides us with some quick and easy ways to move files around. Not ideal ways, perhaps, as not only is our password sent in the clear as part of the ftp protocol (which may not be an issue for you), but also displays our password in a process listing. That's not really cool.
ftp does offer many more options, so, check the man page if you need to get in deeper.
wget
wget bills itself as the "non-interactive network downloader." So, unlike ftp, there is no interactive mode with which you can use to generally poke around. However, we're here to talk about automated use, so, we don't need no stinkin interactive mode! If you were desperate, though, you could use one of wget's more interesting features: when an ftp directory is requested, it will automatically convert the output into an html listing. That might be a little too esoteric...even for me!
Disapointingly, wget is not installed by default under OS X Tiger. However, it's simple to install one way or another. You can grab the source from the GNU page at http://ftp.gnu.org/pub/gnu/wget/ and compile it yourself. Quentin Stafford-Fraser has a pre-compiled binary here: http://www.statusq.org/images/wget.zip. Finally, you can install wget using fink or MacPorts.
To get right down to it, like ftp, you can use any rational URL to specify your target:
$ wget ftp://emarczak:sekretpass@ftp.example.com/path/Big_File.zip
--07:23:08-- ftp://emarczak:*password*@ftp.example.com/path/Big_File.zip
=> `Big_File.zip'
Resolving ftp.example.com... 192.168.77.201
Connecting to ftp.example.com|192.168.77.201|:21... connected.
Logging in as emarczak ... Logged in!
==> SYST ... done. ==> PWD ... done.
==> TYPE I ... done. ==> CWD /path ... done.
==> PASV ... done. ==> RETR Big_File.zip ... done.
Length: 10,871,922 (10M) (unauthoritative)
100%[=======================>] 10,871,922 638.53K/s ETA 00:00
07:23:27 (601.74 KB/s) - `Big_File.zip' saved [10871922]
Also like ftp, wget will display your password in a process listing, so, use this with care! Here's where the roads diverge, though, and wget has a few more tricks up its sleeve. You can recursively download an entire ftp or http directory with the "-r" switch:
wget -r -t 5 ftp://emarczak:sekretpass@ftp.example.com/path/
I also threw in the "-t" switch, which will allows for multiple retries if some part of a file download fails. "-t" also allows for a value of "inf" causing infinite retries. Also useful here is the "-l" (ell) switch, which limits the depth of the traversal. So, to grab just the items from the top layer of the directory you specify, use "-l1".
An absolute life saver is the "-c" switch: it tells wget to continue a partial download. So, if your download bombs, or, perhaps you're on a laptop and need to run before the transfer is complete, retry the operation with the "-c" switch and pick up right from where you left off. Nice.
Wget will read a list of URLs from a file, using the "-i" switch. This is handy for scripting, of course. However, it's also a very nice way to keep your password out of a process listing. With your username and password embedded in a file, you're not using it on the command line. This list also comes in handy as a way to store your favorite sites and then recursively them locally using the "-r" switch mentioned above. In fact, toss in the "-A" switch, which will only accept certain files, and you can download only files of a certain type from a site. Next time you want all of the mov files from a given site, try this:
wget -r -l4 http://www.example.com/movies -A.mov -np
This will mirror the given site, and only transfer files ending in ".mov" on the given pages up to 4 levels deep. We also ensure that we don't follow links back up to the parent directory ("-np", or, "--no-parent").
Again, wget has many, many tricks up its sleeve. Too many to list here, but the brief introduction should convince you of its utility above the standard "ftp" application. Check out the man page for much more. (Specifically, check out the "-k" switch!).
curl
Like the other utilities mentioned here, curl will accept any valid URL as its file description. Unlike wget, curl is installed as a part of OS X. One very cool curl trick is that it dumps files to standard out unless it's told where to write them. Why is that cool?
Sometimes, you just want to view a remote document, be it an actual file like a README or index.html file or a directory listing. So, you could easily:
curl http://server2.example.com/instructions/how_to_do_it.txt | less
...which will get the file from the server and pipe it into less. When you quit less, there will be no file remaining to clutter up your disk. I sometimes use that with http://www.whatismyip.com, and then pipe the output to a script that simply reports back the machine's external IP address. This is also a cool way to run a remote script:
curl ftp://server.example.local/script.sh | bash
If you are interested in downloading a file, use the "-O" switch (capital O), which names the local file the same as the remote:
curl -O ftp://ftp.example.com/path/to/file/some_file.zip
This will anonymously download some_file.zip, and store it in the current working directory as some_file.zip. I also particularly like the "-L" switch (capital ell) when used with http servers as this will make curl follow http redirects.
Of course, curl will upload files, too. The "-T" switch will take care of this for you:
curl -T "pix[1-100].jpg" ftp://ftp.example.com/pictures/
I also threw in the fact that curl will respect globbing and regular expressions. So, the previous example will upload pix1.jpg, pix2.jpg...up through pix100.jpg. Clearly very handy.
Both upload and download can be resumed using the "-C -" switch (capital C followed by a hyphen). The hyphen tells curl to figure out where to resume from automatically. This does require server-side support in the form of telling the server at which byte to start appending at (the SIZE command for upload) or which byte to start the transfer from (ftp resume or HTTP 1.1 for downloads).
If you're an all-OS X shop, you'll be happy to hear that curl supports Kerberos. You can get your initial ticket the usual way (kinit), and then tell curl to use that authentication via the "--krb" switch.
curl --krb private ftp://krb4site.com -u username:boguspw
If this uses Kerberos, why did I still supply a name and password? This is a bit of a hack, but with no password, curl will still want to prompt you for one. However, if you supply one, but use Kerberos, it'll just ignore the password you supply so use a bogus password as this will appear in process listings.
Conclusion
Being able to script data transfer is an important part of every system administrator's toolkit. While good 'old ftp will do the job in many cases, wget and curl give you much more flexibility. Both utilities have overlap in functionality, but curl goes deeper in many cases. Case in point: when I said that curl accepts any valid URL syntax, try TELNET://, dict:// and even LDAP:// (although, you'll currently need to build your own curl for LDAP support as the Apple supplied version isn't linked correctly with the LDAP framework).
Of course, there are other file transfer options available to you, including scp, sftp, ditto and rsync, to name a few. However, I focused mainly on ftp options here, as ftp is alive and well, but sometimes overlooked. While perhaps a deceiving name, sftp is not true ftp, but file transfer over ssh, requiring no ftp server at all. Of course, over "hostile" networks, you should use no less that an encrypted solution. However, with the right internal setup, and in certain other cases, ftp can be the perfect solution.
Media of the month: the ftp RFC: http://www.faqs.org/rfcs/rfc959.html. If you want to get deeper into ftp and understand why it behaves the way it does, this RFC is the way to go.
Please practice this in a test environment and then press it into real-world use where appropriate. Until next month, I think you'll find this a great tool in your automation arsenal.
Ed Marczak owns and operates Radiotope, a technology consulting company that guides companies to use what they have as efficiently as possible. He is also the Executive Editor of MacTech Magazine, a husband and father of two. His spare time is spent editing MacTech Magazine and enjoying his family. He finds keeping it all running smoothly good practice. Improve your practice at http://www.radiotope.com.