Introduction to Unix security concepts
Volume Number: 19 (2003)
Issue Number: 4
Column Tag: Unix Security
Introduction to Unix security concepts
Security can only be achieved when you know its basics.
by Marcelo Amarante Ferreira Gomes
Why Unix?
By now, most, if not all, MacTech readers know that Mac OS X derives from Unix. It may not be just another Unix, since it is very different from most other unixes in many aspects. But it definitely is Unix. So, if we are to write good software for it, especially if we want that software to be secure, we need to know more than just the basics about the Unix side of Mac OS X.
This is the first article of a series, written with the intent to give you an idea of what computer security really is about, and how to enforce it. We will emphasize the programmer side, hoping to help you write safer applications; but the material in this series will also be of use to system administrators and even power users.
These articles will focus on Mac OS X, since old-timers already know Classic Mac OS, and most newbies are only interested on X. To better explain Mac OS X concepts, though, it is sometimes easier to talk about Unix concepts in general.
There is a lot of historical material in this series of articles. This material is here not only so you can better understand how things evolved and why they are the way they are. It will also to let us learn from the errors of the past and avoid repeating them. You will often see typical attacks crackers used and how security evolved in response to them.
This history-telling approach has the added benefit of passing along a little bit of Unix culture to die-hard Classic Mac OS programmers. In order to write successful Mac OS X software, traditional Unix programmers should learn a bit of Classic Mac OS culture, while traditional Classic Mac OS programmers should have a look into Unix culture. For a discussion on this subject, see (Gomes 2001).
This first article contains no code at all. It will start by defining computer security and then focus on Unix users and groups. You will see how users and groups are implemented in a typical Unix system, how different the Mac OS X implementation is from the typical, and the impact that each of these subjects has on the security of a system.
Defining Computer Security
Unless you're coming from Mars today, you have probably already heard about computer viruses, Trojan horses, DoS (Denial of Service) or DDoS (Distributed DoS) attacks made by crackers against Internet sites, business offices, or even home machines. You might even have already been plagued by some of these. Many people think of these threats whenever they hear the term computer security. While it does include concern about these, the security of a computer system is much more complex than that.
We could define computer security as the task of enforcing just one general rule: Any given user should only be able to use a given computing resource if that user is allowed to. A user, in this sense, could be a real person, a process running in the machine, a library routine, or any entity that can use a computing resource. Let's call this broader concept of user a cresuser, for computing resource user. As for the computing resource, it could be memory space, disk space, disk files, CPU time, passwords, processes, or just about anything in the hardware, software or workflow. Even the concepts of using or allowing might mean different things, depending on context. For instance, using a disk file may mean reading it, appending data to it, modifying data in it, deleting it, renaming it, or yet other creative meanings.
In a multi-user system, security involves the task of making sure that any given person will only be able to do what the system administrator decided that that particular user could do. Examples of such systems include Mac OS 9 with the multiple users feature enabled, Mac OS X, NT-based versions of Windows (NT/2000/XP), other versions of windows with network login enabled, any other flavor of Unix (Unix itself, Linux, {Free|Open|Net}BSD, Solaris, AIX, HP-UX, Irix, etc.), and most mainframe environments.
But even single-user systems have security concerns. Mac OS 8.x and earlier, Windows 9x, aging MS-DOS, dinosaur CP/M, or current ones such as PalmOS and Windows CE are all included in this bundle. To fit them in our definition of computer security, you must remember two things. First, even though there's at most one person using the system at any given time, that person is not necessarily allowed to access everything. There may still be the concept of one (or even more than one) administrator and lower-privileged users. And just by adding an interface card and a little software, single-user systems can become part of a larger, multi-user and multi-machine system that we've come to call a network.
The second thing to remember is that our definition of cresuser is much broader than just people; it includes any other entity that could be actively using computer resources. When a new application is installed in a system, it actually becomes a new cresuser in that system. As such, care must be taken so as to limit the reach of the application to only those resources it needs to use. At the same time, we must not impose too restrictive limits; otherwise the application may become so annoying to the person trying to use it, that it effectively becomes useless.
With such a general definition for security, one may get lost while trying to figure out what all of its various meanings might be. And it's also easy to forget to state, or even implement, some of its implications. When we're dealing with computers, there is often more than one way of accomplishing the same task. If a given task is not allowed to a given user, all possible means for that user to accomplish it must be blocked, either explicitly or implicitly.
For instance, I have actually seen a huge security breach in a Unix system shared among students and faculty staff of a certain university. Of course, students were not allowed to browse teacher's files, since the teachers often used that system to write upcoming tests to be applied to those same students. Most security measures were correctly set to block access to directories and files. But they have forgotten about the permission to have raw read access to the hard disk. One student then wrote a program to read the raw information from the disk and interpret the filesystem. That way, he had actually bypassed the Unix kernel filesystem routines, and any security measure built into them, thus getting access to any file in that disk, regardless of the permission bits of those files.
He called his program thundercat, a pun on the name of cat, the Unix utility to concatenate and list the contents of files. The student got so scared when he realized the meaning of his findings that he never did take advantage of them, and stopped using his program in the same day that he had found an upcoming test. He has never been caught. Before you wonder, I'll assure you: that guy wasn't me.
That episode teaches us a few important lessons.
1) There is often a way of accessing a given resource or accomplishing a given task that you may not have anticipated. Whenever possible, security should be analyzed by more than one person, so that one may see possible weaknesses that the other had overlooked.
2) You don't always need a technical solution to a technical problem. The simplest and perhaps most effective way of solving this particular problem would have been to have separate and isolated (i.e., not sharing the same network) computers for students and faculty staff. That way, students would surely have a harder time trying to access teachers' files, instead of tripping upon them by accident.
3) Conversely, some non-technical problems can actually cause technical ones. This particular university had a periodical computer security check-up procedure, and it was effective. The problem was that it had not been followed. This is more a human resources problem than a technical one.
4) Last but not least, just because you have not detected a security problem in your system, it doesn't mean you don't have one. Don't ever assume that your system is safe based on the absence of alarms by anti-virus software or other intrusion detection systems. If you feel that something isn't right, trust your instinct, use a little common sense, and investigate.
As a side note to all lessons above, you should be aware that this series of articles concentrates on aspects of security specific to Unix-like systems and their implications on programming. If your particular concerns are more than just a little curiosity, you should read other material that also covers the social side and many other non-technical aspects of security. Good sources are (Schneier 2000) or (Mitnick 2002).
Unix Users, Groups And User Databases
The above definition of security is just too generic. So, let's try to narrow it down to security in a Unix-like system. Before we do, we need a few more concepts. This time, let's use more concrete definitions.
Our concept of cresusers is very broad. We can only talk about cresusers if we have entities to represent them. In any Unix-like system, and that includes Mac OS X, we have users and groups. Neither of these is even close to the concept of cresusers, but they are very important in a Unix environment, especially for those dealing with security. For now, let's study the situation in which cresusers are indeed Unix users, leaving the case of application cresusers, process cresusers, and other types of cresusers for later investigation.
Each Unix user has a name, a numeric user ID and a few other properties, such as the directory they will use by default. By the way, Unix traditionally uses the term directory for what we call folder in Mac OS parlance. Groups also have names and their own IDs, which may or may not overlap the numeric space of user IDs. As you might guess, a group is a convenient way to refer to more than a single user.
The concepts of users and groups may be implemented in a variety of ways. Traditional unixes have used the /etc/passwd file. It is a plain text file with each line representing a user's properties in fields separated by colons, like the one in Listing 1. You may notice that the line for the user www in that listing is split. This is in fact just a single line in the file. It appears split only because of magazine format.
The fields included in the /etc/passwd file are, in this order, user name, encrypted password, numeric user ID, numeric group ID, a short description of who or what the user is in real life, the user's default directory and his/her/its default shell. If you don't know what a shell is, just read on. The section The Unix Command Line introduces this concept.
Some systems defined a syntax for the user description field, including sub-fields separated by commas or parentheses, but those syntaxes were non-standard. Listing 1 shows an example of this file almost as it has always been. The only exception is that the encrypted password field contains an asterisk for every user.
Listing 1: Typical contents of an /etc/passwd file.
##
# User Database
#
# Note that this file is consulted when the system is running in single-user
# mode. At other times this information is handled by lookupd. By default,
# lookupd gets information from NetInfo, so this file will not be consulted
# unless you have changed lookupd's configuration.
##
nobody:*:-2:-2:Unprivileged User:/nohome:/noshell
root:*:0:0:System Administrator:/var/root:/bin/tcsh
daemon:*:1:1:System Services:/var/root:/noshell
www:*:70:70:World Wide Web Server:/Library/WebServer:/noshell
unknown:*:99:99:Unknown User:/nohome:/noshell
marcelo:*:1000:1000:Marcelo Gomes:/home/marcelo:/bin/sh
As computing power increased, people started to realize that leaving all encrypted passwords in a single file, accessible by any user was a security risk. Since anyone that could log into the system could read this file, it was easy for a cracker to try passwords at will, or copy it to his own machine and later try to guess each user's password in the comfort of his home. All he had to do was to write a small program to try each password in turn and see if, by encrypting it with the publicly available DES encryption routines, he'd get the same encrypted password that was listed in that /etc/passwd file.
Some programs then became available to automate this process, not only from crackers, but also from system administrators. After realizing the fragility of passwords that users chose, sysadmins periodically ran such programs against dictionaries of common English words to see how easy it was to guess each user's passwords. Some of them even tried permutations of letters in the username and user description. The general idea behind the algorithm used by these programs became known as the dictionary attack.
Most dictionary attacker programs were designed to send e-mail to the users that had weak passwords. But since these programs were shared among many people, they eventually ended up in the wrong hands.
Crackers with almost no prior programming experience suddenly found themselves with powerful cracking tools in their hands. Dictionary attackers were not all that hard to tweak, and crackers soon found out how to divert to them the e-mail sent automatically whenever a new password was found.
When people realized how easy it was to break into systems like that, they thought of changing the users database format. By that time, it was too late to eliminate or change the format of the /etc/passwd file, since way too many applications were using it.
The solution came from noticing that very few programs actually used the password field. So, it could contain any kind of garbage, or even be left blank, and most applications would still work. The few programs that would get broken could later be fixed with little effort.
Enter the concept of shadow password databases. These databases were implementation-dependent, sometimes consisting of binary files, as opposed to the text-only /etc/passwd. Usually named either /etc/shadow or /etc/master.passwd, there was still a text file, very similar to the /etc/passwd file, but containing the real encrypted user passwords. It wasn't readable by anyone, except for the root user. When there were binary files involved, they were compiled from the /etc/master.passwd or /etc/shadow files. The /etc/passwd file was still there, and still readable by anyone, but had the look of listing 1, with no password information.
From then on, programmers were encouraged to get user information from library routines, instead of directly accessing /etc/passwd or any other file. This practice made it possible for system software writers to change once again the format of user or group databases.
Meanwhile, Sun had also developed Yellow Pages, now called NIS, and later came its evolution, NIS+, both implementing network-wide user databases. NIS and NIS+ have become very popular among many other Unix-like OS vendors, besides Sun. NIS seems to have spread a little more, perhaps because it is simpler to implement and has been around longer than NIS+. These centralized network user databases reinforced the idea that programmers should rely on system library routines to obtain user information, instead of going straight to /etc/passwd.
Other centralized users databases, to be used throughout an entire network, have been developed. The most widespread today seems to be LDAP (Lightweight Directory Access Protocol). LDAP interoperates with many different environments: Unix-like, including Mac OS X; Classic Mac OS and even Windows.
Groups can also be defined in many ways, and have traditionally been implemented with a plain text file under the /etc directory, called /etc/group. Nowadays they are generally implemented in the same database that keeps users, be it NIS, NIS+, LDAP or something else.
The traditional /etc/group file was even simpler than /etc/passwd, containing one group per line with colon-separated fields storing the group's name, encrypted password, numeric group ID, and a comma-separated list of users belonging to that group.
Listing 2: Typical contents of an /etc/group file.
##
# Group Database
#
# Note that this file is consulted when the system is running in single-user
# mode. At other times this information is handled by lookupd. By default,
# lookupd gets information from NetInfo, so this file will not be consulted
# unless you have changed lookupd's configuration.
##
nobody:*:-1:
wheel:*:0:
daemon:*:1:root
sys:*:3:root
tty:*:4:root
operator:*:5:root,marcelo
mail:*:6:
bin:*:7:
staff:*:20:root,marcelo
guest:*:31:root,marcelo
uucp:*:66:marcelo
www:*:70:
marcelo:*:1000:marcelo
It didn't take too long for people to abandon the concept of a group password. It just doesn't work. When the password leaks, nobody takes responsibility. Besides, it was only useful while systems allowed processes to run with the permissions of just a single group. Soon came versions of Unix and Unix-like systems that allowed a single process to have the added permissions of all groups it could belong to, in fact rendering the concept of group passwords useless.
System administrators in charge of the few systems that actually supported group passwords were encouraged to disable it, by changing the encrypted password to an asterisk. The typical /etc/group file ended up looking like listing 2. The group password field is still there, even in the modern Unix-like systems we have today.
Users And Groups, The Mac OS X Way
Mac OS X has a its own means of storing user and group databases. The /etc/passwd and /etc/master.passwd files are still there, but only for times when network service is unavailable, such as in single-user maintenance mode.
For daily usage, Mac OS X employs a daemon, called lookupd, that in turn gets its information from another daemon, called NetInfo. Unless you fiddle with lookupd's configuration, NetInfo is the definitive source of user information for login and other authentication purposes.
However, Mac OS X may be configured to use NIS/NIS+, as described in (Bresink 2002) for versions prior to 10.2 (pre-Jaguar). As of this writing, in November 2002, the use of NIS with Jaguar is discouraged, but a NIS plug-in for Apple Directory Services is already under development. A similar tweaking may be used to make it look up the needed information in an LDAP database, or even files such as /etc/passwd, although you will probably need additional software to integrate these databases to lookupd and/or NetInfo. Apple says Jaguar has a better integration to LDAP, but I wasn't able to try it out.
Make no mistake, though. NetInfo is much more than a mere users database. It consists of a sophisticated information repository, having a lot in common with LDAP, and is used in Mac OS X to hold lots of other configuration options, besides users and groups. Even information related to a user or a group is not limited to traditional /etc/passwd information. For instance, Mac OS X allows each user to have his or her own network-shared directory. The path to that directory is stored in NetInfo, along with all other information about that user.
NetInfo is similar to NIS/NIS+ and LDAP in that they are all capable of maintaining a single database to be shared among many machines. Thus, any user can log into any machine in the network by using his/her own login and password. This won't cause major headaches to the network administrator, since he has a single master users database to take care of.
You can have a better feeling of what NetInfo is by reading (Apple 2001) or the manual page about NetInfo: just type man netinfo at the shell prompt (see the section The Unix Command Line below for what a shell prompt is). You could also run the GUI interface to NetInfo, NetInfo Manager. It can be found inside the Utilities sub-folder of your Applications folder. To play it safe, make sure to backup the NetInfo database in /var/db/netinfo/local.nidb and the traditional Unix users database in /etc/passwd and /etc/master.passwd. But keep in mind that the recommended utility for manipulation of user and group information is still the Users control panel.
You should note that NetInfo itself, as well as NIS/NIS+, LDAP or any other network-based users database, may pose a security risk to your system. When the concept of shadow passwords was introduced, the idea was to eliminate the possibility of any user having access to any other users' passwords, even in encrypted form, to avoid the dictionary attack.
With the introduction of NetInfo, crackers once again have this possibility once they obtain network access to the NetInfo server. I'm sure you will understand why I won't describe here how such access can be obtained and what good this access can do for a cracker. Those of you with more experience in network security probably already have an idea of how it can be done. But I will indeed tell you what measures you can take to minimize your risks.
The choice of strong passwords and good system configuration can greatly reduce this risk. The use of strong passwords, containing non-alphanumeric characters is always recommended, regardless of how you store and manage them. You could even require users to use such passwords if you give them only the choice of changing their passwords through a GUI front-end that checks for the weakness of passwords before committing the change.
A good password, for instance, could be "h4C&3rZ, G0 h0W3!". It has upper and lowercase letters intermixed, numeric and punctuation characters, and has a mnemonic meaning - try to read it as hackers, go home! Of course, this one might be difficult for you to remember, but you can probably make up a password like this that you'll be able to memorize. If your memory is really good, you can even use truly random passwords. Just make sure not to write your passwords down, or you'll lose most of the effectiveness of password protection. Another hint: don't use the above password or any other password that has been published. Many people know them, and they will probably make it into many crackers' dictionaries for their next dictionary attack.
Configuration changes may make it harder or impossible for the cracker to get the information he/she wants directly from the server. For instance, to limit the reach of crackers trying to access the database, you may add a firewall to block access to the ports used by NetInfo. Early versions of NetInfo running on NeXT systems used ports 716 through 719. Apple seems to be using 765-768 and 1033 for NetInfo and port 776 for lookupd. You can investigate what ports your system is using by issuing the command netstat -a at the command line of your server. By blocking these ports from the outside, you will limit your headaches to inside crackers. It's always a good idea to limit access to other services as well (in particular, RPC, port 111). If you don't have a firewall, you could install a packet filter a similar blocking mechanism.
If you run a single machine, not meant to be a server, Apple has done its homework and made NetInfo database a bit safer, by setting the trusted_networks property of NetInfo's root directory to an empty value. This should render NetInfo inaccessible from outside machines, although I'd prefer to have the ports blocked. Notice that you can still have local crackers. Anyone that has access to your machine can access NetInfo's database. This is less of a problem, since if any bad guy has access to your machine, the battle is already lost.
The Unix Command Line
Even though Apple makes it hidden from the average user, Mac OS X has a command-line interface. It's in this interface that we find most of the similarities between Mac OS X and other Unix-like systems. In that same Applications folder, you'll find the Terminal application. Fire it up, and let's see how the Unix command line feels.
The prompt that you get comes from another application, your default shell, or default command interpreter. The default shell is one of the parameters stored in the user database, and can be different for each user. If you haven't changed any defaults, your commands will be interpreted by /bin/csh, also known as the C shell. The Terminal application just creates one or more windows and acts as an interface between those windows, the keyboard, the shell and you. At the shell prompt, you can type commands to be executed.
You have the option to choose among many shells. Most files under /bin and /usr/bin that end in sh are command interpreters, or shells. Notable exceptions are rsh and ssh, which are meant to establish network connections to contact shells in remote machines.
For instance, try ls -l /bin. This command invokes the ls application, passing it -l and /bin as arguments. ls lists files residing in your filesystem tree. Typically, the entire set of filesystems, the filesystem tree, consists of a single HFS+ hard disk volume and possibly a CD or DVD volume. If no argument is given, ls lists only the names of files in the current directory. In this first command, we have specified the /bin directory, so ls will list the files in that directory. The -l argument makes ls output lots of information about the files, instead of just their names. A typical output might look like listing 3.
Listing 3: typical output from the ls command.
[localhost:~] marcelo% ls -l /bin
total 8208
-r-xr-xr-x 1 root wheel 13656 Aug 19 2001 [
-r-xr-xr-x 1 root wheel 13880 Aug 19 2001 cat
-r-xr-xr-x 1 root wheel 13764 Aug 19 2001 chmod
-r-xr-xr-x 1 root wheel 18820 Aug 19 2001 cp
-r-xr-xr-x 2 root wheel 318108 Aug 19 2001 csh
-r-xr-xr-x 1 root wheel 14544 Aug 19 2001 date
-r-xr-xr-x 1 root wheel 26544 Aug 19 2001 dd
-r-xr-sr-x 1 root operator 18500 Aug 19 2001 df
-r-xr-xr-x 1 root wheel 9744 Aug 19 2001 domainname
-r-xr-xr-x 1 root wheel 9184 Aug 19 2001 echo
-r-xr-xr-x 1 root wheel 60172 Aug 19 2001 ed
-r-xr-xr-x 1 root wheel 13728 Aug 19 2001 expr
-r-xr-xr-x 1 root wheel 9396 Aug 19 2001 hostname
-r-xr-xr-x 1 root wheel 13676 Aug 19 2001 kill
-r-xr-xr-x 1 root wheel 13604 Aug 19 2001 ln
-r-xr-xr-x 1 root wheel 26984 Aug 19 2001 ls
-r-xr-xr-x 1 root wheel 13832 Aug 19 2001 mkdir
-r-xr-xr-x 1 root wheel 14392 Aug 19 2001 mv
-r-xr-xr-x 1 root wheel 98704 Aug 19 2001 pax
-r-sr-xr-x 1 root wheel 35832 Aug 19 2001 ps
-r-xr-xr-x 1 root wheel 9532 Aug 19 2001 pwd
-r-sr-xr-x 1 root wheel 24296 Aug 19 2001 rcp
-r-xr-xr-x 1 root wheel 14292 Aug 19 2001 rm
-r-xr-xr-x 1 root wheel 9356 Aug 19 2001 rmdir
-r-xr-xr-x 1 root wheel 465476 Aug 19 2001 sh
-r-xr-xr-x 1 root wheel 9352 Aug 19 2001 sleep
-r-xr-xr-x 1 root wheel 23896 Aug 19 2001 stty
-r-xr-xr-x 1 root wheel 9544 Aug 19 2001 sync
-r-xr-xr-x 2 root wheel 318108 Aug 19 2001 tcsh
-r-xr-xr-x 1 root wheel 13656 Aug 19 2001 test
-r-xr-xr-x 1 root wheel 465476 Aug 19 2001 zsh
[localhost:~] marcelo%
Here we can see the file names at the end of each line. Notice that very long lines, such as the one for the domainname file in listing 3, get broken it up in two. This is done by the routines in the terminal window, and has nothing to do with the ls command. ls does not output a line break in the middle of a logical line, no matter how long it is.
The -l flag makes ls list, besides their names, also the type, access permissions, number of links, owner name, group name, size in bytes and creation date and time of the files. In our next article, you will see the meaning of each of these bits of information. All of them have implications on your system's security.
To Be Continued...
We have seen some basics about security in a Unix-like system, focusing mainly on users and groups, and a more generic concept introduced here, that of cresusers. Some of these concepts have obvious relations to security, which we've covered here. So that we wouldn't stay only in the theoretical side, we've also covered a little bit about shells and the command-line environment.
In our next article, you will see an in-depth explanation of file access permissions and how they relate to Unix users and groups. Then you'll see some less than obvious security implications of file ownership and permission bits. Later articles in this series will cover practical security measures you could take to avoid common pitfalls when implementing your applications under Mac OS X.
Acknowledgements
Jacques do Prado Brandao always has time and patience to read an article on a subject he does not understand. He helped me correct grammatical glitches and enhance the style of my writing.
On the content side, Marcello, my almost homonymous friend deserves a note here, for being the author of thundercat. The guys that gave him advice on programming and where to find documentation also deserve credit. They know who they are.
I won't discuss the merits of neither my parents nor my wife to have a note here. But I should say that my kids deserve an acknowledgement for always (ok, almost always) be willing to collaborate and give me some silence and peace of mind whenever I needed it. So, here go my thanks to them all.
References
[1] Gomes, Marcelo Amarante Ferreira. "Mac vs. Unix Traditions". MacTech Magazine 17:9 (September 2001), pp. 22-27.
[2] Schneier, Bruce. Secrets and Lies: Digital Security in a Networked World. John Wiley & Sons, 2000.
[3] Mitnick, Kevin. The Art of Deception: Controlling the Human Element of Security. John Wiley & Sons, 2002.
[4] Bresink, Marcel. "Integrating Mac OS X in an NIS environment". Internet page at <http://www.bresink.de/osx/nis.html>, September 19, 2002 (v 1.91).
[5] Apple Computer. "Security: Mac OS X and UNIX". Internet page at <http://developer.apple.com/internet/macosx/security
compare.html>. The footer states Copyright 2002, but the text states it was written when the latest version of Apache was 1.3.20, that is, in 2001.
[6] NetBSD Documentation <http://www.netbsd.org/Documentation/>.
[7] FreeBSD Documentation Project <http://www.freebsd.org/
docproj/>.
[8] The Linux Documentation Project. <http://www.tldp.org/>.
Marcelo Amarante Ferreira Gomes is an independent Macintosh, Unix and Internet consultant. He works most of the time with his Brazilian partners, and plays most of the time with his kids and/or his Lombard PowerBook that has the single most important OS in the world, besides most other junk in this industry. You can reach him at suporte@mac.com... if you're lucky. :-)