Net Program Linux
Volume Number: 16 (2000)
Issue Number: 1
Column Tag: Linux Development
Net Program Linux
by Giovanna Pavarani
Contributing Editor, Michael Swan
How programs communicate through sockets in a client-server architecture
My first personal computer was a Mac IIsi and I've always had a Mac for fun and personal use, but, as a programmer, I was born in a UNIX environment and there I grew up for my job. In particular, I spent lot of time implementing network protocols and writing software for client-server architectures. When I heard, near two years ago, that there was a Linux implementation for PPC I couldn't resist and I cleared a little space on my hard disk to install it and try it.
There is a lot of talk nowadays about Mac OS X, Linux and its Macintosh incarnations, Darwin, OpenSource, networking technologies, and you've seen it in the pages of MacTech Magazine too. So, if you are just curious, or you plan to write networking code in a UNIX-like environment, maybe this article can give you some starting points and/or some hints.
Background
In this article I assume that you have a working UNIX-like environment and some familiarity with it. If this is not your case don't worry! There are lots of resources on the net to help you fulfill these requirements. A good starting point could be the official site of the MkLinux community: http://www.mklinux.org. This site can help you choose the right Linux flavor for your machine (PPC or 68K, PCI or NuBus, old or brand new models ...) and gives you pointers to the different distributions for code and installation instructions. For a User's Guide and other useful docs just select the link to the Linux Documentation Project. Then remember to install and configure the network stuff, the system manual pages, the compiler and debugger, and a friendly GUI :-) (all freely included in nearly all distributions). Manual pages are very important because you can use them to know more about a command or a system call. If you want to know what is and how it works the socket() call, for example, just type man socket at the prompt; man man gives you details on how to use the man facility. At last, for troubles, doubts or questions not answered by the site or manual pages, there are very helpful and polite dedicated mailing lists.
On my Mac, I've installed a MkLinux DR3 distribution complete with manual pages, development libraries and the GNU C Compiler and debugger. All the article's examples have been tested on this system and on a RedHat 6.0 Intel box, but the concepts and the code are so general that it's quite easy to use them in another UNIX-like operating system, even not Mac specific. You don't need a real network to test the examples, you can launch the programs on the same machine, provided that you have installed the network protocol stack and configured your computer.
Socket Basics
In a Unix like system, communications take place between endpoints called sockets within a communication domain. Domains are abstractions which imply both an addressing structure (address family) and a set of protocols implementing the various socket types within the domain (protocol family). The two most important domains are the UNIX domain and the INTERNET domain. They both allow processes to communicate but only the INTERNET domain allows communications between two machines. In the UNIX domain the socket is identified by a path name within the file system name space and communication is permitted between any two processes that reside on the same machine and are able to access the socket pathname. The INTERNET domain allows communication between processes on separate machines with the Internet protocol family; the addresses consist of a machine network address and an identifying number called port. The type of the socket describes the semantics of the communication and its properties: reliability, ordering and prevention of messages duplication. The most known socket types are:
- Stream socket. This type of socket provides a bi-directional, reliable, error-free, sequenced and non duplicated flow of data without record boundaries. Stream communication implies a connection.
- Datagram socket. Datagram sockets provide a bi-directional flow of data but don't guarantee that the datagrams will be received in the same order they were transmitted, or that they won't be duplicated. Record boundaries in data are preserved and a datagram communication doesn't imply a connection.
- Raw socket. Raw sockets allow users direct access to a lower-level protocol; they aren't for the general user but for those that intend to develop new communication protocols or use some of the more particular aspects of an existing protocol.
The most common use of sockets is in a client-server architecture. In this model, client applications request services from a server process. The protocol implemented at both ends of the connection can be symmetric or asymmetric. In a symmetric protocol, both ends may play the server or client roles (e.g. the TELNET protocol); in an asymmetric protocol one end is always the server and the other is always the client (e.g. the FTP protocol).
In this article I'll give you complete examples of client and server processes using stream and datagram sockets in the INTERNET domain.
Opening a socket is quite easy:
int s = socket(int domain, int type, int protocol);
This system call creates a socket in the specified domain (AF_UNIX, Address Format UNIX, or AF_INET, Address Format INTERNET), of the specified type (SOCK_DGRAM, for datagram socket, or SOCK_STREAM, for stream socket). We can specify a protocol too, but leaving it to 0 is usually the best choice: the system will choose the appropriate protocol in the domain that can support the requested type, usually UDP for datagrams and TCP for streams.
Before accepting a connection or receiving a datagram, a socket must first bind to a name or address within the communication domain:
bind(int s, struct sockaddr *name, int namelen);
This call binds to a previously created socket s a name: an Internet address and port number in the INTERNET domain or a path name and a family in the UNIX domain.
To send datagrams a client uses:
sendto(int s, void *msg,int len, unsigned int flags, struct \ sockaddr *to, int tolen);
where to is the destination address.
To receive datagrams a server uses:
read(int s, void *buf, size_t count);
or, when it needs to know the sender address:
recvfrom(int s,void *msg, int len,int flags,struct sockaddr \ *from, int *fromlen);
To establish a connection versus a server, a client performs a:
connect(int s, struct sockaddr *server, int serverlen);
The server parameter contains the UNIX pathname or the Internet address and port number of the server to which the client wishes to talk.
To accept a connection from a client, a server must perform two more steps after the bind() call:
listen(int s, int pending);
int newsock = accept (int s, struct sockaddr *from, int \ *fromlen);
With the first call, a server indicates that it's ready to listen for incoming connection requests on the socket s; pending is the maximum number of connections that may be queued. The second call creates a new socket for the accepted connection; this is a blocking call, it will not return until a connection is available or is interrupted by a signal to the process. It is up to the process to check who the connection is from and close it if not welcome.
When a connection is established, data may flow:
write(int s, void *buf, size_t count);
read(int s, void *buf, size_t count);
are used to read/write up to count bytes from/into s into/from the buffer buf;
send(int s, char *msg, int len, int flags);
recv(int s, char *msg, int len, int flags);
allow options to be set by flags; these flags may be:
- MSG_OOB, for out of band data delivered to the user independently of normal data
- MSG_PEEK, for reading data and leaving it as still unread
- MSG_DONTROUTE, for sending data without routing packets
Once a socket is no more needed it may be discarded:
close(s);
Client-Server Examples
Now that we know the basic calls to work with sockets, let's get practical with a few complete examples. The first two listings are a simple datagram server and client. You may compile them with these two commands (rabbit> is my command prompt, gcc is the GNU C Compiler):
rabbit> gcc -o dataserver dataserver.c
rabbit> gcc -o dataclient dataclient.c
rabbit>
The compiler will create two programs named dataserver and dataclient. When you launch dataserver, it prints out the port number on which it's listening to and waits quietly for a message:
rabbit> dataserver
Socket port #1089
When you launch dataclient you have to specify in the command line the server name and the port number:
rabbit> dataclient rabbit 1089
rabbit>
The client sends the message to the server rabbit on the port 1089 and exits; the server receives the message, prints it to the standard output and exits:
rabbit> dataserver
Socket port #1089
-> There was a time...
rabbit>
Listing 1: A Simple Datagram Server
dataserver.c
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <stdio.h>
main()
{
int sock, length;
struct sockaddr_in server;
char msg[1024]; // buffer to store the message
We create a socket in the INTERNET domain (AF_INET), of type datagram (SOCK_DGRAM) and we choose the default protocol (0), in this case UDP. If the call fails, it returns a negative value and the program prints out an error message and exits.
if((sock = socket(AF_INET, SOCK_DGRAM, 0)) < 0){
perror("socket() call");
exit(1);
}
Then we set all the values necessary to bind a name to the socket. We specify again that we are in the INTERNET domain; we use the predefined value INADDR_ANY to specify that we want to receive messages from any network interface in the machine; we don't set any particular port number from which to receive messages but we let the system to choose for us. We could explicitly set the port number (we'll do it in Listing 3) paying attention to the already allocated port numbers of the well-known services. Note that port numbers under 1024 can be used by the root user only.
server.sin_family = AF_INET;
server.sin_addr.s_addr = INADDR_ANY;
server.sin_port = 0;
We bind the address to the socket:
if (bind(sock, (struct sockaddr *)&server, sizeof server) \ <0 ){
perror("bind() call");
exit(1);
}
length = sizeof(server);
Because the system choosed a port number for us, we have to find it and print it:
if (getsockname(sock, (struct sockaddr *)&server, &length) \ < 0){
perror("getsockname() call");
exit(1);
}
Numbers in Internet are represented using the big-endian byte-order; ntohs() deals with the possible necessary conversion from the network representation to the host's internal representation.
printf("Socket port #%d\n", ntohs(server.sin_port));
Now we read the message from the socket, print it to the standard output, close the socket and exit the program:
if (read(sock, msg, 1024) < 0)
perror("read() call ");
printf("--> %s\n", msg);
close(sock);
exit(0);
}
Listing 2: A Simple Datagram Client
dataclient.c
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
This is the message we'll send in the datagram:
#define MSG "There was a time..."
main(int argc,char *argv[])
{
int sock;
struct sockaddr_in server;
struct hostent *server_data;
We create a socket as before: a datagram socket in the INTERNET domain with the default protocol:
if ((sock = socket(AF_INET, SOCK_DGRAM, 0)) < 0){
perror("socket() call");
exit(1);
}
In the command line we specify the server name, not its address. gethostbyname() returns a structure containing the address of the server.
if((server_data = gethostbyname(argv[1])) == 0){
fprintf(stderr, "%s: unknown host",argv[1]);
exit(2);
}
We set the data necessary to send the message to the server: the server address, port and protocol family. Note the conversion from the host's number representation to the network representation, htons().
memcpy(&server.sin_addr,server_data->h_addr, \ server_data->h_length);
server.sin_family = AF_INET;
server.sin_port = htons(atoi(argv[2]));
At last we send the message, close the socket and exit the program:
if (sendto(sock,MSG,sizeof MSG, 0, (struct sockaddr *) \ &server, sizeof server)< 0)
perror("sendto() call ");
close(sock);
exit(0);
}
Next two examples show a stream server and a stream client. They are a little bit different from the previous ones. The server program waits for connections on a specific port number (2907) and doesn't terminate after receiving the first message but waits for the next connection. You may compile them with these two commands:
rabbit> gcc -o streamserver streamserver.c
rabbit> gcc -o streamclient streamclient.c
rabbit>
The compiler will create two programs: streamserver and streamclient. When you launch streamserver, it waits quietly for a connection on port 2907:
rabbit> streamserver
When you launch dataclient you have to specify only the server name in the command line:
rabbit> streamclient rabbit
rabbit>
The client makes a connection to the server rabbit on port 2907, sends the message, closes the connection and exits; the server accepts the connection, reads the message, prints it to the standard output and waits for another connection:
rabbit> streamserver
-> There was a STREAMING time ...
Ending connection
Listing 3: A Simple Stream Server
streamserver.c
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
main()
{
int sock, length, readn;
struct sockaddr_in server;
int msgsock;
char msg[1024];
We create a socket in the INTERNET domain, of type stream (SOCK_STREAM) with the default protocol, in this case TCP:
if((sock = socket(AF_INET, SOCK_STREAM, 0)) < 0){
perror("socket() call");
exit(1);
}
This time we choose a specific port for our server to listen to: 2907 is a number greater than 1024, so that we don't have to be root to execute the program, and it's not bound to a well-known service. Of course you can do as in the Listing 1 example, this is just to present you with an alternative.
server.sin_family = AF_INET;
server.sin_addr.s_addr = INADDR_ANY;
server.sin_port = 2907;
if (bind(sock, (struct sockaddr *)&server, sizeof server) \ < 0 ){
perror("bind() call ");
exit(1);
}
We mark the socket as ready to receive connections. Since several clients may attempt to connect, the system maintains a queue of pending connections; the listen() call initializes this queue and set the maximum number of pending connections.
listen(sock,5);
We enter an infinite loop. The accept() call will take a pending connection request from the queue if one is available or block waiting for a request. When a request is accepted, a new socket is created. When the connection is closed by the client, the read() call returns zero and the socket is closed.
for (;;) {
if ((msgsock = accept(sock, NULL, NULL)) == -1)
perror("accept() call");
else do {
memset(msg, 0, 1024);
if ((readn = read(msgsock, msg, 1024)) < 0) perror("read() call");
if (readn == 0)
printf("Ending connection\n");
else
printf("--> %s\n", msg);
} while (readn != 0);
close (msgsock);
}
Since we entered an infinite loop, the socket sock is never explicitly closed. All sockets are automatically closed when a process is killed or terminates normally.
exit (0);
}
Listing 4: A Simple Stream Client
streamclient.c
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
This is the message we'll send over the connection:
#define MSG "There was a STREAMING time ..."
main(int argc,char *argv[])
{
int sock;
struct sockaddr_in server;
struct hostent *server_data;
We create a socket in the INTERNET domain, of type stream (SOCK_STREAM) with the default protocol:
if((sock = socket(AF_INET, SOCK_STREAM, 0)) < 0){
perror("socket() call");
exit(1);
}
server.sin_family = AF_INET;
We retrieve the IP address of the server from its name:
if((server_data = gethostbyname(argv[1]))==0){
fprintf(stderr, "%s: unknown host",argv[1]);
exit(1);
}
memcpy(&server.sin_addr, server_data->h_addr, \ server_data->h_length);
server.sin_port = htons(2907);
We initiate the connection and, after it has been established, we send the message:
if(connect(sock,(struct sockaddr *)&server,sizeof server) \ < 0){
perror("connect() call");
exit(1);
}
if(write(sock, MSG, sizeof MSG) < 0)
perror("write() call");
The connection is closed closing the socket. If a process persists in sending messages after the connection is closed, a SIGPIPE signal is sent to the process by the operating system. The process will then terminate unless the signal is handled, for example, with the signal() call.
close(sock);
exit(0);
}
Good. Now we know how to write simple programs to deal with datagrams or streams. Let's complicate things a little. What if we need to process data while we are waiting for a datagram or a connection? Or what if we need to open more than one socket? The read() and accept() calls are blocking calls ...There are two solutions: a synchronous solution and an asynchronous one.
The synchronous solution is so called because we control how to multiplex input/output requests among multiple sockets, in a sort of polling policy. Listing 5 shows such a solution. A socket identifier can be treated as a file descriptor so we can use the select() call. The select() call takes among its arguments pointers to three sets of file descriptors: one for the set of file descriptors on which we wish to read data, one for those on which we wish to write and one for exceptional conditions (e.g. out of band data). When we are not interested in a particular set we have to set the pointer to NULL. Each set is implemented as a structure fd_set containing an array (of size FD_SETSIZE) of long integer bit masks. Two macros, FD_SET(fd,&mask) and FD_CLR(fd,&mask) are provided for adding and removing file descriptor fd in the set mask. Before using it, each set must be zeroed with the FD_ZERO(&mask) macro. The last argument of select() sets a timeout value, in seconds, that specifies how long at max the select() call must last. If the timeout value is set to zero, the selection is like a poll, returning immediately. Assuming a successful return of select() call, the three sets will indicate which file descriptors are ready to be read from, written to or have exceptional condition pending. The macro FD_ISSET(fd,&mask) tests the status of a file descriptor in a set: it returns a non-zero value if fd is a member of mask, 0 otherwise.
You may compile the code with this command:
rabbit> gcc -o selectserver selectserver.c
rabbit>
The compiler will create the program selectserver. When you launch selectserver, it waits quietly for a connection on port 2907 or on port 2908:
rabbit> selectserver
As clients, you can use the code in Listing 4, modified to use port 2907 (dataclient1) or 2908 (dataclient2):
rabbit> dataclient1 rabbit
rabbit> dataclient2 rabbit
rabbit>
The server accepts connections on the two ports and when is not serving a client prints the message "Do something else".
Listing 5: A Synchronous Stream Server
selectserver.c
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/time.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
main()
{
int sock1, sock2, length, readn;
struct sockaddr_in server;
int msgsock;
char msg[1024];
fd_set readmask;
struct timeval timeout;
if((sock1 = socket(AF_INET, SOCK_STREAM, 0)) < 0){
perror("socket() call 1");
exit(1);
}
server.sin_family = AF_INET;
server.sin_addr.s_addr = INADDR_ANY;
server.sin_port = 2907;
if (bind(sock1, (struct sockaddr *)&server, \
sizeof server) <0 ){
perror("bind() call 1");
exit(1);
}
if((sock2 = socket(AF_INET, SOCK_STREAM, 0)) < 0){
perror("socket() call 2");
exit(1);
}
server.sin_family = AF_INET;
server.sin_addr.s_addr = INADDR_ANY;
server.sin_port = 2908;
if (bind(sock2, (struct sockaddr *)&server, \
sizeof server) <0 ){
perror("bind() call 2");
exit(1);
}
listen(sock1,5);
listen(sock2,5);
for (;;) {
FD_ZERO(&readmask);
FD_SET(sock1, &readmask);
FD_SET(sock2, &readmask);
timeout.tv_sec = 2;
if(select(FD_SETSIZE, &readmask, NULL, NULL, \
&timeout) < 0){
perror("select() call");
continue;
}
if(FD_ISSET(sock1,&readmask)){
msgsock = accept(sock1, NULL, NULL);
if(msgsock == -1)
perror("accept() call 1");
else do {
memset(msg, 0, 1024);
if ((readn = read(msgsock, msg, 1024)) < 0)
perror("read() call 1");
else if (readn == 0)
printf("Ending connection 1\n");
else
printf("2907 --> %s\n", msg);
} while (readn > 0);
close (msgsock);
}
if(FD_ISSET(sock2,&readmask)){
msgsock = accept(sock2, NULL, NULL);
if(msgsock == -1)
perror("accept() call 2");
else do {
memset(msg, 0, 1024);
if ((readn = read(msgsock, msg, 1024)) < 0)
perror("read() call 2");
else if (readn == 0)
printf("Ending connection 2\n");
else
printf("2908 --> %s\n", msg);
} while (readn > 0);
close (msgsock);
}
printf("Do something else\n");
}
exit (0);
}
The code in Listing 6 is an example of an interrupt driven socket. This example isn't working under my MkLinux box, but works perfectly under RedHat Linux 6.0 on an Intel box; I'd like to know your results if you have occasion to try it on a different system (LinuxPPC, NetBSD...) :) . The signal SIGIO notifies a process when a socket has data waiting to be read. There are three steps that must be completed: writing a signal handler for the SIGIO signal with the signal() call; setting which process id has to receive the SIGIO signal with the fcntl() call; enabling the asynchronous notification with another fcntl() call. You may compile the code with this command:
rabbit> gcc -o interruptserver interruptserver.c
rabbit>
The compiler will create the program interruptserver. When you launch interruptserver, it waits quietly for a datagram, prints it and then exits. You can use the code in Listing 2 as a client program.
Listing 6: An Asynchronous Stream Server
interruptserver.c
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
int sock;
char msg[1024];
main()
{
int length;
struct sockaddr_in server;
void io_handler();
We declare that after reception of SIGIO signal the process must call the io_handler() function:
signal(SIGIO, io_handler);
We open a datagram socket:
sock = socket(AF_INET, SOCK_DGRAM, 0);
if (sock < 0) {
perror("socket() call");
exit(1);
}
server.sin_family = AF_INET;
server.sin_addr.s_addr = INADDR_ANY;
server.sin_port = 0;
if (bind(sock, (struct sockaddr *)&server, \
sizeof server) <0 ){
perror("bind() call");
exit(1);
}
length = sizeof(server);
if (getsockname(sock, (struct sockaddr *)&server, \
&length) < 0){
perror("getsocketname() call");
exit(1);
}
printf("Socket port #%d\n", ntohs(server.sin_port));
We set the process receiving SIGIO to us. The getpid() call returns this process pid.
if (fcntl(sock, F_SETFL, getpid()) < 0) {
perror("fcntl() F_SETFL, getpid()");
exit(1);
}
We allow the receipt of asynchronous signals:
if (fcntl(sock,F_SETFL,FASYNC) < 0 ){
perror("fcntl() F_SETFL, FASYNC");
exit(1);
}
for(;;)
;
}
This is the signal handler called after the reception of the SIGIO signal. It prints the message, closes the socket and terminates the program.
void io_handler()
{
printf("I waz called!\n");
if (read(sock, msg, 1024) < 0)
perror("read() call");
printf("--> %s\n", msg);
close(sock);
exit(0);
}
Conclusions
OK, that's all. I've tried to give you some basic and advanced notions about network programming with sockets, presenting you code examples of datagrams, stream communication, synchronous I/O multiplexing and interrupt driven sockets. The choice between datagrams and socket streams is done carefully considering the application requirements in terms of semantic and performance. Datagrams are faster because don't require a connection setup, but the complexity of the program can increase a lot if you don't want lost or out of order messages. Stream connection setup takes longer and is often unnecessary for small amount of data, but could be the winner solution for reliable delivery of large amount of data. Synchronous I/O multiplexing and interrupt driven sockets are used in programs that need more than one communication channel or have to fast react to asynchronous events like user's commands.
Now it's up to you, have fun!
Bibliography and References
If you are new to the Open Source concept and philosophy, you can start from two MacTech articles and some links there mentioned:
If you are new to Linux on Macintosh, again you can start from two MacTech articles and some links:
If you are new to networks you could start from:
- Tanenbaun, Andrew S. Computer Networks. 3rd edn. Prentice Hall.
Giovanna Pavarani is a Network and Software Engineer in the Central R&D Labs at Italtel S.p.A., Milano, Italy. Giovanna can be reached at pavarang@yahoo.com.