The Road to Code: Primary Objective
Volume Number: 23 (2007)
Issue Number: 12
Column Tag: Road to Code
Primary Objective
Introduction
to Objective-C
by Dave Dribin
On the Road Again
Last month in The Road to Code, we had a little detour regarding new Leopard features relevant to developers. This month, we're back on the Road, and we're finally ready to tackle Objective-C, the native language for writing Mac OS X software.
Recap
You may recall that in my last code article, we wrote a reusable rectangle data structure, with the ability to represent geometric rectangles and calculate their area and perimeter. Listing 1 and Listing 2 show the final interface and implementation for our rectangle data structure.
Listing 1: Final rectangle.h
typedef struct
{
float leftX;
float bottomY;
float width;
float height;
} Rectangle;
void rectangleInitWithEdges(Rectangle * r,
float leftX, float bottomY, float rightX, float topY);
void rectangleSetRightX(Rectangle * r, float rightX);
float rectangleArea(Rectangle * r);
float rectanglePerimeter(Rectangle * r);
Listing 2: Final rectangle.c
#include "rectangle.h"
void rectangleInitWithEdges(Rectangle * r,
float leftX, float bottomY, float rightX, float topY)
{
r->leftX = leftX;
r->bottomY = bottomY;
r->width = rightX - leftX;
r->height = topY - bottomY;
}
void rectangleSetRightX(Rectangle * r, float rightX)
{
r->width = rightX - r->leftX;
}
float rectangleArea(Rectangle * r)
{
return r->width * r->height;
}
float rectanglePerimeter(Rectangle * r)
{
return (2*r->width) + (2*r->height);
}
rectangle.h contains the structure definition and type definition of a rectangle as well as function declarations that manipulate the structure. rectangle.c contains the definitions of the functions. Here is a simple example that uses this data structure:
Listing 3: main.c using rectangle data structure
#include <stdio.h>
#include <stdlib.h>
#include "rectangle.h"
int main(int argc, const char * argv[])
{
Rectangle * rectangle;
rectangle = malloc(sizeof(Rectangle));
rectangleInitWithEdges(rectangle, 5, 5, 15, 10);
printf("Area is: %.2f\n", rectangleArea(rectangle));
printf("Perimiter is: %.2f\n", rectanglePerimeter(rectangle));
rectangleSetRightX(rectangle, 20);
printf("Area is: %.2f\n", rectangleArea(rectangle));
printf("Perimiter is: %.2f\n", rectanglePerimeter(rectangle));
free(rectangle);
return 0;
}
The output of this program when run is:
Area is: 50.00
Perimiter is: 30.00
Area is: 75.00
Perimeter is: 40.00
Let's recap some of the design decisions that led us to this version of code. First, the rectangle code is separated into its own set of files: rectangle.h and rectangle.c. The header file, rectangle.h, is called the interface and the source file, rectangle.c, is called the implementation. This allows the rectangle code to be used by different parts of the program without repeating the implementation. To use the rectangle data structure, we just need to include the header, as we did in Listing 3. Also, all of the rectangle functions take a pointer to a Rectangle structure. This allows the functions, like rectangleSetRightX, to modify the structure. Even though functions like rectangleArea do not need to modify the structure, they take a pointer for consistency's sake.
Since all of the functions take pointers, it's best to use dynamic memory (malloc and free). Also, as a user of the data structure, we always use the functions and do not access the structure elements directly. This separation, where only the implementation uses the structure elements, is called encapsulation. Encapsulation is a design goal that provides us with future flexibility. It allows us to change the internal guts of the object (the structure elements) without affecting the users. I showed an example of this in the previous code article.
At the end of that article, I also said that this was actually object-oriented programming. What exactly is object-oriented programming? This doesn't look much different than the straight C code we've been writing so far. Object-oriented programming is a programming methodology that organizes the program into objects. Well, what are objects, then? At the simplest level, an object combines data and the functions that operate on that data into a nice little package. In procedural programming, where we started our journey, the data and functions are completely separate. You have structures that are only data, and you have functions that may or may not operate on that data. However, over the years, it was learned that combining the data and functions make it easier to create and understand large programs. Object-oriented programming is an evolutionary step from procedural programming. It tries to take the lessons learned from procedural programming and make them easier to implement.
Object-Oriented Terminology
While much of the details of object-oriented programming are similar to the procedural programming we've covered so far, the powers that be decided at some point to create a bunch of new words for stuff that already know. For starters, the term object-oriented is shortened to OO.
While objects are a collection of data and functions that operate on that data, objects are also called classes. Thus our Rectangle structure and functions would be called a Rectangle class in OO terminology. When you create a new Rectangle and use it, it's called an instance of that class. In main.c, the variable rectangle is an instance of the Rectangle class in OO parlance.
Moving on to rectangle.h. We called this the interface for the Rectangle class and this is still the case. The structure elements are called instance variables. This makes sense because each instance of an object has its own data. The functions that operate on the data are called methods. And instead of calling functions, in OO you call a method. Calling a method may also be referred to as sending a message. Take this line of code as an example:
printf("Area is: %.2f\n", rectangleArea(rectangle));
In C terminology, we are calling the rectangleArea function and printing out the return value. In OO terminology, we are calling the rectangleArea method on rectangle, an instance of the Rectangle class. Same thing, different name. Or alternatively, we are sending the rectangleArea message to the rectangle instance. The return value of the method may also be called the response to the message.
I'm going to switch over to OO mode and start using its terminology when I talk about Objective-C. But don't let these new words confuse you. You already know what they mean; you've been using them already!
On To Objective-C
For the rest of this article, we are going to rewrite our Rectangle class in Objective-C. Before proceeding on to Objective-C, I'd like to make a couple of minor tweaks to our C code, as shown in Listing 4 and Listing 5, that will make the transition to Objective-C more seamless. First, I want to use an underscore prefix on all structure elements. This reinforces the point that structure elements are private and should only be accessed via the implementation. The next change is to rename the first argument of all the functions to self. While the name is largely irrelevant, the reason for this will become evident when we re-implement it in Objective-C.
Listing 4: rectangle.h
typedef struct
{
float _leftX;
float _bottomY;
float _width;
float _height;
} Rectangle;
void rectangleInitWithEdges(Rectangle * self,
float leftX, float bottomY, float rightX, float topY);
void rectangleSetRightX(Rectangle * self, float rightX);
float rectangleArea(Rectangle * self);
float rectanglePerimeter(Rectangle * self);
Listing 5: rectangle.c
#include "rectangle.h"
void rectangleInitWithEdges(Rectangle * self,
float leftX, float bottomY, float rightX, float topY)
{
self->_leftX = leftX;
self->_bottomY = bottomY;
self->_width = rightX - leftX;
self->_height = topY - bottomY;
}
void rectangleSetRightX(Rectangle * self, float rightX)
{
self->_width = rightX - self->_leftX;
}
float rectangleArea(Rectangle * self)
{
return self->_width * self->_height;
}
float rectanglePerimeter(Rectangle * self)
{
return (2*self->_width) + (2*self->_height);
}
You'll notice that all of the functions now follow a similar pattern. First, all of them take a Rectangle * as the first argument. Second, all of the names start with rectangle. I've adopted both of these conventions to tie these functions to the Rectangle structure. By establishing this convention, anytime you see a function starting with rectangle you will know itf operates on the Rectangle structure. And by operates, I mean it could perform some read-only calculation or modify it. This convention essentially ties together the functions and the data. It is this combination of function and data that makes our rectangle a class in the eyes of an OO programmer.
The problem, of course, is that this is just a convention. The C language provides no assistance to coding with these conventions. I could create a function that operates on the Rectangle structure, but does not begin with rectangle. Or I could create a function that begins with rectangle, but has nothing to do with the Rectangle structure. Both of these undermine the usefulness of the convention. Also, in a program that makes heavy use of rectangles, the programmer must deal with the long function names, all beginning with the same text. The fact that I can even write object-oriented code in C is a testament to the flexibility of the language; however, wouldn't it be nice if the language better supported these object-oriented coding conventions? Well, that's where object-oriented programming languages, such as Objective-C come in. They are designed to make object-oriented programming easier.
There are many object-oriented programming languages, but Objective-C is one of the oldest, dating all the way back to 1986. It was licensed by NeXT in 1988 and then came to Apple when they acquired NeXT in 1996. As NeXTSTEP slowly morphed into Mac OS X, Objective-C remained the language of choice. The standard NeXTSTEP libraries for interacting with the system, for example drawing windows and menus, were renamed Cocoa. One nice feature of Objective-C is that it is a strict superset of C. This means that the Objective-C compiler can compile any C program. As traditional Mac OS software was written in C, this allowed a bit of a transition for existing programs. It also allowed programmers to access the C-based Unix underpinnings.
Because Objective-C is built on top of C, it also means we don't have to forget everything we have learned so far. In fact, much of what we learned is still relevant: the data types, control statements, loops, pointers, and dynamic memory allocation.
Hello Objective-C
So it's time to get started writing Objective-C code. I'm going to walk us through writing a rectangle object that is equivalent to the one we wrote in C. Then, you can compare the two implementations side-by-side. Let's start off by creating a new Xcode project.
Start Xcode and choose New Project... under the File menu. Select Foundation Tool under the Command Line Utility section and click Next. Enter objc_rectangle for the project name and choose a directory for your new project. Just like when we create a new C project (using Standard Tool), Xcode creates the main source file, along with some simple code. This code is also the "Hello World" Objective-C program. Remember that a "Hello World" application is the simplest possible application.
Listing 6: Xcode Objective-C template
#import <Foundation/Foundation.h>
int main (int argc, const char * argv[]) {
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
// insert code here...
NSLog(@"Hello, World!");
[pool release];
return 0;
}
While this should look very similar to our C "Hello World" program, it does have some differences. It has the main function, just as before. All Objective-C programs start in main, just like C programs. The #import statement on the first line is very similar to the #include statement in C: it makes other functions and classes available to your application. The Foundation framework contains all the necessary functions and classes for Objective-C programs.
The NSLog function is very similar to printf, which displays output to the console. While we'll be using NSLog more later, we'll be using printf in Objective-C for the rest of this article.
The other two lines, which you must always have in any Objective-C application, have to do with the NSAutoreleasePool class. You must always begin the main function of an Objective-C program with the following line:
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
You must also end any Objective-C program with this line, just before the return statement in the main function:
[pool release];
I'll cover the reason why we need these lines in a later article. For now, just remember that you will need them for your code to work properly.
One final observation before moving forward: the name of the source file that Xcode created for you is named objc_rectangle.m. This file has a ".m" extension instead of the ".c" extension we have been using for C files. The ".m" extension tells the compiler that the code is written in Objective-C, and all of our Objective-C files will have this extension.
Recreating the Rectangle Class in Objective-C
Okay, it's time to recreate our rectangle class in Objective-C. Choose New File... from the File menu. Scroll down to the Cocoa section, select Objective-C class, and click Next. For the filename, enter Rectangle.m, with a capital "R". Make sure the checkbox to create Rectangle.h is checked, and click Finish. Replace the contents of Rectangle.h with Listing 7 and Rectangle.m with Listing 8. And finally, replace the contents objc_rectangle.m with Listing 9.
Listing 7: Rectangle.h for Objective-C
#import <Foundation/Foundation.h>
@interface Rectangle : NSObject
{
float _leftX;
float _bottomY;
float _width;
float _height;
}
- (id) initWithLeftX: (float) leftX
bottomY: (float) bottomY
rightX: (float) rightX
topY: (float) topY;
- (void) setRightX: (float) rightX;
- (float) area;
- (float) perimeter;
@end
Listing 8: Rectangle.m for Objective-C
#import "Rectangle.h"
@implementation Rectangle
- (id) initWithLeftX: (float) leftX
bottomY: (float) bottomY
rightX: (float) rightX
topY: (float) topY
{
self = [super init];
if (self == nil)
return nil;
_leftX = leftX;
_bottomY = bottomY;
_width = rightX - leftX;
_height = topY - bottomY;
return self;
}
- (float) area
{
return _width * _height;
}
@end
Listing 9: objc_rectangle.m
[rectangle setRightX: 20];
printf("Area is %.2f\n", [rectangle area]);
printf("Perimiter is: %.2f\n", [rectangle perimeter]);
[rectangle release];
[pool release];
return 0;
}
Before I explain what all this is, just run it, and make sure you get the same output as our C program. Now take a deep breath, because you're about to take the Red Pill into the world of Objective-C.
Let's look the interface for our rectangle object in Rectangle.h. The first part of the file contains:
@interface Rectangle : NSObject
{
float _leftX;
float _bottomY;
float _width;
float _height;
}
This looks very similar to our structure definition. The @interface keyword tells the compiler that we are defining the interface of a new class called Rectangle. In fact, everything between @interface and @end are part of the interface. The ": NSObject" means that our class inherits from the NSObject class. We'll get to inheritance later, but every class needs to inherit from another class and NSObject is the one you should use for now. As an aside, most of Apple's classes start with the "NS" prefix. They do this to avoid name conflicts with other, similar classes. "NS" stands for NeXTSTEP and goes back to the pre-Apple Objective-C days.
After the @interface line comes the instance variable declarations. Just like C structure, they are listed one per line between braces. In the Objective-C community, instance variables are often known as ivars for short.
After the instance variables and until the @end keyword are the method declarations. Here's a simple one:
- (float) area;
Let's compare this to our C version:
float rectangleArea(Rectangle * self);
First, all method declarations in Objective-C must begin with a dash ("-"). Then comes the return type in parenthesis, followed lastly by the name of the method. We don't need to prefix the name of the method with our class name because we're still in the @implementation section of our Rectangle class. The Objective-C language allows multiple classes to have methods with the same name without conflict. This is one of the benefits using an OO language gives you over a non-OO language. It reduces the code clutter and makes the code easier to read. To call this method in Objective-C would use this syntax:
Rectangle * rectangle;
float area;
// ...
area = [rectangle area];
Before explaining, let's compare this to the corresponding C code:
Rectangle * rectangle;
float area;
// ...
area = rectangleArea(rectangle);
As you can see, calling a method in Objective-C uses the square brackets. This syntax scares many people away, probably even Lisp hackers with their parentheses. But if you're going to program Cocoa on Mac OS X, you're just going to have to get used to this weird syntax. Trust me, you will get used to it. Just to clarify, the object you are calling a method on appears first, followed by the name of the method.
Now, let's compare the implementation of these methods. The Objective-C implementation is found in the Rectangle.m source file. It's important to note from the full code of Listing 8 that this file begins with the @implementation keyword. All method definitions must occur between this keyword and the @end keyword. Here's the Objective-C code for the area method:
- (float) area
{
return _width * _height;
}
And now the C code:
float rectangleArea(Rectangle * self)
{
return self->_width * self->_height;
}
In Objective-C, just like C, we repeat the signature of the method. The bodies are nearly identical, except the Objective-C code accesses the instance variables, _width and _height, without referencing the self pointer. This is another area where using an OO language reduces clutter.
Even though the self pointer is not used nor passed it explicitly, it's still there if you need it. It gets passed automatically. You can think of self as a hidden argument that gets passed to every method. While you don't need self to access instance variables, you will need it to call methods on your own instance. We'll see an example of this a bit later.
Objective-C methods with arguments are a little different from other languages. Let's compare the method to set the right X-coordinate. Here's the Objective-C code:
[rectangle setRightX: 20];
And the corresponding C code:
rectangleSetRightX(rectangle, 20);
A colon is used to separate the method name and the argument value, as opposed to parentheses and commas. The implementation of this method should make sense, now:
- (void) setRightX: (float) rightX
{
_width = rightX - _leftX;
}
And for comparison, the C code:
void rectangleSetRightX(Rectangle * self, float rightX)
{
self->_width = rightX - self->_leftX;
}
No surprises this time. Again, we access the instance variables directly. This shows another reason why I like to use an underscore prefix on instance variables. It makes it clear which variables are instance variables and which are not. In the width calculation above, the underscore makes it clear the _leftX is an instance variable while rightX is not. The underscore prefix on instance variables is not required by Objective-C, though. It's just a convention I think makes the code easier to read. Sometimes you'll see other prefixes, like a lower case "m", i.e. mLeftX. This is often a convention used in C++ where instance variables are called member variables. In any case, the underscore has some benefits in advanced Objective-C usages, so I recommend you use that for now. I think you'll find it makes your code easier to read.
Where Objective-C syntax really starts getting a little strange is for methods with multiple arguments. We've only got one, so let's look at the interface declaration:
- (id) initWithLeftX: (float) leftX
bottomY: (float) bottomY
rightX: (float) rightX
topY: (float) topY;
Compare this to C:
void rectangleInitWithEdges(Rectangle * self,
float leftX, float bottomY, float rightX, float topY);
You'll notice that each argument has a label in front of it, along with the argument type and variable name. This is really noticeable when the method gets called:
rectangle = [rectangle initWithLeftX: 5
bottomY: 5
rightX: 15
topY: 10];
Compare this to C:
rectangleInitWithEdges(rectangle, 5, 5, 15, 10);
A label, which always end in a colon, separates each argument. This makes the Objective-C code easier to read, in my opinion. It's much easier to see that 15 is the right most X-coordinate. In the C code, I'd have to go look at the header if I forgot what that argument was. This is very unusual, as most modern languages do not have any equivalent syntax for argument labels. The implementation of the method is fairly straightforward:
- (id) initWithLeftX: (float) leftX
bottomY: (float) bottomY
rightX: (float) rightX
topY: (float) topY
{
self = [super init];
if (self == nil)
return nil;
_leftX = leftX;
_bottomY = bottomY;
_width = rightX - leftX;
_height = topY - bottomY;
return self;
}
And the C code:
void rectangleInitWithEdges(Rectangle * self,
float leftX, float bottomY, float rightX, float topY)
{
self->_leftX = leftX;
self->_bottomY = bottomY;
self->_width = rightX - leftX;
self->_height = topY - bottomY;
}
For a moment, ignore the first three lines and the last line in the Objective-C method. The middle four lines are nearly identical to the C code. The reason why we need the extra lines in the Objective-C version is because all methods that begin with the init prefix are special methods. They are called constructors and are always called directly after the alloc method to initialize new objects. It's worth noting that we are using the self pointer here. Remember I said that self is silently passed to all methods? Well, this is one case where we need to use it. The first three lines pertain to inheritance and are pretty much boilerplate code for all constructor methods. And constructors always return a pointer to self, hence the last line. That's also why we set rectangle to the return value of this method in the main function. This is an Objective-C idiom, as constructors in languages like C++, Java, and Ruby do not return themselves. This is another one of those "trust me for now" areas. This will all make more sense as you gain experience in Objective-C.
The final point to cover is dynamic memory usage in Objective-C. In C, we use malloc to allocate memory and free to deallocate it. In Objective-C, we use special methods called alloc to allocate memory and release to deallocate it. Compare the memory allocation in Objective-C:
Rectangle * rectangle;
rectangle = [Rectangle alloc];
To the memory allocation in C:
Rectangle * rectangle;
rectangle = malloc(sizeof(rectangle));
These are basically the same except that alloc does not need to know how much memory to allocate, i.e. sizeof isn't needed. The reason for this is that alloc is a method of Rectangle, and it knows its own size. The compiler fills this in for us automatically.
The deallocation code is also similar:
[rectangle release];
And in C:
free(rectangle);
The release method tells the system we are done with the object and to deallocate it. Okay, I'm telling a bit of a white lie. The release method in Objective-C is not quite the same as free in C. Memory management in Objective-C is a bit more complicated, but for a simple example like this, consider them equivalent. Memory management in Objective-C is too much to cover in this article.
Conclusion
Well that covers the basics of Objective-C. This is probably our most difficult topic so far, due mainly to the strange syntax of Objective-C method calling. I suggest re-reading this column over and over again. In fact, you should put a copy underneath your pillow at night. Well, okay, that may be pushing it. But hopefully by having a side-by-side comparison of C code and Objective-C code, you will find the transition to Objective-C a bit easier.
Dave Dribin has been writing professional software for over eleven years. After five years programming embedded C in the telecom industry and a brief stint riding the Internet bubble, he decided to venture out on his own. Since 2001, he has been providing independent consulting services, and in 2006, he founded Bit Maki, Inc. Find out more at http://www.bitmaki.com/ and http://www.dribin.org/dave/.