Friday, 24 February 2012

C++ Basics







What does “::” mean anyway?
By Dave Mark, MacTech Magazine Regular Contributing Author

So far, this column has focused on Macintosh programming, with an emphasis on C. This month, we’re going to switch gears and talk about C++. With Addison-Wesley’s permission, I’ve codged together some bits and pieces from Learn C++ on the Macintosh that I thought might interest you. Unfortunately, there’s no way to cover all of C++ in a single column. If you would like to hear more about C++, send your comments and suggestions to Neil at any one of the editorial addresses on page 2 of this issue.

And now, some legal mumbo jumbo... Portions of this article were derived from Learn C++ on the Macintosh, ©1993, by Dave Mark, published by Addison-Wesley Publishing Co. And now back to our regularly scheduled program...


For the past few years, Apple (along with a host of other companies) has shifted away from procedural languages such as C and Pascal and made C++ their primary developmental language.

Why C++?


C++ is a superset of C and offers all the advantages of a procedural language. You can write a C++ program filled with for loops and if statements just as you would in C. But C++ offers much, much more. C++ allows you to create objects. An object is sort of like a struct on steroids. Understanding the value of objects is the key to understanding the popularity of C++.

Understanding Object Programming

As its name implies, an object programming language allows you to create, interact with, and delete objects. Just as a variable is based on a type definition, an object is based on a class definition. First, you’ll declare a class, then you’ll define an object (or many objects) said to be members of that class. While a struct definition is limited to data elements only, a class definition can include both data elements (called data members) as well as function pointers (called member functions). To make this clearer, let’s start with a simple problem and see how we’d solve it using both structs and objects.

Our First Example


Suppose you wanted to implement an employee data base that tracked an employee’s name, employee ID, and salary. You might design a struct that looks like this:

/* 1 */
const short kMaxNameSize = 20;

struct Employee
{
 char   name[ kMaxNameSize ];
 long   id;
 float  salary;
};
You may have noticed the use of const instead of a #define. Believe it or not, const is part of the ANSI C standard and is not just found in C++. Though many C programmers prefer to use #defines to define a constant, C++ programmers always use const. The major advantage of using const is that a typed constant is created. If you pass a const as a parameter to a function, for example, C++’s parameter checking will ensure that you are passing a constant of the correct type. A #define just does a simple text substitution during the first pass of the compiler.

The great advantage of the struct declared above is that it lets you bundle several pieces of information together under a single name. For example, if you wrote a routine to print an employee’s data, you could write:

/* 2 */

struct Employee  newHire;
 •
 •
 •
PrintEmployee( newHire.name, newHire.id, newHire.salary );
On the other hand, it is more convenient to pass the data in its bundled form:

/* 3 */
PrintEmployee( &newHire );
Bundling allows you to represent complex information in a more natural, easily accessible form. In the C language, the struct is the most sophisticated bundling mechanism available. As you'll soon see, C++ takes bundling to a new level.

When you write your employee management program using structs, you’ll naturally develop a series of functions to access and modify the fields in your various Employee structs. You’ll also develop some functions that manage and organize the structs themselves (i.e., linked list functions). Though this might be a subtle point, it’s important to note that these functions are in no way connected to the Employee structs. Most likely, each function will take a pointer to an Employee as a parameter, but that’s as far as the bundling gets.

Why bundle the functions with the data? Here’s one reason. Think about all the data that you want to be available to all of your Employee structs (a pointer to the head of your Employee linked list, for example). In C, you’d most likely declare these variables as globals, storing them with a bunch of other globals that have nothing to do with the Employee structs. Wouldn’t it be nice if you could bundle Employee linked list globals with the functions that manage your Employee linked lists. Then, you could bundle all your other globals with the functions that they belong with.

C does offer one mechanism to do this. You can define the appropriate globals in the same file with their related functions. This can work pretty well and is the best you can do in C. The trouble comes when you want to reference the global outside the file it is declared in (there’s always some exception) or if you’ve defined a function that needs to refer to globals from two different categories.

The point here is that C doesn’t naturally offer a mechanism that allows you to bundle functions and data. C++ does.

Bundling Data and Functions


Just as C bundles data together in a struct declaration, C++ bundles data and functions together in a class declaration. Here’s an example:

/* 4 */

const short kMaxNameSize = 20;

class Employee
{
// Data members...
 char   employeeName[ kMaxNameSize ];
 long   employeeID;
 float  employeeSalary;

// Member functions...
 void   PrintEmployee( void );
};
A class declaration is similar in form to a struct declaration. Notice that the keyword struct has been replaced by the keyword class. This example declares a class with the name Employee. The Employee class bundles together three data fields as well as a function named PrintEmployee(). As mentioned earlier, a classes’ data fields are known as data members and a classes’ functions are known as member functions.

Just as you’d use a struct declaration to define a struct variable, you’ll use your class declaration to define a variable known as an object.

It’s useful to be aware of the difference between a definition and a declaration. For a variable, the definition is the statement that actually allocates memory. For example, the statement:

short myShort;
is a definition. On the other hand, an extern reference to the same variable:

extern short myShort;
is a declaration, since this statement doesn’t cause any memory to be allocated. Here’s another example of a declaration:

typedef MyType short;
For a function, the function prototype is a declaration and the function implementation, complete with function code, is a definition.

When you define a struct variable, you allocate a block of memory big enough to hold all the struct’s fields. When you define an object, you allocate a block of memory big enough to hold the object’s data members. In addition to the data members, the compiler will also make sure that your object also has access to pointers to all of the functions belonging to its class.

Creating an Object


There are two ways to create a new object. The simplest method is to define the object directly, just as you would a regular variable:

Employeeemployee1;
This definition creates an object named employee1 belonging to the Employee class. Figure 1 shows this definition, seen from a memory perspective. employee1 consists of a block of memory large enough to accomodate each of the three Employee data members, as well as a pointer to the single Employee function, PrintEmployee().



Figure 1. An Employee object, created by definition.

Note that the function pointer probably won’t be stored in the object itself. I’m just trying to show that the new object has access to the PrintEmployee() function.

When you create an object by definition, as we did above, memory for the object is allocated, automatically, when the definition moves into scope. That same memory is freed up when the object drops out of scope.

For example, you might define an object at the beginning of a function:

/* 5 */

void CreateEmployee( void )
{
 Employee employee1;

 •
 •
 •
}
When the function is called, memory for the object is allocated, right along with the function’s other local variables. When the function exits, the object’s memory is deallocated.

[definition] Objects created by definition are known as automatic objects, because memory for them is allocated and deallocated automatically.

Although automatic objects are simple to create, they do have a downside. Once they drop out of scope, they cease to exist. If you want your object to outlive its scope, take advantage of C++’s new operator.

new is a lot like malloc() or the Toolbox call NewPtr(), though the syntax is a bit different. new takes a type instead of a number of bytes. Also, new (and its partner delete) is a built-in C++ operator, as opposed to a special library function.

First, define an object pointer, then call new to allocate the memory for your object. new returns a pointer to the newly created object. Here’s some code that creates an Employee object:

/* 6 */

Employee*employeePtr;

employeePtr = new Employee;
The first line of code defines a pointer designed to point to an Employee object. The second line uses new to create an Employee object. new returns a pointer to the newly created Employee.

Figure 2 shows what this looks like from a memory perspective. employeePtr is a pointer, pointing to an object of the Employee class. As was the case previously, the Employee object consists of a block of memory large enough to accomodate each of the three Employee data members, as well as a pointer to the single Employee function, PrintEmployee().



Figure 2 An object pointer, pointing to an object, pointing to some code.

Once again, this picture may not reflect the reality of your C++ compiler. The function pointer may not be stored with the object itself.

Suppose we create a second Employee object:

/* 7 */

Employee*employee1Ptr, *employee2Ptr;

employee1Ptr = new Employee;
employee2Ptr = new Employee;
Take a look at Figure 3. Notice that the second Employee object gets its own block of memory, with its very own copy of the Employee data members and its own function pointer. Notice also that both objects point to the same copy of PrintEmployee() in memory. Every single Employee object gets its own copy of the Employee data members. At the same time, all Employee objects share a single copy of the Employee member functions.



Figure 3. A second Employee, pointing to the same code.

Accessing an Object’s Data Members and Member Functions


Once you’ve created an object, you can call its functions and modify its data members. Remember, each object you create has its own copy of the data members defined by its class. You’ll refer to an object’s data members and member functions in much the same way as you’d refer to the fields of a struct. If you’ve defined the object directly, use the . operator:

/* 8 */

Employeeemployee1;

employee1.employeeSalary = 200.0;
If you’re working with an object pointer, use the -> operator:

/* 9 */

Employee*employeePtr;

employeePtr = new Employee;

employeePtr->employeeSalary = 200.0;
To call a member function, use the same technique. If the object was created automatically, you’ll use the . operator:

/* 10 */

Employeeemployee1;

employee1.PrintEmployee();
If the object was created using new, you’ll use the -> operator:

/* 11 */

Employee*employeePtr;

employeePtr = new Employee;

employeePtr->PrintEmployee();
The Current Object


In the previous examples, each reference to a data member or member function started with an object or object pointer. When you are inside a member function, however, the object or object pointer isn’t necessary.

For example, inside the PrintEmployee() function, you can refer to the data member employeeSalary directly, without referring to an object or object pointer:

/* 12 */

if ( employeeSalary <= 200 )
 cout << "Give this person a raise!!!";
This code is kind of puzzling. What object does employeeSalary belong to? After all, you’re used to saying:

myObject->employeeSalary

instead of just plain:

employeeSalary
The key to this puzzle lies in knowing which object spawned the call of PrintEmployee() in the first place. Although this may not be obvious, a call to a member function must originate with a single object.

Suppose you called PrintEmployee() from a non-Employee function (such as main()). You must start this call off with a reference to an object:

employeePtr->PrintEmployee();

Whenever a class function is called, C++ keeps track of the object used to call the function. This object is known as the current object.

In the call of PrintEmployee() above, the object pointed to by employeePtr is the current object. Whenever this call of PrintEmployee() refers to an Employee data member or function without using an object reference, the current object (in this case, the object pointed to by employeePtr) is assumed.

Suppose PrintEmployee() then called another Employee function. The object pointed to by employeePtr is still considered the current object. A reference to employeeSalary would still modify the current object’s copy of employeeSalary.

The point to remember is, a member function always starts up with a single object in mind. This object, which we’ve called the current object, is always of the same class as the function.

The “This” Alternative


In the pursuit of legibile code, C++ provides a generic object pointer, available inside any member function, that points to the current object. The generic pointer has the name “this”. For example, inside every Employee function, the line:

this->employeeSalary = 400;
is equivalent to this line:

employeeSalary = 400;
You don’t have to use this, but it does make the code a little easier to read. If you refer to a data member or function using this, it is quite clear that the data member or function is part of the class, and not a local or global variable.

[By the way] Another benefit of this occurs when you declare a local variable with the exact same name as a data member. For example, suppose PrintEmployee() declared a local variable (or had a parameter) named employeeSalary. When employeeSalary comes up in the code, which does it refer to, the local or the data member? As it turns out, the local variable (or parameter) wins out in case of a conflict, but you can avoid the conflict altogether by either using this or by naming your variables more carefully.

Deleting an Object


As we mentioned earlier, objects created by definition are created and deleted automatically. For example, suppose the Employee function PrintEmployee() defined its own Employee object, right at the beginning of the function:

EmployeelocalEmployee;
localEmployee is created, automatically, at the beginning of PrintEmployee(), and is deleted as soon as PrintEmployee() exits.

Non-automatic objects are another story altogether. If you create an object with new, you’ll delete the object yourself by using the delete operator. Here’s the syntax:

/* 13 */

Employee*employeePtr;

employeePtr = new Employee;

delete employeePtr;
As you’d expect, delete deletes the specified object, freeing up any memory allocated for the object. Note that this freed up memory only includes memory for the actual object and does not include any extra memory you may have allocated. You’ll have to free up that memory yourself.

Writing Class Functions


Once your class is defined, you’re ready to write your classes’ member functions. Member functions behave in much the same way as ordinary functions, with a few small differences. One difference, pointed out earlier, is that a member function automatically has access to the data members and functions of the object that called it.

Another difference lies in the function implementation’s title line. Here’s a sample:

/* 14 */

void  Employee::PrintEmployee( void )
{
 cout << "Employee Name:   " << employeeName << "\n";
}
Notice that the function name is preceded by the class name and two colons. This notation is mandatory and tells the compiler that this function is a member of the specified class.

The Constructor Function


Typically, when you create an object, you’ll want to perform some sort of initialization on the object. For example, you might want to provide initial values for your object’s data members. The constructor function is C++’s built-in initialization mechanism.

The constructor function (or just plain constructor) is a member function that has the same name as the class. For example, the constructor for the Employee class is named Employee(). When an object is created, the constructor for that class gets called, automatically.

Consider this code:

/* 15 */

Employee*employeePtr;

employeePtr = new Employee;
In the second line, the new operator allocates a new Employee object, then immediately calls the object’s constructor. Once the constructor returns, a pointer to the new object is assigned to employeePtr.

This same scenario holds true for an automatic object:

Employeeemployee1;
As soon as the object is created, its constructor is called.

Here’s our Employee class declaration with the constructor declaration added in:

/* 16 */

const short kMaxNameSize = 20;

class Employee
{
// Data members...
 char   employeeName[ kMaxNameSize ];
 long   employeeID;
 float  employeeSalary;

// Member functions...
 Employee( void );
 void   PrintEmployee( void );
};
Notice that the constructor is declared without a return value. Constructors never return a value. This being the case, you won’t want to call any functions that do return a value inside your constructor. As an example, it’s not a good idea to allocate memory inside your constructor.

[definition] In general, an object’s constructor will initialize each of the object’s data members. The constructor will not make any calls that return a status, or that can fail. As your objects get more complex, you’ll want to move to two-stage construction.

With two-stage construction, you create an additional member function that you call after the constructor returns. Typically, this second routine takes the name I, followed by the class name. For example, the second-stage constructor for the Employee class would be named IEmployee().

This example creates an Employee object using two-stage construction:

/* 17 */

Employee*employeePtr;
short   objectStatus;

employeePtr = new Employee();

objectStatus = employeePtr->IEmployee();
Since IEmployee() can return a status, this is the perfect place to allocate memory, or perform any other initialization that has the potential of failing.

Here’s a sample constructor:

/* 18 */
Employee::Employee( void )
{
 employeeSalary = 200.0;
}
As mentioned earlier, the constructor is declared without a return value. This is proper form.

[By the way] Constructors are optional. If you don’t have any initialization to perform, don’t define one. When an object is created, the constructor is only called if it is included in the class declaration.

Adding Parameters to Your Constructor


If you like, you can add parameters to your constructor. Constructor parameters are typically used to provide initial values for the object’s data members. Here’s a new version of the Employee() constructor:

/* 19 */

Employee::Employee( char *name, long id, float salary )
{
 strcpy( employeeName, name );
 employeeID = id;
 employeeSalary = salary;
}
The constructor copies the three parameter values into the corresponding data members. The object that was just created is always the constructor’s current object. In other words, when the constructor refers to an Employee data member, such as employeeName or employeeSalary, it is referring to the copy of that data member in the newly created object.

Notice that this constructor used different names for a parameter and its corresponding data member. Some programmers prefer to use the same name, using this to keep things straight:

/* 20 */

Employee::Employee( char *employeeName, long employeeID,
 float employeeSalary )
{
 strcpy( this->employeeName, employeeName );
 this->employeeID = employeeID;
 this->employeeSalary = employeeSalary;
}
As you write your own code, pick a style you feel comfortable with and be consistent.

This line of code supplies the new operator with a set of parameters to pass on to the constructor:

employeePtr = new Employee( "Dave Mark", 1000, 200.0 );
[By the way] Notice that the parameter list was appended to the class name, making it look just like a function call. Don’t be fooled! This line of code specifies the parameters to be passed to the new object’s constructor function. It does not call the constructor directly. The constructor call happens behind the scenes and no return value is generated. Thought you’d like to know...

This line of code creates an automatic object using parameters:

Employeeemployee1( "Dave Mark", 1000, 200.0 );
As you might expect, this code creates an object named employee1, then calls the Employee constructor, passing it the three specified parameters.

Just for completeness, here’s the class declaration again, showing the new, paramaterized constructor:

/* 21 */

class Employee
{
// Data members...
 char   employeeName[ kMaxNameSize ];
 long   employeeID;
 float  employeeSalary;

// Member functions...
 Employee( char *name, long id, float salary );
 void   PrintEmployee( void );
};
The Destructor Function


The destructor function is called automatically, just like the constructor. Unlike the constructor, however, the destructor is called when an object in its class is deleted. Use the destructor to clean up after your object before it goes away. For instance, you might use the destructor to deallocate any additional memory your object may have allocated.

The destructor function is named by a tilda character (~) followed by the class name. The destructor for the Employee class is named ~Employee(). The destructor has no return value and no parameters.

Here’s a sample destructor:

/* 22 */

Employee::~Employee( void )
{
 cout << "Deleting employee #" << employeeID << "\n";
}
If you created your object using new, the destructor is called when you call delete:

/* */
Employee*employeePtr;

employeePtr = new Employee;

delete employeePtr;
If your object was created automatically, the destructor is called just before the object is deleted. For example, if the object was declared at the beginning of a function, the destructor is called when the function exits.

[By the way] If your object was defined as a global variable, its constructor will be called at the beginning of the program and its destructor will be called just before the program exits. Yes, global objects are automatic and have scope, just like local objects.

Here’s an updated Employee class declaration showing the constructor and destructor:

/* 23 */

class Employee
{
// Data members...
 char   employeeName[ kMaxNameSize ];
 long   employeeID;
 float  employeeSalary;

// Member functions...
 Employee( char *name, long id, float salary );
 ~Employee( void );
 void   PrintEmployee( void );
};
[By the way] If you use two-stage initialization, check the return status of your extra initializer right away. If your request for additional memory fails, for example, you might want to delete the object you just created.

/* 24 */

Employee*employeePtr;

employeePtr = new Employee();

if ( employeePtr->IEmployee() == false )
 delete employeePtr;
Whether you use two-stage initialization or not, it’s a good idea to keep your constructor and destructor in sync. If you allocated extra memory, be sure your destructor has some way of knowing about it. For example, it’s good practice to initialize your pointers to null. If your destructor encounters a non-null pointer, it knows that additional memory has been allocated that must be deallocated.

0 comments:

Post a Comment