GENERIC<PROGRAMMING> Traits are useful, but when do you need their unusual
flexibility? If you want to use them, how can you avoid the drudgery
of manually traits-enabling reams of classes in existing
hierarchies? This article answers these questions in the context of
the same SmartPtr example we used last
time. In particular, watch for hierarchy-wide traits—a new,
killer-cool C++ technique that allows you to define traits not only
for individual types, but for entire hierarchies in a single shot.
BACK TO SMARTPTR The previous
column presented a smart pointer that can be used with
single-threaded code and multithreaded code, depending on how the
client instantiates it. Recall SmartPtr's
definition:
The RefCountingTraits
class template customizes SmartPtr to meet
the exact reference counting syntax and semantics that type T uses. If you need to use SmartPtr with single-threaded code, RefCountingTraits will do. Otherwise, you must
provide a separate traits class (MtRefCountingTraits) as the second template
argument. MtRefCountingTraits ensures
multithreading safety for the reference count manipulation.
Client code uses SmartPtr<Widget> for single-threaded
widgets and SmartPtr<Widget,
MtRefCountingTraits> for multithreaded widgets. It would
be that simple, were it not for the homework question at the end of
the previous installment: What's still a potential source of
inefficiencies in the multithreaded version of SmartPtr?
As many readers pointed out, the problem is that MtRefCountingTraits uses what's known as
class-level locking. Herb Sutter puts it in a plastic way:
Static lock, bad juju! (What a useful mnemonic.) Whenever you
perform a serialized operation like MtRefCountingTraits::Refer, class-level locking
locks all objects of type MtRefCountingTraits. This happens because lock_ is a static variable shared by all the
instances of MtRefCountingTraits.
This phenomenon might become a major source of time
inefficiencies if you have many threads that manipulate smart
pointers to widgets intensively. Threads that could have been
totally independent have to wait in line one after another whenever
they copy SmartPtr<Widget,
MtRefCountingTraits> objects.
Object-level locking is the technique that solves this
problem. To use object-level locking, make lock_ a regular (nonstatic) member of MtRefCountingTraits. (That's the only change to
MtRefCountingTraits needed.) This way
serialized code locks each object independently. The downside of
this approach is the increase in size that locks incur to each
object. Let's implement object-level locking for smart pointers to
Widgets.
TRAITS OBJECTS As soon as we try to graft object-level
locking onto SmartPtr, we stumble upon a
problem. Let's recall SmartPtr's
destructor definition:
As you see, there's no RCTraits object at all; SmartPtr's destructor calls Unrefer with static member syntax (RCTraits::Unrefer). SmartPtr uses the RCTraits traits class as the mere depository of
two static functions. Now the traits class has state, so we
start talking about storing a traits object. The obvious
place to store it is right in the SmartPtr
object, so let's modify the code accordingly.
Now each SmartPtr
object holds and uses a Lock object, which
is exactly what's needed. Different SmartPtr objects belonging to different threads
don't share any data and therefore don't have any operations to
synchronize. Problem solved.
However, SmartPtr grew bigger.
"Obviously," I hear you saying, "now each multithreaded SmartPtr has a Lock
object, which is what we wanted in the first place." However, not
only multithreaded SmartPtrs grew. The
single-threaded SmartPtr is bigger, too,
although it has no supplemental data at all (recall from the
previous column that RefCountingTraits has
no data members). Why? Because in C++ even empty objects have a
nonzero size. This rule enables much of the language to remain sound
(otherwise, for instance, how would you build arrays of zero-sized
objects?).
As sensible as this rule might be, in this case it works against
us. SmartPtr<Something,
RefCountingTraits<Something> > is larger than a mere
pointer to T, and it should not be. Now
the size of the single-threaded SmartPtr
is at least sizeof(T*) + 1, but often, due
to alignment and padding issues, it's around 2 *
sizeof(T*). If you have many single-threaded smart pointers,
the size overhead incurred might become significant—to say nothing
about the supplemental cost of passing SmartPtr by value.
Fortunately, the standard provides another rule that can be of
help regarding object sizes. It's known as the empty base
optimization. If a base class B of a
class D is "empty" (i.e., has no nonstatic
data members), the B subobject in a D object can have an effective size zero. This
doesn't break the language rules because the B subobject is "melted" within the D object; of course, as soon as you extract a
standalone B object it has, again, nonzero
size. Whether you get the empty base optimization or not depends
heavily on your compiler—it's an option, not a requirement.
Compilers such as Metrowerks' Code-Warrior 5.x and Microsoft Visual
C++ 6.0 do perform empty base optimization—an added incentive to try
taking advantage of this optimization.*
Applied to the previously listed SmartPtr code, the empty base optimization
suggests it is better for SmartPtr to
inherit from RCTraits than aggregate it.
This way, if RCTraits is empty, the
compiler can optimize away any slack space; if RCTraits is not empty, the situation becomes
much the same as with aggregation.
What kind of inheritance (private, protected, or public) should
we use? Well, let's not forget it's just about an optimization, not
a conceptual change. SmartPtr is
not an RCTraits by any means.
Therefore, the best choice is to use private inheritance.
This is an "inheritance hack" for optimizing
object size. Ironically, we're back to the double-colon call syntax
since RCTraits is now a base class of
SmartPtr.
Traits objects are needed whenever traits may have to hold state.
Traits objects can be part of other objects and/or passed around as
parameters. When a traits object can be empty, you might want to
consider using the inheritance hack to achieve optimal memory layout
of your object, assuming your compiler actually does the empty base
optimization.
Definition: A traits object is an instance of a traits
class.
INTERMEZZO Traits templates, traits classes, traits
objects . . . As we moved from mere static code generation to
entities with state, our means of expression evolved from the most
static (templates) to the much more dynamic (full-fledged traits
objects). Traits templates are exclusively a compile-time mechanism;
they vanish completely even before compilation ends. At the other
end of the spectrum, traits objects are living, dynamic entities
with state and behavior.
One further step to dynamism is to use polymorphic traits and
pointers, or references to traits objects, but that's certainly not
traits' charter anymore. Polymorphic traits are best spelled as the
Strategy design pattern.2
Use whichever traits mechanism suits your needs, and choose the
most static solution that you can. Prefer compile-time solutions to
runtime solutions. Compile time often means that code is better
checked at compile time (essential) and more efficient (doesn't
hurt). On the other hand, of course, dynamism is what brings spice
to life.
HIERARCHY-WIDE TRAITS Traits often apply not only to
individual types, but also to whole class hierarchies. For instance,
the reference counting strategy is usually the same within a class
hierarchy. It would be nice, then, to define traits that apply to a
whole class hierarchy at once without having to hand-code them into
each individual class. However, templates—on which traits build—are
agnostic of inheritance. What to do?
Maybe the first and foremost rule of good design is to be
flexible and not to get stuck with a single strategy. Solving a
design problem is like conquering a fortified castle: If one
strategy doesn't work well, it's best to try another. A bad strategy
might still solve the problem, but with higher costs than an
alternative approach.
Following this line of thought, let's regroup. The need is to
templatize a class template with something that is invariant
within a hierarchy of types. And guess what? Nested classes
(classes defined inside other classes) are invariant inside a
hierarchy, as long as you don't redefine them. Nested classes are
inherited like any other symbols. Looks like a path worth exploring.
For automating such inner type definition, let's put in place a
simple template:
Now say we have a hierarchy rooted in class
Shape (Fig. 1). To publish the hierarchy
root role that Shape has, you derive it
from Hierarchy<RootShape>, shown as
follows. The rest of the hierarchy remains the same.
If you want to protect against implicit (and
undesired) conversions from Shape to HierarchyRoot<Shape>, you define Shape like this:
In any case, the key achievement is that if you
write Rectangle::HierarchyId, you get
the same type as if you wrote Shape::HierarchyId. As long as a type deriving
directly or indirectly from Shape does not
redefine the symbol HierarchyId, that
symbol identifies a type that's invariant in the hierarchy.
Building a SmartPtr that uses
hierarchy-wide traits is just as simple as with regular traits. You
only have to replace T with T::HierarchyId, like so:
Now say you have two hierarchies in your
application: one rooted in Shape and
another one rooted in Widget. Just like
Shape, Widget
inherits HierarchyRoot<Widget> to
state its role of hierarchy root. You can now specialize RefCountingTraits for the two hierarchies in
this way:
That's it—the traits above dispatch correctly on
classes in the two hierarchies, even for types that you have not
defined yet. As the following two sections show, hierarchy-wide
traits are quite flexible.
PERSONALIZED HIERARCHY-WIDE TRAITS Simple traits offer
per-type specialization; hierarchy-wide traits offer per-hierarchy
specialization. Sometimes you might need something in between—a
traits template that you can define for a whole hierarchy, yet
specialize only for an isolated type or two in that hierarchy.
You can achieve that by defining your traits template as shown in
the following:
How does this traits template work? The client
code uses Traits<Shape>,
Traits<Circle>, etc. If you want to specialize the
traits for the whole Shape hierarchy, you
specialize HierarchyTraits<Shape::HierarchyId>. By
default, because Traits<T> inherits
HierarchyTraits<T::HierarchyId>, all
derivatives of Shape use the traits
defined in HierarchyTraits<T::HierarchyId>. (I'd bet
money you have a lot of fun following all these symbols. But it's
actually simple. HierarchyTraits refers to
the hierarchy, and Traits refers to each
type.)
If you want to specialize the traits for a specific class, say
Ellipse, you do this by specializing the
Traits template directly:
You can choose to derive or not from the HierarchyTraits<Ellipse> if you want just
to override a symbol or two, or not to derive at all if you want to
rewrite Ellipse's traits from scratch.
It's up to you.
A note about the use of inheritance with the traits just
mentioned: From the viewpoint of dynamic polymorphism, the fact that
Traits<T> inherits HierarchyTraits<T::HierarchyId> is wrong.
HierarchyTraits is not a polymorphic base
class. We use inheritance here with a different good reason—as a
symbol-propagating device; the intent is to have Traits<T> sip the bouillabaisse of symbols
that HierarchyTraits<T::HierarchyId>
defines. Inheritance has uses not only for dynamic polymorphism, but
also for compile-time type manipulation.
The Traits-HierarchyTraits idiom
presented in this section allows per-type specialization of
hierarchy traits. In the example discussed, Traits<Rectangle> and Traits<Circle> take you to HierarchyTraits<Shape::HierarchyId>, while
Traits<Ellipse> takes you to the
specialized Traits<Ellipse>—all this
with amazingly little scaffolding.
SUBHIERARCHY TRAITS Suppose that in your Shape hierarchy you decide to redefine traits
for a certain subhierarchy of that hierarchy, for instance, the
subhierarchy rooted in Bitmap (see Fig.
2). You will need therefore to specialize traits on Bitmap and all its derivatives, direct or
indirect.
You can do this by deriving Bitmap from
HierarchyRoot<Bitmap>, in addition
to deriving from Shape. Then you provide a
using declaration that
disambiguates the HierarchyId
symbol, like so:
Through the using
declaration, Bitmap gives preference to
the class HierarchyRoot<Bitmap>::HierarchyId over
Shape::HierarchyId. This way you can now
specialize some traits for Bitmap::HierarchyId and all its derivees will
use that specialization. Unless, of course, you decide to define a
new subhierarchy with distinct traits down the road.
CAVEAT EMPTOR The biggest disadvantage of hierarchy-wide
traits is that they require modifying the base class of the
hierarchy, and you sometimes are not in a position to do that.
Fortunately, you get a pretty clear compile-time error ("Class Widget does not define a type HierarchyId") instead of subtle runtime errors.
You can mitigate this problem to a certain extent by specializing
your hierarchy-wide traits template for an insipid type, like void. Then you use HierarchyTraits<void> for hierarchies that
you cannot modify. Not too flexible, but it works when you're stuck.
There are ways to create nonintrusive hierarchy-wide traits, but
such techniques are more fragile and exposed to various kinds of
errors. Ideas from readers are, as always, welcome.
CONCLUSION Traits objects are useful whenever the trait
must hold some state. If a trait class has optional state (some
traits have state, some others are empty) it's best to use the
"inheritance hack" to grab the empty base optimization when
applicable (and if available).
With only a tad of scaffolding, you can define hierarchy-wide
traits. This way you write the traits for a given hierarchy once and
only once. Hierarchy-wide traits offer considerable flexibility by
allowing you to "personalize" traits for specific classes in the
hierarchy. You also can define subhierarchy traits: traits that
apply to a specific subtree of the inheritance graph.
Hierarchy-wide traits make heretical use of inheritance, proving
that inheritance is not only a tool for runtime polymorphism, but
also for compile-time type manipulation. C++ mixes the two natures
of inheritance, which sometimes becomes confusing.
The best news of all, however, is that hierarchy-wide traits use
only simple template amenities, which means that you can use them
with your not-so-compliant compiler today.
Acknowledgment Many thanks to Herb Sutter, who took the
time to review this article and added insightful remarks.
References This article contains text and examples from Design with
C++ (tentative title), by Andrei Alexandrescu, © 2001 Addison
Wesley Longman (in press).
FOOTNOTE
Traits on
Steroids
Andrei
Alexandrescu
The April installment of
Generic<Programming>1
discussed traits templates and traits classes. This article takes
traits further by discussing traits objects and hierarchy-wide
traits.
template <class T, class RCTraits =
RefCountingTraits<T> >
class SmartPtr
{
...
};
class MtRefCountingTraits
{
static void Refer(Widget* p)
{
// serialize access
Sentry s(lock_);
p->AddReference();
}
static void Unrefer(Widget* p)
{
// serialize access
Sentry s(lock_);
if (p->RemoveReference() == 0)
delete p;
}
private:
static Lock lock_;
};
template <class T, class RCTraits =
RefCountingTraits<T> >
class SmartPtr
{
private:
T* pointee_;
public:
...
~SmartPtr()
{
RCTraits::Unrefer(pointee_);
}
};
template <class T, class RCTraits =
RefCountingTraits<T> >
class SmartPtr
{
private:
T* pointee_;
RCTraits rcTraits_;
public:
...
~SmartPtr()
{
rcTraits_.Unrefer(pointee_);
}
};
template <class T, class RCTraits =
RefCountingTraits<T> >
class SmartPtr : private RCTraits
{
private:
T* pointee_;
public:
...
~SmartPtr()
{
RCTraits::Unrefer(pointee_);
}
};
template <class T>
struct HierarchyRoot
{
// HierarchyId is a nested class
struct HierarchyId {};
};
Figure 1. A hierarchy of shapes.
class Shape : public HierarchyRoot<Shape>
{
...
};
class Rectangle : public Shape
{
...
};
class Shape : private HierarchyRoot<Shape>
{
...
public:
using HierarchyRoot<Shape>::HierarchyId;
};
template <class T, class RCTraits =
RefCountingTraits<typename T::HierarchyId> >
class SmartPtr
{
...
};
template <>
class RefCountingTraits<Shape::HierarchyId>
{
...
};
template <>
class RefCountingTraits<Widget::HierarchyId>
{
...
};
template <class HierarchyId>
class HierarchyTraits
{
... most general traits here ...
};
template <class T>
class Traits
: public HierarchyTraits<T::HierarchyId>
{
// empty body - inherits all symbols from base class
};
template <>
class Traits<Ellipse>
{
... specialized stuff for Ellipse ...
};
Figure 2. Hierarchy of shapes with a bitmap
subhierarchy.
class Bitmap : public Shape,
public HierarchyRoot<Bitmap>
{
...
public:
using HierarchyRoot<Bitmap>::HierarchyId;
};
* Conveniently enough, the latest
implementations of the standard C++ library for the mentioned
compilers take advantage of the empty base optimization, which is
applicable to containers. Every standard container aggregates an
allocator object, and the default allocator is usually an empty
class.