The meaning of RAII — or why you never need to worry about resource management again

I tried really hard to come up with some witty title or pun to weave into the title of this post. I couldn’t. RAII is just a ter­ri­ble name, and it isn’t really clever or funny. Unfor­tu­nately, it is also the sin­gle most impor­tant key to C++. It is not just an idiom but a fun­da­men­tal phi­los­o­phy used to solve almost any prob­lem in the lan­guage. So we can’t really avoid it.

If I had to pin­point one thing that marked the dif­fer­ence between a skilled and an unskilled C++ pro­gram­mer, it would be “do they under­stand RAII”. Many peo­ple don’t, hence this post.

RAII is, apart from being badly named, one of those decep­tively sim­ple con­cepts that you think you under­stand when you first hear of it, think “well duh, that’s obvi­ous”, and then pro­ceed to write code as usual, because you just don’t see how widely applic­a­ble it is.

But let’s get the name out of the way first. RAII stands for “Resource Acqui­si­tion Is Ini­tial­iza­tion”. And if you’re not already famil­iar with the idiom, then this has told you noth­ing at all. If you did know about RAII in advance, then you can, when you stop and think about it, kind of see how the name relates to it… vaguely… sort of.

What it actu­ally means is sim­ple: Resources should be man­aged by classes. When the class is ini­tial­ized, the resource is acquired (hence the name). When the class is destroyed, the resource is released. And the life­time of the object should exactly match the desired life­time of the resource. That sounds obvi­ous, and many pro­gram­mers will (assum­ing they’re work­ing in a lan­guage that has classes), say that this is what they always do.

Often, C++ devel­op­ers think this just means “smart point­ers. Wrap your mem­ory allo­ca­tion in a boost::shared_ptr and you’re done”. I see that as one not-very-often used bor­der case though, rather than a typ­i­cal exam­ple of RAII. So let’s take a step back instead.

The key idea isthat any kind of resource, not just mem­ory, but file han­dles, sock­ets, data­base con­nec­tions, or even more abstract resources like log­gers or pro­fil­ing timers or tex­tures, really any con­cept or process which has a life­time, should be mapped to an object.

Unlike the typ­i­cal object-oriented line of thought which goes that “every­thing must be an object, because then.… well, every­thing will be an object, and your code will be bet­ter”, here we actu­ally have a con­crete rea­son: We want to use the object to man­age the life­time of the resource.

When I allo­cate mem­ory with new, I have to deal­lo­cate it again sooner or later, with delete. (Or in C, with malloc() and free() respec­tively). And I have to make sure that this is done. And I have to make sure that it is not done twice. And that the object is not accessed after this is done. There are a lot of con­straints we have to obey, all related to the life­time of the resource. And this is why unman­aged pro­grams have a rep­u­ta­tion of leak­ing mem­ory left and right. If we allo­cate mem­ory, and it is to be used by a dynamic num­ber of objects or func­tions all ref­er­enc­ing the same allo­ca­tions, which of the users is respon­si­ble for delet­ing it? And how do we know when it is safe to delete, when no users remain?

Iron­i­cally, most man­aged lan­guages have not solved the prob­lem. They have added a garbage col­lec­tor (which yes, is very use­ful for a wide num­ber of rea­sons), but that only solves one spe­cific instance of the prob­lem. It takes care of avoid­ing mem­ory leaks, but it doesn’t avoid resource leaks in gen­eral.

The garbage col­lec­tor ensures that this code won’t leak memory:

void foo() {
  SomeObject* obj = new SomeObject();
  bar(obj);
}

where with­out a garbage col­lec­tor, we’d (at least with­out RAII) have to write code such as

void foo() {
  SomeObject* obj = new SomeObject();
  try {
    bar(obj);
    delete obj;
  }
  catch(...){ delete obj; }
}

In the garbage col­lected case, we don’t know what bar does, and we don’t need to know. It doesn’t have to delete the object. And nei­ther does the foo func­tion. So we have suc­cess­fully dodged the prob­lem of man­ag­ing the life­time of mem­ory allo­ca­tions. We haven’t really solved the prob­lem though. We still don’t have any good tools to man­age the life­time. We’re just guar­an­teed by the sys­tem that it’ll last long enough.

In C++, this effect can be approx­i­mated using some kind of smart pointer1.

Smart point­ers allow us to write code like this:

void foo() {
  boost::shared_ptr<SomeObject> ptr = new SomeObject();
  bar(ptr);
}

and be sure we won’t leak mem­ory. Of course, this solu­tion isn’t per­fect — ref­er­ence count­ing is much more expen­sive than a good garbage col­lec­tor, and if we cre­ate cyclic ref­er­ences, the objects will never be deleted, as the ref­er­ence counts never reach zero. It is a decent approx­i­ma­tion, but nowhere near as good and reli­able as the garbage col­lec­tor in man­aged languages.

But the prob­lem shows up again if we use another type of resource. What if we’d opened a data­base con­nec­tion instead? We’d have to write code such as this: (The fol­low­ing Java-like pseudocode is copied almost ver­ba­tim from this StackOverflow.com answer, cour­tesy of Mar­tin York.)

void writeToDb()
{
  Db db = new Db("DBDesciptionString");
  try
  {
    // Use the db object.
  }
  finally
  {
    db.close();
  }
}

(And of course it gets even worse if db.close() can throw excep­tions. Then we have to catch that excep­tion, just to avoid it prop­a­gat­ing out from the finally clause if we reached finally because of an excep­tion being thrown in the try clause.)

The resource man­age­ment prob­lem still exists. We still have to wrap the code in excep­tion han­dling just to make sure that the con­nec­tion is closed as soon as we’re done with it. And we have to do this at every use. And it gets com­pli­cated fast.

Of course, .NET makes this a bit simpler:

using (Db db = new Db("DbDescriptionString"))
{
  // use the database object.
}

But the onus is still on the user of the class to ensure it is closed cor­rectly. There is no obvi­ous way to encode into the Db class that “once we’re done with an object of this type, the con­nec­tion must be closed immediately”.

And in C++, smart point­ers are no longer suit­able solu­tions, since the resource to be man­aged is no longer a pointer allo­cated with new.

Instead, a more basic fla­vor of RAII comes to the fore:

void someFunc()
{
    Db db("DBDesciptionString");
    // Use the db object.
} 

Yes, that’s all. When the db object goes out of scope, at the end of the func­tion, its destruc­tor is called. The destruc­tor inter­nally calls this->Close() for us, so we don’t need to do it! We just have to trust the scop­ing rules of C++, which guar­an­tee that destruc­tors are called on local vari­ables when they go out of scope, and on class mem­bers when the class is destroyed.

So in a sense, the key idea in RAII is sim­ply that “resources should behave sen­si­bly”. They should get copied safely if an assign­ment is made (or oth­er­wise, assign­ments should be pre­vented), they should be avail­able if their own­ing object is suc­cess­fully cre­ated (if it can’t cre­ate the resource, it should throw an excep­tion, abort­ing the cre­ation of the object), and when they are no longer used, they should be cleaned up.

The C++ stan­dard library class tem­plate std::vector is a won­der­ful exam­ple of RAII in action. The resources being man­aged by a vector are mem­ory (the array allo­cated inter­nally to hold the objects being con­tained in the vec­tor, as well as the objects them­selves. When the vector is destroyed, every object it holds must be destroyed too, and the array in which they were placed must be deallocated.

In the fol­low­ing exam­ples, assume that a func­tion foo is passed a vec­tor of MyClass objects by value. We don’t know how many, if any, objects are stored in it, but since we are passed a copy of the orig­i­nal vector, we take own­er­ship of it. It exists only in the func­tion foo, and must be destroyed afterwards.

void foo(std::vector<MyClass> vec) {
  ...
 //  when we get to the end of the function, all local variables, including vec, 
 // are automatically destroyed by having their destructors invoked.
 // So no matter how many MyClass objects were stored in the vector, it ensures that they too have their destructors called.
 // And the vector also deallocates its internal array, leaving neither of its resources alive at the end of the function
}

void foo(std::vector<MyClass> vec) {
  throw std::exception("Oops");
  // as above, vec is automatically destroyed when we leave the function,
  // regardless of *how* we leave it. Even if we leave it because an exception was thrown and not caught.
} 

void foo(std::vector<MyClass> vec) {
  // other is constructed as a copy of vec. std::vector ensures that both of vecs resources are copied as well
  std::vector<MyClass> other = vec;
  // we now have two vectors, each owning a dynamically allocated array and a number of MyClass objects
  // and again, at the end of the function, both are deallocated cleanly
} 

void foo(std::vector<MyClass> vec) {
  std::vector<MyClass> other; // a second, empty, vector

  // perform an assignment, setting vec to be an empty vector
  // std::vector makes sure that if you do this, the resources previously held by vec are cleanly released
  // before copies are made of the resources held by other
  vec = other;

  // and so when the function ends, the MyClass objects originally held by vec
  // have already been destroyed, so their destructors are *not* invoked now
} 

As the above shows, vec owns its resources, and man­ages them tightly. When­ever a change hap­pens to vec, it reflects this by updat­ing its owned resources. If it is destroyed, it destroys its owned resources. If it is copied, it copies the resources it owns. If it is assigned to hold some­thing else, it first destroys its exist­ing resources. And so on. Noth­ing you do can bring it “out of bal­ance”. It just works. That is RAII. Smart point­ers are just con­ve­nient adapters turn­ing raw point­ers into RAII objects. But RAII is much more than smart pointers.

It is the broad and gen­eral idea that resources should be mapped to objects, so that the object can not be cre­ated unless it suc­ceeded in acquir­ing its resource, and it can not be destroyed with­out also releas­ing its resource. This effec­tively saves C++ pro­gram­mers from hav­ing to worry about resource management.

Take an exam­ple that’s guar­an­teed to cause pain with­out the use of RAII: Han­dling excep­tions being through halfway through con­struc­tors. Say you have a class with mul­ti­ple mem­bers which are ini­tial­ized in its con­struc­tor. After the first mem­ber has been ini­tial­ized, but before all of them have been ini­tial­ized, an excep­tion is thrown. Let’s use the fol­low­ing con­trived example:

class Foobar {
  Foo f;
  Bar b;
  MyClass c;

public:
  Foobar() : f(42), b("hello world), c('a') {}
};

unfor­tu­nately, b’s con­struc­tor throws an excep­tion. How to han­dle this? We know that in C++, par­tially con­structed objects do not auto­mat­i­cally have their destruc­tors called. when the con­struc­tion is aborted.

And since we want to avoid any resource leaks, we require that the fol­low­ing must hap­pen: – a must have its destruc­tor called (because a was suc­cess­fully ini­tial­ized before the error occurrd) – b must release any resources it acquired in its con­struc­tor before it threw the excep­tion – c must do noth­ing. Its con­struc­tion was not yet begun when the error ocurred, so it would be an error to attempt any kind of cleanup of c. – The Foobar object (the object pointed to by the this pointer) must ensure that the above, and noth­ing else, hap­pens, and it must do so with­out rely­ing on its own destruc­tor (which won’t be called, as con­struc­tion did not suc­cess­fully complete).

And of course, pre­tend­ing that only b can throw an excep­tion may be a sim­pli­fi­ca­tion over the real world. Per­haps every mem­ber could throw one from its con­struc­tor. Care to write a Foobar con­struc­tor which takes all this into account, pro­vid­ing enough try/catch blocks to cor­rectly catch every excep­tion that might be thrown, and release exactly the resources that have been allo­cated until then, and noth­ing else? A tall order, and an open invi­ta­tion for bugs. And of course, it’d lead to a huge, bloated and error-prone con­struc­tor. It’d also pre­vent us from using the ini­tial­izer list. We’d have to per­form some kind of “safe” non-throwing default con­struc­tion of both a, b and c before enter­ing the con­struc­tor body, where excep­tion han­dling is pos­si­ble, and from there, attempt to per­form assign­ments to bring the three mem­bers into the desired state.

In pseudocode, the con­struc­tor might look some­thing like this:

Foobar() {
  a = new Foo(42);
  try {
    b = new Bar("hello world");
  }
  catch {
    destroy a;
    throw;
  }
 try {
    c = new MyClass();
  }
  catch {
    destroy b;
    destroy a;
    throw;
  }
}

Note that all this com­plex­ity is only nec­es­sary because we want to han­dle sev­eral dif­fer­ent resources. a, b and c all con­tain resources that must be attempted acquired, and prop­erly released if this fails. If there’d been only one resource, the job would have been much sim­pler. There wouldn’t be any point at which some resources have been acquired, and oth­ers have not. If we suc­ceeded in acquir­ing that one resource, there’d be no risk of errors occur­ring after­wards, so we wouldn’t need com­plex con­di­tional cleanup code. And if we failed to acquire the one resource, there’d be noth­ing to clean up — after all, the resource was never acquired!

So to keep down the com­plex­ity, the only safe way to define a class is to make it own at most one resource. And this one-to-one map­ping of resources to classes is exactly what RAII is all about. If a, b and c had all been RAII objects, then the above code would work. Regard­less of which mem­bers could or couldn’t throw excep­tions. Accord­ing to the rules of C++, we know that in the above case,

  • the Foobar destruc­tor (this->Foobar::~Foobar() will not be called, as *this was not suc­cess­fully constructed.
  • the a destruc­tor will be called, as this mem­ber was fully con­structed at the time of the error.
  • the b and c destruc­tors will not be called, as these mem­bers were not fully con­structed at the time of the error.

So assum­ing that b’s con­struc­tor takes care of releas­ing any resources suc­cess­fully allo­cated when the error occurred (the num­ber of which, as pointed out above, should ide­ally be zero), we’re actu­ally home free! What hap­pens is exactly what we listed ear­lier as our goal. a has its destruc­tor called, c’s con­struc­tor was never run in the first place, so it doesn’t have to do any­thing, and *this doesn’t have to do any­thing spe­cial in its con­struc­tor. All of its mem­bers take care of their own resources, so the num­ber of resources man­aged by *this is zero!

We don’t even need to write a destruc­tor for Foobar now, if all its mem­bers are RAII objects. Whether the Foobar object is par­tially or fully con­structed, its mem­bers take care of them­selves. That is the power of RAII. Once a resource has been mapped to a class, we can use it as much as we like, and even in very com­plex sit­u­a­tions, and never have to worry about the resource being leaked. It is man­aged by its wrap­ping RAII object, and the C++ life­time and scope rules ensure that this wrap­per object gets destroyed when it goes out of scope


  1. A smart pointer is an object which behaves as a pointer (mean­ing that it over­loads the * and -> oper­a­tors, so it can be deref­er­enced to yield the pointed-to value), but also enforces some kind of own­er­ship seman­tics on the value. A plain pointer does noth­ing when it goes out of scope. If it pointed to some dynam­i­cally allo­cated mem­ory, noth­ing hap­pens to that mem­ory. And if no one else have a pointer to it, then that mem­ory is lost, and can not be reclaimed. A smart pointer does some­thing when it is destroyed. Some vari­ants sim­ply free the mem­ory they point to (boost::scoped_ptr, std::auto_ptr or std::unique_ptr all fall into this cat­e­gory, although with some impor­tant dif­fer­ences), while oth­ers imple­ment ref­er­ence count­ing, so that the mem­ory is only destroyed when all smart point­ers point­ing to it have been destroyed. boost::shared_ptr is by far the best known imple­men­ta­tion of this con­cept. 

Share and Enjoy: These icons link to social book­mark­ing sites where read­ers can share and dis­cover new web pages.
  • Digg
  • del.icio.us
  • StumbleUpon
  • Reddit
  • Technorati

Tags: , , ,

7 Responses to The meaning of RAII — or why you never need to worry about resource management again

  1. […] still be con­fused by this idiom. One thing I read while keep­ing tabs on the web for C++ arti­cles is this one from […]

  2. sheepsimulator says:

    GREAT ARTICLE! I felt like it was directed exactly where I am at in my jour­ney to learn more about OOP. I think I know how to talk­a­bout RAII bet­ter: resources are mapped to objects.

    Didn’t know that garbage col­lec­tors were less resource-intensive than shared_ptrs; I thought they were about the same, since they did about the same thing.

  3. jalf says:

    They do roughly the same thing, yes, but in very dif­fer­ent ways.

    Con­sider that a ref­er­ence counter has to be updated every time a ref­er­ence is cre­ated or deleted. If a smart pointer points to object a, and you set it to point to b instead, you have to update the ref­er­ence coun­ters for both objects. And every update has to be done atom­i­cally to ensure thread safety as well. That makes it still more costly. When you add it up like that, it’s actu­ally quite a few CPU cycles that are thrown away updat­ing ref­er­ence counters.

    By com­par­i­son, a garbage col­lec­tor doesn’t have to do any­thing when ref­er­ences are cre­ated, mod­i­fied or destroyed. It only has to step in when the heap has been filled so much that a mem­ory allo­ca­tion fails. Once that hap­pens, it has to tra­verse the graph of live objects, mark­ing each as in use. All dead (non­reach­able) objects are never even touched by the GC; so they’re effec­tively free. Tra­vers­ing this graph does take a bit of time, as does the heap com­paction that typ­i­cally fol­lows. But because it hap­pens so rarely com­pared to ref count­ing, it’s still vastly cheaper overall.

  4. @jalf: “But because it hap­pens so rarely com­pared to ref count­ing, it’s still vastly cheaper overall.”

    And when you fac­tor in the “multi core cri­sis” GC is even more vastly cheap than ref-counting. GC can pipeline its work into back­ground threads, while ref-counting adds a lot of interlocking.

    It’s ironic, and it goes against all the “folk­lore”, but it has to be accepted that C++‘s low-level mem­ory model approach is actu­ally the more inef­fi­cient way to do things in very many (prob­a­bly most) applications.

  5. “But the onus is still on the user of the class to ensure it is closed cor rectly. There is no obvi ous way to encode into the Db class that “once we’re done with an object of this type, the con nec tion must be closed immediately”.

    Don’t you think the same is true in C++? Given a class C, is it intended to be used like this:

    C c;
    

    Or like this:

    C *p = new C;
    

    If you attempt to use RAII by pro­vid­ing a class like C to be declared on the stack, then any­one is free to abuse it by allo­cat­ing a C on the heap (and hope­fully delet­ing it again at some point), and thus they lose the exception-safe cleanup capa­bil­ity. A basic prin­ci­ple of C++ is that there is no way to stop this.

    Yes, it’s true that in C# it is very easy to for­get to put a using state­ment in. So I tend to use the Scheme-style approach of a “with-whatever” method:

    WithOpenFile(filePath, file => 
    {
        // use file in here.
    });
    

    The With­Open­File method is equiv­a­lent to a RAII wrap­per class around a file han­dle, except that it can­not be abused. So it’s actu­ally more robust than the C++ wrapper-class tech­nique. (With C++1x lamb­das this is of course now going to be pos­si­ble in C++ very soon).

    On the down­side, it’s also less flex­i­ble because you can’t embed it inside another object and so asso­ciate their life­times (C# is severely lack­ing in lan­guage sup­port there anyway).

  6. Ger says:

    Hey, nice arti­cle. One ques­tion: I recently learned about the “rule of three”, and accord­ing to it, a RAII class that has a destruc­tor (to release a resource) would require a copy con­struc­tor and an oper­a­tor=. What do you do if you have a class that requires cleanup but you’d rather not allow copy­ing? Define copy con­struc­tor and oper­a­tor= and throw an excep­tion? Or can you pre­vent copy­ing at com­piler level?

  7. Jim says:

    @Ger

    What one typ­i­cally does is declare a copy con­struc­tor and an oper­a­tor= as pri­vate. This is only required because C++ pro­vides for a DEFAULT copy con­struc­tor / assign­ment oper­a­tor if not oth­er­wise declared. Declar­ing them as pri­vate tells the com­piler that you’ve con­sid­ered them and want to make sure they are not able to be invoked by out­side code.

Leave a Reply

Name and Email Address are required fields. Your email will not be published or shared with third parties.