<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>jalf.dk &#187; c++</title>
	<atom:link href="http://jalf.dk/blog/tag/c/feed/" rel="self" type="application/rss+xml" />
	<link>http://jalf.dk/blog</link>
	<description>Musings and thoughts on programming and other geeky stuff</description>
	<lastBuildDate>Mon, 12 Jul 2010 15:21:00 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Singletons: Solving problems you didn’t know you never had since 1995</title>
		<link>http://jalf.dk/blog/2010/03/singletons-solving-problems-you-didnt-know-you-never-had-since-1995/</link>
		<comments>http://jalf.dk/blog/2010/03/singletons-solving-problems-you-didnt-know-you-never-had-since-1995/#comments</comments>
		<pubDate>Fri, 12 Mar 2010 04:40:59 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[design patterns]]></category>
		<category><![CDATA[singleton]]></category>
		<category><![CDATA[stackoverflow]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=532</guid>
		<description><![CDATA[Funny how it goes. Some subjects are just flat out impossible to write catchy titles for. Others seem to attract them like flies. A lot of very clever people have written volumes about “The Simpleton Pattern”, and “Singletonitis”. Many people are in love with the Singleton pattern. Others — a small minority, I suspect — [...]]]></description>
			<content:encoded><![CDATA[<p>Funny how it goes. <a href="http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/">Some subjects</a> are just flat out impossible to write catchy titles for. Others seem to attract them like flies. A lot of very clever people have written volumes about <a href="http://steve.yegge.googlepages.com/singleton-considered-stupid">“The Simpleton Pattern”</a>, and <a href="http://www.gamedev.net/community/forums/mod/journal/journal.asp?jn=259115">“Singletonitis”</a>.</p>

<p>Many people are in love with the <a href="http://en.wikipedia.org/wiki/Singleton_pattern">Singleton pattern</a>. Others — a small minority, I suspect — consider it a mistake, an anti-pattern, or something that was only ever included in <em>the</em> Design Patterns book as a lifeline to procedural programmers who couldn’t really figure out this OOP thing.
<span id="more-532"></span></p>

<p>I won’t pretend to be half as clever as all the people who have already written about the problems with singletons years ago, and I don’t think I have anything <em>new</em> to bring to the table. But it is a pattern I learned to loathe very soon after I first saw it in use. (Singletons do sound attractive when you first hear of them. But they pale a bit when you end up having to tear up and rewrite half your code just because all your singleton classes start revealing their shortcomings) And for a long time now, I’ve tried to convince other programmers that Singletons have some serious problems. Recently, it seems like I’ve even gotten noticed for it on StackOverflow.</p>

<p>First, <a href="http://stackoverflow.com/users/87234/gman">GMan</a> posts an answer to <a href="http://stackoverflow.com/questions/2080233/is-it-good-programming-to-have-lots-of-singleton-classes-in-project/2080242#2080242">one question</a>, and I comment with a mild disagreement, and the discussion goes on for a few more comments. As singleton rants go, this one is pretty mild, and I don’t really think about it any further. Then, a few weeks later, I discover his blog and <a href="http://blackninjagames.com/?p=24">this post</a>. Wow! A convert. A person I know to be extremely bright and a knowledgeable programmer has changed his mind in response to something <em>I</em> said… I’m flattered.</p>

<p>And today, I noticed another question being posted, which had both Boost and Singletons in the title — how could I resist? Two subjects I enjoy talking about, even if the things I say about them are very different. Surprisingly, the comments there already mentioned me, and some of my earlier answers regarding singletons. Should I be flattered that people have started bringing my name up when discussing Singletons?</p>

<p>Anyway, one of the comments also suggested I write a blog post describing my argument in detail. So I will.</p>

<h1>Two wrongs don’t make a right</h1>

<p>There are a lot of problems with singletons. In fact, it’s surprising that so many people still consider the pattern useful, when it is afflicted with so many weaknesses and flaws. However, for now I will single out the two that I feel are the most fundamental: not just problems with how a singleton works, but with what they’re trying to achieve:</p>

<p>A singleton, as defined by the Gang of Four, combines two properties:</p>

<ul>
<li>it guarantees that exactly one instance of an object exists. While that one instance is typically created lazily, so it doesn’t technically exist throughout the entire application’s lifetime, it always seems to the programmer as if precisely one instance exists, and</li>
<li>it guarantees global access to this one instance.</li>
</ul>

<p>Let’s pick those apart a bit. The last one is easiest: it is, by now, fairly common knowledge that <em>global state is bad</em>. We don’t like global variables, we don’t like static class members, we don’t like anything that makes it harder to isolate bits of our code. Dependence on global state causes a lot of problems: it hurts parallelism, as access to global mutable state generally has to be serialized through the use of locks. It makes dependencies harder to detect and control (any function might silently decide to access our singleton. The function signature says nothing about this, so we have to read the source code of the function to determine if this is the case. And because it is so convenient to always just add a reference to a singleton, we tend to do it a lot. When you have a singleton, you quickly end up in a situation where three out of four classes depend on it. How did that happen? Why, logically speaking, do so many classes need direct access to the database? Or the renderer? Is that good design? Not only is this messy, it’s also painfully hard to fix after the fact. Once we have these dependencies on global objects everywhere, that’s a lot of code we need to change to eliminate the global. Almost every class will be impacted by the change, and a huge number of functions have to have their signatures modified to take that extra parameter replacing the global. Or even worse, the function has to be completely rewritten to eliminate the need for whatever service the singleton provided. The more globals you have in your project, the more your dependency graph starts resembling spaghetti. And the harder it gets to clean it up.</p>

<p>It hurts reusability, as code taken from one project and inserted into another may break because it depended on globals not present in the new project. It hurts testability partly for the same reason, a unit test testing a class must suddenly provide a number of globals as well just for the code under test to compile, but also because global state makes tests less deterministic. One test might change the state of this global, affecting the outcome of the next test to run.</p>

<p>Globals are bad for a lot of reasons. They have their uses, no doubt about that, but we should be suspicious whenever the solution to a problem involves global data. It might be the best solution, but often, it is more trouble than it’s worth.</p>

<p>The other point is more subtle. Why do I object to a class enforcing that “only one instance may exist”? It’s really just common sense. As the Agile movement tells us, we don’t really know what our code is going to look like tomorrow. Over the course of development, we <em>have</em> to adapt to changes, modify our code, revise decisions already made. Why put roadblocks in front of us? Why make it harder to adapt to unforeseen changes or requirements?</p>

<p>Today, I might think that I need only one logger instance. But what if I realize tomorrow that I need two? That’s not so far fetched. We may have one log we write ad-hoc messages intended for debugging purposes, solely to be read by developers, and another formalized log, where structured messages are written when predetermined events occur, so that the application can be monitored in production. Sure, we <em>could</em> define the two as completely separate classes, and then we’d only need one instance of each (but then we’d start duplicating code). Or we could use the same log instance to write to both logs (but then the logging code would become more complex, having to interleave two separate and non-overlapping logs.</p>

<p>Once we’ve accepted that an application may need more than one logger, shouldn’t we do ourselves the favor of ensuring that our loggers <em>can</em> be instantiated more than once, just in case it turns out to be the right thing to do? We’re not even adding any complexity, there’s no cost associated with this. On the contrary, we’re <em>removing</em> significant complexity. Thread-safe singletons are surprisingly hard to get right. Dependencies between singletons are tricky and circular ones can cause them to blow up in all sorts of fun ways. And let’s not even get into how to handle anything our singletons might do while the application is shutting down. What if the database singleton tries to write a simple “goodbye” log message to the log singleton? What if the log singleton got destroyed before the database one? Ouch.</p>

<p>Singletons are hard to write and hard to use. Removing them only simplifies our code, so if it also enables us to better adapt to unforeseen requirements, why <em>shouldn’t</em> we remove them?</p>

<p>Not convinced? Let’s think of some other examples then:</p>

<ul>
<li><em>the application configuration should be a singleton, right? We <strong>obviously</strong> can’t have more than one of those!</em> Wrong. We can. We often do. Think about what happens when the user opens the “Options” screen and modifies the settings. During that time, two sets of settings exist: the “applied” settings that are currently in effect, and the “speculative” ones, currently being picked out by the user. Once he clicks OK, the speculative changes should be applied, replacing the ones that were previously in effect. But until then, we have two sets of settings to maintain.</li>
<li><em>a database connection pool then! If we have more than one pool of connections, we can’t efficiently share them!</em> Correct, but perhaps we don’t <em>want</em> to share them. Perhaps I want to ensure that library A has one pool of 10 connections available to it, component B has a smaller pool of 3 connections, an components C, D and E use the global pool with however many connections it supplies. That would ensure that no matter the number of threads running in component B, it’ll never starve out other components trying to access the database. It can never hold more than three connections, leaving room for other components. Of course, in the common case, we do want all connections to be shared in one single pool. But perhaps not <em>always</em>. So yes, there should probably be a globally accessible default pool available. But why shouldn’t it also be possible to define new <em>local</em> pools if the user deems it necessary? Why limit ourselves to one instance?</li>
</ul>

<p>And even if you do come up with some case where we absolutely <em>must</em> never have more than one instance, where it would make the sky come crashing down on us, consider testing. Consider that each of your unit tests should set up the environment it needs, and run within that environment, in isolation from other tests. That means that every test should create its own logger instance, or database pool instance, or whatever else our singletons are doing, just so it can avoid being polluted by stateful changes made by earlier tests. Each unit test for the Direct3D renderer <em>should</em> set up its own renderer object. Each physics simulation test <em>should</em> initialize the physics engine first, and shut it down again after use. Singletons don’t easily allow that. Sure, we can extend them with explicit <code>Create()</code> and <code>Destroy()</code> methods, but then our abstraction is starting to get leaky. We can no longer assume that precisely one instance exists, because we might have just destroyed the one that existed.</p>

<p>The “exactly one instance” guarantee removes flexibility from our code that we may need, in order to enforce a constraint that we <em>definitely</em> don’t need. Where’s the harm in allowing the user to create more than one instance <em>if he decides to?</em></p>

<p>C++ programmers are familiar with <code>std::cout</code>, the standard output stream. Funny thing about this, it is a simple global object. We can <em>obviously</em> never have more than one standard output stream. But we <em>can</em> have more than one stream. The standard library just initializes one of them to point to the standard output, and saves it as a global variable. We don’t need it to be a singleton, we don’t even need it to be a static class specially defined for the purpose. We just need a stream, defined somewhere where it’s globally accessible.</p>

<p>True, a sufficiently stupid programmer <em>could</em> create a new stream when he intended to write to <code>std::cout</code>, and true, a singleton implementation would have prevented that. But is it worth it? When was the last time you saw someone <em>accidentally</em> invoke <code>std::ostream() &lt;&lt; "Hello world";</code>, when they intended to write <code>std::cout &lt;&lt; "Hello world";</code>? It’s not the most common typo I’ve seen.</p>

<p>We don’t <em>need</em> to prevent multiple instantiations. If we want only one instance, we just instantiate the class once, and refer to that instance whenever we need it, end of story. We don’t need the compiler to slap us over the wrists if we do create multiple instances, because we never do so by mistake. If we do it, it’s because we have a reason. It’s because our initial assumption that only one instance was needed, turned out to be wrong!</p>

<p>So there you have it. A singleton combines two <em>negative</em> qualities. It takes the “you can never create a second instance of this class” constraint, which hardly ever makes sense, and even when it does, does not typically need to be enforced by the compiler, <em>and combines it with a global object</em>, giving us all the downsides of both!</p>

<p>Two wrongs don’t make a right. Not even if they were described as a good idea by some guys 15 years ago. They’re still no greater than the sum of their parts: two wrongs. One bad thing combined with another bad thing, creating a <em>very</em> bad thing.</p>

<p>Too many programmers rely heavily on singletons to solve a problem they never had. They never <em>needed</em> a compile-time guarantee that multiple instances of a class can never be created. They just needed one instance to be created.</p>

<p>Sometimes, we do need globals, yes. In those cases, make old-fashioned globals. Use static class members, or if the language allows it, global (non-member) objects. Or use the Monostate pattern, or whatever you feel is the cleanest solution. But remember that the problem you’re trying to solve is “enabling global access to this data”. No more, no less. You do <em>not</em> want a solution which sneaks completely unrelated constraints and limitations in through the back door.</p>

<p>And while I can’t personally think of many cases where this is true, you <em>might</em> also run into situations where it is truly <em>necessary</em> to prevent more than one instance of a class from ever existing. Again, I can’t think of what situation this might be, but I won’t rule out that it can occur. If it does, then enforce <em>that</em> constraint alone. But don’t go around providing <em>global access</em> to the object as well. Whatever specialized purpose your “one instance only” class serves, it’s highly unlikely that <em>everyone</em> should be allowed access to it. So don’t make it a global.</p>

<p>Most of the time, your classes should have neither of these attributes. Sometimes, rarely, they may need <em>one</em> of them. But the singleton pattern imbues the class with <em>both</em> properties, and <em>that</em> is just a plain bad idea.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2010/03/singletons-solving-problems-you-didnt-know-you-never-had-since-1995/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>The meaning of RAII — or why you never need to worry about resource management again</title>
		<link>http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/</link>
		<comments>http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/#comments</comments>
		<pubDate>Sat, 02 Jan 2010 05:00:52 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[.net]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[raii]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=340</guid>
		<description><![CDATA[I tried really hard to come up with some witty title or pun to weave into the title of this post. I couldn’t. RAII is just a terrible name, and it isn’t really clever or funny. Unfortunately, it is also the single most important key to C++. It is not just an idiom but a [...]]]></description>
			<content:encoded><![CDATA[<p>I tried <em>really</em> hard to come up with some witty title or pun to weave into the title of this post. I couldn’t. RAII is just a terrible name, and it isn’t really clever or funny. Unfortunately, it is also <em>the</em> single most important key to C++. It is not just an idiom but a fundamental philosophy used to solve almost any problem in the language. So we can’t really avoid it.</p>

<p>If I had to pinpoint one thing that marked the difference between a skilled and an unskilled C++ programmer, it would be “do they understand RAII”. Many people don’t, hence this post.<span id="more-340"></span></p>

<p>RAII is, apart from being badly named, one of those deceptively simple concepts that you <em>think</em> you understand when you first hear of it, think “well duh, that’s obvious”, and then proceed to write code as usual, because you just don’t see how widely applicable it is.</p>

<p>But let’s get the name out of the way first. <a href="http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization">RAII</a> stands for “Resource Acquisition Is Initialization”. And if you’re not already familiar with the idiom, then this has told you <em>nothing at all</em>. If you did know about RAII in advance, then you can, when you stop and think about it, kind of see how the name relates to it… vaguely… sort of.</p>

<p>What it actually <em>means</em> is simple: Resources should be managed by classes. When the class is initialized, the resource is acquired (hence the name). When the class is destroyed, the resource is released. And the lifetime of the object should exactly match the desired lifetime of the resource. That sounds obvious, and many programmers will (assuming they’re working in a language that <em>has</em> classes), say that this is what they always do.</p>

<p>Often, C++ developers think this just means “smart pointers. Wrap your memory allocation in a <code>boost::shared_ptr</code> and you’re done”. I see that as one not-very-often used border case though, rather than a typical example of RAII. So let’s take a step back instead.</p>

<p>The key idea isthat any kind of resource, not just memory, but file handles, sockets, database connections, or even more abstract resources like loggers or profiling timers or textures, really <em>any</em> concept or process which has a lifetime, should be mapped to an object.</p>

<p>Unlike the typical object-oriented line of thought which goes that “everything must be an object, because then.… well, everything will be an object, and your code will be better”, here we actually have a concrete <em>reason</em>: We want to use the object to manage the lifetime of the resource.</p>

<p>When I allocate memory with <code>new</code>, I have to deallocate it again sooner or later, with <code>delete</code>. (Or in C, with <code>malloc()</code> and <code>free()</code> respectively). And I have to make sure that this is done. And I have to make sure that it is not done twice. And that the object is not accessed after this is done. There are a lot of constraints we have to obey, all related to the lifetime of the resource. And this is why unmanaged programs have a reputation of leaking memory left and right. If we allocate memory, and it is to be used by a dynamic number of objects or functions all referencing the same allocations, which of the users is responsible for deleting it? And how do we know when it is safe to delete, when no users remain?</p>

<p>Ironically, most managed languages have <em>not</em> solved the problem. They have added a garbage collector (which yes, is very useful for a wide number of reasons), but that only solves one specific instance of the problem. It takes care of avoiding memory leaks, but it doesn’t avoid resource leaks <em>in general</em>.</p>

<p>The garbage collector ensures that this code won’t leak memory:</p>

<pre><code>void foo() {
  SomeObject* obj = new SomeObject();
  bar(obj);
}
</code></pre>

<p>where without a garbage collector, we’d (at least without RAII) have to write code such as</p>

<pre><code>void foo() {
  SomeObject* obj = new SomeObject();
  try {
    bar(obj);
    delete obj;
  }
  catch(...){ delete obj; }
}
</code></pre>

<p>In the garbage collected case, we don’t know what <code>bar</code> does, and we don’t <em>need</em> to know. It doesn’t have to delete the object. And neither does the <code>foo</code> function. So we have successfully dodged the problem of managing the lifetime of memory allocations. We haven’t really <em>solved</em> the problem though. We still don’t have any good tools to <em>manage</em> the lifetime. We’re just guaranteed by the system that it’ll last <em>long enough</em>.</p>

<p>In C++, this effect can be approximated using some kind of smart pointer<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>.</p>

<p>Smart pointers allow us to write code like this:</p>

<pre><code>void foo() {
  boost::shared_ptr&lt;SomeObject&gt; ptr = new SomeObject();
  bar(ptr);
}
</code></pre>

<p>and be sure we won’t leak memory. Of course, this solution isn’t perfect — reference counting is much more expensive than a good garbage collector, and if we create cyclic references, the objects will never be deleted, as the reference counts never reach zero. It is a decent approximation, but nowhere near as good and reliable as the garbage collector in managed languages.</p>

<p>But the problem shows up again if we use another type of resource. What if we’d opened a database connection instead?
We’d have to write code such as this:
(The following Java-like pseudocode is copied almost verbatim from <a href="http://stackoverflow.com/questions/161177/does-c-support-finally-blocks-and-whats-this-raii-i-keep-hearing-about/161247#161247">this StackOverflow.com answer</a>, courtesy of <a href="http://stackoverflow.com/users/14065/martin-york">Martin York</a>.)</p>

<pre><code>void writeToDb()
{
  Db db = new Db("DBDesciptionString");
  try
  {
    // Use the db object.
  }
  finally
  {
    db.close();
  }
}
</code></pre>

<p>(And of course it gets even worse if <code>db.close()</code> can throw exceptions. Then we have to catch <em>that</em> exception, just to avoid it propagating out from the <code>finally</code> clause if we reached <code>finally</code> because of an exception being thrown in the <code>try</code> clause.)</p>

<p>The resource management problem still exists. We still have to wrap the code in exception handling just to make sure that the connection is closed as soon as we’re done with it. And we have to do this at <em>every</em> use. And it gets complicated fast.</p>

<p>Of course, .NET makes this a bit simpler:</p>

<pre><code>using (Db db = new Db("DbDescriptionString"))
{
  // use the database object.
}
</code></pre>

<p>But the onus is still on the user of the class to ensure it is closed correctly. There is no obvious way to encode into the <code>Db</code> class that “once we’re done with an object of this type, the connection must be closed immediately”.</p>

<p>And in C++, smart pointers are no longer suitable solutions, since the resource to be managed is no longer a pointer allocated with <code>new</code>.</p>

<p>Instead, a more basic flavor of RAII comes to the fore:</p>

<pre><code>void someFunc()
{
    Db db("DBDesciptionString");
    // Use the db object.
} 
</code></pre>

<p>Yes, that’s all. When the <code>db</code> object goes out of scope, at the end of the function, its destructor is called. The destructor internally calls <code>this-&gt;Close()</code> for us, so we don’t need to do it! We just have to trust the scoping rules of C++, which guarantee that destructors are called on local variables when they go out of scope, and on class members when the class is destroyed.</p>

<p>So in a sense, the key idea in RAII is simply that “resources should behave sensibly”. They should get copied safely if an assignment is made (or otherwise, assignments should be prevented), they should be available if their owning object is successfully created (if it can’t create the resource, it should throw an exception, aborting the creation of the object), and when they are no longer used, they should be cleaned up.</p>

<p>The C++ standard library class template <code>std::vector</code> is a wonderful example of RAII in action. The resources being managed by a <code>vector</code> are memory (the array allocated internally to hold the objects being contained in the vector, as well as the objects themselves. When the <code>vector</code> is destroyed, every object it holds must be destroyed too, and the array in which they were placed must be deallocated.</p>

<p>In the following examples, assume that a function <code>foo</code> is passed a vector of <code>MyClass</code> objects by value. We don’t know how many, if any, objects are stored in it, but since we are passed a copy of the original <code>vector</code>, we take ownership of it. It exists only in the function <code>foo</code>, and must be destroyed afterwards.</p>

<pre><code>void foo(std::vector&lt;MyClass&gt; vec) {
  ...
 //  when we get to the end of the function, all local variables, including vec, 
 // are automatically destroyed by having their destructors invoked.
 // So no matter how many MyClass objects were stored in the vector, it ensures that they too have their destructors called.
 // And the vector also deallocates its internal array, leaving neither of its resources alive at the end of the function
}

void foo(std::vector&lt;MyClass&gt; vec) {
  throw std::exception("Oops");
  // as above, vec is automatically destroyed when we leave the function,
  // regardless of *how* we leave it. Even if we leave it because an exception was thrown and not caught.
} 

void foo(std::vector&lt;MyClass&gt; vec) {
  // other is constructed as a copy of vec. std::vector ensures that both of vecs resources are copied as well
  std::vector&lt;MyClass&gt; other = vec;
  // we now have two vectors, each owning a dynamically allocated array and a number of MyClass objects
  // and again, at the end of the function, both are deallocated cleanly
} 

void foo(std::vector&lt;MyClass&gt; vec) {
  std::vector&lt;MyClass&gt; other; // a second, empty, vector

  // perform an assignment, setting vec to be an empty vector
  // std::vector makes sure that if you do this, the resources previously held by vec are cleanly released
  // before copies are made of the resources held by other
  vec = other;

  // and so when the function ends, the MyClass objects originally held by vec
  // have already been destroyed, so their destructors are *not* invoked now
} 
</code></pre>

<p>As the above shows, <code>vec</code> owns its resources, and manages them tightly. Whenever a change happens to <code>vec</code>, it reflects this by updating its owned resources. If it is destroyed, it destroys its owned resources. If it is copied, it copies the resources it owns. If it is assigned to hold something else, it first destroys its existing resources. And so on. Nothing you do can bring it “out of balance”. It just works. <em>That</em> is RAII. Smart pointers are just convenient adapters turning raw pointers into RAII objects. But RAII is much more than smart pointers.</p>

<p>It is the broad and general idea that <em>resources should be mapped to objects</em>, so that the object can not be created unless it succeeded in acquiring its resource, and it can not be destroyed without also releasing its resource. This effectively saves C++ programmers from having to worry about resource management.</p>

<p>Take an example that’s guaranteed to cause pain without the use of RAII: Handling exceptions being through halfway through constructors. Say you have a class with multiple members which are initialized in its constructor. After the first member has been initialized, but before all of them have been initialized, an exception is thrown. Let’s use the following contrived example:</p>

<pre><code>class Foobar {
  Foo f;
  Bar b;
  MyClass c;

public:
  Foobar() : f(42), b("hello world), c('a') {}
};
</code></pre>

<p>unfortunately, <code>b</code>’s constructor throws an exception. How to handle this? We know that in C++, partially constructed objects do not automatically have their destructors called. when the construction is aborted.</p>

<p>And since we want to avoid any resource leaks, we require that the following must happen:
– <code>a</code> must have its destructor called (because <code>a</code> was successfully initialized before the error occurrd)
– <code>b</code> must release any resources it acquired in its constructor before it threw the exception
– <code>c</code> must do nothing. Its construction was not yet begun when the error ocurred, so it would be an error to attempt any kind of cleanup of <code>c</code>.
– The <code>Foobar</code> object (the object pointed to by the <code>this</code> pointer) must ensure that the above, and nothing else, happens, and it must do so without relying on its own destructor (which won’t be called, as construction did not successfully complete).</p>

<p>And of course, pretending that only <code>b</code> can throw an exception may be a simplification over the real world. Perhaps every member could throw one from its constructor. Care to write a <code>Foobar</code> constructor which takes all this into account, providing enough <code>try</code>/<code>catch</code> blocks to correctly catch every exception that might be thrown, and release exactly the resources that have been allocated until then, and <em>nothing</em> else? A tall order, and an open invitation for bugs. And of course, it’d lead to a huge, bloated and error-prone constructor. It’d also prevent us from using the <em>initializer list</em>. We’d have to perform some kind of “safe” non-throwing default construction of both <code>a</code>, <code>b</code> and <code>c</code> before entering the constructor body, where exception handling is possible, and from there, attempt to perform assignments to bring the three members into the desired state.</p>

<p>In pseudocode, the constructor might look something like this:</p>

<pre><code>Foobar() {
  a = new Foo(42);
  try {
    b = new Bar("hello world");
  }
  catch {
    destroy a;
    throw;
  }
 try {
    c = new MyClass();
  }
  catch {
    destroy b;
    destroy a;
    throw;
  }
}
</code></pre>

<p>Note that all this complexity is only necessary because we want to handle several different resources. <code>a</code>, <code>b</code> and <code>c</code> all contain resources that must be attempted acquired, and properly released if this fails. If there’d been only one resource, the job would have been much simpler. There wouldn’t be any point at which <em>some</em> resources have been acquired, and others have not. If we succeeded in acquiring that one resource, there’d be no risk of errors occurring afterwards, so we wouldn’t need complex conditional cleanup code. And if we failed to acquire the one resource, there’d be nothing to clean up — after all, the resource was never acquired!</p>

<p>So to keep down the complexity, the only safe way to define a class is to make it own <em>at most one</em> resource. And this one-to-one mapping of resources to classes is exactly what RAII is all about. If <code>a</code>, <code>b</code> and <code>c</code> had all been RAII objects, then the above code <em>would work</em>. Regardless of which members could or couldn’t throw exceptions. According to the rules of C++, we know that in the above case,</p>

<ul>
<li>the <code>Foobar</code> destructor (<code>this-&gt;Foobar::~Foobar()</code> will not be called, as <code>*this</code> was not successfully constructed.</li>
<li>the <code>a</code> destructor will be called, as this member was fully constructed at the time of the error.</li>
<li>the <code>b</code> and <code>c</code> destructors will not be called, as these members were not fully constructed at the time of the error.</li>
</ul>

<p>So assuming that <code>b</code>’s constructor takes care of releasing any resources successfully allocated when the error occurred (the number of which, as pointed out above, should ideally be zero), we’re actually home free! What happens is exactly what we listed earlier as our goal. <code>a</code> has its destructor called, <code>c</code>’s constructor was never run in the first place, so it doesn’t have to do anything, and <code>*this</code> doesn’t have to do <em>anything</em> special in its constructor. All of its members take care of their own resources, so the number of resources managed by <code>*this</code> is zero!</p>

<p>We don’t even need to write a destructor for <code>Foobar</code> now, if all its members are RAII objects. Whether the <code>Foobar</code> object is partially or fully constructed, its members take care of themselves. That is the power of RAII. Once a resource has been mapped to a class, we can use it as much as we like, and even in very complex situations, and never have to worry about the resource being leaked. It is managed by its wrapping RAII object, and the C++ lifetime and scope rules ensure that this wrapper object gets destroyed when it goes out of scope</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>A smart pointer is an object which behaves as a pointer (meaning that it overloads the <code>*</code> and <code>-&gt;</code> operators, so it can be dereferenced to yield the pointed-to value), but also enforces some kind of ownership semantics on the value. A plain pointer does nothing when it goes out of scope. If it pointed to some dynamically allocated memory, nothing happens to that memory. And if no one else have a pointer to it, then that memory is lost, and can not be reclaimed.
A smart pointer does <em>something</em> when it is destroyed. Some variants simply free the memory they point to (<code>boost::scoped_ptr</code>, <code>std::auto_ptr</code> or <code>std::unique_ptr</code> all fall into this category, although with some important differences), while others implement reference counting, so that the memory is only destroyed when <em>all</em> smart pointers pointing to it have been destroyed. <code>boost::shared_ptr</code> is by far the best known implementation of this concept. <a href="#fnref:1" rev="footnote">↩</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Hopes for 2010: Microsoft Visual C++</title>
		<link>http://jalf.dk/blog/2009/12/hopes-for-2010-microsoft-visual-c/</link>
		<comments>http://jalf.dk/blog/2009/12/hopes-for-2010-microsoft-visual-c/#comments</comments>
		<pubDate>Wed, 30 Dec 2009 17:00:24 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[ide]]></category>
		<category><![CDATA[intellisense]]></category>
		<category><![CDATA[msvc]]></category>
		<category><![CDATA[new-year]]></category>
		<category><![CDATA[visual-studio]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=387</guid>
		<description><![CDATA[As I mentioned earlier, I’d like to celebrate the new year by calling out a few products I’d like to see improved in the new year. First in line is Microsoft’s C++ compiler and IDE. From you, what I’d like to see in 2010 is actually fairly simple (at least conceptually): rethink your IDE. The [...]]]></description>
			<content:encoded><![CDATA[<p>As I mentioned <a href="http://jalf.dk/blog/?p=352">earlier</a>, I’d like to celebrate the new year by calling out a few products I’d like to see improved in the new year.</p>

<p>First in line is Microsoft’s C++ compiler and IDE.<span id="more-387"></span></p>

<p>From you, what I’d like to see in 2010 is actually fairly simple (at least conceptually): rethink your IDE. The Visual Studio team as a whole is already <a href="http://blogs.msdn.com/ricom/archive/2009/10/19/my-history-of-visual-studio-part-10-final.aspx">doing this in a big way</a> with VS10. I’m hoping you can find the time to reinvent the C++ IDE specifically as well.</p>

<p>For the past decade (except for the 5 years you wasted trying to eliminate native C++), you’ve been trying very hard to write the ultimate IDE for the wrong language. You’ve got something that works pretty well for C++ code anno 1993 or so. And which completely falls apart when used for more modern C++. Why do you persist in putting so many resources into making Intellisense better, when it <em>still</em> has no way to deal with a simple template function? Isn’t that a hint that you should rethink your approach? Modern C++ has quite a bit in common with dynamic languages. The type of a function parameter may not be known just by looking at the function definition. Often a full-fledged “interactive mode” as is common in dynamic languages, is desired, for example if we want to see what the actual type of some template parameter is. Or when instantiating template metaprograms, we may wish to step through them line by line at compile-time, rather than being limited to looking at the compiler errors — the metaprogramming equivalent of <code>printf</code>–debugging. What I’d like to see from the MSVC IDE in the coming years is an acceptance and support of modern C++ paradigms — generic programming, template metaprogramming and all the difficulties that implies. Don’t give me an IDE that tries to provide Intellisense for a program with a completely static structure. If you’re going to do Intellisense, make it able to handle the very flexible type system enabled by templates. take a leaf from the JavaScript support in your IDE, which supports intellisense even though the language is dynamic and types generally aren’t known until runtime. So to display type information in the IDE while the code is being written, they have to be clever. But they can do it.</p>

<p>In C++, the types aren’t known until compile-time, so from the IDE’s point of view, the problem is similar. To display type information while the code is being written (and before it is compiled), the IDE has to be clever. And at the moment, it isn’t. At the moment, Intellisense just gives up.</p>

<p>Let’s take a simple example. What should the IDE do about this function:</p>

<pre><code>template &lt;typename T&gt;
typename T::return_type foo(T arg){
  bar(arg);
  return arg.baz();
}
</code></pre>

<p>If we play it by the book, and demand a “perfect” solution, there is nothing the IDE can do. It doesn’t know what <code>T</code> is, so it can’t help us with autocompletion, suggesting members after the dot, or anything else.
But if we’re willing to think outside the box, and accept a success rate lower than 100%, there are several strategies the IDE <em>could</em> use to provide meaningful Intellisense information:</p>

<ul>
<li>we could look at the call sites. They must logically provide types that are valid in this context. We could find one call site, and provide Intellisense on the assumption that the type <code>T</code> is whatever was passed at <em>that</em> call site. That wouldn’t be 100% accurate in all cases, of course, but it would give us a type that works with the function, so it would be useful. It could even look at several call sites, and compute the union of the types used. If they all provide a <code>frobnicate()</code> method, then the IDE could assume that <code>T</code> inside the function <code>foo</code> always contains such a member.</li>
<li>We could look at how the type is used in the function. It must have a copy constructor (because it is passed by value), it must have some nested type <code>return_type</code>, and it must have a no-arg <code>baz</code> member function, which returns something convertible to that type. And it must be convertible to whatever arguments <code>bar</code> expects. This probably isn’t a complete description of the type, but it would be enough to give us some limited Intellisense information at least. The compiler might be able to deduce some information about the type. We could even generate some kind of ad-hoc “concepts” implementation — perhaps not as extensive as that which was proposed in C++0x (and subsequently dropped), but a kind of helper datastructure that the IDE can attempt to map onto unknown template types.</li>
<li>Or we could allow the user to specify an example of a valid parameter type, and then use that to generate Intellisense information from.</li>
</ul>

<p>But an alternative approach (and these aren’t mutually exclusive) might be to reduce the reliance on Intellisense, which is essentially a static analysis tool. Perhaps a better approach would be to bring the “Immediate” pane up to date, and make it useful, not just during debugging, but while programming as well. Why can’t I through the intermediate window ask for the class <code>std::vector&lt;bool&gt;</code> to be instantiated, for example, so that I can inspect its structure? Perhaps I’m curious what its <code>iterator</code> type will resolve to, or perhaps I want to know the size of the class or other static information. Why can’t I just ask the IDE for this information? Again, modern C++ has a lot in common with dynamic languages. Very little information can be reliably extracted without compiling the code. So give me the tools for optionally and temporarily compiling bits and pieces.</p>

<p>When I write silly template metaprograms to compute the N’th prime number, and the result is wrong, why doesn’t MSVC provide a compile-time debugger? One which lets me step through the instantiation of this maze of templates, inspect the members of each, and find out where it went wrong, where it instantiated the wrong template, or where I forgot to write the specialization I intended.</p>

<p>In far too many ways, the C++ IDE really feels like a C IDE. Most of it doesn’t seem to know that there’s this new-fangled thing called “templates”, or that they change how people write code. The Immediate window or the debugger, do not recognize template parameter names. If I am debugging a function <code>template &lt;int I&gt; void foo()</code>, why can’t I get the debugger to tell me the value of  <code>I</code>? It should be absolutely trivial to do. But the debugger can’t seem to do it. Intellisense can’t seem to do it. The Immediate pane can’t seem to do it. There’s a clear mismatch between the compiler, which is clearly a C++ compiler, and pretty much hasn’t bothered about the C side for close to a decade, and the IDE which still seems to be trying to be the perfect C IDE, completely disregarding every feature unique to C++.</p>

<p>I know you’re used to being told that you have one of the best IDE’s in existence. I beg to differ. You may have got one of the best C IDE’s, and your C# and VB IDE’s kick some serious butt. But your C++ IDE is essentially nonexistent. Your IDE does not support C++. It supports a marginally and conservatively extended C.</p>

<p>So far, I’ve dealt exclusively with the IDE issues, and that’s not a coincidence. On the whole, I’m quite happy with the MSVC compiler. The <a href="http://blogs.msdn.com/vcblog/archive/2009/11/02/visual-c-code-generation-in-visual-studio-2010.aspx">performance</a> of generated code is good; you’re making great progress on <a href="http://blogs.msdn.com/vcblog/archive/2009/04/22/decltype-c-0x-features-in-vc10-part-3.aspx">C++0x support</a>, and overall, you’ve got a compiler I’m happy with. Of course there are still a couple of areas where the lack of standards-conformance is embarassing (never mind the <code>export</code> keyword, I’m more bothered about two-phase name lookup and other <em>relevant</em> features), and there are some features I wish you’d borrow from GCC, but on the whole, and I wish you’d tighten up your warning messages a bit (some of them are nothing more than noise, or are impossible to avoid in “good” healthy code), I have relatively few <em>serious</em> complaints about the compiler. I do, however, have a few suggestions.</p>

<p>It seems to me that the source/header compilation mechanism could use a makeover. We can’t change the actual semantics (yet — hopefully the proposal for a module system for C++ gains traction), but the compiler <em>can</em> change how it actually processes the code. And yet, major compilers still process the source files in the exact same manner they did 20 years ago. Even though this is, on today’s machines, and with today’s huge codebases, ridiculously inefficient.</p>

<p>Ages ago, precompiled headers were invented, but I’m not really a fan of them. It’s a hackish solution which sometimes helps, but may also hurt, due to the tendency towards including everything in one single “blob” header. Even if that header is precompiled, it still means everything that includes it has to deal with these bloated monolithic symbol tables and other data structures. More importantly, it is a fragile solution, as <a href="http://blogs.msdn.com/vcblog/archive/2009/11/12/visual-c-precompiled-header-errors-on-windows-7.aspx">the VC Team’s own blog shows</a>.</p>

<p>But why can’t this mechanism be generalized?
Why can’t the compiler process every header in isolation, build a complete parse tree of each one, and store those on disk? And then, when the header is included, rather than reading and parsing the header again, simply load this parse tree and merge it into the rest of the compilation unit. Of course, it is easy to come up with cases where the file may have to be parsed differently depending on where it is included, but in 99.9% of all cases, the inclusion mechanism is straightforward and simple: The header is typically not included in the middle of a class definition or from inside a namespace. It usually only reacts to a few fixed macros that may be defined before the header’s inclusion. So <em>most</em> of the time, the header could be precompiled in isolation and reused. And for the few cases where the changed state actually matters, where the header is included in the middle of a function definition or with no include guards or where a macro (say <code>CreateWindow</code>, or a similarly common name, <em>cough cough</em>) mangles the contents of the header, in <em>those</em> cases, the compiler can simply fall back to the traditional source code inclusion and subsequent compilation of the translation unit. Even if these precompilation passes aren’t stored to disk in the manner of precompiled headers, they could still be kept in memory, and reused between translation units during a build. If N <code>.cpp</code> files all include a certain header, it would allow that header to be compiled once, rather than N times.</p>

<p>Once again, we have something that feels like a leftover from C. In C, headers were mostly forward declarations and little actual <em>code</em>, so naive processing of headers worked fairly efficiently. in C++, it is getting more and more common to put huge amounts of code in headers, which means that the naive compilation strategy traditionally used for C becomes ridiculously slow and inefficient. Creating a truly <em>general</em> replacement strategy is nearly impossible, true, but it seems like it’d be possible to create a heuristic that’d enable more efficient processing of header files 99% of the time, and which could then fall back to the traditional method of copy/pasting headers into the translation unit for the last percent of cases.</p>

<p>And why does every translation unit have to read every file every time? Can’t their contents be kept in memory, at least for a short time? Those hundreds or thousands of file accesses are painfully slow. Windows already exposes APIs for monitoring file changes, so it should be fairly simple to determine when a source file has been modified, and only then flush it from memory.</p>

<p>And of course, everyone’s favorite nitpick: Why is <code>windows.h</code> so absolutely horrible? Why does it have to be one monolithic header which gives us <em>everything</em> Windows has to offer? Why doesn’t it compile as standard C++? Why does it include so many other headers (as above, slowing down compilation)? Why does it pollute the global namespace with macros for ridiculously common names?</p>

<p>Well, it does, and it’d be silly to expect this to change, due to backwards compatibility concerns.
But why then, is there not a <code>windows.hpp</code> or similar? Why isn’t there a separate cleaned-up, C++-compatible header? One which uses function overloading instead of macros, for example? Or which just defines simple forwarding functions instead of macros? One which compiles even with the non-standard language extensions disabled? Or why isn’t there a <em>set of</em> these headers, allowing us to access the bits of the Windows API we’re interested in, without having to include *everything?</p>

<p>In short, I think the MSVC IDE (and in some cases the compiler too) is in desperate need of a rethink. Out with those 12-year-old project wizards, which create complex predefined project structures accumulating every bad practice and unexpected project setting in one place. I’ve lost count of how many beginners I’ve seen choke because their tiny little projects automatically get a precompiled headers thrown in for absolutely no reason. Out with the idea that C++ can best be presented like C#, as a static language where every piece of code can be understood in isolation. Instead, give us an IDE that treats C++ as a more dynamic language, where many types of information are just not available until the program has been compiled, and which embraces the unique features of C++ — one which supports and encourages use of templates, one which accepts that in modern C++, most code ends up in header files, and these header files become expensive to compile, and so are an area ripe for optimization. An IDE which treats C++ compilation as an interactive process, where template instantiation can be stepped through and inspected at each stage, and where interactive queries can be made statically or during debugging to inspect not just data, but also types.</p>

<p>Another addition that would really boost the usefulness of MSVC would be to provide facilities for template metaprogramming in unit tests: For example, it is common to use metaprograms to force compilation failures if a template is instantiated with a specific type. But how do we test that this works as intended? Give us the hooks and language extensions necessary to specify that “this function is expected to fail to compile, and if it does, that’s not an error, just remove the function and compile the remainder of the file”. Again, consider compilation process a part of the language — it is something that must be inspected and debugged, and for which we may wish to write tests.</p>

<pre><code>void my_testcase() __declspec(wontcompile){
  // perhaps we want to ensure that the template can not be instantiated with a reference,
  // so we expect this test to fail
  frobnicator&lt;int&amp;&gt; f; 
}

// if the above test fails to compile (as it should), the compiler should simply ignore it, and proceed to compile the other tests, rather than aborting
void next_testcase() {... } 
</code></pre>

<p>Target your IDE at Modern C++, rather than C with classes. Impress the world by being the first IDE to even think about this. Embrace, and provide support for, the changes that have happened in the C++ language, in best practices and in the mindset of the C++ community. Accept that yes, headers are ridiculously heavy these days, and blindly recompiling them for every translation unit doesn’t scale. Accept that the C++ language needs IDE support to inspect what happens in our compile-time metaprograms as well as at runtime. And face up to the fact that traditional intellisense is a lost cause. There is no way to statically produce all the information we expect from intellisense. Some can be improvised by various heuristics, or as in the template function case, by assuming some suitable dummy value for the function’s template parameters, but others may be nearly impossible to provide until the program has been compiled. So perhaps you need to think beyond Intellisense to provide this information to the programmer. Perhaps an “interactive mode” would be more suitable. If the IDE can’t provide the information I need automatically, it could at least allow me to query for the information. Perhaps it can’t tell me anything about the template type <code>T</code>, but why can’t I tell it to assume that <code>T</code> is a <code>std::wstring</code>, and provide information based on this assumption. Or something else entirely. You already have a pretty good C++ compiler. It’s time to start working on a C++ IDE, and call it a day on the C IDE you’ve been polishing until now.</p>

<p>So dear MSVC team, in case you can’t think of anything useful to do with your time in the year 2010 (as if… I know you’ve got C++0x support to work on, and that’s infintely more important to me than IDE improvements), here’s a new year’s resolution for you: Amaze the world, by showing what a C++ IDE <strong>should</strong> work like. Reinvent the role of the C++ IDE, instead of trying to force the C# or C IDE to work for C++ as well.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/12/hopes-for-2010-microsoft-visual-c/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Houston, we have a (performance) problem</title>
		<link>http://jalf.dk/blog/2009/12/houston-we-have-a-performance-problem/</link>
		<comments>http://jalf.dk/blog/2009/12/houston-we-have-a-performance-problem/#comments</comments>
		<pubDate>Tue, 15 Dec 2009 13:49:43 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[stm]]></category>
		<category><![CDATA[thesis]]></category>
		<category><![CDATA[transactional-memory]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=403</guid>
		<description><![CDATA[Ouch. These last few days, I’ve been fixing a few lingering bugs in my STM system, and last night, I finally nailed them. Specifically, it is now possible to open variables within a transaction as read-only. An obvious optimization, right? At least that’s the idea. Less work is required by the STM system if we [...]]]></description>
			<content:encoded><![CDATA[<p>Ouch. These last few days, I’ve been fixing a few lingering bugs in my STM system, and last night, I finally nailed them. Specifically, it is now possible to open variables within a transaction as <em>read-only</em>. An obvious optimization, right? At least that’s the idea. Less work is required by the STM system if we can trust that the variable isn’t modified by this transaction.
<span id="more-403"></span></p>

<p>Well, my test case for this feature now takes <em>ages</em> to run. As I mentioned previously, a simple transaction modifying two integer variables under heavy contention can pull off almost two million transactions per second on my laptop.</p>

<p>My new test, in which each thread takes four variables and alternates between modifying two of them and reading the other two, runs perhaps ten thousand (!) times slower.</p>

<p>Of course I have several leads on how to fix this. The problem is largely all the performance-related “extras” I’ve been leaving out. For example, if a transaction fails to acquire a variable it needs, it simply aborts and immediately retries. In many cases, a  better approach would be to block the thread, waiting for that variable to actually become available.</p>

<p>There are several other cases where I have a similar problem: I have to choose between delaying the thread for a moment with <code>sleep()</code> before attempting to continue, blocking it until some condition is true, or aborting the transaction entirely and starting over from scratch. At the moment, I generally just pick the easiest solutions (typically abort, and <em>occasionally</em> call <code>sleep()</code> a few times before we resort to that. Again, implementing some actual meaningful policies here would make a big difference. And tweaking these policies should help still more.</p>

<p>Another problem is that currently, I do not enforce a consistent global order when acquiring objects during a commit. This means I risk livelocks, again causing excessive rollbacks when multiple threads are competing over access to the same variables.</p>

<p>So I’m still optimistic. It should be possible to get performance back on track. But man, it’s depressing watching performance plummet like this.</p>

<p><strong>Edit</strong><br />
And an update. After poking around a bit, it turned out that most of the time was being spent sleeping. When a transaction attempts to commit, if it can not acquire all the all the variables it needs, it retries a few times with a short delay (a couple of milliseconds) in between. If it doesn’t succeed after a few tries, it rolls back the entire transaction and starts over.</p>

<p>It turned out that these few, short <code>sleep()</code> calls brought CPU utilization down to something like 0.01%, and totally destroyed performance. Simply turning the <code>sleep()</code> call into a <em>no-op</em> brought me back to something more or less reasonable. I still need to improve on the above shortcomings, but now at least I can run my tests in less than an hour.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/12/houston-we-have-a-performance-problem/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using My STM Library</title>
		<link>http://jalf.dk/blog/2009/11/using-my-stm-library/</link>
		<comments>http://jalf.dk/blog/2009/11/using-my-stm-library/#comments</comments>
		<pubDate>Mon, 30 Nov 2009 12:58:22 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[stm]]></category>
		<category><![CDATA[thesis]]></category>
		<category><![CDATA[transactional-memory]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=362</guid>
		<description><![CDATA[As promised yesterday, I’d like to show off a few bits of my STM library. Of course it’s far from done, and is still missing several key features, but the core library is in pretty good shape. So as they say on the internets, “my STM library, let me show you it” In the following, [...]]]></description>
			<content:encoded><![CDATA[<p>As promised yesterday, I’d like to show off a few bits of my STM library. Of course it’s far from done, and is still missing several key features, but the core library is in pretty good shape. So as they say on the internets, <em>“my STM library, let me show you it”</em><span id="more-362"></span></p>

<p>In the following, I’ll show a slightly modified version of one of my test cases. It shows how to call and use my library, from the user’s point of view. The test spawns a number of threads, which each wait on a barrier (because if one thread was allowed to run while another was being constructed, I’d get fewer concurrent transactions, and my test would be less likely to uncover race conditions), and then perform a fixed number of transactions. In each transaction, two transactional variables are opened for writing, one is decremented and the other incremented. If the sum of these variables is nonzero in any iteration, the thread registers a failure.</p>

<p>And if, after all the threads have terminated, the value of each of these variables is not what was expected, a failure is registered as well.</p>

<p>So from a testing point of view, there should be plenty of opportunity for things to go wrong. Just a single small race condition somewhere, and <em>one</em> of the many millions of reads would be inconsistent and the test would fail.</p>

<pre><code>#include &lt;stm.hpp&gt; // my STM library

#include &lt;boost/test/unit_test.hpp&gt; // Boost.Test is used to supply a unit-testing framework
#include &lt;boost/thread.hpp&gt; // Boost.Thread is used as a threading API

// define the number of threads to run, and the number of iterations for each
enum { thread_count = 8, iterations = 200000 }; 

// The following are transactional variables. The shared template ensures that the contained value
// can only be accessed as part of a transaction, and provides the necessary metadata
// for checking validity and consistency
// Two such integers are created, both initialized to zero
stm::shared&lt;int&gt; val1(0);
stm::shared&lt;int&gt; val2(0); 

BOOST_AUTO_TEST_SUITE( threads ) // define a test suite named "threads"

// this function defines the body of our transaction. 
// We're passed a transaction, which can be used to open any "shared" variables
bool tx_func(stm::transaction&amp; tx){
    // open both variables for writing
    int&amp; a = val1.open_rw(tx);
    int&amp; b = val2.open_rw(tx);

    // modify the variables freely
    --a;
    ++b;

    return a + b == 0; // Our transaction returns a bool. Other return types (or void) are also supported
}

// this class defines a thread. operator() is called as the thread's entry point
struct thread_functor{
    // In the constructor, the thread object is given a barrier it can synchronize on,
    // and a reference where it can write the its result (success/failure)
    thread_functor(boost::barrier&amp; bar, int&amp; res) : bar(bar), res(res) {}

    void operator()(){
        // when the thread is first created, we wait for the barrier
        // This ensures that no transactions are running until all threads have been constructed
        bar.wait(); 
        for (int i = 0; i &lt; iterations; ++i){
            // for each iteration, pass our transaction function to the "atomic" function, which executes it atomically.
            // To get a non-void return type, we have to specify the template parameter explicitly 
            // (this can be avoided in C++0x using the return_of template to deduce the return type implicitly)

            // depending on the return value of the transaction, write success or failure back
            res = (res != 0) &amp;&amp; stm::atomic&lt;bool&gt;(tx_func) ? 1 : 0;
        }
    }

    boost::barrier&amp; bar;
    int&amp; res;
};

// another transaction, to be executed after our helper threads terminate, 
// to verify that the right number of modifications have occurred
// note that here variables are opened for reading only
void verify(stm::transaction&amp; tx) {
    const int&amp; a = val1.open_r(tx);
    const int&amp; b = val2.open_r(tx);

    BOOST_CHECK_EQUAL(-a, thread_count * iterations);
    BOOST_CHECK_EQUAL(b, thread_count * iterations);
}

// finally, we get to our test case itself
BOOST_AUTO_TEST_CASE ( short_concurrent_transactions )
{
    boost::barrier bar(thread_count);
    boost::thread_group gr;
    int res[thread_count]; // array of results
    // set all the results to an initial true/1 value (since each iteration "and"'s it together with the current result
    std::fill(res, res+thread_count, 1); 

    for (int i = 0; i &lt; thread_count; ++i){
        gr.create_thread(thread_functor(bar, res[i])); // create the threads, passing the necessary parameters to each
    }

    gr.join_all(); // wait for all threads to terminate

    // verify that each thread return success
    for (int i = 0; i &lt; thread_count; ++i){
        BOOST_CHECK_EQUAL(res[i], 1);
    }

    // run a final transaction to access both variables and check their final values
    stm::atomic(verify);
}

BOOST_AUTO_TEST_SUITE_END()
</code></pre>

<p>In the above, I used a function object to represent threads, and a regular function to represent transactions. Of course in both cases, either would work — a function object would potentially be more efficient as it is easier for the compiler to inline, but I used a function for brevity.</p>

<p>In C++0x, of course, lambdas could also have been used in both cases.</p>

<p>One of my design goals has been to make basic usage as simple and intuitive as possible, and I think I’ve succeeded so far. Any C++ programmer who is familiar with the STL algorithms or the Boost libraries, should find my library’s interface very straightforward. Note especially that all the transaction “magic”, of verifying validity and retrying transactions as needed, is completely invisible to the user. You simply define a function expressing what your transaction should do, and pass it to the <code>atomic</code> function.</p>

<p>In its current version, this test is able to execute around 1,800,000 transactions per second on my Core Duo 2GHz laptop. (Of course, with transactions as small as these, opening only two variables each, performance is a lot better than it would be in real-world transactions.</p>

<p>So that’s it for now. Of course I’ve got a few more user-facing features in the pipeline<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>, and a <em>lot</em> of backend changes, but the basic functionality is there, and I’m pretty happy with it so far.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>It should probably be possible to specify that the transaction should <em>not</em> automatically retry if it fails to commit, and instead abort with an exception. It should also be possible to define nested transactions, and use operations such as <em>OrElse</em> and <em>Retry</em> primitives introduced in <a href="http://www.haskell.org/haskellwiki/Software_transactional_memory">Haskell STM</a> <a href="#fnref:1" rev="footnote">↩</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/11/using-my-stm-library/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Being functional in an imperative language</title>
		<link>http://jalf.dk/blog/2009/10/being-functional-in-an-imperative-language/</link>
		<comments>http://jalf.dk/blog/2009/10/being-functional-in-an-imperative-language/#comments</comments>
		<pubDate>Sat, 03 Oct 2009 12:00:25 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[functional]]></category>
		<category><![CDATA[imperative]]></category>
		<category><![CDATA[stm]]></category>
		<category><![CDATA[thesis]]></category>
		<category><![CDATA[transactional-memory]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=334</guid>
		<description><![CDATA[By now, I’ve read an awful lot of papers about STM systems, and certain trends are really starting to stand out, not so much in terms of the algorithms used or the clever schemes invented to make transactions appear atomic, but in how they interface with the actual language. It has really underlined to me [...]]]></description>
			<content:encoded><![CDATA[<p>By now, I’ve read an awful lot of papers about STM systems, and certain trends are really starting to stand out, not so much in terms of the algorithms used or the clever schemes invented to make transactions appear atomic, but in how they interface with the actual language.</p>

<p>It has really underlined to me just how deeply entrenched most Java, C and C++ programmers are in the imperative mindset.
<span id="more-334"></span>
When we want to perform a transaction, we’ll need to execute something like the following pseudocode:</p>

<pre><code>for (;;) {
  tx = new Transaction();
  try {
    // ... fill in body of the transaction ...
    tx.Validate();
    tx.Commit()
  }
  catch (InvalidTxException){
    tx.Rollback();
  }
}
</code></pre>

<p>And several STM systems actually require the user to write the above whenever a transaction is to be executed.</p>

<p>Some implementers have realized that this is both verbose, messy and error-prone.
So they try to hide it behind a macro.</p>

<pre><code>#define BEGIN_ATOMIC \
  for (;;) { \
    tx = new Transaction();\
    try { \

#define END_ATOMIC \
      tx.Validate();
      tx.Commit()
    }
    catch (InvalidTxException){
      tx.Rollback();
    }
  }
</code></pre>

<p>and now a transaction looks like this instead, from the user’s point of view:</p>

<pre><code>BEGIN_ATOMIC
// ... fill in body of the transaction ...
END_ATOMIC
</code></pre>

<p>Which is obviously shorter and requires less duplicate code. On the other hand, it relies on macros. Eeeew, yuck!</p>

<p>In Java, most implementers have tried to extend the language, to add a language-level <code>atomic</code> statement, allowing code like this:</p>

<pre><code>atomic {
  // ... fill in body of the transaction ...
}
</code></pre>

<p>And yes, this obviously works too, and has the potential to integrate much more closely with the compiler and type-checker. But  it’s also a <em>very</em> imperative kind of thinking. “add another type of scope inside whichever function wants to perform the transaction”.</p>

<p>What astonishes me is that no one (apart from Peyton Jones, who implemented Haskell STM) have realized that what we really want is nothing more than a higher-order function. A function which implements the transaction infrastructure, and then calls a user-defined function.</p>

<p>What we’d like to do, ideally, is to define a function <code>atomic</code>, as in the following pseudocode:</p>

<pre><code>T atomic(f : Tx -&gt; T){
  for (;;) {
    tx = new Transaction();
    try {
      T ret = f(tx);
      tx.Validate();
      tx.Commit();
      return ret;
    }
    catch (InvalidTxException){
      tx.Rollback();
    }
  }
}
</code></pre>

<p>Of course, if we wanted to be <em>really</em> functional, we could use recursion instead of the loop, but let’s not get carried away. Ultimately, what we want is just to hide all the messy infrastructure of a transaction from the user.</p>

<p>The above declares a function <code>atomic</code> which takes as its parameter a function which, given a transaction (type <code>Tx</code>), returns type <code>T</code>.</p>

<p><code>atomic</code> implements the messy transaction loop, and simply calls the user-supplied function inside it.</p>

<p>And presto, the user can now create a transaction like this:</p>

<pre><code>int myTx(Tx tx) {
  // do whatever
}
int result = atomic(myTx);
</code></pre>

<p>Or, in a language which supports lambda expressions,</p>

<pre><code>int result = atomic((tx) =&gt; {/* do whatever */ });
</code></pre>

<p>Bit nicer than messing around with user-implemented loops <em>or</em> macros, isn’t it?
Of course, I’ve so far sought refuge within the peaceful realms of pseudocode, where anything is possible. We obviously can’t do this in Java or C++, can we?</p>

<p>No, and yes, in that order. In Java, I agree, we’re just screwed. Of course, we could just pass in an object instead of a function, and if it implements some ICallableTransactionBody interface, <code>atomic</code> could call its <code>RunTx()</code> method. That would work, but it’s not quite as elegant.</p>

<p>In C++, though, the situation is different. A large part of the standard library (all the STL algorithms) relies on higher-order functions. .</p>

<p>In C++, we can define <code>atomic</code> as follows:</p>

<pre><code>template &lt;typename Func&gt;
void atomic(Func f){
  for (;;) {
    tx = new Transaction();
    try {
      f(tx);
      tx.Validate();
      tx.Commit();
    }
    catch (InvalidTxException){
      tx.Rollback();
    }
  }
}
</code></pre>

<p>First, I should mention that I cheated. I removed the return value. <code>atomic</code> now returns <code>void</code>. The reason is that it’s not quite trivial to infer the return type of a function in C++. <a href="http://en.wikipedia.org/wiki/C++_Technical_Report_1">TR1</a> does introduce a function <code>std::result_of</code> but it doesn’t work in every case. C++0x will fix this once and for all, by providing the necessary language features to implement it. But until then, we could hack around it, either by requiring the user to explicitly specify the return type (<code>atomic&lt;int&gt;(myTx)</code>), or by using <code>result_of</code> and simply telling the user to avoid the cases where it doesn’t work. But to keep the above code example simple, I simply removed the return type instead. But it <em>can</em> be solved, and my prototype does it.</p>

<p>Now, on with the show. We have now defined our <code>atomic</code> function. How do we call it?</p>

<p>In one of three ways, in decreasing order of familiarity:</p>

<p>First, with a function pointer. Hopefully every C++ programmer knows about these:</p>

<pre><code>void myTx(Tx tx) {... }
atomic(myTx);
</code></pre>

<p>Fairly clean and simple. One downside is that because the <code>atomic</code> function is just passed a function pointer, the compiler may not be able to inline the call to <code>myTx</code>. It is doubtful that it’d make much difference here, but it is nevertheless a valid concern.</p>

<p>The second approach fixes this, as well as enabling some more flexibility: Functors, or function objects, which every C++ programmer <em>should</em> know about, but unfortunately, not everyone does:</p>

<pre><code>struct myTx {
  void operator()(Tx tx) {...}
};
atomic(myTx());
</code></pre>

<p>Now we’re passing in an object of type <code>myTx</code>, which means that the compiler knows exactly which function to call: <code>myTx::operator()(Tx)</code>. So it can inline this trivially, eliminating the above performance concern.</p>

<p>Perhaps more importantly, our function can now hold state. It is an object, so it can be given a constructor and destructor. It can be given other member functions to allow the user to interact with it. It has become a lot more versatile.</p>

<p>Finally, we can wait for C++0x, which introduces lambda expressions:</p>

<pre><code>atomic([](Tx tx) { ...});
</code></pre>

<p>And what’s noteworthy is that the <em>same</em> <code>atomic</code> definition works in all three cases! So the user is actually given the choice of which method to use. Our STM library supports them all.</p>

<p>Isn’t this cleaner than the original approach, of requiring the user to implement the whole loop himself?
So why have no one used it before?</p>

<p>Ultimately, I can think of two possible reasons:</p>

<ul>
<li>the writers, usually very clever researchers and academics, were not familiar with higher-order functions, or with functional programming in general, or</li>
<li>the writers did not know that this was possible in C++</li>
</ul>

<p>I suppose the second option is the most forgiveable. C++ is a big messy language, and not everyone are familiar with every corner of it. That’s fair enough, but even then, most of this could have been done in C too, by using function pointers. It’s not <em>that</em> esoteric, is it? But I sincerely hope it is not the first case. I could understand if Joe Coder at Insignificant Software Inc. had never heard of functional programming, but most of these are computer science professors!</p>

<p>Incidentally, this should also provide a nice example for when an imperative programmer asks “how would it benefit me to learn a functional language?”</p>

<p>Here we have a clear example of how using a very simple functional technique can significantly clean up the code and make it much more robust<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>, even in an imperative language like C++.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>Imagine what would have happened in the original loop– or macro-based versions, if the user had dared to put a <code>break</code> or <code>return</code> in their transaction… But both are safe in the higher-order function version. <a href="#fnref:1" rev="footnote">↩</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/10/being-functional-in-an-imperative-language/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>A .NET Developers Guide to C++ (part III)</title>
		<link>http://jalf.dk/blog/2009/10/a-net-developers-guide-to-c-part-iii/</link>
		<comments>http://jalf.dk/blog/2009/10/a-net-developers-guide-to-c-part-iii/#comments</comments>
		<pubDate>Thu, 01 Oct 2009 15:45:28 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[.net]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[teaching]]></category>
		<category><![CDATA[win32]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=247</guid>
		<description><![CDATA[We’re nearing the end! Part I focused on the very fundamentals of C and C++, making sure that you understand the build system and the very basics of the syntax. Part II expanded on this to teach you all the C++ you’ll need to do basic work in the language, including a few useful parts [...]]]></description>
			<content:encoded><![CDATA[<p>We’re nearing the end!</p>

<p><a href="http://jalf.dk/blog/2009/08/a-net-developers-guide-to-c/">Part I</a> focused on the very fundamentals of C and C++, making sure that you understand the build system and the very basics of the syntax.</p>

<p><a href="http://jalf.dk/blog/2009/09/a-net-developers-guide-to-c-part-ii/">Part II</a> expanded on this to teach you all the C++ you’ll need to do basic work in the language, including a few useful parts of the standard library, such as vectors and strings.</p>

<p>You now know all the basics we need, and the actual Win32 API should now be very simple to deal with. Not elegant or consistent, but comprehensible as long as you keep a close eye on the documentation and take nothing for granted.</p>

<p><span id="more-247"></span></p>

<p>First of all, the documentation can be found <a href="http://msdn.microsoft.com/en-us/library/aa383749.aspx">here</a>. As you probably already know, Microsoft’s own search capabilities are nonexistent, and to find the function you need, you’ll typically want to use Google. But sometimes, the complete reference is useful, so here it is.</p>

<p>To teach you how to use the Win32 API, I wil, run you through a pair of functions with some very basic functionality: retrieving the last error message.</p>

<p>Should be simple, right? You’d think so, if you’re new to Win32.</p>

<p>The operation consists of two steps. First we have to retrieve the last error code, and then we have to ask Windows for the associated message as a string.</p>

<p>And the first step is indeed easy. We just have to call the <a href="http://msdn.microsoft.com/en-us/library/ms679360.aspx"><code>GetLastError</code></a> function. Let’s start with the complete code:</p>

<pre><code>#include &lt;windows.h&gt;

int main() {
  DWORD error = GetLastError();
}
</code></pre>

<p>Feel free to run this in the debugger and see which code the function returns. (Most likely you’re going to get a <code>0</code>, since no error has actually occurred at this point).</p>

<p>Now let’s look at the actual documentation. They describe the function as having this signature:</p>

<pre><code>DWORD WINAPI GetLastError(void);
</code></pre>

<p>which looks like nothing we’ve seen so far. Let’s take the easy parts first. The function’s name is GetLastError. It takes a parameter of type <code>void</code>, or so it seems. This is actually a throwback to C, and means that the function takes no parameters. Both <code>GetLastError()</code> and <code>GetLastError(void)</code> is legal in C++. In C, the two used to have subtly different meanings. <code>(void)</code> properly declared a function which took no parameters, while <code>()</code> declared a function which took <em>any number of parameters</em>, but simply didn’t access them. But that was C. In C++, the two are identical, and we usually use <code>()</code> to indicate a function that takes no parameters.</p>

<p>Next, we have the return type at the far left. DWORD is short for Double Word (a word is the “natural” data size on the CPU, which, back in the old days, was a 16-bit integer. Hence, a double word is a 32 bits wide. Microsoft has defined DWORD as a macro alias for portability. Under the hood it is simply an <code>unsigned int</code>, but that may change some day. If it does, they will redefine the <code>DWORD</code> macro to stand for some other type. So if you use DWORD when they tell you to, your code will still compile when it happens. It is easiest to just nod and accept this. It doesn’t make a big difference for us, but if and when we need to, we know that we can cast a <code>DWORD</code> to an <code>unsigned int</code></p>

<p>That leaves the last name, <code>WINAPI</code>, which exists for pretty much the same purpose. It is another macro, and stands for the <em>calling convention</em>. The calling convention for a function specifies how parameters should be passed to it, and where it should place its return value. If we don’t know the calling convention of a function, we can not call it. Normally, we’re happy to use the default calling convention, but the Windows API has to be specific, so they add the <code>WINAPI</code> macro. And again, they use a macro so that if they one day decide to change the underlying calling convention, they can simply redefine this macro, and everyone’s code should still compile with no problems.</p>

<p>Following this, they describe the parameters and return value in detail. This is always worth reading in detail, because often, some parameters may or <em>must</em> be NULL. Likewise, the return value may have several meanings, and there is no single consistent convention. Some functions return zero on success, others return non-zero, or a positive value on succes. Some don’t return a success code at all. And some functions returns NULL on error, and <em>actual data</em> on success. Always, <em>always</em> read this section carefully.</p>

<p>In this case, we’re lucky. It simply returns the currently active error code, although it does ramble on about all the inconsistencies caused by other functions.</p>

<p>The <strong>remarks</strong> section tells us other information that doesn’t fit under one specific parameter or under the return value. Again, this should never be skipped. This is where all the inconsistencies and special cases are often listed.</p>

<p>Some functions then have a link to an example usage.</p>

<p>Finally, the documentation shows us <em>where</em> and <em>when</em> the function is defined. In this case, we need at least Windows 2000, and the function is defined in <code>WinBase.h</code> (but we should just include <code>windows.h</code>).</p>

<p>And it is defined in the <code>Kernel32.Lib</code> library. This library is included by default, so we don’t have to worry about this.</p>

<p>So far, it hasn’t been <em>too</em> bad, has it? It should be clear already that it’s not a pretty API, but as long as we stick to the documentation it’s pretty straightforward.</p>

<p>So let’s move on to the <a href="http://msdn.microsoft.com/en-us/library/ms679351.aspx"><code>FormatMessage</code></a> function. Follow that link, and take a look… I’ll be here waiting.…</p>

<p>Done?
Good. Now <strong>this</strong> looks scary. And no, this time I can’t give you a simple explanation. This function truly <em>is</em> scary. Of course, this is one of the reasons why I picked it for this example. This is about as bad as the Win32 API gets.</p>

<p>The page lists the following function prototype:</p>

<pre><code>DWORD WINAPI FormatMessage(
  __in      DWORD dwFlags,
  __in_opt  LPCVOID lpSource,
  __in      DWORD dwMessageId,
  __in      DWORD dwLanguageId,
  __out     LPTSTR lpBuffer,
  __in      DWORD nSize,
  __in_opt  va_list *Arguments
);
</code></pre>

<p><code>__in</code>, <code>__in_opt</code> and <code>__out</code> are Microsoft-specific extensions, and are mainly used for documentation and for static code verification. It tells us which parameters are used for input, and which ones are for output, as well as which ones are optional.</p>

<p><code>LPCVOID</code> is another Microsoft macro. Microsoft spent a decade or two promoting Hungarian Notation before they had to admit what an astonishingly bad idea it actually was. But of course Win32 is stuck with it.
The <code>LP</code> prefix stands for “Long Pointer”, and you can pretty much ignore the “Long” part. That dates back to 16-bit computers, where you actually had different types of pointers (far and near pointers). All we need to know is that it is a pointer. The <code>C</code> is for constant. In other words, this is a constant pointer to void, or <code>const void*</code>. (Of course, <code>void</code> isn’t a very meaningful thing to point to. A void pointer is essentially used as a pointer to an unknown type.)</p>

<p><code>LPTSTR</code> is another adventure in Hungarian Notation. You already know <code>LP</code>. <code>STR</code> is probably obvious too. It’s a string. (Of course, since this is a C API, we’re talking about a C string, or a char pointer, which also explains the presence of the <code>LP</code> part. That leaves the <code>T</code>. What can that mean? I’m not sure. It might be “Template” or similar. It was introduced when Microsoft realized that they’d have to support Unicode. As I mentioned previously, Windows uses <code>wchar_t</code> for unicode text, and so their API had to accept <code>wchar_t</code> pointers when working with Unicode strings. But they still had to be backwards compatible as well, and be able to handle plain char pointers as well.</p>

<p>So they invented a new set of macros The <code>T</code> essentially stands for “whichever character type is currently active”.
If you enter your project’s properties, you’ll see the option to enable or disable Unicode on the General tab. It should be enabled by default.</p>

<p>As long as Unicode is enabled, any macro including this T will be mapped to the equivalent macro using a <code>W</code> (for Wide). If Unicode is disabled, the macro will instead point to a similarly named macro <em>without</em> this character.</p>

<p>In other words:</p>

<ul>
<li>LPTSTR -&gt; LPWSTR or LPSTR</li>
<li>LPTCSTR -&gt; LPWCSTR or LPCSTR</li>
<li>TCHAR -&gt; WCHAR or CHAR</li>
</ul>

<p>And all of these again point to the types you would probably now expect. <code>LPWSTR</code> is a pointer to a wide string (<code>wchar_t*</code>). And <code>LPCSTR</code> is a <em>const</em> pointer to a string, or <code>const char*</code>. And <code>WCHAR</code> is a <code>wchar_t</code>.</p>

<p>As if this wasn’t complicated enough, the function itself is <em>also</em> a macro. Two versions of the function actually exist:</p>

<ul>
<li><code>FormatMessageA</code> is the old ASCII version, using plain <code>char</code> strings.</li>
<li><code>FormatMessageW</code> is the “new” Unicode version, using <code>wchar_t</code> strings.</li>
</ul>

<p>FormatMessage is not itself a function, but simply a macro, which is resolved by the preprocessor to one of these two. (C doesn’t allow overloaded functions, so they had to settle for this ugly hack to allow multiple definitions of the same function).</p>

<p>This also means that we can actually <em>call</em> these two names directly. If we call <code>FormatMessageW</code>, we’ll get the Unicode version regardless of whether Unicode is enabled in project settings. This makes it safe for us to use <code>wchar_t</code> strings directly, rather than mess around with <code>TCHAR</code> strings which might be one or the other.</p>

<p>Going back to the function declaration, the last parameter, <code>va_list</code>, looks a bit out of place. It’s not capitalized, and it doesn’t have these ugly prefixes. It is used in C to indicate <em>a variable number of arguments</em>, commonly known as <code>varargs</code>. As I mentioned in part I, printf uses <code>varargs</code> as well, and this throws away all hope of type safety, or even knowing how many parameters are pased to the function. Hopefully we won’t need to mess with this. (it’s marked as <code>__in_opt</code>, so it should be optional. Let’s hope we won’t have to use it then).</p>

<p>Anyway, there’s nothing for it. Let’s dive in. First parameter:</p>

<p>Ok, so this is a <code>DWORD</code> flag, and seems to store a combination of two values. The second table is shorter, so let’s take that first. There are three options here. A zero just means to preserve whatever line breaks exist in the message by default. the constant <code>FORMAT_MESSAGE_MAX_WIDTH_MASK</code> seems to preserve hardcoded linebreaks, but removes “regular” ones. I have no clue what the difference is.</p>

<p>The last option (mentioned just under the table) is to store any other number into the value. This then specifies the maximum line width. We’re happy to just use the default line breaks though, so we’ll settle for the zero value. That leaves the first table.</p>

<p>Looking through the options there, it seems that we need <code>FORMAT_MESSAGE_FROM_SYSTEM</code>. <code>FORMAT_MESSAGE_ALLOCATE_BUFFER</code> seems potentially interesting as well, but this table doesn’t really explain what happens if this flag is <em>not</em> enabled. If the system doesn’t allocate a buffer for us, who does? Looking down further, at the input parameter <code>nSize</code> we see that:</p>

<blockquote>
  <p>If the FORMAT_MESSAGE_ALLOCATE_BUFFER flag is not set, this parameter specifies the size of the output buffer, in TCHARs.</p>
</blockquote>

<p>In other words, if we don’t use this flag, we have to provide a buffer. But we don’t know the length of the message we’re trying to retrieve, so this seems a bad idea. (Of course we could just provide a buffer of 64KB, which the documentation mentions is the maximum size, but this seems silly).</p>

<p>Finally, if we skip down to the “Security Remarks”, it says to add <code>FORMAT_MESSAGE_IGNORE_INSERTS</code> if we’re going to pass “arbitrary system error codes”, which we are. Most API’s try to ensure that the simplest action is the correct one. Win32 seems to be designed for the opposite case, ensuring that that correct usage should only be possible if you have <em>already</em> read the entire documentation page, very carefully, and at least three times. But that won’t stop us.</p>

<p>So the dwFlags parameter should then be the combination of these flags:  FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_IGNORE_INSERTS | 0‘, although of course the 0 can be omitted.</p>

<p>Next, we have <code>lpSource</code>. Luckily, this is marked optional, and it is stated that this is ignored unless one of the two listed <code>dwFlags</code> values are set, which they’re not in our case. So we ignore it and simply pass NULL.</p>

<p>Then we have the message ID. This must be the value we got from <code>GetLastError</code>. Then we have the language ID. Rather than going searching for possible values to pass here, we can see that if we just pass a zero, it’ll try to pick a sensible default. So let’s do that.</p>

<p>Now comes the pointer to the output buffer. Read what it says here carefully:</p>

<blockquote>
  <p>If dwFlags includes FORMAT_MESSAGE_ALLOCATE_BUFFER, the function allocates a buffer using the  LocalAlloc  function, <em>and places the pointer to the buffer at the address specified in lpBuffer</em>.</p>
</blockquote>

<p>So the parameter <code>lpBuffer</code> is a pointer to <em>the pointer to the buffer</em>. That is, we must pass it a pointer to the pointer it should set to point to the allocated buffer.</p>

<p>It also mentions that the buffer is allocated with <code>LocalAlloc</code>, and must be freed by us with <code>LocalFree</code>. Better remember this, or we’ll leak memory. Note that Windows defines several different memory allocation functions. This time they chose to use LocalAlloc. C++‘s <code>new</code> and <code>delete</code> are implemented in terms of <em>some</em> of these, but who knows which?.</p>

<p>Now comes <code>nSize</code>. It allows us to specify the minimum number of characters to allocate? Why would we care about that? Let’s just pass zero and hope for the best. It’s just a minimum after all.</p>

<p>Finally, we have <code>Arguments</code>. We already specified that the system should ignore inserts, so it seems like it shouldn’t actually care about these arguments. They’re also specified as optional, so let’s pass a big fat <code>NULL</code> here.</p>

<p>And that should be it! Now we just have to handle the return value:</p>

<ul>
<li>zero on failure, or</li>
<li>the number of TCHARs stored in the output buffer, not counting the terminating <code>NULL</code></li>
</ul>

<p>And… we’re through. Now let’s try putting the pieces together and see what happens:</p>

<pre><code>#include &lt;windows.h&gt;
#include &lt;iostream&gt;

int main() {
  DWORD error = GetLastError();

  wchar_t* buffer;

  DWORD length = FormatMessageW(
  FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_IGNORE_INSERTS,
  NULL,
  error,
  0,
  (wchar_t*)&amp;buffer,
  0,
  NULL);

  std::wcout &lt;&lt; buffer &lt;&lt; std::endl;

  LocalFree(buffer);
}
</code></pre>

<p>Note the ugly cast we need to on the buffer. This is necessary because the argument may be either a pointer to a pre-allocated buffer, or (as in our case), a pointer to a pointer we’d like to be set to point to the system-allocated buffer. But the function expects a pointer to a string buffer, not a pointer to a pointer to a string buffer, so if we want to pass it the latter, we have to cast it to the former type.</p>

<p>Note that I’m calling the <code>W</code> version of the function specifically, and using <code>wchar_t</code> instead of <code>TCHAR</code>. The reason is simple. I want the Unicode version, regardless of Unicode setting in the project. Part of the reason is that it’s a lot easier to print out the string when we know what type it is. In particular, the standard library requires us to use <code>cout</code> for regular character strings, and <code>wcout</code> for wide strings. If we’re given a string of TCHAR’s, do we call <code>cout</code> or <code>wcout</code> to print it? Easier to just be specific and make sure we have wide characters.</p>

<p>Well, that’s it. Try running it. It should print out that “the operation completed successfully”. Gee, thanks. That really makes it all feel worthwhile, doesn’t it? Make sure you understand what our code means (in particular, why the cast is necessary, and how <code>wcout</code> is able to print out the string and know where it ends, when all it has is a pointer to a character.</p>

<p>Anyway, you’ve now seen some of the worst the Win32 API has to offer. And you’re still alive. Many of the functions you might want to call are far simpler than this.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/10/a-net-developers-guide-to-c-part-iii/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A .NET Developers Guide to C++ (part II)</title>
		<link>http://jalf.dk/blog/2009/09/a-net-developers-guide-to-c-part-ii/</link>
		<comments>http://jalf.dk/blog/2009/09/a-net-developers-guide-to-c-part-ii/#comments</comments>
		<pubDate>Tue, 08 Sep 2009 13:19:56 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[.net]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[teaching]]></category>
		<category><![CDATA[win32]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=316</guid>
		<description><![CDATA[Welcome to the second installment in my guide of “what you need to know if you’re a .NET programmer who wants to be able to write C++ code and call native APIs”. It took me much longer to get this posted than I’d hoped. My work on my thesis has kept me more busy than [...]]]></description>
			<content:encoded><![CDATA[<p>Welcome to the second installment in my guide of “what you need to know if you’re a .NET programmer who wants to be able to write C++ code and call native APIs”. It took me much longer to get this posted than I’d hoped. My work on my thesis has kept me more busy than I’d originally expected. Sorry for the delay!</p>

<p>In <a href="http://jalf.dk/blog/2009/08/a-net-developers-guide-to-c/">part I</a>, I went through a minimal “Hello World” program in some detail, and attempted to explain the arcane workings of the C/C++ compilation model. Some may argue that this had no relevance to my target audience, but I think it is a necessary evil. Almost all C++ programmers get tripped up at some point by the the difference between compiler and linker errors, and what exactly the <code>#include</code> directive actually <em>does</em>. Hopefully, by reading part I, you’ll be able to avoid this.</p>

<p>With that out of the way, we can get started on the interesting part, though. Part II will focus on actual C++ code. We won’t consider managed interop or even the Win32 API yet, though. This part will still take place in native C++-land only. In short, the purpose of this part is to enable you to write simple C++ programs, and more importantly, to <em>understand</em> the C++ sample code you probably run into from time to time.</p>

<p><span id="more-316"></span></p>

<p>I will <em>not</em> cover all the idioms and techniques that “real” C++ programmers use. We’ll settle for the bare minimum required to get by in a .NET-to-Win32 interop scenario where you really just want to write enough C++ code to call some native API function. This means that we won’t get the most robust, reusable, elegant or concise C++ code. But we <em>will</em> be able to get the job done.</p>

<p>I’d love to write a more detailed series of posts about “modern C++“<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup> some other time, but it is beyond the scope of this series of posts.</p>

<h1>Using C++</h1>

<p>Before we get into the Win32 API, let’s run through some slightly bigger C++ examples than the Hello World from part I. At the very least you’re going to need to know how to define and use classes, and a few useful components in the standard library.</p>

<p>You already know that it is possible to define class member functions outside classes, but you haven’t yet seen a nontrivial class definition. Let us try creating one. For the purposes of demonstration, I’ll implement the simplest class I can think of; a counter. It’ll simply contain an integer, and callers will be able to increment the value, and get the current value.</p>

<pre><code>class counter {
public:
  counter() : i(0) {}
  int current() {return i; }
  void update() { ++i; }
private:
  int i;
};
</code></pre>

<p>There, we now have a basic class. We can call it from this function: (I use <code>assert</code> to indicate expected values of variables, much like you would in a unit-test. Note that the asserts are pseudocode (among other things, I will access private class members with them, which obviously won’t work in reality)</p>

<pre><code>// assume that we either placed the class definition here, or have a #include for the header in which the class is defined.
int main(){
  counter c;
  int i = c.current();
  assert(i == 0);
  c.update();
  assert(c.i == 1);
  assert(c.current() == 1);  
}
</code></pre>

<p>A ridiculous simple program, of course. But there are several things worth noting. In no particular order:</p>

<ul>
<li>Our counter object  <code>c</code> is created without using <code>new</code>, and without explicitly calling a constructor. All C++ types are fundamentally similar to .NET’s value types — so <code>c</code> is not a reference to a counter, but instead a default-constructed instance of one, placed on the stack. If nothing else is specified, the default constructor is called when the variable is declared. (To call another constructor, we could have done something like <code>counter c(1, "hello", 2.0f);</code>. </li>
<li>the class definition is terminated by a semicolon. This is important to remember, as forgetting it can lead to very misleading compiler errors. I won’t get into <em>why</em> this semicolon is necessary here though. It is a long story, and it is caused by the need for C compatibility.</li>
<li>access specifiers are not applied per-member, but rather used to divide the class into sections. In a class, the default specifier is <code>private</code>. I tend to put my public members at the top of the class, to make it easier for readers to find the public interface. Further down, we have a <code>private</code> specifier, for hiding our int member. The valid access specifiers are <code>public</code>, <code>private</code> and <code>protected</code>, which each behave just like in C#. <code>internal</code> does not exist however, since there is no notion of assemblies, and the only way to share types between files is with <code>#include</code>’s as mentioned in part I.</li>
<li>there is no clear, common naming convention in C++. The standard library users lower-case, and separate words by underscores, as in <code>class_name</code>. Many programmers however, use a convention similar to in .NET, naming types <code>ClassName</code>, and variables <code>className</code>. There are no fixed rules, so as long as you are consistent it’s fine with me.</li>
<li>.NET has both classes and structs, and the two have very different meanings. In C++, both classes and structs exist as well, but their meaning is <em>almost</em> the same. The only difference between a class and a struct in C++ is that a struct defaults to <em>public</em> accessibility for members, where a class defaults to private. In other words, if I had defined the above as a struct, I could have omitted the <code>public:</code> line. (For this reason, I often find myself using structs. As I said, I tend to put the public interface at the top of the class, and add a <code>private:</code> section further down. However, it is not a big deal. A common rule of thumb is much the same as is used in C#: Structs are simple containers of data, where classes have behavior. My own style tends to be a compromise between the two. Classes with complex behavior are made classes for this reason, but in simpler borderline cases, I tend to prefer struct, even if it has <em>some</em> behavior. It makes no difference to the compiler, and it saves me a line of code, because it defaults to <code>public</code>, which is what I want at the top of my class anyway.)</li>
<li>the observant reader will probably have noticed a difference here compared to the example shown in part I. Back then we declared the member method without defining its body inside the class. Now we define the body inside the class. Both approaches are legal, and each have their pros and cons. In particular, defining the body outside the class leaders to shorter class definitions, which may aid readability. On the other hand, defining functions inside the class leads to better locality — you only have to look in <em>one</em> place to learn all about the class. Further, the compiler is generally better able to optimize code if member methods are defined “inline”. For these reasons, people often put short functions of 2–3 lines or so inside the class, and define larger ones separately. There is one caveat, however. Functions defined <em>inline</em> (either by placing the full definition inside the class, or by marking the definition with the <code>inline</code> keyword) may have a definition in each translation unit. In other words, they may be placed in headers (where they’re seen by multiple translation units. Non-inline functions must only be defined <em>once</em>, and so generally have to be defined in a <code>.cpp</code> file, similar to what we did in part I.</li>
<li>the constructor looks a bit different than you may be used to. The <code>i</code> member is initialized via the <em>initializer list</em>, specified after the colon. This is similar to how you would call a base class constructor in C#, although instead of <code>base</code>, the member name is used. Also note that we have to explicitly initialize <code>i</code> because as a primitive type, it would otherwise not be initialized at all. The initializer list syntax is only legal in constructors, and should be used as much as possible. I’ll explain why in a moment.</li>
</ul>

<h1>“Special” member functions</h1>

<p>We <em>could</em> have defined the constructor in a more familiar way:</p>

<pre><code>counter() {
  i = 0;
}
</code></pre>

<p>and in this simple case, it would have made no difference. In more complex classes, however, there is an important distinction: anything that happens in the constructor’s body happens <em>after members are initialized</em>. If the member is not specified in the initializer list, it is <em>default initialized</em>, which means that for primitive types, <em>nothing</em> happens, they just contain random garbage values, and for classes defining a default constructor, it gets called <em>before</em> the constructor’s body is evaluated, in which we assign the <em>actual</em> value we want the member to contain.</p>

<p>So yes, for our simple <code>int</code> case, we might as well have written the constructor without using the initializer list. But consider what would have happened if the member had been some complex user-defined class. Instead of simply constructing the object with the right value to begin with, we would have default-constructed it, and <em>then</em> executed an assignment. This would obviously have been less efficient than simply constructing the object correctly in the first place.</p>

<p>But just as importantly, some types <em>can not</em> be assigned to once they are initialized. Likewise, some types may not have a default constructor, in which case failure to use the initializer list to explicitly call another constructor will result in a compiler error! So in general, the initializer list should be preferred both from performance and correctness concerns. A side effect of the initializer list is that the actual body of constructors can often be left empty.</p>

<p>In .NET, there is a distinction between <em>value</em> and <em>reference</em> types, and the behavior of the assignment operator is completely different for each of the two cases. <code>x = y</code> for values of a reference type simply stores a <em>reference</em> to <code>y</code> into <code>x</code>. But if the two types are value types, a complete copy is created instead.</p>

<p>In C++, <em>all</em> variables obey value semantics. <code>x = y</code> will always copy the <em>value</em> <code>y</code> into <code>x</code>. This is why I said when discussing the constructor’s initializer list that an extra assignment may be expensive.</p>

<p>Since the plain value semantics as used by C# would be both inflexible and inefficient, C++ provides a number of tools for controlling the behavior of your class. In particular, you can define a <em>copy constructor</em> and an <em>assignment operator</em> to override exactly how assignment should be performed. The following demonstrates what they may look like.</p>

<pre><code>class counter {
public:
  counter(const counter&amp; other) : i(other.i) {} // copy constructor
  counter&amp; operator= (const counter&amp; other) { // assignment operator
    if (this == &amp;other) {return *this; }
    i = other.i;
  }
  ....
};
</code></pre>

<p>Perhaps the first thing we should mention is the meaning of the <code>&amp;</code> character. It is used to denote a <em>reference</em>, essentially an alias for a variable. It is related to pointers (see <a href="http://jalf.dk/blog/2009/07/the-great-pointer-conspiracy/">this post</a> for a more detailed explanation of pointers), but is simpler and more limited. In particular, it can not be reseated. Once it is initialized, it is an alias for the variable it points to <em>forever</em>. Also, unlike pointers, there is no special syntax for <em>using</em> a reference:</p>

<pre><code>int i; // create an integer.
int&amp; r = i; // create a reference as an alias of i. Note that we simply assign i, unlike with pointers where we would have had to take the address of i first with the `&amp;` operator.
r = 42; // assign 42 to whatever the reference points to. Again, no special syntax. There is nothing here to tell us that r is a reference.
int j = 13; // create another integer
r = j; // assign it to our reference. The effect of this is *not* to make r point to j (as would have happened had it been a pointer), but simply to assign the value of j to i. In other words, i will now equal 13, and r will still point to i.
</code></pre>

<p>Because a reference can not be reseated, it is also a nice example of a case where the constructor’s initializer list <em>must</em> be used. Imagine a class which has a reference member. References <em>must</em> point to something, so they have no default constructor. And once they are initialized to point to an object, they <em>always</em> point to that object. In other words, it must be constructed <em>before</em> the constructor’s body is executed, which means in the initializer list. Failure to do so simply won’t compile.</p>

<p>Now to explain the copy constructor, which is fairly simple. It is simply a constructor which takes one argument, a <em>const reference</em> to the type itself. Copy constructors are commonly used to initialize class members with copies of the arguments passed to the “outer” class’ constructor. In the copy constructor above, we also copy-construct <code>i</code>, for example. (The value of <code>other.i</code> is copied into our own <code>i</code>)</p>

<p>The assignment operator is a bit trickier.
The first line inside it tests for assignment to itself. (As would happen in <code>x = x</code>). This may not have been a problem in this simple class, but in more complicated ones, self-assignment can cause problems, as you will be reading data from the same object you’re writing to.  We also note that instead of simply comparing <code>this</code> to <code>other</code> in the test, we use <code>&amp;other</code>. We wish to check that <code>this</code> and <code>other</code> refer to the same object instance, not just that they contain the same value. To achieve this, we need to compare pointers. <code>this</code> is already a pointer<sup id="fnref:2"><a href="#fn:2" rel="footnote">2</a></sup>, but <code>other</code> is a reference, so we have to take the address of it first. Because a reference is essentially an alias for the referenced value, the address-of operator returns the address of the referenced value, not of the reference itself.</p>

<p>Next, note that the assignment operator does not have an initializer list, but instead performs the copying in the function body. The reason for this is obvious: It is not a constructor, so all its members are already initialized. An initializer list would not make sense, and is not allowed by the language. This also means that here, <code>i</code>’s assigment operator is invoked, rather than its copy constructor, as was used in the previous example. (Technically, built-in types have neither assignment operator or copy constructor. However, the same syntax is allowed, it simply uses the obvious built-in operations.)</p>

<p>A final note about assignment operators and copy constructors is that if <code>=</code> is used to <em>declare</em> a variable, the copy constructor, and not the assignment operator, is called. As I said before, these functions are special and known to the compiler, and so, can be invoked in special cases. That is, if you are given variables <code>c</code> and <code>d</code> of type <code>counter</code>, then <code>c = d</code> calls the assignment operator on <code>c</code>, because <code>c</code> is <em>already</em> initialized. But if it had instead been <code>counter c = d</code>, then <code>c</code> would have been <em>initialized</em> as a copy of <code>d</code>, and so its <em>copy constructor</em> would have been used. The compiler ensures this, even if you use assignment syntax in the initialization of a variable.</p>

<p>Finally we get to another dreaded C++ construct: the <em>destructor</em>.
This is automatically called when the object is destroyed, and can be defined thusly:</p>

<pre><code>class counter {
public:
  ~counter(){
    std::cout &lt;&lt; i &lt;&lt; std::endl;
  }
  ....
};
</code></pre>

<p>The syntax is similar to finalizers in C#, but the effect is somewhat different. The destructor is invoked <em>instantly</em> when an object is deleted, and it is <em>guaranteed</em> to be called. In our case, we simply use it to print out the counter value.</p>

<p>Let’s try using these new functions and operators:</p>

<pre><code>int main(){
  counter c; // use the default constructor to create a counter
  c.update(); // increment its value
  assert(c.i == 1);
  counter d(c); // use the copy constructor to create a new copy of our existing counter.
  assert(d.i == 1);
  d.update();
  assert(c.i == 1); // our copy constructor made sure to create a *new* counter variable, so c is not affected by changes to d, and vice versa
  assert(d.i == 2); 
  c = d; // since c has already been initialized, the assignment operator is used to copy d into c.
  assert(c.i == 2);
} // at this point, both c and d go out of scope, and so their destructors are called. Destructors are always called in opposite order of destruction, so d's destructor will be invoked first.
</code></pre>

<p>All three functions are auto-generated by the compiler, if not declared explicitly. (The one exception is the assignment operator, which it may not be possible to auto-generate. If a class contains a member with no assignment operator, or a reference (which can not be reseated), the compiler will fail to generate an assignment operator, and all attempts to perform assignment will fail if one is not explicitly defined by the user.</p>

<p>The trio of copy constructor, assignment operator and destructor are sometimes called “the big three”, or we may speak of “the rule of three”. This is a rule of thumb that if you find yourself implementing one of these three special functions, you almost certainly should also implement the other two. The reasoning is pretty simple: The assignment operator and copy constructor are related — both are used to copy an object. If special care has to be taken when copying, then it should probably be defined for both these functions.</p>

<p>Further, if copying requires nontrivial handling, then it is a good bet that the class manages some kind of resource or contains data which requires special care in the destructor as well. Perhaps a pointer pointing to dynamically allocated memory, which must be deleted, or perhaps it should decrement a global counter used to count the number of live instances of the class. Or perhaps it is a file handle which must be closed. The fact that we had to implement special handling when copying is a strong hint that there will probably also be special handling required when cleaning up in the destructor.</p>

<p>And the converse is also true. If the destructor has to do something special, it must be because the class owns some kind of resource that must be released. And if it owns a resource, then we should ensure that the resource gets copied when the class itself does. So we should probably define copy constructor and assignment operator as well.</p>

<h1>POD types</h1>

<p>A final note about classes may be worth mentioning. C had no classes, only simple structs containing values, but no member functions, and without allowing inheritance or access specifiers. Since C++ was designed to be (mostly) backwards-compatible, such types have a special status in C++. In the above, I mentioned “primitive types” a few times. While an <code>int</code> is technically a primitive type (all built-in types are considered primitive types), the behavior I described is actually common to all <em>POD</em> (Plain Old Data) types. A POD type is essentially a type that would have been legal in C — in other words, it is either a built-in (primitive) type, or a class or struct where</p>

<ul>
<li>all members are public</li>
<li>no member methods exist</li>
<li>no constructor, copy constructor, assignment operator or destructor is defined</li>
<li>no base classes exist</li>
<li>All members are POD types as well</li>
</ul>

<p>Such POD types are given special treatment in many ways. For example, they may be treated as “raw memory”. The standard-library C function <code>memcpy</code>, which simply copies a number of bytes from one location to another, may be used to copy POD types, but not non-POD classes. The reason for this is that non-POD types may have extra behavior that would break if this was done. As an obvious example, if we created a copy in this way, we would bypass the assignment operator/copy constructor, but we would end up with two objects, both of which would have their destructors called when deleted — so we would end up with a mismatch where the destructor is called more often than the constructors, a clear error if the class implements reference-counting, for example.</p>

<p>Another peculiarity of POD types is that they are not initialized unless a constructor is explicitly called. this is why we had to initialize <code>i</code> in our constructor above. As a POD type, <code>i</code> would otherwise contain whatever garbage value was found in memory. The same is true for POD structs. They too contain garbage if not explicitly initialized by calling a constructor:</p>

<pre><code>int i; // no initialization occurs
int i(); // explicitly require default initialization -- for POD types, this is done by setting all members to zero.
</code></pre>

<p>In other words, had our <code>counter</code> class stored a non-POD member, the initializer list would not have been necessary. Its member would automatically be default-constructor if nothing else was specified. But POD types do not have that extra behavior, so if nothing else is specified, they simply don’t get initialized.</p>

<h1>Enough about classes</h1>

<p>There are a few other nitty-gritty details about the language we should discuss. You may have already wondered about one or two of them. So without further ado,</p>

<ul>
<li>variable declaration is usually done <em>without</em> using <code>new</code>. The <code>new</code> operator allocates memory on the heap, and returns a pointer to the newly declared variable. Since there is no garbage collector, we have to manually call <code>delete</code> on this pointer to free the memory. This is the source of C++‘s reputation as a playground for memory leaks. Of course the astute reader will have noticed that so far, I haven’t used <code>new</code> and <code>delete</code> even once. The truth is that these can often be avoided or hidden, thus removing all possibility of memory leaks. Any variable declared <em>without</em>  using <code>new</code> is declared “locally” — if it is declared in a function, it becomes a local variable, and is destroyed when we leave the scope in which it is declared. If it is a class member, it is destroyed when the owning class is destroyed. If it is declared inside a loop, it is destroyed when we leave the loop, and if it is defined in a function, it is destroyed when we leave the function. In other words, variables declared without <code>new</code> have “automatic storage duration”, and in fact, <code>int i = 42</code> could also be written as <code>auto int i = 42</code>. The auto keyword indicates exactly this, that the lifetime of the variable is <em>automatic</em>. Since this is the default, the keyword is never actually used, but it exists, and this is what it means. And just to clear up any doubts, variables with automatic storage duration are destroyed when we leave the scope it was declared in, <em>no matter how</em> we leave it. It doesn’t matter if we return from the function, or if an exception is thrown. In both cases, the local variable’s destructor is called.</li>
<li>Just to avoid confusion, we’d better look at a quick example of using <code>new</code>: Consider this line of code: <code>counter* p = new counter()</code>. Here, we allocate an object of our <code>counter</code> class on the heap, with dynamic storage duration, but we <em>also</em> declare a local variable — the pointer <code>p</code>.  The pointer is a local variable with automatic storage duration. In other words, the pointer itself will be freed just fine when we leave the function — but the dynamically allocated <code>counter</code> to which it points will <em>not</em>. This is how memory leaks occur. Once <code>p</code> gets destroyed, we no longer have a pointer to the dynamically allocated memory, so we can never free it.</li>
<li>Avoiding cyclic dependencies can take a bit of work, since C++ code is read by the compiler from top to bottom. It won’t let a function or class refer to another which hasn’t been defined yet. Sometimes, this can be solved through refactoring, by splitting out the code we need to refer to, out into a separate class which can be declared first. But another trick is to use <em>forward declarations</em>. You have already seen it used for the class member method in part I. We can declare a function without specifying its body. This tells the compiler that the function <em>exists</em>, which means we can call it safely. So if we put such a declaration at the top of a file, we can provide the actual definition including the body at the end of the file, after whatever classes or functions we need to refer to. For classes, we can do a similar trick, and simply declare <code>class counter;</code>. As with the function case, this tells the compiler that <code>counter</code> is a class, and that it <em>does</em> exist. The definition just isn’t shown yet. This won’t let you access class members yet (since the compiler still doesn’t know which members it has), and you can’t declare variables of that type yet (because the compiler doesn’t know which, if any, constructor to call, and it doesn’t know the size of the class). But you <em>can</em> create references and pointers to the class.</li>
<li>C# uses function overloading to allow for functions where some parameters may have sensible default values. If we have a function taking parameters <code>a</code> and <code>b</code>, we can create an overload which takes only <code>a</code>, and provides a default value for <code>b</code>. The same can be done in C++, but you <em>also</em> have the option of providing default values. The function <code>void foo(int i = 0) {std::cout &lt;&lt; i &lt;&lt; std::endl; }</code> can be called just with <code>foo()</code>, and will print out <code>0</code>. If you are more comfortable with overloading, you may not need to use default parameters, but you may still encounter third-party code which uses them, so you should be familiar with the syntax.</li>
</ul>

<h1>The standard library</h1>

<p>We’re nearing the end. The last thing you should know about C++ before I let you run loose is a few standard library classes.
The C++ standard library is very small compared to .NET or Java’s class libraries, but it is also widely considered C++‘s main saving grace — most people consider the language an overcomplicated mess in many ways, but the standard library stands out, both as an example of C++ done <em>right</em>, and as a redeeming feature which transforms C++ into a powerful and elegant language<sup id="fnref:3"><a href="#fn:3" rel="footnote">3</a></sup>. Or more precisely, <em>part of</em> the standard library possesses these qualities.</p>

<p>In the following I’ll briefly sketch out the main parts of the standard library, and explain a few useful classes. For more general information, Microsoft has some excellent documentation for all parts of the standard library <a href="http://msdn.microsoft.com/en-us/library/cscc687y.aspx">here</a>.</p>

<p>The standard library has been assembled piecemeal over the years, and as such, represents several different styles and paradigms. The oldest parts of it are simple functions carried over from C’s standard library. I have already mentioned two of these, <code>printf</code> and <code>memcpy</code>, but of course many others exist.</p>

<p>After these came the first C++-specific additions, in the form of the <code>iostreams</code> library. You have also encountered a few members of this, in <code>cout</code>, <code>cin</code> and <code>endl</code>, as well as the <code>operator&lt;&lt;</code> used for streaming. This library is, honestly, not very nice. It does the job for simple Hello World-like applications, but it is inflexible, inefficient, overcomplicated and hard to extend. In fact, many C++ programmers stick to <code>printf</code> over <code>cout</code> despite all the disadvantages I listed in part I. Of course, <code>iostreams</code> also contains file streams as well as some other basic stream functionality. A related addition is the <code>string</code> class, and the locale facilities.</p>

<p>These all have one thing in common: they are very old-fashioned and are, today, considered far from ideal. The <code>string</code> class got some last-minute surgery when it was added to make it a bit more modern, and a few additions were made to the stream classes as well, but overall, these are relics from the era of “C with classes”.</p>

<p>Finally, the star of the show is the Standard Template Library, or the STL for short. This remarkable library completely changed the how the language was used, and is definitely worth exploring. I won’t ramble on about it here, but I will mention that one of its characteristics is that it almost completely abandons traditional Object-Oriented programming (which <code>iostreams</code> used heavily), in favor of the less known and almost C++-specific paradigm <em>Generic Programming</em>.</p>

<p>The STL consists of three distinct “pillars”:</p>

<ul>
<li>Container classes are the equivalents of .NET’s System.Collections.Generics classes. They store sequences of data, and little else.</li>
<li>Iterator classes are superficially similar to .NET’s IEnumerator. They allow traversal over a container, but where .NET only allows traversal from the beginning to the end, C++ iterators also allow reversed iteration (from end to beginning), as well as traversal over subsets of the container (from the 6th to the 12th element, for example). Pairs of iterators are often used to mark sequences for further processing. Individual iterators are often used as “markers” into a sequence.</li>
<li>Algorithm functions work on iterators, or a pair of iterators, and perform almost all sequence processing. Sorting, searching, copying, <code>foreach</code>, accumulating values or any other algorithm involving sequences of data is implemented as an algorithm working on iterators.</li>
</ul>

<p>The clever part about this setup is that algorithms and containers know nothing of each others. An algorithm works on iterators, <em>wherever they come from</em>. It works whether the iterators are pointers into an array, into a linked list, or perhaps even into a stream or a database. As long as the iterator implements the appropriate functionality, it can be used by the algorithms. This allows for a degree of reusability that would have been impossible in .NET. The same <code>find</code> function for example, works on all of the standard container classes, <em>in addition</em> to working on any iterators your define yourself. As long as they fulfill a few basic requirements, you get <code>find</code>, <code>sort</code> and many other common operations for free.</p>

<p>And again unlike .NET, there is no interface you have to implement to create a new iterator type, or, for that matter, a new container class. The STL relies on a form of <a href="http://en.wikipedia.org/wiki/Duck_typing">Duck Typing</a> (if it looks like a duck, and walks like a duck, and quacks like a duck, it must be a duck) — this means that an iterator is not “a class which implements <code>IIterator&lt;T&gt;</code> or anything like that, but simply “A type T for which the following statements are defined, given an object <code>x</code> of type <code>T</code>: <code>++x</code>, <code>*x</code>, <code>T()</code> and a few others. In other words, if a type defines a default constructor and a few operators, <em>then it is an iterator</em>, and it’ll work seamlessly with the rest of the STL. In fact, raw pointers are valid iterators as well.</p>

<p>In .NET, every collection class has to define its own search function, and there is no elegant way to decouple it completely. (We could define the function in a static helper class, but it would still be working on something specific like an <code>IList</code>, rather than just <em>any</em> sequence). In C++, the function <code>std::find</code> works on <em>any</em> pair of iterators.</p>

<p>While iterators and algorithms are key to “modern C++”, I will focus on the containers here, as they can be used with little explanation, and are almost indispensable (just like you wouldn’t want to program in C# without the <code>List&lt;T&gt;</code> class)</p>

<p>The equivalent of .NET’s <code>List&lt;T&gt;</code> class is the <code>vector</code>:</p>

<pre><code>#include &lt;vector&gt;

int main() {
  std::vector&lt;int&gt; v;
  v.push_back(1);
  v.push_back(2);
  v.push_back(3);
  v.push_back(42);
  v.pop_back();
  // v now contains the values [1, 2, 3]
  v.resize(5); // resize to contain 5 elements
  // v now contains [1, 2, 3, 0, 0]
  assert(v[1] == 2);
  v[3] = 42;
  // v now contains [1, 2, 3, 42, 0]
  int&amp; r = v[0]; // create a reference to the first element
  int* p = &amp;v[0]; // create a pointer to the first element
}
</code></pre>

<p>Pretty straightforward. And again, note that we’ve managed to create an arbitrary number of objects in our application, without even once having to call <code>new</code>. Which also means that there is <em>no</em> possible way in which this application can leak memory. (short of bugs in the compiler or standard library).</p>

<p>There are a couple of caveats to be aware of though:</p>

<ul>
<li>There is typically no bounds-checking on the <code>[ ]</code> operator. This doesn’t mean it is legal to do <code>v[999]</code> above, it just means that there is no guarantee of what will happen if you do it. It is <em>undefined behavior</em>.</li>
<li>Pointers and references to individual elements within a vector may be <em>invalidated</em> when we add elements to the vector. Like with C#‘s <code>List&lt;T&gt;</code>, it is a dynamic array, and resizes as necessary. Each such resizing operation consists of allocating a new array, copying the contents into that, and then freeing the old array. A pointer to data in the old array is therefore no longer valid. The same applies for iterators. Any iterator pointing into a vector is invalidated if the vector is resized.</li>
</ul>

<p>Because a vector guarantees that its data is stored contiguously, essentially as an array, we can use this class instead of an array when interfacing with old C code (which only has pointers and arrays, but no vectors). In the above, the variable <code>p</code> could be passed to a C function as a pointer to the beginning of an array of <code>int</code>’s. we still have to be careful of course. The function must not be allowed to write past the end of the array.</p>

<p>Other container classes are the the map (equivalent to .NET’s <code>Dictionary&lt;Key, Value&gt;</code>. <code>std::map&lt;Key, Value&gt;</code> in the <code>map</code> header), and the set (no equivalent in .NET 2.0, although <code>HashSet&lt;T&gt;</code> in 3.5 is similar). Works much like a <code>map</code> without the <code>Value</code> parameter: <code>std::set&lt;T&gt;</code> in the <code>set</code> header). Their use is pretty much as you would expect.</p>

<p>In general, I would discourage you from using arrays. Prefer vectors instead, and if an API expects a pointer to an array, pass it a pointer to the first element of the vector instead, as shown in the previous example. Vectors are safer and simpler to work with.</p>

<h1>Strings</h1>

<p>A final pair of classes worth mentioning are <code>std::string</code> and <code>std::wstring</code>. C++ has no built-in string type, and so to work with strings, you have to include the <code>string</code> header, and use these classes. A <code>string</code> is simply a string of <code>char</code>’s, single-byte characters. A <code>wstring</code> is a string of <code>wchar_t</code>’s, or <em>wide characters</em>. On Windows, these are 16 bits wide, and use the UTF16 encoding, allowing them to be used for unicode strings.</p>

<p>These classes behave much as you would expect, so I won’t discuss them further. Instead I’ll skip to a related point of confusion: C has no string type <em>at all</em>. Instead, <code>char</code> pointers (or <code>wchar_t</code> pointers) are used as primitive strings.</p>

<p>A C-string is simply a sequence of characters, terminated by a null character (<code>'\0'</code>). If the null character is left out, all C string functions will just assume that the string continues <em>until</em> a null character happens to be found. This is obviously extremely fragile and a common source of bugs. but it’s an unavoidable fact of life when interfacing with C code.</p>

<p>This also rears its head when working with string literals. <code>"hello world"</code> does <em>not</em> have type <code>std::string</code> in C++. It has type <code>const char[12]</code>, that is, an array of 12 const characters. (Note that the string is only 11 characters long. The compiler automatically generates the terminating null, and sets aside space for this as well).</p>

<p>Arrays in C and C++ are very primitive and fragile things, and implicitly <em>decays</em> into pointers when needed. Whenever you have an array, you can assign it to a pointer, and the pointer will automatically point to the beginning of the array. Because arrays are so limited (a function can not return an array or take an array as argument either), arrays are often passed around <em>as</em> pointers — and in fact, pointers can be treated much like arrays as well. Given a pointer <code>p</code>, <code>p[2]</code> is legal, and is equivalent to <code>*(p+2)</code>. But because it is just a pointer, the <em>size</em> of the array isn’t known. It is up to the programmer to keep track of that.</p>

<p>Getting back to strings, the way arrays can decay into pointers means that this is legal: <code>const char* str = "hello world"</code>. The pointer <code>str</code> now points to the <em>statically allocated</em> array of characters “hello world”, and for all practical purposes, <code>str</code> is now a C-string.</p>

<p>To create a wide string literal, the string is prefixed with a ‘L’, as in <code>wchar_t* wstr = L"hello world"</code>.</p>

<p>Because C-style strings are used in most API’s, you often need to convert between this and the C++ string class. This can be done as in the following:</p>

<pre><code>const char* str = "hello world";
std::string str2 = str; // an implicit conversion exists from char pointer to string. So in addition to this line, 'std::string str = "hello world" would also have worked.
const char* str3 = str2.c_str(); // the c_str() member method on the string class returns a C-style string.
</code></pre>

<p>Because string literals are C-style strings, there are a few pitfalls to be aware of when using them:</p>

<pre><code>char* str = "hello worl";
char* str2 = str + str; // #1
str += 'd'; // #2
</code></pre>

<p>In line <code>#1</code>, we get a compile error. Because <code>str</code> is just a pointer, addition is not defined, and so the compiler chokes.
A related example is in <code>#2</code> where we try to add a character to the string. This compiles, perhaps surprisingly, but it won’t do what you expect. Instead, the <code>char</code> gets converted to an <code>int</code>, and <em>added to the value of the pointer</em>. So the result is a pointer to <code>'d'</code> characters past the beginning of the string.</p>

<p>For these operations to work, we must have a proper C++ string:</p>

<pre><code>std::string str = "hello ";
str += "worl";
std::string str2 = str + 'd';
</code></pre>

<p>will work as expected, and result in the string “hello world”.</p>

<p>You now know all you need to know about C++ to use it without shooting yourself in the foot <em>too much.</em> You also know enough to read a lot of the code snippets you’re likely to find online. And you’ve got a starting point for searching out more information should you wish to.</p>

<p>In the next installment, we will finally get to interfacing with the Win32 API. You may want to play around a bit with the compiler to make sure you understand pointers and C-style strings in particular, as we’re going to need those quite a bit. As I mentioned in part I, the Windows API is a C API, and an ugly, inconsistent one at that. It’s not a bad idea to make sure you’re somewhat comfortable with the basics of the language before trying to grapple with it.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>“Modern C++” is not just a random name. It is a style of C++ programming named after Alexandrescu’s book, <a href="http://www.amazon.com/Modern-Design-Programming-Patterns-Depth/dp/0201704315">Modern C++ Design</a> — there are fundamentally two ways to program in C++. One style is often, and somewhat derisively, called “C with classes” — implying that it is used in much the same way one would program in C, but with the addition of classes, member methods and public/private access specifiers. The other, superior, approach is “Modern C++”. C with classes is often what beginners encounter, and perhaps surprisingly, what Java and C# are based upon — meaning that programmers coming from these languages tend to settle on an obsolete and sub-optimal style. I often make a point of teaching newcomers “proper” modern C++, but this is not the place. The goal of <em>this</em> series of posts is not to teach <em>good</em> C++ practices, but simply to enable .NET programmers to talk to native API’s. <a href="#fnref:1" rev="footnote">↩</a></p>
</li>

<li id="fn:2">
<p>Unfortunately, there is no particularly good reason for this. <code>this</code> <em>should</em> have been a reference. That would have made much more sense. However, when <code>this</code> was added to the language, references did not yet exist, so it had to be a pointer. And later, when references were added, changing <code>this</code> to a reference would have broken backwards compatibility. <a href="#fnref:2" rev="footnote">↩</a></p>
</li>

<li id="fn:3">
<p>Bjarne Stroustrup, the designer of C++, once said that “Within C++, there is a much smaller and cleaner language struggling to get out” <a href="#fnref:3" rev="footnote">↩</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/09/a-net-developers-guide-to-c-part-ii/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>A .NET Developers Guide to C++</title>
		<link>http://jalf.dk/blog/2009/08/a-net-developers-guide-to-c/</link>
		<comments>http://jalf.dk/blog/2009/08/a-net-developers-guide-to-c/#comments</comments>
		<pubDate>Mon, 17 Aug 2009 17:49:10 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[.net]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[teaching]]></category>
		<category><![CDATA[win32]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=184</guid>
		<description><![CDATA[One of my coworkers is essentially a self-taught programmer, but he is interested in, and wants to learn, absolutely everything. A year or two back, he asked me to give him a crash course in C++, because he felt it was a problem that whenever he needed to do something that required functionality not exposed [...]]]></description>
			<content:encoded><![CDATA[<p>One of my coworkers is essentially a self-taught programmer, but he is interested in, and wants to learn, absolutely everything. A year or two back, he asked me to give him a crash course in C++, because he felt it was a problem that whenever he needed to do something that required functionality not exposed by the .NET framework, he essentially hit a wall.</p>

<p>So we took an afternoon out to run through some basic C++ code, and while we had fun doing it, and I’m pretty sure he found it interesting, it didn’t really achieve the goal of making him comfortable with writing small C++ programs to communicate with native APIs such as the Windows one.</p>

<p><span id="more-184"></span></p>

<p>Afterwards, I realized that the reason for our failure was that we hadn’t really made it clear what we were trying to achieve. He might have been interested in C++ in general, but what he actually <em>needed</em> was something a bit simpler: Being able to call native (primarily Win32) APIs.</p>

<p>Of course, the difference between these two is not obvious. In the .NET world, the two would basically have been the same thing. In learning C# (or another .NET language), you also learn to interface with .NET APIs, and if you need to interface with these APIs, you have to learn a .NET language.</p>

<p>In the case of C++ and native APIs, the situation is a bit different. Learning the language does not guarantee proficiency with using native APIs, and native APIs can be used without knowing C++.</p>

<p>So this series of posts is going to be my second attempt at teaching a .NET developer to at least be able to set up a basic native application, and more importantly, to call a function in the Win32 API.</p>

<p>The following is <em>not</em> a completely general introduction to C++. If you actually intend to learn and use the C++ language, there are many better texts to follow. I might even write my own attempt one day.</p>

<p>In this series of posts, I will</p>

<ul>
<li>assume familiarity with programming in .NET or another managed platform (such as Java). You’ll probably be able to get by as well if you’re coming from another high-level language such as Python or Ruby, as long as you can understand the basic syntax of the C family of languages.</li>
<li>leave out a lot of things a “dedicated” C++ programmer should know. The goal is not to turn the reader into a professional C++ developer, but simply to break down the wall and enable you to make occasional forays into native-land to call an API function or two before heading back to your favorite language.</li>
</ul>

<h1>Before we begin</h1>

<p>Before we get into the actual code, there are a few peculiarities of native languages to be aware of.</p>

<p>Almost all native APIs are actually written in C, not C++. Both languages have some responsibility for this. Part of the reason is that C is the <em>lingua franca</em> of programming languages. When your Python code has to talk to your Java code, they use a C interface. Virtually every language has C wrappers available to allow it to communicate with C code. So by writing your API in C, you ensure that <em>every</em> language can use it without too much trouble. And of course C is a very simple language, so almost any language can cope with a C API. There are no classes, no higher-order functions or exceptions or other pecularities of more modern programming paradigms. So part of the reason is that C is simply a good intermediate language.</p>

<p>The other part of the reason is found in C++: C++ has no fixed ABI<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>. C++ functions compiled by one compiler can not be called from code compiled by another. And when C++ compilers can’t cooperate, entirely different languages don’t stand <em>any</em> chance of being able to talk to C++ code. COM objects provide a partial solution to this, but require a lot of plumbing to implement correctly. For widely used API’s, it is often simpler to restrict your interface to C code.</p>

<p>So the code we need to interface with is actually C, not C++. Our own code is going to be a limited subset of C++. If you intend to write actual applications in C++, you really owe it to yourself to learn the language properly, but for our purposes, sticking with a smaller subset is simpler.</p>

<p>So what does it mean in practice that the API is written in C?
Primarily two things:</p>

<ul>
<li>No exceptions — errors have to be reported through error codes.</li>
<li>No classes — C allows structs, containing data, but no member functions, and no access specifiers. All members are public.</li>
</ul>

<p>Now, on to how we’re going to tackle our task:</p>

<p>The first three installments in this series of posts will deal exclusively with native code. This first one will demonstrate a simple <code>Hello world</code> program, and discuss some fundamentals of organizing and compiling C++ code. This isn’t exactly exciting stuff, but it is useful to understand, as it commonly trips up beginners (and even some reasonably experienced programmers).</p>

<p>The second part will teach all the missing piece of C++ (the ones that we’re going to need, anyway), so that you’re comfortable with reading and writing simple C++ programs.</p>

<p>In the third part we’ll get into the Win32 API, calling a few functions (of varying complexity) and not least, learning to read the arcane specification on MSDN.</p>

<h1>Hello World</h1>

<p>Load up Visual Studio, and create a new project. The project type should be <code>Win32 Console Application</code>. This brings you to the C++ Project Wizard. If it looks like something that belonged in Windows 95, that’s because that is when it was last updated. It is written in Javascript and HTML, of all things.</p>

<p>This wizard gives you access to a couple of application settings. For now, set <code>Application Type</code> to <code>Console Application</code>, and select <code>Empty Project</code> under <code>Additional Options</code>. In particular, we do <em>not</em> want a precompiled header. It is a hack that can speed up compilation time in large C++ projects, but it is nothing more than a source for confusion in simple, small projects. Neither ATL or MFC headers should be added under <code>Add common header files for</code>.</p>

<p>Click <code>finish</code>, and we’re given an empty project, just like we asked for. It contains three “folders”, named <code>Header Files</code>, <code>Resource Files</code> and <code>Source Files</code>. I put “folders” in quotes because they aren’t. Visual Studio calls them filters, and they basically just group files by file type, rather than actually enforcing any particular location on the file system. They’re also not particular important to us so you can delete them if you like. If you add a <code>.cpp</code> file to the project, it is automatically listed under the <code>Source Files</code> filter, while <code>.h</code> files get listed under <code>Header Files</code>.</p>

<p>Now, let’s see some actual code. To begin with, let’s try a Hello World:</p>

<p>Create a new .cpp file in the project.</p>

<p>Now type the following into it: (We’ll get into what it means in a moment)</p>

<pre><code>#include &lt;iostream&gt;

int main() {
   std::cout &lt;&lt; "Hello world" &lt;&lt; std::endl;
}
</code></pre>

<p>Now compile and run it. No big surprises here, it does exactly what we’d expect a “hello world” program to do.
As for what the code means, let’s start with the <code>main</code> function itself. It’s not a member of any class — in C++, nonmember functions are allowed (and commonly used), and <code>main</code> in particular <em>must</em> be a nonmember function. The observant reader may have noticed another curious thing about it: we declare <code>int</code> as its return type, but don’t actually have a return statement. This is allowed as a special case for <code>main</code>. Other functions still have to return normally, but if control reaches the end of the <code>main</code> function, it implicitly returns 0<sup id="fnref:2"><a href="#fn:2" rel="footnote">2</a></sup>.</p>

<p>Inside the main function, you might wonder about the <code>&lt;&lt;</code>‘s. The operators exist in C# as well, and their built-in meaning is the same. Formally, they are used for bit-shifting in both languages, but C++ allows them to be overloaded, and in particular, streams define overloaded versions.</p>

<p>So the <code>&lt;&lt;</code> operator “streams” data into <code>std::cout</code>. <code>std::endl</code> is a stream manipulator which, when it is fed into a stream, produces a line break, and flushes the stream. In this example, we could just have written <code>std::cout &lt;&lt; "hello world\n"</code> to get the newline without flushing the stream, and in some ways, that would actually have been preferable. But I wanted to introduce <code>endl</code>.</p>

<p>A final note is the <code>std::</code> prefix. Where C# uses a simple dot for all scope resolution operators, C++ defines a few different ones:</p>

<ul>
<li>For specifying members of a namespace, <em>or</em> specifying static members of a class, <code>::</code> is used.</li>
<li>For nonstatic class members, <code>.</code> is used. Given an object <code>o</code>, we can access a member <code>m</code> with the syntax <code>o.m</code>, exactly like in C#.</li>
<li>For nonstatic class members <em>accessed through a pointer to the class</em>, <code>-&gt;</code> is used. If we have a pointer <code>p</code> to an object, accessing its member m looks like this instead: <code>p-&gt;m</code>.</li>
</ul>

<p>So in our Hello World program, we reference the object <code>cout</code> in the <code>std</code> namespace.
We could simply add a <code>using namespace std;</code> at the top of the program, much like we would in C#, but in C++, it is not customary to do so. You’ll note that the namespace actually has a very short name, unlike .NET’s long names and nested namespace. Rather than <code>System.Collections.Generic.List</code>, for example, C++ defines <code>std::vector</code>. Almost the entire C++ standard library exists in the <code>std</code> namespace. One of the main reasons for this structure is to make it easy and convenient to access namespace members without having to do <code>using namespace X</code>.</p>

<p><code>cout</code> stands for <em>character output</em>, and is the stream used for standard output, much like the output-related members of .NET’s <code>Console</code> class. There is also a <code>cin</code> stream object responsible for input.</p>

<p><code>cout</code> and <code>cin</code> are actually nothing more than global variables of the type <code>std::ostream</code> and <code>std::istream</code> respectively.
Another output mechanism you’re likely to see is the C function <code>printf</code>, which is syntactically closer to what you’re used to from .NET.</p>

<p>Given an integer <code>i</code> we want to print out along with a message, <code>cout</code> and <code>printf</code> would be used like this:</p>

<pre><code>std::cout &lt;&lt; "You have " &lt;&lt; i &lt;&lt; pancakes\n";
printf("You have %d pancakes.\n", i); 
</code></pre>

<p>Each of these have their advantages and disadvantages as you can probably see. The nice thing about <code>cout</code> is that it is type-safe, and allows us to compose our output string without having to worry about the type of <code>i</code> or the number of parameters. We just stream whatever we like into <code>cout</code> one parameter at a time, and it all just works. It also works with user-defined types. They just have to define an appropriate <code>operator &lt;&lt;</code>.</p>

<p>The nice thing about <code>printf</code> on the other hand, is that the actual format of the string is much more readable, and parameters are specified separately at the end. As you know from .NET’s <code>string.Format</code> function, it is very convenient to be able to write the entire format string in one go, and only specify parameters afterwards. It is a bit awkward that <code>cout</code> requires you to break up the string with <code>&lt;&lt;</code>‘s all over the place.  But there are some serious limitations to printf as well:</p>

<ul>
<li>It can not be extended. It works for the basic built-in types, and nothing else.</li>
<li>It requires the programmer to specify the type of the parameter as part of the format string. (<code>%d</code> specifies that the paramater at this partition is expected to be an integer (I assume the <code>d</code> stands for decimal). But there is no type-checking to verify that this is actually the case. I can pass a float to <code>printf</code>, and print it out with <code>%d</code>, and I get garbage.)</li>
<li>The number of parameters to the function are unknown to the compiler. C (and C++) only have very rudimentary support for functions with variable arguments. Once you make use of this feature, you lose all type information <em>and</em> information about the number of parameters passed to the function. </li>
</ul>

<p>I tend to prefer <code>cout</code> for these reasons; it is safer, and it can be extended. But you’re likely to encounter <code>printf</code> in code samples and should at the very least be familiar with it.</p>

<p>Finally, let’s deal with the very first line. There are four things to note about it. In order of appearance, they are:</p>

<ul>
<li>The <code>#</code> at the very start of the line indicates that this is a preprocessor directive. In other words, this is evaluated in a separate pass <em>before</em> the compiler starts working. Modern compilers don’t maintain a strict separation between preprocessing and compilation, but as the language is specified, the preprocessor basically runs over the source code performing a number of simple modifications <em>before</em> the compiler is invoked.</li>
<li><code>include</code> is the actual preprocessor directive. It specifies that we would like to include a file.</li>
<li>The file name is surrounded by angle brackets (<code>&lt;&gt;</code>). When these are used, the preprocessor searches for the file to include in system directories. If we had used double quotes (<code>""</code>), the preprocessor would have searched for the file locally first. So slightly simplified, use ´&lt;&gt;<code>to include system headers, and</code>””‘ to include files from the same project or solution.</li>
<li>Finally, inside the angle brackets, we have the name of the header file we’d like to include. In general, your own files should use a <code>.h</code> or <code>.hpp</code> suffix. Headers belonging to the C standard library also use <code>.h</code>, but C++ standard library headers have no extension. (So you have <code>iostream</code> instead of <code>iostream.h</code>).</li>
</ul>

<p>Finally, what does it mean for a file to be <code>#include</code>’d? It’s not quite the same thing as the <code>using</code> statements you put at the top of a file in C#. Those <code>using</code> statements are functionally similar to the <code>using namespace</code> statement mentioned earlier — they allow us to reference types defined in other namespaces as if they were members of the current namespace. If we do not have the <code>using</code> statement, we have to specify the full namespace prefix when using the type (<code>System.Collections.Generic.List&lt;T&gt;</code> instead of simply <code>List&lt;T&gt;</code>), but the types are still <em>available</em>. I can reference <code>System.Collections.Generic.List&lt;T&gt;</code> in C# without any <code>using</code> statements. Likewise, I can reference <code>std::cout</code> as I did in the previous example without having a <code>using namespace std</code>.</p>

<p>But without the <code>#include</code>, the compiler would not have been aware of <code>cout</code> at all.</p>

<p>An <code>#include</code> is in a sense very simple. All that actually happens is a copy/paste operation. The preprocessor locates the file <code>iostream</code>, and copies its contents into our file at the location of the <code>#include</code>. The effect of this is to give us access to anything defined in the file. In .NET this is all taken care of by magic. Anything in the current assembly is automatically visible, and anything that isn’t declared <code>internal</code> in other assemblies is visible as soon as we add a reference to it.</p>

<p>In C++, no such mechanism exists. What the compiler sees is <em>just</em> the current file. Other files, even in the same project, are not visible when the current file is being compiled. The compilation model is notoriously quirky, and probably deserves some explanation.</p>

<h1>The preprocessor and the C/C++ compilation model</h1>

<p>C++ code is compiled in a couple of stages. I already mentioned the preprocessor. In the old days, this was a separate program, which was run on the source code first, perfoming simple text manipulation (search/replace, and conditionally removing chunks of code). The output of this was then fed to the compiler. Finally, the output of the compiler is fed to a linker, which we’ll get to later. Today, the preprocessor is built into the compiler, but it is still a separate pass made over the code before the actual compilation begins.</p>

<p>Let’s wrap up the preprocessor quickly though. It can do a few other things that we’ll probably run into soon enough. In particular, <code>#define</code> has a few uses. It creates a macro — whenever the name of this macro is encountered, it is replaced with the macro definition.</p>

<p>So in the following:</p>

<pre><code>#define waffles pancakes
std::cout &lt;&lt; "I like " &lt;&lt; waffles();
</code></pre>

<p>we create a macro named <code>waffles</code>, and from that point onwards, any occurence of <code>waffles</code> is swapped for <code>pancakes</code>. Which means that the function that actually gets called in line two is <code>pancakes()</code>, rather than <code>waffles()</code> — highlighting another important aspect of the preprocessor. Because it is run <em>before</em> compilation, it has no notion of actual language syntax. It doesn’t care about the context of the text it is replacing. It doesn’t care that this is a function call, just like it wouldn’t care if the named had been found in a different namespace than the one the macro was defined in. It doesn’t respect scoping rules or anything else. It won’t swap out the middle of words, or the contents of string literals (so <code>ilikewaffles()</code> would go untouched, as would <code>"waffles"</code>, but that’s about it. Anything else gets brutally replaced by the preprocessor.</p>

<p>Another common example of its simplicity is the following:</p>

<pre><code>#define four 2+2
int i = four * four;
</code></pre>

<p>The result of this? It is <code>8</code>. The preprocessor just performs simple text substitution, resulting in this code: <code>int i = 2+2 * 2+2</code>, which of course gets evaluated as <code>int i = 2 + (2*2) + 2</code>.</p>

<p>We can also use the preprocessor to perform conditional compilation removing sections of code at compile-time:</p>

<pre><code>#define waffles
#ifdef waffles // #if defined(waffles) would also have been legal
// this will get compiled
#else
// this will get removed by the preprocessor
#endif
</code></pre>

<p>A variation on this is used in almost every header file, but we’ll get to that soon enough.</p>

<p>The compiler processes what is technically known as <em>translation units</em>. A translation unit is a single source file (typically <code>.cpp</code> or <code>.cc</code> for C++, or <code>.c</code> for C code), after preprocessing. So in our Hello World program, we have one translation unit, consisting of the contents of the header file <code>iostream</code>, followed by our main function. The result of compilation is not a program, but rather an <em>object file</em> (Visual Studio uses the extension <code>.obj</code> for these — GCC uses <code>.o</code>). An object file contains all the compiled code for this file, but with certain placeholder “gaps”. This is necessary as code files will typically depend on functions or variables defined in other translation units. We are able to tell the compiler that a function defined in another translation unit <em>exists</em>, but it won’t be able to see the actual defintion of the function, so it has to generate a kind of placeholder, saying “call the function with this name, as soon as we find out where that function <em>is</em>”. That is essentially the role of object files. Store the compiled code, along with the necessary information about which symbols <em>this</em> file defines, and which symbols it depends upon, and which must be found in other files for the program to be complete.</p>

<p>When all the object files are created, they are passed to the linker, which performs the final steps — reading all the object files, locating all these placeholders, and filling them in. If some code in object file A calls a function <code>f</code> defined in another file B, the linker must read both files A and B, determine the address of the function <code>f</code>, and insert it into the function call inside A.</p>

<p>If the linker finds multiple conflicting definitions of <code>f</code> (perhaps object file C also defined a function with the same signature), it is of course an error. Likewise, if it is unable to locate the full definition of a symbol referenced from a file, we get an error. Because the linker does not have access to the actual source code, but only the object files, linker errors are notoriously hard to understand, but it can be done. The following simple code causes a linker error: (we’re going to run with this example for a while, so feel free to add it to a new project, or overwrite the previous file. This code should be the only contents of the project)</p>

<pre><code>class myclass {
public:
    int f(float fl);
};

int main(){
    myclass c;
    c.f(1.0f);
}
</code></pre>

<p>The code should be straightforward enough. We declare a class with a member function <code>f</code>. In the main function we create an instance of our class, and call the <code>f</code> function. There is just one problem: the function is <em>declared</em>, but it has not been <em>defined</em>. In other words, the compiler knows it exists (so we don’t get a compiler errror when we try to call it, as we would if we called a completely unknown function), but because it does not have the function body, it has to assume that the full definition is… elsewhere. So the compiler lets this pass, hoping that the linker can sort things out.</p>

<p>But the the linker is given only this one translation unit. So it is unable to find a definition for the function <code>f</code>, so it spits the following error at us:</p>

<blockquote>
  <p>error LNK2019: unresolved external symbol <code>"public: int __thiscall myclass::f(float)" (?f@myclass@@QAEHM@Z)</code> referenced in function <code>_main</code></p>
</blockquote>

<p>Ouch. Again, the linker doesn’t have access to the source code, so this is about the best it can do. It tells us that the problem is an “unresolved external symbol”, or in other words, it was unable to resolve a symbol that one of our translation units expected to be “external” (defined in another translation unit). As for the symbol itself? All it actually sees is the mangled string near the end: <code>?f@myclass@@QAEHM@Z</code>. This is the name for the function generated by the compiler and stored in the object file, and I have no clue what the @‘s or the letters following it mean. They <em>somehow</em> encode information about parameters and return type, but that’s about all I can say. Luckily, the linker is able to decode this name, which it also does for us. It tells us that the function has <em>public</em> visiblity, and its return type is int. <code>__thiscall</code> is the <em>calling convention</em> used for member methods. (It is essentially a calling convention that allows for a <code>this</code> parameter, hence the name). The calling convention isn’t usually important here though. Next, we can see that the unresolved symbol is a member of the class <code>myclass</code>, the function is named <code>f</code>, and it takes a <code>float</code> as its parameter. Finally, it tells us that the symbol was referenced from the <code>_main</code> function (again, we can’t always trust the compiler to preserve the precise names, but it’s probably a safe bet to assume that when it says <code>_main</code>, it means <code>main</code>.</p>

<p>So the error is actually pretty straightforward once you filter out the noise. A lot of C++ programmers don’t realize this, and go into a panic whenever they encounter a linker error, which is why I wanted to demonstrate this one. They typically contain a lot of noise (especially in more complicated cases), but they can be deciphered if you eliminate all the <code>@@</code> nonsense and read the remaining text slowly and carefully.</p>

<p>The other reason why I wanted to demonstrate this is that it is key to why header files are used. Based on the above example, we now know that the compiler can be tricked into accepting a call to a function it has no knowledge of, as long as it can see a valid declaration. (a function declaration is essentially just the signature (including return type), followed by a semicolon, much like an interface method in C#.</p>

<p>So perhaps we should get creative and see if we can make the linker happy too. First, we create second <code>.cpp</code> file with the following contents:</p>

<pre><code>class myclass {
public:
    int f(float fl);
};
</code></pre>

<p>There’s still no definitions of <code>f</code>, but we’re taking it a step at a time. Now, though, we have <em>two</em> files containing the same definition of <code>myclass</code>. Of course, the compiler only sees one file at a time, so it won’t notice this, but what will the linker say? Won’t it complain about multiple definitions of the same symbol? Try compiling the project and find out.</p>

<p>As it turns out, we get <em>exactly</em> the same error as before. But we don’t get any complaints about the multiple definitions of the same class. This is actually allowed. We are allowed to create as many definitions of the same symbol as we like, as long as there is only one in each translation unit (the compiler will choke on it if you try to define a class you’ve already defined), <strong>and</strong> all the definitions are <em>exactly</em> identical. (The linker will typically not enforce the last requirement though. If the definitions are not identical, it typically manifests as weird crashes at runtime)</p>

<p>This is called the One Definition Rule (ODR). Only one definition may exist. That definition may occur in multiple places, but it must be identical, it must be the <em>same</em> definition, every time it is encountered.</p>

<p>So it seems like we have a problem, doesn’t it? We’re allowed to duplicate the class definition, but we’re not allowed to modify it! So how are we supposed to add the definition of <code>f</code>?</p>

<p>Try changing your second file (the one without the <code>main</code> function) to the following:</p>

<pre><code>class myclass {
public:
    int f(float fl);
};

int myclass::f(float fl){
    return 42;
}
</code></pre>

<p>and compile it. Voila! It works. We didn’t modify the actual class definition, so we obeyed the ODR rule. Instead, we added the function definition <em>afterwards</em>, outside the actual class definition. And both the compiler and linker are happy. The linker now sees two identical definitions of the class <code>myclass</code>, but that’s allowed under the ODR rule. It also sees a call to the function <code>myclass::f</code>, <em>and</em> a single definition of the same function, so it is able to glue everything together into one single program.</p>

<p>Of course, having to copy/paste, and maintain duplicate code in every <code>.cpp</code> file is hardly ideal. Sooner or later, we’re going to modify <code>myclass</code> in one file, and forget to do the same modifications in all the other files. That will break the ODR rule, and everything will crash horribly.</p>

<p>That is where header files come in. We could put the shared code in a separate file, and use the <code>#include</code> directive mentioned earlier to <em>automatically</em> copy/paste the contents in! Let’s try that now. Create a new file (with the <code>.h</code> or <code>.hpp</code> extension), and place the class definition in that. Now remove the class definition from the two <code>.cpp</code> files we already had, and replace it with a <code>#include</code> referencing the header.</p>

<p>That is, your projcet should contain the following three files: (I’m going to name the <code>.cpp</code> files <code>main.cpp</code> and <code>myclass.cpp</code> for convenience:</p>

<pre><code>// myclass.h
class myclass {
public:
  int f(float fl);
};

// myclass.cpp 
#include "myclass.h" // note we use quotes, not angle brackets here
int myclass::f(float fl){
  return 42;
}

// main.cpp
#include "myclass.h"
int main(){
  myclass c;
  c.f(1.0f);
}
</code></pre>

<p>And it seems to work. Clever.
There is one little problem though. What happens if we include our header multiple times? We probably won’t intentionally do this, but perhaps we’re going to include it, and then include another header, which also includes it. We can easily get out into a situation where some headers get included many times. Think of standard headers like <code>iostream</code>. We’re going to end up including it fairly often. Sooner or later, we’ll end up including some of our headers twice, which breaks the ODR rule! We’re not allowed to have multiple definitions <em>in the same translation unit</em>. To test the problem, feel free to duplicate the <code>#include</code> statement and verify that the compiler chokes on it.</p>

<p>So to solve this problem include guards are used. Modify your header as follows:</p>

<pre><code>#ifndef MYCLASS_H
#define MYCLASS_H

class myclass {
public:
  int f(float fl);
};

#endif
</code></pre>

<p>There should be nothing new in this, but the consequence might be surprising. First, we ask the preprocessor to check if the macro <code>MYCLASS_H</code> is defined, and only evaluate the following if it is <strong>not</strong> defined (the directive is named <code>ifndef</code>, or <em>if <strong>not</strong> defined</em>).</p>

<p>If we enter the if statement, the first thing we do is define the symbol <code>MYCLASS_H</code>, and then we evaluate the original contents of the header. Finally, we end the if-statement with an <code>#endif</code>. So what happens if the file gets included twice now?</p>

<p>For simplicity, assume the following <code>.cpp</code> file, containing nothing except two includes:</p>

<pre><code>#include "myclass.h"
#include "myclass.h"
</code></pre>

<p>As the preprocessor parses this, it’ll expand both <code>#include</code>’s, resulting in this:</p>

<pre><code>#ifndef MYCLASS_H // At this point, the macro MYCLASS_H is not defined, so we enter the following block:
#define MYCLASS_H // define the macro MYCLASS_H

class myclass { // allow this code to stay in the translation unit
public:
  int f(float fl);
};

#endif // end the if statement
#ifndef MYCLASS_H // now MYCLASS_H *is* defined, the condition is not true, and so we *skip* the if statement.
//#define MYCLASS_H // of course the preprocessor doesn't actually comment out the code, it simply removes it from the translation unit. I'm commenting it to illustrate what happens
//
//class myclass { // this time, the preprocessor *removes* all this code, because it is inside a #if statement we're skipping
//public:
//  int f(float fl);
//};
//
#endif
</code></pre>

<p>so after the preprocessor has run, only this code actually gets inserted in our translation unit:</p>

<pre><code>class myclass {
public:
  int f(float fl);
};
</code></pre>

<p>So it seems we’re able to handle multiple inclusions of the same header now.</p>

<p>So to review, we’re now able to split our code across multiple source files, and do it <em>correctly</em>. We don’t need to duplicate any code — all the shared code can be placed in header files, and include guards protect against accidentally including the same file twice in the same compilation unit.</p>

<p>And now you should finally understand what it meant when we included <code>iostream</code> in the original Hello World example. We’re simply pasting in a lot of system code, containing declarations that get linked together with the standard library containing the full definitions.</p>

<p>This turned out a lot longer than I’d originally intended (I had originally, and naïvely, planned to write the entire series in one post), so let’s call it a day here. Part two will be posted very soon, and cover some actual C++, now that we’ve got the fundamentals out of the way. You needed to understand how C++ code is compiled before you’re able to write anything useful in the language.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>Application Binary Interface — a common ABI is required for two functions to be able to call each others. The ABI defines the memory layout of structs or classes, as well as calling conventions and basically everything you need to be able to call a function. Where should the return value go, where should parameters be placed, and so on. C defines a fixed ABI, which makes it easy to interface with. <a href="#fnref:1" rev="footnote">↩</a></p>
</li>

<li id="fn:2">
<p>The story goes that Bjarne Stroustrup, the language designer, didn’t want to create a language where a simple <code>hello world</code> required multiple lines of code in the main function. Hence the special rule that <code>main</code> doesn’t have to have an explicit <code>return</code> statement. <a href="#fnref:2" rev="footnote">↩</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/08/a-net-developers-guide-to-c/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The Great Pointer Conspiracy</title>
		<link>http://jalf.dk/blog/2009/07/the-great-pointer-conspiracy/</link>
		<comments>http://jalf.dk/blog/2009/07/the-great-pointer-conspiracy/#comments</comments>
		<pubDate>Thu, 30 Jul 2009 18:25:55 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[pointers]]></category>
		<category><![CDATA[teaching]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=168</guid>
		<description><![CDATA[One of the great tragedies of C and C++ is that they are taught wrong — that a number of perfectly straightforward features are taught and described as if they were mythical and supernatural entities that no mortal can truly understand. Memory management in C++ is one such feature (it is actually very simple, once [...]]]></description>
			<content:encoded><![CDATA[<p>One of the great tragedies of C and C++ is that they are taught wrong — that a number of perfectly straightforward features are taught and described as if they were mythical and supernatural entities that no mortal can truly understand. Memory management in C++ is one such feature (it is actually very simple, once you know the trick), but the biggest of all is probably pointers.</p>

<p><em>Everyone</em> who learns C++  fears pointers. <em>Everyone</em> who is new to the language, or who has merely <em>heard</em> of the language consider pointers to be some kind of magic — arcane constructs that give the programmer access to <em>Real Ultimate Power</em> — a feature that both mark C/C++ as <em>superior</em> and <em>more powerful</em> than other languages, but is also feared as <em>dangerous</em> or unsafe*.</p>

<p>None of this is true.</p>

<p><em>Pointers are simple.</em></p>

<p><em>Pointers are not magical.</em></p>

<p><em>Pointers are safe (as long as you use them only as allowed by the language)</em></p>

<p><span id="more-168"></span>
 It is very well-defined what you may, and may not, do with a pointer. The only problem is that the compiler is unable to enforce most of this, so it relies on your own discipline, and knowledge of the rules. But the rules exist. And if you stay within the rules, if your C++ program is legal, then pointers are perfectly safe.</p>

<p>This post is my little attempt to debunk The Great Pointer Conspiracy. It seems there is some hidden rule that whenever we teach others C or C++, we must describe pointers</p>

<ul>
<li>as more complicated than they are, and, </li>
<li><strong>as something they are not</strong>. It sometimes makes sense to lie to your pupil in order to teach them the truth a bit at a time (similar to how most of what you learned in elementary school turns out to be wrong when you get to university. They didn’t mislead you, they just taught you simplified versions of the truth to get you on the right track). But in the case of pointers, the model taught is not merely wrong, it is also more complex and harder to understand!</li>
</ul>

<h1>So what is a pointer then?</h1>

<p>Let’s start with a crash course in syntax, just to get that out of the way.</p>

<ul>
<li>A pointer to type T is denoted <code>T*</code> (pronounced <em>pointer to T</em>)</li>
<li>A pointer is created with the <code>&amp;</code> operator. Assuming an <code>int i</code>, we can create a pointer to it: <code>int* p = &amp;i;</code> (<code>&amp;i</code> is typically pronounced as <em>take the address of i</em>)</li>
<li>A pointer can be <em>dereferenced</em> with the <code>*</code> operator, yielding the value it points to: <code>int j = *p;</code></li>
</ul>

<p>That’s easy, right? The only point of confusion is the dual role of <code>*</code>, as both part of the type, and as the dereferencing operator. There’s a bit of symmetry here, because  <code>&amp;</code> can be used in both places as well. As above, it can be used to take the address of an object, but it can also be used as part of the type, to create a <em>reference</em>: <code>int&amp; k = i</code> creates a reference to the previously defined integer i. But references aren’t the subject of this post. I only mention it because of the related syntax.</p>

<p>So, on to what pointers are, and what they can do:</p>

<p><strong>Pointers are references</strong></p>

<p>A pointer is little more than a reference (in the conceptual sense — not the specific C++ references mentioned in the previous section) to a variable. If we have multiple references to the same variable, they will all see changes made by each others. Here’s an example:</p>

<pre><code>void Foo(int* ptr){ // Because we're passed a pointer, we have a reference to the original variable, and can modify it so the changes are visible outside the function
  *ptr = 2; // set whatever ptr points to, to 2
}

int main(){
  // create a local variable i. This isn't a pointer, but it can be referenced by one.
  int i;
  int* p = &amp;i; // create a pointer to i by taking the address (see below) of i, and store that as a pointer p
  i = 1;
  assert(*p == 1); // the value referenced by p is now equal to 1
  Foo(p);
  assert(i == 2 &amp;&amp; *p == 2);
}
</code></pre>

<p><em>important note</em>: Yes, I used the word “address” in the comment above. It is important to realize what I mean by this. I do <em>not</em> mean “the memory address at which the data is physically stored”, but simply an abstract “whatever we need in order to locate the value. The address of <code>i</code> might be anything, but once we have it, we can always find and modify <code>i</code>. If you want a real-world analogy, what is an address in the real world? My email-address has nothing to do with my house address. My phone number could be considered a third address. Even my social security number, or my full name could be considered addresses in this sense. All of these allow you to locate or contact me, which is all we require.</p>

<p>So far, so good. Pointers are simply references to other variables, with slightly quirky syntax in that we have to use <code>*p</code> to get the value that the pointer  <code>p</code> points to, and we have to use <code>&amp;i</code> to create a pointer to <code>i</code>.</p>

<p>Of course pointers can do a bit more than this though. They’re not as complex as people often try to convince beginners, but they’re not <em>that</em> simple either.</p>

<p><strong>Pointers can be reseated</strong></p>

<p>Once a pointer exists, we can change what it points to. For example:</p>

<pre><code>int main() {
  int i = 1;
  int j = 2;
  int* p = &amp;i; // make the pointer p point to i
  assert(*p == 1);
  p = &amp;j; // and now make it point to j
  assert(*p == 2);
  *p = 3; // modify the variable p points to
  assert(j == 3);  // j is now 3
  assert(i == 1);  // but i is untouched, because p no longer points to it.
}
</code></pre>

<p>See, that’s not rocket science either, is it? Whatever the pointer points to, we can look at and modify. And when it no longer points to that, they have no connection any more.</p>

<p><strong>Pointers can be null</strong></p>

<p>Next up, pointers don’t have to point to something. They can be <em>null pointers</em>. And just like with addresses in the previous example, it is important to be clear on what we mean by this. A null pointer is exactly what I said: <em>a pointer which does not point to any object</em>.</p>

<p>In particular, it is <em>not</em> a pointer to the address zero. Of course, here is where it becomes tricky, because the following <em>does</em> create a null pointer:</p>

<pre><code>int* ptr = 0;
</code></pre>

<p>The trick here is that the C++ language standard makes a special rule for this case. Assigning the constant zero to a pointer creates a null pointer, and <em>not</em> a pointer to address zero. The “constant” part is important too. Here is the precise wording in the standard (Section 4.10 [conv.ptr], paragraph 1:</p>

<blockquote>
  <p>A <em>null pointer constant</em> is an integral constant expression (5.19) rvalue of integer type that evaluates to zero. A null pointer constant can be converted to a pointer type; the result is the <em>null pointer value</em> of that type…</p>
</blockquote>

<p>A “constant expression” is essentially an integral value which can be evaluated at compile-time. So <code>42</code>, <code>2+2</code> or <code>const int i = 99</code> are constant expressions.</p>

<pre><code>int* p0 = 0; // null pointer
const int zero1 = 0; // constant expression
int* p1 = zero1; // null pointer
const int zero2 = 2 - 2; // constant expression
int* p2 = zero2; // null pointer
int zero3 = 0; // not a constant expression
int* p3 = zero3; // not a null pointer
int a = 2;
int b = 2;
int zero4 = a - b; // not a constant expression
int* p4 = zero4; // not a null pointer
const int c = 2;
const int d = 2;
int zero4 = c -d; // constant expression
int* p4 = zero4; // null pointer
</code></pre>

<p>Obviously, the compiler is unable to enforce all of this, but that doesn’t make it less true. According to “the rules”, a null pointer is neither a pointer pointing to address zero, or a pointer to which the value zero has been assigned. It is <em>a pointer to which the constant expression zero has been assigned</em>.</p>

<p>As for what you’re allowed to do with a null pointer? Basically nothing. You may compare it to other pointers, and… that’s basically it.</p>

<p>With me so far? You might have noticed that what I have described so far is almost exactly what references in C# or Java (or many other languages) are. A variable of a reference type behaves pretty much exactly like this. We can set it to point to another <em>valid</em> object (but we are <em>not</em> allowed to ever set it to an <em>invalid</em> object), or we can set it to <code>null</code>.</p>

<p>Pointers are much like reference types in most other languages. This is an important point. Like I said to begin with, pointers are <em>not</em> difficult. They are a very simple concept, as the above shows. Where the confusion arises is in the <em>one</em> extra thing they can do, which I will describe next. Note that while this <em>does</em> make them somewhat more flexible than C# references, it is still a far cry from the “raw memory address” concept that people often think pointers are.</p>

<p><strong>Pointers can traverse arrays</strong></p>

<p>Now comes the (slightly) tricky part — the one that usually gets people confused, or gives them the wrong idea. If we have a pointer to an element within an array, we are allowed to move the pointer around within the array</p>

<pre><code>char arr[] = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j' };
</code></pre>

<pre><code>char* ptr = arr; // arrays are not pointers, but can *decay* into a pointer to the first element. So now we have a pointer to arr[0]
assert(*ptr == 'a');
++ptr; // move ptr to the next element
assert(*ptr == 'b');
ptr += 5;
assert(*ptr == g);
assert(*(ptr + 2) == 'i');
assert(*(--ptr) == 'f');
ptr -= 3;
assert(*ptr == 'c')
</code></pre>

<p>So far, so good. At this point it is probably a good idea to mention that when you increment a pointer, it always moves <em>to the next element</em>, and <em>not to the next byte</em>. Once again, being careful with the idea of “addresses” pays off. The pointer stores the address of an object. By adding one to that address, we get the address of the <em>next</em> object, no matter how big the object is. Think of your house address. It doesn’t matter how big your house is, the <em>next</em> address is always the neighboring house. It is not your garage door or your kitchen window.</p>

<p>If, for the sake of argument, pointers had merely been memory addresses, then adding one to a pointer would have produced an address that was one byte higher, which means the pointer would no longer have pointed to a valid object. Good thing we don’t live in <em>that</em> messy kind of world, eh? In C++ land, a pointer points to an object, and incrementing it gives us a pointer to the <em>next</em> object.</p>

<p>Now comes the next little surprise:
You are allowed to move the pointer one step <em>past</em> the end of the array. Assuming the same array as above:</p>

<pre><code>char* p = arr + 9; 
assert(*p == 'j'); // no surprises here, just verifying that we're at the end of the array.
char* q = arr + 10; // this is legal
++p; // so is this
</code></pre>

<p>But once again, we have to be careful. The language has only given us permission  to go <em>one</em> step past the end. A pointer to <code>arr + 11</code> is downright illegal, <em>even if we don’t dereference it. The mere existence of the pointer is illegal</em>. The compiler probably won’t complain, and your code may even <em>appear</em> to work, but it is no longer a legal C++ program.</p>

<p>We have also not been given permission to dereference the one-past-the-end-pointer. <code>*(arr + 10)</code> is not legal. Again, it may seem to work, on your computer, with your compiler, on this particular day. But it may not work tomorrow. Or on my compiler. Or when I run your program.</p>

<p>So the language allows us to create, and move pointers around freely, from the start of the array, and up to one past the end of the array. And it allows us to dereference pointers that point to any element in the array, but not one past the end.</p>

<p>And that’s basically it. This is the dreaded pointer arithmetic that usually have beginners running scared. Not all that scary, is it?</p>

<p>Of course, For the sake of completeness, there is one other arithmetic operation that is legal under much the same circumstances:</p>

<p>Two pointers <em>pointing to the same array</em> may be subtracted, yielding the distance between them, expressed as a number of elements.
And for the purposes of pointer arithmetics, single elements are considered arrays of size one, meaning that all the above is true for single variables too — they’re just treated as arrays with only a single element.</p>

<p><strong>And one final detail</strong></p>

<p>Now let’s get self-referential. There is nothing new in this — it follows as a logical conclusion of the above, but it often comes as a surprise, so let’s mention it:</p>

<p>Pointers may point to pointers. Again, there is no magic, no special cases. A pointer is simply a reference to an object, remember? And a pointer is an object too, so obviously we can point to <em>that</em> as well!</p>

<p>We don’t often need to do that, but there is one case where it is used. Typically when you call a library function, and want it to give you a pointer to some resource it has created, you do this:</p>

<pre><code>Resource* ptr = 0; // this is going to be our pointer to the resource. For now, make it a null pointer to avoid confusion
bool success = CreateResource(&amp;ptr); // pass the address of our pointer to the function
</code></pre>

<p>Note that the function wishes to return a status code to let us know if the operation succeeded, so it can’t simply return the pointer we want. So it has to resort to pointer-pointer trickery instead.</p>

<p>The insides of <code>CreateResource</code> might look something like this:</p>

<pre><code>bool CreateResource(Resource** res){
  Resource* actualResource = new Resource(); // create the resource, and temporarily store a pointer to it
  // now we need to pass this pointer to the caller. If res had been a regular "single" pointer, it would simply have been a null pointer. 
  // And sure, we could have made it point to our resource instead, but the caller wouldn't know, because we only received a *copy* of the original null pointer. So even if we change what it points to, we can't change what the *original* points to.
  // Instead, we use a pointer to a pointer. We know that 'res' now points to the caller's Resource pointer. So if we manipulate the value pointed to by 'res', we're actually manipulating the caller's pointer.
  *res = actualResource; // so take our newly allocated resource pointer, and store that into the caller's pointer, which we get by dereferencing res.
}
</code></pre>

<p>It may help to remember that function arguments in C++ are <em>always</em> copied. If you pass an <code>int</code> to a function, it receives a <em>copy</em> of that <code>int</code>. And if you pass a pointer, then the function receives a <em>copy</em> of that pointer. A copy which points to the same address, so anything we do to the pointed-at address will be visible outside the function as well. But if we change the pointer itself, no one else will see it, because the function has been given its own copy.</p>

<p>So if we pass a pointer <code>p0</code> to a pointer <code>p1</code>, then this is again copied. The function receives a copy of <code>p0</code>, let’s call it <code>p2</code> which points to <code>p1</code>. So if we change what <code>p2</code> points to, the calling function won’t see it, but if we change what <code>p1</code> points to, it will be visible to the caller, because <code>p0</code> still points to <code>p1</code>.</p>

<p>Yes, this added level of indirection may take some getting used to, but the important part is that there’s nothing fundamentally special. It is simply the logical conclusions of the rules I described previously, so even if you don’t get it now, you will when you’ve got a bit more experience with pointers. It’s similar to how, when you first learned to read “See Spot Run”, you had all the rules necessary to read longer words, like “stewardesses” or “programmatically”. After that, you pretty much just needed practice.</p>

<p>So that’s it. That’s all pointers are. If you hadn’t previously encountered pointers, you can stop reading here. But if you were already taught about pointers, we probably have to undo some of the damage.</p>

<p>So the following will discuss what pointers are <em>not</em> — that is, the misconceptions that typically exist about pointers, and which beginners are almost invariably taught. I’ll try to explain <em>why</em> these limitations exist as well, partly so you can take the rule seriously as “something with real-world relevance”.</p>

<h1>The Pointer Abuse Rehab and Correction Center</h1>

<p>In the following, assume that <code>i, j</code> are integer variables (<code>int</code>), and <code>p, q</code> are pointers to integers (<code>int*</code>) and <code>n</code> is a null pointer:</p>

<ul>
<li><p>A pointer is not just a number. For example, <code>i + j</code> is legal, but <code>p + q</code> is not. Try it. Your compiler will give you an error. Likewise, <code>i*j</code> is valid, but <code>i * p</code> is not. Integers may be added to or subtracted from pointers, and pointers may be subtracted from pointers (as long as they both point to the same array).   And on some computers, a pointer isn’t implemented as an integer either. Some machines have segmented memory space, so an address is a tuple consisting of a segment identifier plus an offset. Sure, you <em>can</em> combine those two in a single number, in the same way that you can combine the country code with my phone number to create a single integer. But the address is still, fundamentally, a tuple of two numbers on that machine.</p></li>
<li><p>A pointer is not a memory address! I mentioned this above, but let’s say it again. Pointers are typically <em>implemented</em> by the compiler simply as memory addresses, yes, but they don’t have to be. A pointer may not point to just any address (and again, some computers, which have separate address and integer registers, are actually able to enforce this at runtime, generating a hardware fault if you try to create a pointer to an address that is not allocated to your process.) The same goes for moving past the end of an array. You’re allowed to go one element past, but pointing two past the end is not allowed, and again, some computers are able to <em>enforce</em> this, at least in some cases. (imagine that the array is located at the very top of the address space, so moving two elements past the end produces an overflow. On a CPU with dedicated address registers, overflows probably won’t be allowed. They’ll be caught and they’ll generate an exception).</p></li>
<li><p>All pointers are not born equal. A pointer to T may not be convertible to a valid pointer to U. Some machines require datatypes to be aligned. Typically, a 4-byte integer will have to be aligned so it starts on an address  that is divisible by 4. But a single byte datatype such as a char can be placed anywhere. So that means three out of four char pointers will not be valid integer pointers! We also can’t rely on casting as much as we’d typically expect. <code>reinterpret_cast</code> in particular often trips people up. (For non-C++ programmers, you can assume that we had used the “traditional” casting syntax, as in <code>(float*)i</code>. The difference is not important.)</p></li>
</ul>

<pre><code>int* i; // assume we have a pointer i and that it points to a valid integer
float* f = reinterpret_cast&lt;float*&gt;(i); // #1
int* j = reinterpret_cast&lt;int*&gt;(f); // #2
assert(i == j);
</code></pre>

<p>In the above, we know <em>nothing</em> about the value of  <code>f</code> after the cast on line <code>#1</code>. We know that it contains an “implementation-defined mapping” of the original <code>i</code>. But we are <em>not</em> guaranteed that it points to the same address, <em>or even that it contains the same bit pattern</em>!</p>

<p>True, the standard says that the mapping is “intended to be unsurprising to those who know the addressing structure of the underlying machine”, but in general, we can’t rely on that. All we are guaranteed is that once we cast <em>back</em> to the original type, we’re given the original value. So the standard guarantees that <code>i</code> and <code>j</code> in the above will point to the same address. But we know nothing about <code>f</code>, other than that the compiler is able to convert the value stored in it back to the original pointer <code>i</code>.</p>

<h1>Conclusion</h1>

<p>By now, I hope it’s clear that pointers actually become a lot simpler when we treat them as what they are, reseatable references to objects. If we start pretending that they are memory addresses, we get a whole host of complications: we start thinking that they should be allowed to point to <em>any</em> memory address, or even worse, that they are just numbers, and that all the usual arithmetics work on them. (Remember, adding or subtracting integers is legal, but it adjusts the pointer by that number of <em>objects</em>, not <em>bytes</em>, as we would have expected if pointers were just memory addresses. And <code>pointer + pointer</code>, <code>pointer * pointer</code> or <code>pointer / pointer</code> are simply not defined at all.)</p>

<p>As if that wasn’t bad enough, we also require the student to understand the underlying hardware, in particular the concept of a memory space, and of physical (or virtual) hardware addresses.</p>

<p>But if we treat pointers as what they are, that is no longer necessary. A pointer points to a C++ object, not a memory address, so to understand pointers you merely have to understand C++ objects, not memory addresses.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/07/the-great-pointer-conspiracy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
