<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>jalf.dk &#187; .net</title>
	<atom:link href="http://jalf.dk/blog/tag/net/feed/" rel="self" type="application/rss+xml" />
	<link>http://jalf.dk/blog</link>
	<description>Musings and thoughts on programming and other geeky stuff</description>
	<lastBuildDate>Mon, 12 Jul 2010 15:21:00 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>The meaning of RAII — or why you never need to worry about resource management again</title>
		<link>http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/</link>
		<comments>http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/#comments</comments>
		<pubDate>Sat, 02 Jan 2010 05:00:52 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[.net]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[raii]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=340</guid>
		<description><![CDATA[I tried really hard to come up with some witty title or pun to weave into the title of this post. I couldn’t. RAII is just a terrible name, and it isn’t really clever or funny. Unfortunately, it is also the single most important key to C++. It is not just an idiom but a [...]]]></description>
			<content:encoded><![CDATA[<p>I tried <em>really</em> hard to come up with some witty title or pun to weave into the title of this post. I couldn’t. RAII is just a terrible name, and it isn’t really clever or funny. Unfortunately, it is also <em>the</em> single most important key to C++. It is not just an idiom but a fundamental philosophy used to solve almost any problem in the language. So we can’t really avoid it.</p>

<p>If I had to pinpoint one thing that marked the difference between a skilled and an unskilled C++ programmer, it would be “do they understand RAII”. Many people don’t, hence this post.<span id="more-340"></span></p>

<p>RAII is, apart from being badly named, one of those deceptively simple concepts that you <em>think</em> you understand when you first hear of it, think “well duh, that’s obvious”, and then proceed to write code as usual, because you just don’t see how widely applicable it is.</p>

<p>But let’s get the name out of the way first. <a href="http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization">RAII</a> stands for “Resource Acquisition Is Initialization”. And if you’re not already familiar with the idiom, then this has told you <em>nothing at all</em>. If you did know about RAII in advance, then you can, when you stop and think about it, kind of see how the name relates to it… vaguely… sort of.</p>

<p>What it actually <em>means</em> is simple: Resources should be managed by classes. When the class is initialized, the resource is acquired (hence the name). When the class is destroyed, the resource is released. And the lifetime of the object should exactly match the desired lifetime of the resource. That sounds obvious, and many programmers will (assuming they’re working in a language that <em>has</em> classes), say that this is what they always do.</p>

<p>Often, C++ developers think this just means “smart pointers. Wrap your memory allocation in a <code>boost::shared_ptr</code> and you’re done”. I see that as one not-very-often used border case though, rather than a typical example of RAII. So let’s take a step back instead.</p>

<p>The key idea isthat any kind of resource, not just memory, but file handles, sockets, database connections, or even more abstract resources like loggers or profiling timers or textures, really <em>any</em> concept or process which has a lifetime, should be mapped to an object.</p>

<p>Unlike the typical object-oriented line of thought which goes that “everything must be an object, because then.… well, everything will be an object, and your code will be better”, here we actually have a concrete <em>reason</em>: We want to use the object to manage the lifetime of the resource.</p>

<p>When I allocate memory with <code>new</code>, I have to deallocate it again sooner or later, with <code>delete</code>. (Or in C, with <code>malloc()</code> and <code>free()</code> respectively). And I have to make sure that this is done. And I have to make sure that it is not done twice. And that the object is not accessed after this is done. There are a lot of constraints we have to obey, all related to the lifetime of the resource. And this is why unmanaged programs have a reputation of leaking memory left and right. If we allocate memory, and it is to be used by a dynamic number of objects or functions all referencing the same allocations, which of the users is responsible for deleting it? And how do we know when it is safe to delete, when no users remain?</p>

<p>Ironically, most managed languages have <em>not</em> solved the problem. They have added a garbage collector (which yes, is very useful for a wide number of reasons), but that only solves one specific instance of the problem. It takes care of avoiding memory leaks, but it doesn’t avoid resource leaks <em>in general</em>.</p>

<p>The garbage collector ensures that this code won’t leak memory:</p>

<pre><code>void foo() {
  SomeObject* obj = new SomeObject();
  bar(obj);
}
</code></pre>

<p>where without a garbage collector, we’d (at least without RAII) have to write code such as</p>

<pre><code>void foo() {
  SomeObject* obj = new SomeObject();
  try {
    bar(obj);
    delete obj;
  }
  catch(...){ delete obj; }
}
</code></pre>

<p>In the garbage collected case, we don’t know what <code>bar</code> does, and we don’t <em>need</em> to know. It doesn’t have to delete the object. And neither does the <code>foo</code> function. So we have successfully dodged the problem of managing the lifetime of memory allocations. We haven’t really <em>solved</em> the problem though. We still don’t have any good tools to <em>manage</em> the lifetime. We’re just guaranteed by the system that it’ll last <em>long enough</em>.</p>

<p>In C++, this effect can be approximated using some kind of smart pointer<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>.</p>

<p>Smart pointers allow us to write code like this:</p>

<pre><code>void foo() {
  boost::shared_ptr&lt;SomeObject&gt; ptr = new SomeObject();
  bar(ptr);
}
</code></pre>

<p>and be sure we won’t leak memory. Of course, this solution isn’t perfect — reference counting is much more expensive than a good garbage collector, and if we create cyclic references, the objects will never be deleted, as the reference counts never reach zero. It is a decent approximation, but nowhere near as good and reliable as the garbage collector in managed languages.</p>

<p>But the problem shows up again if we use another type of resource. What if we’d opened a database connection instead?
We’d have to write code such as this:
(The following Java-like pseudocode is copied almost verbatim from <a href="http://stackoverflow.com/questions/161177/does-c-support-finally-blocks-and-whats-this-raii-i-keep-hearing-about/161247#161247">this StackOverflow.com answer</a>, courtesy of <a href="http://stackoverflow.com/users/14065/martin-york">Martin York</a>.)</p>

<pre><code>void writeToDb()
{
  Db db = new Db("DBDesciptionString");
  try
  {
    // Use the db object.
  }
  finally
  {
    db.close();
  }
}
</code></pre>

<p>(And of course it gets even worse if <code>db.close()</code> can throw exceptions. Then we have to catch <em>that</em> exception, just to avoid it propagating out from the <code>finally</code> clause if we reached <code>finally</code> because of an exception being thrown in the <code>try</code> clause.)</p>

<p>The resource management problem still exists. We still have to wrap the code in exception handling just to make sure that the connection is closed as soon as we’re done with it. And we have to do this at <em>every</em> use. And it gets complicated fast.</p>

<p>Of course, .NET makes this a bit simpler:</p>

<pre><code>using (Db db = new Db("DbDescriptionString"))
{
  // use the database object.
}
</code></pre>

<p>But the onus is still on the user of the class to ensure it is closed correctly. There is no obvious way to encode into the <code>Db</code> class that “once we’re done with an object of this type, the connection must be closed immediately”.</p>

<p>And in C++, smart pointers are no longer suitable solutions, since the resource to be managed is no longer a pointer allocated with <code>new</code>.</p>

<p>Instead, a more basic flavor of RAII comes to the fore:</p>

<pre><code>void someFunc()
{
    Db db("DBDesciptionString");
    // Use the db object.
} 
</code></pre>

<p>Yes, that’s all. When the <code>db</code> object goes out of scope, at the end of the function, its destructor is called. The destructor internally calls <code>this-&gt;Close()</code> for us, so we don’t need to do it! We just have to trust the scoping rules of C++, which guarantee that destructors are called on local variables when they go out of scope, and on class members when the class is destroyed.</p>

<p>So in a sense, the key idea in RAII is simply that “resources should behave sensibly”. They should get copied safely if an assignment is made (or otherwise, assignments should be prevented), they should be available if their owning object is successfully created (if it can’t create the resource, it should throw an exception, aborting the creation of the object), and when they are no longer used, they should be cleaned up.</p>

<p>The C++ standard library class template <code>std::vector</code> is a wonderful example of RAII in action. The resources being managed by a <code>vector</code> are memory (the array allocated internally to hold the objects being contained in the vector, as well as the objects themselves. When the <code>vector</code> is destroyed, every object it holds must be destroyed too, and the array in which they were placed must be deallocated.</p>

<p>In the following examples, assume that a function <code>foo</code> is passed a vector of <code>MyClass</code> objects by value. We don’t know how many, if any, objects are stored in it, but since we are passed a copy of the original <code>vector</code>, we take ownership of it. It exists only in the function <code>foo</code>, and must be destroyed afterwards.</p>

<pre><code>void foo(std::vector&lt;MyClass&gt; vec) {
  ...
 //  when we get to the end of the function, all local variables, including vec, 
 // are automatically destroyed by having their destructors invoked.
 // So no matter how many MyClass objects were stored in the vector, it ensures that they too have their destructors called.
 // And the vector also deallocates its internal array, leaving neither of its resources alive at the end of the function
}

void foo(std::vector&lt;MyClass&gt; vec) {
  throw std::exception("Oops");
  // as above, vec is automatically destroyed when we leave the function,
  // regardless of *how* we leave it. Even if we leave it because an exception was thrown and not caught.
} 

void foo(std::vector&lt;MyClass&gt; vec) {
  // other is constructed as a copy of vec. std::vector ensures that both of vecs resources are copied as well
  std::vector&lt;MyClass&gt; other = vec;
  // we now have two vectors, each owning a dynamically allocated array and a number of MyClass objects
  // and again, at the end of the function, both are deallocated cleanly
} 

void foo(std::vector&lt;MyClass&gt; vec) {
  std::vector&lt;MyClass&gt; other; // a second, empty, vector

  // perform an assignment, setting vec to be an empty vector
  // std::vector makes sure that if you do this, the resources previously held by vec are cleanly released
  // before copies are made of the resources held by other
  vec = other;

  // and so when the function ends, the MyClass objects originally held by vec
  // have already been destroyed, so their destructors are *not* invoked now
} 
</code></pre>

<p>As the above shows, <code>vec</code> owns its resources, and manages them tightly. Whenever a change happens to <code>vec</code>, it reflects this by updating its owned resources. If it is destroyed, it destroys its owned resources. If it is copied, it copies the resources it owns. If it is assigned to hold something else, it first destroys its existing resources. And so on. Nothing you do can bring it “out of balance”. It just works. <em>That</em> is RAII. Smart pointers are just convenient adapters turning raw pointers into RAII objects. But RAII is much more than smart pointers.</p>

<p>It is the broad and general idea that <em>resources should be mapped to objects</em>, so that the object can not be created unless it succeeded in acquiring its resource, and it can not be destroyed without also releasing its resource. This effectively saves C++ programmers from having to worry about resource management.</p>

<p>Take an example that’s guaranteed to cause pain without the use of RAII: Handling exceptions being through halfway through constructors. Say you have a class with multiple members which are initialized in its constructor. After the first member has been initialized, but before all of them have been initialized, an exception is thrown. Let’s use the following contrived example:</p>

<pre><code>class Foobar {
  Foo f;
  Bar b;
  MyClass c;

public:
  Foobar() : f(42), b("hello world), c('a') {}
};
</code></pre>

<p>unfortunately, <code>b</code>’s constructor throws an exception. How to handle this? We know that in C++, partially constructed objects do not automatically have their destructors called. when the construction is aborted.</p>

<p>And since we want to avoid any resource leaks, we require that the following must happen:
– <code>a</code> must have its destructor called (because <code>a</code> was successfully initialized before the error occurrd)
– <code>b</code> must release any resources it acquired in its constructor before it threw the exception
– <code>c</code> must do nothing. Its construction was not yet begun when the error ocurred, so it would be an error to attempt any kind of cleanup of <code>c</code>.
– The <code>Foobar</code> object (the object pointed to by the <code>this</code> pointer) must ensure that the above, and nothing else, happens, and it must do so without relying on its own destructor (which won’t be called, as construction did not successfully complete).</p>

<p>And of course, pretending that only <code>b</code> can throw an exception may be a simplification over the real world. Perhaps every member could throw one from its constructor. Care to write a <code>Foobar</code> constructor which takes all this into account, providing enough <code>try</code>/<code>catch</code> blocks to correctly catch every exception that might be thrown, and release exactly the resources that have been allocated until then, and <em>nothing</em> else? A tall order, and an open invitation for bugs. And of course, it’d lead to a huge, bloated and error-prone constructor. It’d also prevent us from using the <em>initializer list</em>. We’d have to perform some kind of “safe” non-throwing default construction of both <code>a</code>, <code>b</code> and <code>c</code> before entering the constructor body, where exception handling is possible, and from there, attempt to perform assignments to bring the three members into the desired state.</p>

<p>In pseudocode, the constructor might look something like this:</p>

<pre><code>Foobar() {
  a = new Foo(42);
  try {
    b = new Bar("hello world");
  }
  catch {
    destroy a;
    throw;
  }
 try {
    c = new MyClass();
  }
  catch {
    destroy b;
    destroy a;
    throw;
  }
}
</code></pre>

<p>Note that all this complexity is only necessary because we want to handle several different resources. <code>a</code>, <code>b</code> and <code>c</code> all contain resources that must be attempted acquired, and properly released if this fails. If there’d been only one resource, the job would have been much simpler. There wouldn’t be any point at which <em>some</em> resources have been acquired, and others have not. If we succeeded in acquiring that one resource, there’d be no risk of errors occurring afterwards, so we wouldn’t need complex conditional cleanup code. And if we failed to acquire the one resource, there’d be nothing to clean up — after all, the resource was never acquired!</p>

<p>So to keep down the complexity, the only safe way to define a class is to make it own <em>at most one</em> resource. And this one-to-one mapping of resources to classes is exactly what RAII is all about. If <code>a</code>, <code>b</code> and <code>c</code> had all been RAII objects, then the above code <em>would work</em>. Regardless of which members could or couldn’t throw exceptions. According to the rules of C++, we know that in the above case,</p>

<ul>
<li>the <code>Foobar</code> destructor (<code>this-&gt;Foobar::~Foobar()</code> will not be called, as <code>*this</code> was not successfully constructed.</li>
<li>the <code>a</code> destructor will be called, as this member was fully constructed at the time of the error.</li>
<li>the <code>b</code> and <code>c</code> destructors will not be called, as these members were not fully constructed at the time of the error.</li>
</ul>

<p>So assuming that <code>b</code>’s constructor takes care of releasing any resources successfully allocated when the error occurred (the number of which, as pointed out above, should ideally be zero), we’re actually home free! What happens is exactly what we listed earlier as our goal. <code>a</code> has its destructor called, <code>c</code>’s constructor was never run in the first place, so it doesn’t have to do anything, and <code>*this</code> doesn’t have to do <em>anything</em> special in its constructor. All of its members take care of their own resources, so the number of resources managed by <code>*this</code> is zero!</p>

<p>We don’t even need to write a destructor for <code>Foobar</code> now, if all its members are RAII objects. Whether the <code>Foobar</code> object is partially or fully constructed, its members take care of themselves. That is the power of RAII. Once a resource has been mapped to a class, we can use it as much as we like, and even in very complex situations, and never have to worry about the resource being leaked. It is managed by its wrapping RAII object, and the C++ lifetime and scope rules ensure that this wrapper object gets destroyed when it goes out of scope</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>A smart pointer is an object which behaves as a pointer (meaning that it overloads the <code>*</code> and <code>-&gt;</code> operators, so it can be dereferenced to yield the pointed-to value), but also enforces some kind of ownership semantics on the value. A plain pointer does nothing when it goes out of scope. If it pointed to some dynamically allocated memory, nothing happens to that memory. And if no one else have a pointer to it, then that memory is lost, and can not be reclaimed.
A smart pointer does <em>something</em> when it is destroyed. Some variants simply free the memory they point to (<code>boost::scoped_ptr</code>, <code>std::auto_ptr</code> or <code>std::unique_ptr</code> all fall into this category, although with some important differences), while others implement reference counting, so that the memory is only destroyed when <em>all</em> smart pointers pointing to it have been destroyed. <code>boost::shared_ptr</code> is by far the best known implementation of this concept. <a href="#fnref:1" rev="footnote">↩</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>A .NET Developers Guide to C++ (part III)</title>
		<link>http://jalf.dk/blog/2009/10/a-net-developers-guide-to-c-part-iii/</link>
		<comments>http://jalf.dk/blog/2009/10/a-net-developers-guide-to-c-part-iii/#comments</comments>
		<pubDate>Thu, 01 Oct 2009 15:45:28 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[.net]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[teaching]]></category>
		<category><![CDATA[win32]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=247</guid>
		<description><![CDATA[We’re nearing the end! Part I focused on the very fundamentals of C and C++, making sure that you understand the build system and the very basics of the syntax. Part II expanded on this to teach you all the C++ you’ll need to do basic work in the language, including a few useful parts [...]]]></description>
			<content:encoded><![CDATA[<p>We’re nearing the end!</p>

<p><a href="http://jalf.dk/blog/2009/08/a-net-developers-guide-to-c/">Part I</a> focused on the very fundamentals of C and C++, making sure that you understand the build system and the very basics of the syntax.</p>

<p><a href="http://jalf.dk/blog/2009/09/a-net-developers-guide-to-c-part-ii/">Part II</a> expanded on this to teach you all the C++ you’ll need to do basic work in the language, including a few useful parts of the standard library, such as vectors and strings.</p>

<p>You now know all the basics we need, and the actual Win32 API should now be very simple to deal with. Not elegant or consistent, but comprehensible as long as you keep a close eye on the documentation and take nothing for granted.</p>

<p><span id="more-247"></span></p>

<p>First of all, the documentation can be found <a href="http://msdn.microsoft.com/en-us/library/aa383749.aspx">here</a>. As you probably already know, Microsoft’s own search capabilities are nonexistent, and to find the function you need, you’ll typically want to use Google. But sometimes, the complete reference is useful, so here it is.</p>

<p>To teach you how to use the Win32 API, I wil, run you through a pair of functions with some very basic functionality: retrieving the last error message.</p>

<p>Should be simple, right? You’d think so, if you’re new to Win32.</p>

<p>The operation consists of two steps. First we have to retrieve the last error code, and then we have to ask Windows for the associated message as a string.</p>

<p>And the first step is indeed easy. We just have to call the <a href="http://msdn.microsoft.com/en-us/library/ms679360.aspx"><code>GetLastError</code></a> function. Let’s start with the complete code:</p>

<pre><code>#include &lt;windows.h&gt;

int main() {
  DWORD error = GetLastError();
}
</code></pre>

<p>Feel free to run this in the debugger and see which code the function returns. (Most likely you’re going to get a <code>0</code>, since no error has actually occurred at this point).</p>

<p>Now let’s look at the actual documentation. They describe the function as having this signature:</p>

<pre><code>DWORD WINAPI GetLastError(void);
</code></pre>

<p>which looks like nothing we’ve seen so far. Let’s take the easy parts first. The function’s name is GetLastError. It takes a parameter of type <code>void</code>, or so it seems. This is actually a throwback to C, and means that the function takes no parameters. Both <code>GetLastError()</code> and <code>GetLastError(void)</code> is legal in C++. In C, the two used to have subtly different meanings. <code>(void)</code> properly declared a function which took no parameters, while <code>()</code> declared a function which took <em>any number of parameters</em>, but simply didn’t access them. But that was C. In C++, the two are identical, and we usually use <code>()</code> to indicate a function that takes no parameters.</p>

<p>Next, we have the return type at the far left. DWORD is short for Double Word (a word is the “natural” data size on the CPU, which, back in the old days, was a 16-bit integer. Hence, a double word is a 32 bits wide. Microsoft has defined DWORD as a macro alias for portability. Under the hood it is simply an <code>unsigned int</code>, but that may change some day. If it does, they will redefine the <code>DWORD</code> macro to stand for some other type. So if you use DWORD when they tell you to, your code will still compile when it happens. It is easiest to just nod and accept this. It doesn’t make a big difference for us, but if and when we need to, we know that we can cast a <code>DWORD</code> to an <code>unsigned int</code></p>

<p>That leaves the last name, <code>WINAPI</code>, which exists for pretty much the same purpose. It is another macro, and stands for the <em>calling convention</em>. The calling convention for a function specifies how parameters should be passed to it, and where it should place its return value. If we don’t know the calling convention of a function, we can not call it. Normally, we’re happy to use the default calling convention, but the Windows API has to be specific, so they add the <code>WINAPI</code> macro. And again, they use a macro so that if they one day decide to change the underlying calling convention, they can simply redefine this macro, and everyone’s code should still compile with no problems.</p>

<p>Following this, they describe the parameters and return value in detail. This is always worth reading in detail, because often, some parameters may or <em>must</em> be NULL. Likewise, the return value may have several meanings, and there is no single consistent convention. Some functions return zero on success, others return non-zero, or a positive value on succes. Some don’t return a success code at all. And some functions returns NULL on error, and <em>actual data</em> on success. Always, <em>always</em> read this section carefully.</p>

<p>In this case, we’re lucky. It simply returns the currently active error code, although it does ramble on about all the inconsistencies caused by other functions.</p>

<p>The <strong>remarks</strong> section tells us other information that doesn’t fit under one specific parameter or under the return value. Again, this should never be skipped. This is where all the inconsistencies and special cases are often listed.</p>

<p>Some functions then have a link to an example usage.</p>

<p>Finally, the documentation shows us <em>where</em> and <em>when</em> the function is defined. In this case, we need at least Windows 2000, and the function is defined in <code>WinBase.h</code> (but we should just include <code>windows.h</code>).</p>

<p>And it is defined in the <code>Kernel32.Lib</code> library. This library is included by default, so we don’t have to worry about this.</p>

<p>So far, it hasn’t been <em>too</em> bad, has it? It should be clear already that it’s not a pretty API, but as long as we stick to the documentation it’s pretty straightforward.</p>

<p>So let’s move on to the <a href="http://msdn.microsoft.com/en-us/library/ms679351.aspx"><code>FormatMessage</code></a> function. Follow that link, and take a look… I’ll be here waiting.…</p>

<p>Done?
Good. Now <strong>this</strong> looks scary. And no, this time I can’t give you a simple explanation. This function truly <em>is</em> scary. Of course, this is one of the reasons why I picked it for this example. This is about as bad as the Win32 API gets.</p>

<p>The page lists the following function prototype:</p>

<pre><code>DWORD WINAPI FormatMessage(
  __in      DWORD dwFlags,
  __in_opt  LPCVOID lpSource,
  __in      DWORD dwMessageId,
  __in      DWORD dwLanguageId,
  __out     LPTSTR lpBuffer,
  __in      DWORD nSize,
  __in_opt  va_list *Arguments
);
</code></pre>

<p><code>__in</code>, <code>__in_opt</code> and <code>__out</code> are Microsoft-specific extensions, and are mainly used for documentation and for static code verification. It tells us which parameters are used for input, and which ones are for output, as well as which ones are optional.</p>

<p><code>LPCVOID</code> is another Microsoft macro. Microsoft spent a decade or two promoting Hungarian Notation before they had to admit what an astonishingly bad idea it actually was. But of course Win32 is stuck with it.
The <code>LP</code> prefix stands for “Long Pointer”, and you can pretty much ignore the “Long” part. That dates back to 16-bit computers, where you actually had different types of pointers (far and near pointers). All we need to know is that it is a pointer. The <code>C</code> is for constant. In other words, this is a constant pointer to void, or <code>const void*</code>. (Of course, <code>void</code> isn’t a very meaningful thing to point to. A void pointer is essentially used as a pointer to an unknown type.)</p>

<p><code>LPTSTR</code> is another adventure in Hungarian Notation. You already know <code>LP</code>. <code>STR</code> is probably obvious too. It’s a string. (Of course, since this is a C API, we’re talking about a C string, or a char pointer, which also explains the presence of the <code>LP</code> part. That leaves the <code>T</code>. What can that mean? I’m not sure. It might be “Template” or similar. It was introduced when Microsoft realized that they’d have to support Unicode. As I mentioned previously, Windows uses <code>wchar_t</code> for unicode text, and so their API had to accept <code>wchar_t</code> pointers when working with Unicode strings. But they still had to be backwards compatible as well, and be able to handle plain char pointers as well.</p>

<p>So they invented a new set of macros The <code>T</code> essentially stands for “whichever character type is currently active”.
If you enter your project’s properties, you’ll see the option to enable or disable Unicode on the General tab. It should be enabled by default.</p>

<p>As long as Unicode is enabled, any macro including this T will be mapped to the equivalent macro using a <code>W</code> (for Wide). If Unicode is disabled, the macro will instead point to a similarly named macro <em>without</em> this character.</p>

<p>In other words:</p>

<ul>
<li>LPTSTR -&gt; LPWSTR or LPSTR</li>
<li>LPTCSTR -&gt; LPWCSTR or LPCSTR</li>
<li>TCHAR -&gt; WCHAR or CHAR</li>
</ul>

<p>And all of these again point to the types you would probably now expect. <code>LPWSTR</code> is a pointer to a wide string (<code>wchar_t*</code>). And <code>LPCSTR</code> is a <em>const</em> pointer to a string, or <code>const char*</code>. And <code>WCHAR</code> is a <code>wchar_t</code>.</p>

<p>As if this wasn’t complicated enough, the function itself is <em>also</em> a macro. Two versions of the function actually exist:</p>

<ul>
<li><code>FormatMessageA</code> is the old ASCII version, using plain <code>char</code> strings.</li>
<li><code>FormatMessageW</code> is the “new” Unicode version, using <code>wchar_t</code> strings.</li>
</ul>

<p>FormatMessage is not itself a function, but simply a macro, which is resolved by the preprocessor to one of these two. (C doesn’t allow overloaded functions, so they had to settle for this ugly hack to allow multiple definitions of the same function).</p>

<p>This also means that we can actually <em>call</em> these two names directly. If we call <code>FormatMessageW</code>, we’ll get the Unicode version regardless of whether Unicode is enabled in project settings. This makes it safe for us to use <code>wchar_t</code> strings directly, rather than mess around with <code>TCHAR</code> strings which might be one or the other.</p>

<p>Going back to the function declaration, the last parameter, <code>va_list</code>, looks a bit out of place. It’s not capitalized, and it doesn’t have these ugly prefixes. It is used in C to indicate <em>a variable number of arguments</em>, commonly known as <code>varargs</code>. As I mentioned in part I, printf uses <code>varargs</code> as well, and this throws away all hope of type safety, or even knowing how many parameters are pased to the function. Hopefully we won’t need to mess with this. (it’s marked as <code>__in_opt</code>, so it should be optional. Let’s hope we won’t have to use it then).</p>

<p>Anyway, there’s nothing for it. Let’s dive in. First parameter:</p>

<p>Ok, so this is a <code>DWORD</code> flag, and seems to store a combination of two values. The second table is shorter, so let’s take that first. There are three options here. A zero just means to preserve whatever line breaks exist in the message by default. the constant <code>FORMAT_MESSAGE_MAX_WIDTH_MASK</code> seems to preserve hardcoded linebreaks, but removes “regular” ones. I have no clue what the difference is.</p>

<p>The last option (mentioned just under the table) is to store any other number into the value. This then specifies the maximum line width. We’re happy to just use the default line breaks though, so we’ll settle for the zero value. That leaves the first table.</p>

<p>Looking through the options there, it seems that we need <code>FORMAT_MESSAGE_FROM_SYSTEM</code>. <code>FORMAT_MESSAGE_ALLOCATE_BUFFER</code> seems potentially interesting as well, but this table doesn’t really explain what happens if this flag is <em>not</em> enabled. If the system doesn’t allocate a buffer for us, who does? Looking down further, at the input parameter <code>nSize</code> we see that:</p>

<blockquote>
  <p>If the FORMAT_MESSAGE_ALLOCATE_BUFFER flag is not set, this parameter specifies the size of the output buffer, in TCHARs.</p>
</blockquote>

<p>In other words, if we don’t use this flag, we have to provide a buffer. But we don’t know the length of the message we’re trying to retrieve, so this seems a bad idea. (Of course we could just provide a buffer of 64KB, which the documentation mentions is the maximum size, but this seems silly).</p>

<p>Finally, if we skip down to the “Security Remarks”, it says to add <code>FORMAT_MESSAGE_IGNORE_INSERTS</code> if we’re going to pass “arbitrary system error codes”, which we are. Most API’s try to ensure that the simplest action is the correct one. Win32 seems to be designed for the opposite case, ensuring that that correct usage should only be possible if you have <em>already</em> read the entire documentation page, very carefully, and at least three times. But that won’t stop us.</p>

<p>So the dwFlags parameter should then be the combination of these flags:  FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_IGNORE_INSERTS | 0‘, although of course the 0 can be omitted.</p>

<p>Next, we have <code>lpSource</code>. Luckily, this is marked optional, and it is stated that this is ignored unless one of the two listed <code>dwFlags</code> values are set, which they’re not in our case. So we ignore it and simply pass NULL.</p>

<p>Then we have the message ID. This must be the value we got from <code>GetLastError</code>. Then we have the language ID. Rather than going searching for possible values to pass here, we can see that if we just pass a zero, it’ll try to pick a sensible default. So let’s do that.</p>

<p>Now comes the pointer to the output buffer. Read what it says here carefully:</p>

<blockquote>
  <p>If dwFlags includes FORMAT_MESSAGE_ALLOCATE_BUFFER, the function allocates a buffer using the  LocalAlloc  function, <em>and places the pointer to the buffer at the address specified in lpBuffer</em>.</p>
</blockquote>

<p>So the parameter <code>lpBuffer</code> is a pointer to <em>the pointer to the buffer</em>. That is, we must pass it a pointer to the pointer it should set to point to the allocated buffer.</p>

<p>It also mentions that the buffer is allocated with <code>LocalAlloc</code>, and must be freed by us with <code>LocalFree</code>. Better remember this, or we’ll leak memory. Note that Windows defines several different memory allocation functions. This time they chose to use LocalAlloc. C++‘s <code>new</code> and <code>delete</code> are implemented in terms of <em>some</em> of these, but who knows which?.</p>

<p>Now comes <code>nSize</code>. It allows us to specify the minimum number of characters to allocate? Why would we care about that? Let’s just pass zero and hope for the best. It’s just a minimum after all.</p>

<p>Finally, we have <code>Arguments</code>. We already specified that the system should ignore inserts, so it seems like it shouldn’t actually care about these arguments. They’re also specified as optional, so let’s pass a big fat <code>NULL</code> here.</p>

<p>And that should be it! Now we just have to handle the return value:</p>

<ul>
<li>zero on failure, or</li>
<li>the number of TCHARs stored in the output buffer, not counting the terminating <code>NULL</code></li>
</ul>

<p>And… we’re through. Now let’s try putting the pieces together and see what happens:</p>

<pre><code>#include &lt;windows.h&gt;
#include &lt;iostream&gt;

int main() {
  DWORD error = GetLastError();

  wchar_t* buffer;

  DWORD length = FormatMessageW(
  FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_IGNORE_INSERTS,
  NULL,
  error,
  0,
  (wchar_t*)&amp;buffer,
  0,
  NULL);

  std::wcout &lt;&lt; buffer &lt;&lt; std::endl;

  LocalFree(buffer);
}
</code></pre>

<p>Note the ugly cast we need to on the buffer. This is necessary because the argument may be either a pointer to a pre-allocated buffer, or (as in our case), a pointer to a pointer we’d like to be set to point to the system-allocated buffer. But the function expects a pointer to a string buffer, not a pointer to a pointer to a string buffer, so if we want to pass it the latter, we have to cast it to the former type.</p>

<p>Note that I’m calling the <code>W</code> version of the function specifically, and using <code>wchar_t</code> instead of <code>TCHAR</code>. The reason is simple. I want the Unicode version, regardless of Unicode setting in the project. Part of the reason is that it’s a lot easier to print out the string when we know what type it is. In particular, the standard library requires us to use <code>cout</code> for regular character strings, and <code>wcout</code> for wide strings. If we’re given a string of TCHAR’s, do we call <code>cout</code> or <code>wcout</code> to print it? Easier to just be specific and make sure we have wide characters.</p>

<p>Well, that’s it. Try running it. It should print out that “the operation completed successfully”. Gee, thanks. That really makes it all feel worthwhile, doesn’t it? Make sure you understand what our code means (in particular, why the cast is necessary, and how <code>wcout</code> is able to print out the string and know where it ends, when all it has is a pointer to a character.</p>

<p>Anyway, you’ve now seen some of the worst the Win32 API has to offer. And you’re still alive. Many of the functions you might want to call are far simpler than this.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/10/a-net-developers-guide-to-c-part-iii/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A .NET Developers Guide to C++ (part II)</title>
		<link>http://jalf.dk/blog/2009/09/a-net-developers-guide-to-c-part-ii/</link>
		<comments>http://jalf.dk/blog/2009/09/a-net-developers-guide-to-c-part-ii/#comments</comments>
		<pubDate>Tue, 08 Sep 2009 13:19:56 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[.net]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[teaching]]></category>
		<category><![CDATA[win32]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=316</guid>
		<description><![CDATA[Welcome to the second installment in my guide of “what you need to know if you’re a .NET programmer who wants to be able to write C++ code and call native APIs”. It took me much longer to get this posted than I’d hoped. My work on my thesis has kept me more busy than [...]]]></description>
			<content:encoded><![CDATA[<p>Welcome to the second installment in my guide of “what you need to know if you’re a .NET programmer who wants to be able to write C++ code and call native APIs”. It took me much longer to get this posted than I’d hoped. My work on my thesis has kept me more busy than I’d originally expected. Sorry for the delay!</p>

<p>In <a href="http://jalf.dk/blog/2009/08/a-net-developers-guide-to-c/">part I</a>, I went through a minimal “Hello World” program in some detail, and attempted to explain the arcane workings of the C/C++ compilation model. Some may argue that this had no relevance to my target audience, but I think it is a necessary evil. Almost all C++ programmers get tripped up at some point by the the difference between compiler and linker errors, and what exactly the <code>#include</code> directive actually <em>does</em>. Hopefully, by reading part I, you’ll be able to avoid this.</p>

<p>With that out of the way, we can get started on the interesting part, though. Part II will focus on actual C++ code. We won’t consider managed interop or even the Win32 API yet, though. This part will still take place in native C++-land only. In short, the purpose of this part is to enable you to write simple C++ programs, and more importantly, to <em>understand</em> the C++ sample code you probably run into from time to time.</p>

<p><span id="more-316"></span></p>

<p>I will <em>not</em> cover all the idioms and techniques that “real” C++ programmers use. We’ll settle for the bare minimum required to get by in a .NET-to-Win32 interop scenario where you really just want to write enough C++ code to call some native API function. This means that we won’t get the most robust, reusable, elegant or concise C++ code. But we <em>will</em> be able to get the job done.</p>

<p>I’d love to write a more detailed series of posts about “modern C++“<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup> some other time, but it is beyond the scope of this series of posts.</p>

<h1>Using C++</h1>

<p>Before we get into the Win32 API, let’s run through some slightly bigger C++ examples than the Hello World from part I. At the very least you’re going to need to know how to define and use classes, and a few useful components in the standard library.</p>

<p>You already know that it is possible to define class member functions outside classes, but you haven’t yet seen a nontrivial class definition. Let us try creating one. For the purposes of demonstration, I’ll implement the simplest class I can think of; a counter. It’ll simply contain an integer, and callers will be able to increment the value, and get the current value.</p>

<pre><code>class counter {
public:
  counter() : i(0) {}
  int current() {return i; }
  void update() { ++i; }
private:
  int i;
};
</code></pre>

<p>There, we now have a basic class. We can call it from this function: (I use <code>assert</code> to indicate expected values of variables, much like you would in a unit-test. Note that the asserts are pseudocode (among other things, I will access private class members with them, which obviously won’t work in reality)</p>

<pre><code>// assume that we either placed the class definition here, or have a #include for the header in which the class is defined.
int main(){
  counter c;
  int i = c.current();
  assert(i == 0);
  c.update();
  assert(c.i == 1);
  assert(c.current() == 1);  
}
</code></pre>

<p>A ridiculous simple program, of course. But there are several things worth noting. In no particular order:</p>

<ul>
<li>Our counter object  <code>c</code> is created without using <code>new</code>, and without explicitly calling a constructor. All C++ types are fundamentally similar to .NET’s value types — so <code>c</code> is not a reference to a counter, but instead a default-constructed instance of one, placed on the stack. If nothing else is specified, the default constructor is called when the variable is declared. (To call another constructor, we could have done something like <code>counter c(1, "hello", 2.0f);</code>. </li>
<li>the class definition is terminated by a semicolon. This is important to remember, as forgetting it can lead to very misleading compiler errors. I won’t get into <em>why</em> this semicolon is necessary here though. It is a long story, and it is caused by the need for C compatibility.</li>
<li>access specifiers are not applied per-member, but rather used to divide the class into sections. In a class, the default specifier is <code>private</code>. I tend to put my public members at the top of the class, to make it easier for readers to find the public interface. Further down, we have a <code>private</code> specifier, for hiding our int member. The valid access specifiers are <code>public</code>, <code>private</code> and <code>protected</code>, which each behave just like in C#. <code>internal</code> does not exist however, since there is no notion of assemblies, and the only way to share types between files is with <code>#include</code>’s as mentioned in part I.</li>
<li>there is no clear, common naming convention in C++. The standard library users lower-case, and separate words by underscores, as in <code>class_name</code>. Many programmers however, use a convention similar to in .NET, naming types <code>ClassName</code>, and variables <code>className</code>. There are no fixed rules, so as long as you are consistent it’s fine with me.</li>
<li>.NET has both classes and structs, and the two have very different meanings. In C++, both classes and structs exist as well, but their meaning is <em>almost</em> the same. The only difference between a class and a struct in C++ is that a struct defaults to <em>public</em> accessibility for members, where a class defaults to private. In other words, if I had defined the above as a struct, I could have omitted the <code>public:</code> line. (For this reason, I often find myself using structs. As I said, I tend to put the public interface at the top of the class, and add a <code>private:</code> section further down. However, it is not a big deal. A common rule of thumb is much the same as is used in C#: Structs are simple containers of data, where classes have behavior. My own style tends to be a compromise between the two. Classes with complex behavior are made classes for this reason, but in simpler borderline cases, I tend to prefer struct, even if it has <em>some</em> behavior. It makes no difference to the compiler, and it saves me a line of code, because it defaults to <code>public</code>, which is what I want at the top of my class anyway.)</li>
<li>the observant reader will probably have noticed a difference here compared to the example shown in part I. Back then we declared the member method without defining its body inside the class. Now we define the body inside the class. Both approaches are legal, and each have their pros and cons. In particular, defining the body outside the class leaders to shorter class definitions, which may aid readability. On the other hand, defining functions inside the class leads to better locality — you only have to look in <em>one</em> place to learn all about the class. Further, the compiler is generally better able to optimize code if member methods are defined “inline”. For these reasons, people often put short functions of 2–3 lines or so inside the class, and define larger ones separately. There is one caveat, however. Functions defined <em>inline</em> (either by placing the full definition inside the class, or by marking the definition with the <code>inline</code> keyword) may have a definition in each translation unit. In other words, they may be placed in headers (where they’re seen by multiple translation units. Non-inline functions must only be defined <em>once</em>, and so generally have to be defined in a <code>.cpp</code> file, similar to what we did in part I.</li>
<li>the constructor looks a bit different than you may be used to. The <code>i</code> member is initialized via the <em>initializer list</em>, specified after the colon. This is similar to how you would call a base class constructor in C#, although instead of <code>base</code>, the member name is used. Also note that we have to explicitly initialize <code>i</code> because as a primitive type, it would otherwise not be initialized at all. The initializer list syntax is only legal in constructors, and should be used as much as possible. I’ll explain why in a moment.</li>
</ul>

<h1>“Special” member functions</h1>

<p>We <em>could</em> have defined the constructor in a more familiar way:</p>

<pre><code>counter() {
  i = 0;
}
</code></pre>

<p>and in this simple case, it would have made no difference. In more complex classes, however, there is an important distinction: anything that happens in the constructor’s body happens <em>after members are initialized</em>. If the member is not specified in the initializer list, it is <em>default initialized</em>, which means that for primitive types, <em>nothing</em> happens, they just contain random garbage values, and for classes defining a default constructor, it gets called <em>before</em> the constructor’s body is evaluated, in which we assign the <em>actual</em> value we want the member to contain.</p>

<p>So yes, for our simple <code>int</code> case, we might as well have written the constructor without using the initializer list. But consider what would have happened if the member had been some complex user-defined class. Instead of simply constructing the object with the right value to begin with, we would have default-constructed it, and <em>then</em> executed an assignment. This would obviously have been less efficient than simply constructing the object correctly in the first place.</p>

<p>But just as importantly, some types <em>can not</em> be assigned to once they are initialized. Likewise, some types may not have a default constructor, in which case failure to use the initializer list to explicitly call another constructor will result in a compiler error! So in general, the initializer list should be preferred both from performance and correctness concerns. A side effect of the initializer list is that the actual body of constructors can often be left empty.</p>

<p>In .NET, there is a distinction between <em>value</em> and <em>reference</em> types, and the behavior of the assignment operator is completely different for each of the two cases. <code>x = y</code> for values of a reference type simply stores a <em>reference</em> to <code>y</code> into <code>x</code>. But if the two types are value types, a complete copy is created instead.</p>

<p>In C++, <em>all</em> variables obey value semantics. <code>x = y</code> will always copy the <em>value</em> <code>y</code> into <code>x</code>. This is why I said when discussing the constructor’s initializer list that an extra assignment may be expensive.</p>

<p>Since the plain value semantics as used by C# would be both inflexible and inefficient, C++ provides a number of tools for controlling the behavior of your class. In particular, you can define a <em>copy constructor</em> and an <em>assignment operator</em> to override exactly how assignment should be performed. The following demonstrates what they may look like.</p>

<pre><code>class counter {
public:
  counter(const counter&amp; other) : i(other.i) {} // copy constructor
  counter&amp; operator= (const counter&amp; other) { // assignment operator
    if (this == &amp;other) {return *this; }
    i = other.i;
  }
  ....
};
</code></pre>

<p>Perhaps the first thing we should mention is the meaning of the <code>&amp;</code> character. It is used to denote a <em>reference</em>, essentially an alias for a variable. It is related to pointers (see <a href="http://jalf.dk/blog/2009/07/the-great-pointer-conspiracy/">this post</a> for a more detailed explanation of pointers), but is simpler and more limited. In particular, it can not be reseated. Once it is initialized, it is an alias for the variable it points to <em>forever</em>. Also, unlike pointers, there is no special syntax for <em>using</em> a reference:</p>

<pre><code>int i; // create an integer.
int&amp; r = i; // create a reference as an alias of i. Note that we simply assign i, unlike with pointers where we would have had to take the address of i first with the `&amp;` operator.
r = 42; // assign 42 to whatever the reference points to. Again, no special syntax. There is nothing here to tell us that r is a reference.
int j = 13; // create another integer
r = j; // assign it to our reference. The effect of this is *not* to make r point to j (as would have happened had it been a pointer), but simply to assign the value of j to i. In other words, i will now equal 13, and r will still point to i.
</code></pre>

<p>Because a reference can not be reseated, it is also a nice example of a case where the constructor’s initializer list <em>must</em> be used. Imagine a class which has a reference member. References <em>must</em> point to something, so they have no default constructor. And once they are initialized to point to an object, they <em>always</em> point to that object. In other words, it must be constructed <em>before</em> the constructor’s body is executed, which means in the initializer list. Failure to do so simply won’t compile.</p>

<p>Now to explain the copy constructor, which is fairly simple. It is simply a constructor which takes one argument, a <em>const reference</em> to the type itself. Copy constructors are commonly used to initialize class members with copies of the arguments passed to the “outer” class’ constructor. In the copy constructor above, we also copy-construct <code>i</code>, for example. (The value of <code>other.i</code> is copied into our own <code>i</code>)</p>

<p>The assignment operator is a bit trickier.
The first line inside it tests for assignment to itself. (As would happen in <code>x = x</code>). This may not have been a problem in this simple class, but in more complicated ones, self-assignment can cause problems, as you will be reading data from the same object you’re writing to.  We also note that instead of simply comparing <code>this</code> to <code>other</code> in the test, we use <code>&amp;other</code>. We wish to check that <code>this</code> and <code>other</code> refer to the same object instance, not just that they contain the same value. To achieve this, we need to compare pointers. <code>this</code> is already a pointer<sup id="fnref:2"><a href="#fn:2" rel="footnote">2</a></sup>, but <code>other</code> is a reference, so we have to take the address of it first. Because a reference is essentially an alias for the referenced value, the address-of operator returns the address of the referenced value, not of the reference itself.</p>

<p>Next, note that the assignment operator does not have an initializer list, but instead performs the copying in the function body. The reason for this is obvious: It is not a constructor, so all its members are already initialized. An initializer list would not make sense, and is not allowed by the language. This also means that here, <code>i</code>’s assigment operator is invoked, rather than its copy constructor, as was used in the previous example. (Technically, built-in types have neither assignment operator or copy constructor. However, the same syntax is allowed, it simply uses the obvious built-in operations.)</p>

<p>A final note about assignment operators and copy constructors is that if <code>=</code> is used to <em>declare</em> a variable, the copy constructor, and not the assignment operator, is called. As I said before, these functions are special and known to the compiler, and so, can be invoked in special cases. That is, if you are given variables <code>c</code> and <code>d</code> of type <code>counter</code>, then <code>c = d</code> calls the assignment operator on <code>c</code>, because <code>c</code> is <em>already</em> initialized. But if it had instead been <code>counter c = d</code>, then <code>c</code> would have been <em>initialized</em> as a copy of <code>d</code>, and so its <em>copy constructor</em> would have been used. The compiler ensures this, even if you use assignment syntax in the initialization of a variable.</p>

<p>Finally we get to another dreaded C++ construct: the <em>destructor</em>.
This is automatically called when the object is destroyed, and can be defined thusly:</p>

<pre><code>class counter {
public:
  ~counter(){
    std::cout &lt;&lt; i &lt;&lt; std::endl;
  }
  ....
};
</code></pre>

<p>The syntax is similar to finalizers in C#, but the effect is somewhat different. The destructor is invoked <em>instantly</em> when an object is deleted, and it is <em>guaranteed</em> to be called. In our case, we simply use it to print out the counter value.</p>

<p>Let’s try using these new functions and operators:</p>

<pre><code>int main(){
  counter c; // use the default constructor to create a counter
  c.update(); // increment its value
  assert(c.i == 1);
  counter d(c); // use the copy constructor to create a new copy of our existing counter.
  assert(d.i == 1);
  d.update();
  assert(c.i == 1); // our copy constructor made sure to create a *new* counter variable, so c is not affected by changes to d, and vice versa
  assert(d.i == 2); 
  c = d; // since c has already been initialized, the assignment operator is used to copy d into c.
  assert(c.i == 2);
} // at this point, both c and d go out of scope, and so their destructors are called. Destructors are always called in opposite order of destruction, so d's destructor will be invoked first.
</code></pre>

<p>All three functions are auto-generated by the compiler, if not declared explicitly. (The one exception is the assignment operator, which it may not be possible to auto-generate. If a class contains a member with no assignment operator, or a reference (which can not be reseated), the compiler will fail to generate an assignment operator, and all attempts to perform assignment will fail if one is not explicitly defined by the user.</p>

<p>The trio of copy constructor, assignment operator and destructor are sometimes called “the big three”, or we may speak of “the rule of three”. This is a rule of thumb that if you find yourself implementing one of these three special functions, you almost certainly should also implement the other two. The reasoning is pretty simple: The assignment operator and copy constructor are related — both are used to copy an object. If special care has to be taken when copying, then it should probably be defined for both these functions.</p>

<p>Further, if copying requires nontrivial handling, then it is a good bet that the class manages some kind of resource or contains data which requires special care in the destructor as well. Perhaps a pointer pointing to dynamically allocated memory, which must be deleted, or perhaps it should decrement a global counter used to count the number of live instances of the class. Or perhaps it is a file handle which must be closed. The fact that we had to implement special handling when copying is a strong hint that there will probably also be special handling required when cleaning up in the destructor.</p>

<p>And the converse is also true. If the destructor has to do something special, it must be because the class owns some kind of resource that must be released. And if it owns a resource, then we should ensure that the resource gets copied when the class itself does. So we should probably define copy constructor and assignment operator as well.</p>

<h1>POD types</h1>

<p>A final note about classes may be worth mentioning. C had no classes, only simple structs containing values, but no member functions, and without allowing inheritance or access specifiers. Since C++ was designed to be (mostly) backwards-compatible, such types have a special status in C++. In the above, I mentioned “primitive types” a few times. While an <code>int</code> is technically a primitive type (all built-in types are considered primitive types), the behavior I described is actually common to all <em>POD</em> (Plain Old Data) types. A POD type is essentially a type that would have been legal in C — in other words, it is either a built-in (primitive) type, or a class or struct where</p>

<ul>
<li>all members are public</li>
<li>no member methods exist</li>
<li>no constructor, copy constructor, assignment operator or destructor is defined</li>
<li>no base classes exist</li>
<li>All members are POD types as well</li>
</ul>

<p>Such POD types are given special treatment in many ways. For example, they may be treated as “raw memory”. The standard-library C function <code>memcpy</code>, which simply copies a number of bytes from one location to another, may be used to copy POD types, but not non-POD classes. The reason for this is that non-POD types may have extra behavior that would break if this was done. As an obvious example, if we created a copy in this way, we would bypass the assignment operator/copy constructor, but we would end up with two objects, both of which would have their destructors called when deleted — so we would end up with a mismatch where the destructor is called more often than the constructors, a clear error if the class implements reference-counting, for example.</p>

<p>Another peculiarity of POD types is that they are not initialized unless a constructor is explicitly called. this is why we had to initialize <code>i</code> in our constructor above. As a POD type, <code>i</code> would otherwise contain whatever garbage value was found in memory. The same is true for POD structs. They too contain garbage if not explicitly initialized by calling a constructor:</p>

<pre><code>int i; // no initialization occurs
int i(); // explicitly require default initialization -- for POD types, this is done by setting all members to zero.
</code></pre>

<p>In other words, had our <code>counter</code> class stored a non-POD member, the initializer list would not have been necessary. Its member would automatically be default-constructor if nothing else was specified. But POD types do not have that extra behavior, so if nothing else is specified, they simply don’t get initialized.</p>

<h1>Enough about classes</h1>

<p>There are a few other nitty-gritty details about the language we should discuss. You may have already wondered about one or two of them. So without further ado,</p>

<ul>
<li>variable declaration is usually done <em>without</em> using <code>new</code>. The <code>new</code> operator allocates memory on the heap, and returns a pointer to the newly declared variable. Since there is no garbage collector, we have to manually call <code>delete</code> on this pointer to free the memory. This is the source of C++‘s reputation as a playground for memory leaks. Of course the astute reader will have noticed that so far, I haven’t used <code>new</code> and <code>delete</code> even once. The truth is that these can often be avoided or hidden, thus removing all possibility of memory leaks. Any variable declared <em>without</em>  using <code>new</code> is declared “locally” — if it is declared in a function, it becomes a local variable, and is destroyed when we leave the scope in which it is declared. If it is a class member, it is destroyed when the owning class is destroyed. If it is declared inside a loop, it is destroyed when we leave the loop, and if it is defined in a function, it is destroyed when we leave the function. In other words, variables declared without <code>new</code> have “automatic storage duration”, and in fact, <code>int i = 42</code> could also be written as <code>auto int i = 42</code>. The auto keyword indicates exactly this, that the lifetime of the variable is <em>automatic</em>. Since this is the default, the keyword is never actually used, but it exists, and this is what it means. And just to clear up any doubts, variables with automatic storage duration are destroyed when we leave the scope it was declared in, <em>no matter how</em> we leave it. It doesn’t matter if we return from the function, or if an exception is thrown. In both cases, the local variable’s destructor is called.</li>
<li>Just to avoid confusion, we’d better look at a quick example of using <code>new</code>: Consider this line of code: <code>counter* p = new counter()</code>. Here, we allocate an object of our <code>counter</code> class on the heap, with dynamic storage duration, but we <em>also</em> declare a local variable — the pointer <code>p</code>.  The pointer is a local variable with automatic storage duration. In other words, the pointer itself will be freed just fine when we leave the function — but the dynamically allocated <code>counter</code> to which it points will <em>not</em>. This is how memory leaks occur. Once <code>p</code> gets destroyed, we no longer have a pointer to the dynamically allocated memory, so we can never free it.</li>
<li>Avoiding cyclic dependencies can take a bit of work, since C++ code is read by the compiler from top to bottom. It won’t let a function or class refer to another which hasn’t been defined yet. Sometimes, this can be solved through refactoring, by splitting out the code we need to refer to, out into a separate class which can be declared first. But another trick is to use <em>forward declarations</em>. You have already seen it used for the class member method in part I. We can declare a function without specifying its body. This tells the compiler that the function <em>exists</em>, which means we can call it safely. So if we put such a declaration at the top of a file, we can provide the actual definition including the body at the end of the file, after whatever classes or functions we need to refer to. For classes, we can do a similar trick, and simply declare <code>class counter;</code>. As with the function case, this tells the compiler that <code>counter</code> is a class, and that it <em>does</em> exist. The definition just isn’t shown yet. This won’t let you access class members yet (since the compiler still doesn’t know which members it has), and you can’t declare variables of that type yet (because the compiler doesn’t know which, if any, constructor to call, and it doesn’t know the size of the class). But you <em>can</em> create references and pointers to the class.</li>
<li>C# uses function overloading to allow for functions where some parameters may have sensible default values. If we have a function taking parameters <code>a</code> and <code>b</code>, we can create an overload which takes only <code>a</code>, and provides a default value for <code>b</code>. The same can be done in C++, but you <em>also</em> have the option of providing default values. The function <code>void foo(int i = 0) {std::cout &lt;&lt; i &lt;&lt; std::endl; }</code> can be called just with <code>foo()</code>, and will print out <code>0</code>. If you are more comfortable with overloading, you may not need to use default parameters, but you may still encounter third-party code which uses them, so you should be familiar with the syntax.</li>
</ul>

<h1>The standard library</h1>

<p>We’re nearing the end. The last thing you should know about C++ before I let you run loose is a few standard library classes.
The C++ standard library is very small compared to .NET or Java’s class libraries, but it is also widely considered C++‘s main saving grace — most people consider the language an overcomplicated mess in many ways, but the standard library stands out, both as an example of C++ done <em>right</em>, and as a redeeming feature which transforms C++ into a powerful and elegant language<sup id="fnref:3"><a href="#fn:3" rel="footnote">3</a></sup>. Or more precisely, <em>part of</em> the standard library possesses these qualities.</p>

<p>In the following I’ll briefly sketch out the main parts of the standard library, and explain a few useful classes. For more general information, Microsoft has some excellent documentation for all parts of the standard library <a href="http://msdn.microsoft.com/en-us/library/cscc687y.aspx">here</a>.</p>

<p>The standard library has been assembled piecemeal over the years, and as such, represents several different styles and paradigms. The oldest parts of it are simple functions carried over from C’s standard library. I have already mentioned two of these, <code>printf</code> and <code>memcpy</code>, but of course many others exist.</p>

<p>After these came the first C++-specific additions, in the form of the <code>iostreams</code> library. You have also encountered a few members of this, in <code>cout</code>, <code>cin</code> and <code>endl</code>, as well as the <code>operator&lt;&lt;</code> used for streaming. This library is, honestly, not very nice. It does the job for simple Hello World-like applications, but it is inflexible, inefficient, overcomplicated and hard to extend. In fact, many C++ programmers stick to <code>printf</code> over <code>cout</code> despite all the disadvantages I listed in part I. Of course, <code>iostreams</code> also contains file streams as well as some other basic stream functionality. A related addition is the <code>string</code> class, and the locale facilities.</p>

<p>These all have one thing in common: they are very old-fashioned and are, today, considered far from ideal. The <code>string</code> class got some last-minute surgery when it was added to make it a bit more modern, and a few additions were made to the stream classes as well, but overall, these are relics from the era of “C with classes”.</p>

<p>Finally, the star of the show is the Standard Template Library, or the STL for short. This remarkable library completely changed the how the language was used, and is definitely worth exploring. I won’t ramble on about it here, but I will mention that one of its characteristics is that it almost completely abandons traditional Object-Oriented programming (which <code>iostreams</code> used heavily), in favor of the less known and almost C++-specific paradigm <em>Generic Programming</em>.</p>

<p>The STL consists of three distinct “pillars”:</p>

<ul>
<li>Container classes are the equivalents of .NET’s System.Collections.Generics classes. They store sequences of data, and little else.</li>
<li>Iterator classes are superficially similar to .NET’s IEnumerator. They allow traversal over a container, but where .NET only allows traversal from the beginning to the end, C++ iterators also allow reversed iteration (from end to beginning), as well as traversal over subsets of the container (from the 6th to the 12th element, for example). Pairs of iterators are often used to mark sequences for further processing. Individual iterators are often used as “markers” into a sequence.</li>
<li>Algorithm functions work on iterators, or a pair of iterators, and perform almost all sequence processing. Sorting, searching, copying, <code>foreach</code>, accumulating values or any other algorithm involving sequences of data is implemented as an algorithm working on iterators.</li>
</ul>

<p>The clever part about this setup is that algorithms and containers know nothing of each others. An algorithm works on iterators, <em>wherever they come from</em>. It works whether the iterators are pointers into an array, into a linked list, or perhaps even into a stream or a database. As long as the iterator implements the appropriate functionality, it can be used by the algorithms. This allows for a degree of reusability that would have been impossible in .NET. The same <code>find</code> function for example, works on all of the standard container classes, <em>in addition</em> to working on any iterators your define yourself. As long as they fulfill a few basic requirements, you get <code>find</code>, <code>sort</code> and many other common operations for free.</p>

<p>And again unlike .NET, there is no interface you have to implement to create a new iterator type, or, for that matter, a new container class. The STL relies on a form of <a href="http://en.wikipedia.org/wiki/Duck_typing">Duck Typing</a> (if it looks like a duck, and walks like a duck, and quacks like a duck, it must be a duck) — this means that an iterator is not “a class which implements <code>IIterator&lt;T&gt;</code> or anything like that, but simply “A type T for which the following statements are defined, given an object <code>x</code> of type <code>T</code>: <code>++x</code>, <code>*x</code>, <code>T()</code> and a few others. In other words, if a type defines a default constructor and a few operators, <em>then it is an iterator</em>, and it’ll work seamlessly with the rest of the STL. In fact, raw pointers are valid iterators as well.</p>

<p>In .NET, every collection class has to define its own search function, and there is no elegant way to decouple it completely. (We could define the function in a static helper class, but it would still be working on something specific like an <code>IList</code>, rather than just <em>any</em> sequence). In C++, the function <code>std::find</code> works on <em>any</em> pair of iterators.</p>

<p>While iterators and algorithms are key to “modern C++”, I will focus on the containers here, as they can be used with little explanation, and are almost indispensable (just like you wouldn’t want to program in C# without the <code>List&lt;T&gt;</code> class)</p>

<p>The equivalent of .NET’s <code>List&lt;T&gt;</code> class is the <code>vector</code>:</p>

<pre><code>#include &lt;vector&gt;

int main() {
  std::vector&lt;int&gt; v;
  v.push_back(1);
  v.push_back(2);
  v.push_back(3);
  v.push_back(42);
  v.pop_back();
  // v now contains the values [1, 2, 3]
  v.resize(5); // resize to contain 5 elements
  // v now contains [1, 2, 3, 0, 0]
  assert(v[1] == 2);
  v[3] = 42;
  // v now contains [1, 2, 3, 42, 0]
  int&amp; r = v[0]; // create a reference to the first element
  int* p = &amp;v[0]; // create a pointer to the first element
}
</code></pre>

<p>Pretty straightforward. And again, note that we’ve managed to create an arbitrary number of objects in our application, without even once having to call <code>new</code>. Which also means that there is <em>no</em> possible way in which this application can leak memory. (short of bugs in the compiler or standard library).</p>

<p>There are a couple of caveats to be aware of though:</p>

<ul>
<li>There is typically no bounds-checking on the <code>[ ]</code> operator. This doesn’t mean it is legal to do <code>v[999]</code> above, it just means that there is no guarantee of what will happen if you do it. It is <em>undefined behavior</em>.</li>
<li>Pointers and references to individual elements within a vector may be <em>invalidated</em> when we add elements to the vector. Like with C#‘s <code>List&lt;T&gt;</code>, it is a dynamic array, and resizes as necessary. Each such resizing operation consists of allocating a new array, copying the contents into that, and then freeing the old array. A pointer to data in the old array is therefore no longer valid. The same applies for iterators. Any iterator pointing into a vector is invalidated if the vector is resized.</li>
</ul>

<p>Because a vector guarantees that its data is stored contiguously, essentially as an array, we can use this class instead of an array when interfacing with old C code (which only has pointers and arrays, but no vectors). In the above, the variable <code>p</code> could be passed to a C function as a pointer to the beginning of an array of <code>int</code>’s. we still have to be careful of course. The function must not be allowed to write past the end of the array.</p>

<p>Other container classes are the the map (equivalent to .NET’s <code>Dictionary&lt;Key, Value&gt;</code>. <code>std::map&lt;Key, Value&gt;</code> in the <code>map</code> header), and the set (no equivalent in .NET 2.0, although <code>HashSet&lt;T&gt;</code> in 3.5 is similar). Works much like a <code>map</code> without the <code>Value</code> parameter: <code>std::set&lt;T&gt;</code> in the <code>set</code> header). Their use is pretty much as you would expect.</p>

<p>In general, I would discourage you from using arrays. Prefer vectors instead, and if an API expects a pointer to an array, pass it a pointer to the first element of the vector instead, as shown in the previous example. Vectors are safer and simpler to work with.</p>

<h1>Strings</h1>

<p>A final pair of classes worth mentioning are <code>std::string</code> and <code>std::wstring</code>. C++ has no built-in string type, and so to work with strings, you have to include the <code>string</code> header, and use these classes. A <code>string</code> is simply a string of <code>char</code>’s, single-byte characters. A <code>wstring</code> is a string of <code>wchar_t</code>’s, or <em>wide characters</em>. On Windows, these are 16 bits wide, and use the UTF16 encoding, allowing them to be used for unicode strings.</p>

<p>These classes behave much as you would expect, so I won’t discuss them further. Instead I’ll skip to a related point of confusion: C has no string type <em>at all</em>. Instead, <code>char</code> pointers (or <code>wchar_t</code> pointers) are used as primitive strings.</p>

<p>A C-string is simply a sequence of characters, terminated by a null character (<code>'\0'</code>). If the null character is left out, all C string functions will just assume that the string continues <em>until</em> a null character happens to be found. This is obviously extremely fragile and a common source of bugs. but it’s an unavoidable fact of life when interfacing with C code.</p>

<p>This also rears its head when working with string literals. <code>"hello world"</code> does <em>not</em> have type <code>std::string</code> in C++. It has type <code>const char[12]</code>, that is, an array of 12 const characters. (Note that the string is only 11 characters long. The compiler automatically generates the terminating null, and sets aside space for this as well).</p>

<p>Arrays in C and C++ are very primitive and fragile things, and implicitly <em>decays</em> into pointers when needed. Whenever you have an array, you can assign it to a pointer, and the pointer will automatically point to the beginning of the array. Because arrays are so limited (a function can not return an array or take an array as argument either), arrays are often passed around <em>as</em> pointers — and in fact, pointers can be treated much like arrays as well. Given a pointer <code>p</code>, <code>p[2]</code> is legal, and is equivalent to <code>*(p+2)</code>. But because it is just a pointer, the <em>size</em> of the array isn’t known. It is up to the programmer to keep track of that.</p>

<p>Getting back to strings, the way arrays can decay into pointers means that this is legal: <code>const char* str = "hello world"</code>. The pointer <code>str</code> now points to the <em>statically allocated</em> array of characters “hello world”, and for all practical purposes, <code>str</code> is now a C-string.</p>

<p>To create a wide string literal, the string is prefixed with a ‘L’, as in <code>wchar_t* wstr = L"hello world"</code>.</p>

<p>Because C-style strings are used in most API’s, you often need to convert between this and the C++ string class. This can be done as in the following:</p>

<pre><code>const char* str = "hello world";
std::string str2 = str; // an implicit conversion exists from char pointer to string. So in addition to this line, 'std::string str = "hello world" would also have worked.
const char* str3 = str2.c_str(); // the c_str() member method on the string class returns a C-style string.
</code></pre>

<p>Because string literals are C-style strings, there are a few pitfalls to be aware of when using them:</p>

<pre><code>char* str = "hello worl";
char* str2 = str + str; // #1
str += 'd'; // #2
</code></pre>

<p>In line <code>#1</code>, we get a compile error. Because <code>str</code> is just a pointer, addition is not defined, and so the compiler chokes.
A related example is in <code>#2</code> where we try to add a character to the string. This compiles, perhaps surprisingly, but it won’t do what you expect. Instead, the <code>char</code> gets converted to an <code>int</code>, and <em>added to the value of the pointer</em>. So the result is a pointer to <code>'d'</code> characters past the beginning of the string.</p>

<p>For these operations to work, we must have a proper C++ string:</p>

<pre><code>std::string str = "hello ";
str += "worl";
std::string str2 = str + 'd';
</code></pre>

<p>will work as expected, and result in the string “hello world”.</p>

<p>You now know all you need to know about C++ to use it without shooting yourself in the foot <em>too much.</em> You also know enough to read a lot of the code snippets you’re likely to find online. And you’ve got a starting point for searching out more information should you wish to.</p>

<p>In the next installment, we will finally get to interfacing with the Win32 API. You may want to play around a bit with the compiler to make sure you understand pointers and C-style strings in particular, as we’re going to need those quite a bit. As I mentioned in part I, the Windows API is a C API, and an ugly, inconsistent one at that. It’s not a bad idea to make sure you’re somewhat comfortable with the basics of the language before trying to grapple with it.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>“Modern C++” is not just a random name. It is a style of C++ programming named after Alexandrescu’s book, <a href="http://www.amazon.com/Modern-Design-Programming-Patterns-Depth/dp/0201704315">Modern C++ Design</a> — there are fundamentally two ways to program in C++. One style is often, and somewhat derisively, called “C with classes” — implying that it is used in much the same way one would program in C, but with the addition of classes, member methods and public/private access specifiers. The other, superior, approach is “Modern C++”. C with classes is often what beginners encounter, and perhaps surprisingly, what Java and C# are based upon — meaning that programmers coming from these languages tend to settle on an obsolete and sub-optimal style. I often make a point of teaching newcomers “proper” modern C++, but this is not the place. The goal of <em>this</em> series of posts is not to teach <em>good</em> C++ practices, but simply to enable .NET programmers to talk to native API’s. <a href="#fnref:1" rev="footnote">↩</a></p>
</li>

<li id="fn:2">
<p>Unfortunately, there is no particularly good reason for this. <code>this</code> <em>should</em> have been a reference. That would have made much more sense. However, when <code>this</code> was added to the language, references did not yet exist, so it had to be a pointer. And later, when references were added, changing <code>this</code> to a reference would have broken backwards compatibility. <a href="#fnref:2" rev="footnote">↩</a></p>
</li>

<li id="fn:3">
<p>Bjarne Stroustrup, the designer of C++, once said that “Within C++, there is a much smaller and cleaner language struggling to get out” <a href="#fnref:3" rev="footnote">↩</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/09/a-net-developers-guide-to-c-part-ii/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>A .NET Developers Guide to C++</title>
		<link>http://jalf.dk/blog/2009/08/a-net-developers-guide-to-c/</link>
		<comments>http://jalf.dk/blog/2009/08/a-net-developers-guide-to-c/#comments</comments>
		<pubDate>Mon, 17 Aug 2009 17:49:10 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[.net]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[teaching]]></category>
		<category><![CDATA[win32]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=184</guid>
		<description><![CDATA[One of my coworkers is essentially a self-taught programmer, but he is interested in, and wants to learn, absolutely everything. A year or two back, he asked me to give him a crash course in C++, because he felt it was a problem that whenever he needed to do something that required functionality not exposed [...]]]></description>
			<content:encoded><![CDATA[<p>One of my coworkers is essentially a self-taught programmer, but he is interested in, and wants to learn, absolutely everything. A year or two back, he asked me to give him a crash course in C++, because he felt it was a problem that whenever he needed to do something that required functionality not exposed by the .NET framework, he essentially hit a wall.</p>

<p>So we took an afternoon out to run through some basic C++ code, and while we had fun doing it, and I’m pretty sure he found it interesting, it didn’t really achieve the goal of making him comfortable with writing small C++ programs to communicate with native APIs such as the Windows one.</p>

<p><span id="more-184"></span></p>

<p>Afterwards, I realized that the reason for our failure was that we hadn’t really made it clear what we were trying to achieve. He might have been interested in C++ in general, but what he actually <em>needed</em> was something a bit simpler: Being able to call native (primarily Win32) APIs.</p>

<p>Of course, the difference between these two is not obvious. In the .NET world, the two would basically have been the same thing. In learning C# (or another .NET language), you also learn to interface with .NET APIs, and if you need to interface with these APIs, you have to learn a .NET language.</p>

<p>In the case of C++ and native APIs, the situation is a bit different. Learning the language does not guarantee proficiency with using native APIs, and native APIs can be used without knowing C++.</p>

<p>So this series of posts is going to be my second attempt at teaching a .NET developer to at least be able to set up a basic native application, and more importantly, to call a function in the Win32 API.</p>

<p>The following is <em>not</em> a completely general introduction to C++. If you actually intend to learn and use the C++ language, there are many better texts to follow. I might even write my own attempt one day.</p>

<p>In this series of posts, I will</p>

<ul>
<li>assume familiarity with programming in .NET or another managed platform (such as Java). You’ll probably be able to get by as well if you’re coming from another high-level language such as Python or Ruby, as long as you can understand the basic syntax of the C family of languages.</li>
<li>leave out a lot of things a “dedicated” C++ programmer should know. The goal is not to turn the reader into a professional C++ developer, but simply to break down the wall and enable you to make occasional forays into native-land to call an API function or two before heading back to your favorite language.</li>
</ul>

<h1>Before we begin</h1>

<p>Before we get into the actual code, there are a few peculiarities of native languages to be aware of.</p>

<p>Almost all native APIs are actually written in C, not C++. Both languages have some responsibility for this. Part of the reason is that C is the <em>lingua franca</em> of programming languages. When your Python code has to talk to your Java code, they use a C interface. Virtually every language has C wrappers available to allow it to communicate with C code. So by writing your API in C, you ensure that <em>every</em> language can use it without too much trouble. And of course C is a very simple language, so almost any language can cope with a C API. There are no classes, no higher-order functions or exceptions or other pecularities of more modern programming paradigms. So part of the reason is that C is simply a good intermediate language.</p>

<p>The other part of the reason is found in C++: C++ has no fixed ABI<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>. C++ functions compiled by one compiler can not be called from code compiled by another. And when C++ compilers can’t cooperate, entirely different languages don’t stand <em>any</em> chance of being able to talk to C++ code. COM objects provide a partial solution to this, but require a lot of plumbing to implement correctly. For widely used API’s, it is often simpler to restrict your interface to C code.</p>

<p>So the code we need to interface with is actually C, not C++. Our own code is going to be a limited subset of C++. If you intend to write actual applications in C++, you really owe it to yourself to learn the language properly, but for our purposes, sticking with a smaller subset is simpler.</p>

<p>So what does it mean in practice that the API is written in C?
Primarily two things:</p>

<ul>
<li>No exceptions — errors have to be reported through error codes.</li>
<li>No classes — C allows structs, containing data, but no member functions, and no access specifiers. All members are public.</li>
</ul>

<p>Now, on to how we’re going to tackle our task:</p>

<p>The first three installments in this series of posts will deal exclusively with native code. This first one will demonstrate a simple <code>Hello world</code> program, and discuss some fundamentals of organizing and compiling C++ code. This isn’t exactly exciting stuff, but it is useful to understand, as it commonly trips up beginners (and even some reasonably experienced programmers).</p>

<p>The second part will teach all the missing piece of C++ (the ones that we’re going to need, anyway), so that you’re comfortable with reading and writing simple C++ programs.</p>

<p>In the third part we’ll get into the Win32 API, calling a few functions (of varying complexity) and not least, learning to read the arcane specification on MSDN.</p>

<h1>Hello World</h1>

<p>Load up Visual Studio, and create a new project. The project type should be <code>Win32 Console Application</code>. This brings you to the C++ Project Wizard. If it looks like something that belonged in Windows 95, that’s because that is when it was last updated. It is written in Javascript and HTML, of all things.</p>

<p>This wizard gives you access to a couple of application settings. For now, set <code>Application Type</code> to <code>Console Application</code>, and select <code>Empty Project</code> under <code>Additional Options</code>. In particular, we do <em>not</em> want a precompiled header. It is a hack that can speed up compilation time in large C++ projects, but it is nothing more than a source for confusion in simple, small projects. Neither ATL or MFC headers should be added under <code>Add common header files for</code>.</p>

<p>Click <code>finish</code>, and we’re given an empty project, just like we asked for. It contains three “folders”, named <code>Header Files</code>, <code>Resource Files</code> and <code>Source Files</code>. I put “folders” in quotes because they aren’t. Visual Studio calls them filters, and they basically just group files by file type, rather than actually enforcing any particular location on the file system. They’re also not particular important to us so you can delete them if you like. If you add a <code>.cpp</code> file to the project, it is automatically listed under the <code>Source Files</code> filter, while <code>.h</code> files get listed under <code>Header Files</code>.</p>

<p>Now, let’s see some actual code. To begin with, let’s try a Hello World:</p>

<p>Create a new .cpp file in the project.</p>

<p>Now type the following into it: (We’ll get into what it means in a moment)</p>

<pre><code>#include &lt;iostream&gt;

int main() {
   std::cout &lt;&lt; "Hello world" &lt;&lt; std::endl;
}
</code></pre>

<p>Now compile and run it. No big surprises here, it does exactly what we’d expect a “hello world” program to do.
As for what the code means, let’s start with the <code>main</code> function itself. It’s not a member of any class — in C++, nonmember functions are allowed (and commonly used), and <code>main</code> in particular <em>must</em> be a nonmember function. The observant reader may have noticed another curious thing about it: we declare <code>int</code> as its return type, but don’t actually have a return statement. This is allowed as a special case for <code>main</code>. Other functions still have to return normally, but if control reaches the end of the <code>main</code> function, it implicitly returns 0<sup id="fnref:2"><a href="#fn:2" rel="footnote">2</a></sup>.</p>

<p>Inside the main function, you might wonder about the <code>&lt;&lt;</code>‘s. The operators exist in C# as well, and their built-in meaning is the same. Formally, they are used for bit-shifting in both languages, but C++ allows them to be overloaded, and in particular, streams define overloaded versions.</p>

<p>So the <code>&lt;&lt;</code> operator “streams” data into <code>std::cout</code>. <code>std::endl</code> is a stream manipulator which, when it is fed into a stream, produces a line break, and flushes the stream. In this example, we could just have written <code>std::cout &lt;&lt; "hello world\n"</code> to get the newline without flushing the stream, and in some ways, that would actually have been preferable. But I wanted to introduce <code>endl</code>.</p>

<p>A final note is the <code>std::</code> prefix. Where C# uses a simple dot for all scope resolution operators, C++ defines a few different ones:</p>

<ul>
<li>For specifying members of a namespace, <em>or</em> specifying static members of a class, <code>::</code> is used.</li>
<li>For nonstatic class members, <code>.</code> is used. Given an object <code>o</code>, we can access a member <code>m</code> with the syntax <code>o.m</code>, exactly like in C#.</li>
<li>For nonstatic class members <em>accessed through a pointer to the class</em>, <code>-&gt;</code> is used. If we have a pointer <code>p</code> to an object, accessing its member m looks like this instead: <code>p-&gt;m</code>.</li>
</ul>

<p>So in our Hello World program, we reference the object <code>cout</code> in the <code>std</code> namespace.
We could simply add a <code>using namespace std;</code> at the top of the program, much like we would in C#, but in C++, it is not customary to do so. You’ll note that the namespace actually has a very short name, unlike .NET’s long names and nested namespace. Rather than <code>System.Collections.Generic.List</code>, for example, C++ defines <code>std::vector</code>. Almost the entire C++ standard library exists in the <code>std</code> namespace. One of the main reasons for this structure is to make it easy and convenient to access namespace members without having to do <code>using namespace X</code>.</p>

<p><code>cout</code> stands for <em>character output</em>, and is the stream used for standard output, much like the output-related members of .NET’s <code>Console</code> class. There is also a <code>cin</code> stream object responsible for input.</p>

<p><code>cout</code> and <code>cin</code> are actually nothing more than global variables of the type <code>std::ostream</code> and <code>std::istream</code> respectively.
Another output mechanism you’re likely to see is the C function <code>printf</code>, which is syntactically closer to what you’re used to from .NET.</p>

<p>Given an integer <code>i</code> we want to print out along with a message, <code>cout</code> and <code>printf</code> would be used like this:</p>

<pre><code>std::cout &lt;&lt; "You have " &lt;&lt; i &lt;&lt; pancakes\n";
printf("You have %d pancakes.\n", i); 
</code></pre>

<p>Each of these have their advantages and disadvantages as you can probably see. The nice thing about <code>cout</code> is that it is type-safe, and allows us to compose our output string without having to worry about the type of <code>i</code> or the number of parameters. We just stream whatever we like into <code>cout</code> one parameter at a time, and it all just works. It also works with user-defined types. They just have to define an appropriate <code>operator &lt;&lt;</code>.</p>

<p>The nice thing about <code>printf</code> on the other hand, is that the actual format of the string is much more readable, and parameters are specified separately at the end. As you know from .NET’s <code>string.Format</code> function, it is very convenient to be able to write the entire format string in one go, and only specify parameters afterwards. It is a bit awkward that <code>cout</code> requires you to break up the string with <code>&lt;&lt;</code>‘s all over the place.  But there are some serious limitations to printf as well:</p>

<ul>
<li>It can not be extended. It works for the basic built-in types, and nothing else.</li>
<li>It requires the programmer to specify the type of the parameter as part of the format string. (<code>%d</code> specifies that the paramater at this partition is expected to be an integer (I assume the <code>d</code> stands for decimal). But there is no type-checking to verify that this is actually the case. I can pass a float to <code>printf</code>, and print it out with <code>%d</code>, and I get garbage.)</li>
<li>The number of parameters to the function are unknown to the compiler. C (and C++) only have very rudimentary support for functions with variable arguments. Once you make use of this feature, you lose all type information <em>and</em> information about the number of parameters passed to the function. </li>
</ul>

<p>I tend to prefer <code>cout</code> for these reasons; it is safer, and it can be extended. But you’re likely to encounter <code>printf</code> in code samples and should at the very least be familiar with it.</p>

<p>Finally, let’s deal with the very first line. There are four things to note about it. In order of appearance, they are:</p>

<ul>
<li>The <code>#</code> at the very start of the line indicates that this is a preprocessor directive. In other words, this is evaluated in a separate pass <em>before</em> the compiler starts working. Modern compilers don’t maintain a strict separation between preprocessing and compilation, but as the language is specified, the preprocessor basically runs over the source code performing a number of simple modifications <em>before</em> the compiler is invoked.</li>
<li><code>include</code> is the actual preprocessor directive. It specifies that we would like to include a file.</li>
<li>The file name is surrounded by angle brackets (<code>&lt;&gt;</code>). When these are used, the preprocessor searches for the file to include in system directories. If we had used double quotes (<code>""</code>), the preprocessor would have searched for the file locally first. So slightly simplified, use ´&lt;&gt;<code>to include system headers, and</code>””‘ to include files from the same project or solution.</li>
<li>Finally, inside the angle brackets, we have the name of the header file we’d like to include. In general, your own files should use a <code>.h</code> or <code>.hpp</code> suffix. Headers belonging to the C standard library also use <code>.h</code>, but C++ standard library headers have no extension. (So you have <code>iostream</code> instead of <code>iostream.h</code>).</li>
</ul>

<p>Finally, what does it mean for a file to be <code>#include</code>’d? It’s not quite the same thing as the <code>using</code> statements you put at the top of a file in C#. Those <code>using</code> statements are functionally similar to the <code>using namespace</code> statement mentioned earlier — they allow us to reference types defined in other namespaces as if they were members of the current namespace. If we do not have the <code>using</code> statement, we have to specify the full namespace prefix when using the type (<code>System.Collections.Generic.List&lt;T&gt;</code> instead of simply <code>List&lt;T&gt;</code>), but the types are still <em>available</em>. I can reference <code>System.Collections.Generic.List&lt;T&gt;</code> in C# without any <code>using</code> statements. Likewise, I can reference <code>std::cout</code> as I did in the previous example without having a <code>using namespace std</code>.</p>

<p>But without the <code>#include</code>, the compiler would not have been aware of <code>cout</code> at all.</p>

<p>An <code>#include</code> is in a sense very simple. All that actually happens is a copy/paste operation. The preprocessor locates the file <code>iostream</code>, and copies its contents into our file at the location of the <code>#include</code>. The effect of this is to give us access to anything defined in the file. In .NET this is all taken care of by magic. Anything in the current assembly is automatically visible, and anything that isn’t declared <code>internal</code> in other assemblies is visible as soon as we add a reference to it.</p>

<p>In C++, no such mechanism exists. What the compiler sees is <em>just</em> the current file. Other files, even in the same project, are not visible when the current file is being compiled. The compilation model is notoriously quirky, and probably deserves some explanation.</p>

<h1>The preprocessor and the C/C++ compilation model</h1>

<p>C++ code is compiled in a couple of stages. I already mentioned the preprocessor. In the old days, this was a separate program, which was run on the source code first, perfoming simple text manipulation (search/replace, and conditionally removing chunks of code). The output of this was then fed to the compiler. Finally, the output of the compiler is fed to a linker, which we’ll get to later. Today, the preprocessor is built into the compiler, but it is still a separate pass made over the code before the actual compilation begins.</p>

<p>Let’s wrap up the preprocessor quickly though. It can do a few other things that we’ll probably run into soon enough. In particular, <code>#define</code> has a few uses. It creates a macro — whenever the name of this macro is encountered, it is replaced with the macro definition.</p>

<p>So in the following:</p>

<pre><code>#define waffles pancakes
std::cout &lt;&lt; "I like " &lt;&lt; waffles();
</code></pre>

<p>we create a macro named <code>waffles</code>, and from that point onwards, any occurence of <code>waffles</code> is swapped for <code>pancakes</code>. Which means that the function that actually gets called in line two is <code>pancakes()</code>, rather than <code>waffles()</code> — highlighting another important aspect of the preprocessor. Because it is run <em>before</em> compilation, it has no notion of actual language syntax. It doesn’t care about the context of the text it is replacing. It doesn’t care that this is a function call, just like it wouldn’t care if the named had been found in a different namespace than the one the macro was defined in. It doesn’t respect scoping rules or anything else. It won’t swap out the middle of words, or the contents of string literals (so <code>ilikewaffles()</code> would go untouched, as would <code>"waffles"</code>, but that’s about it. Anything else gets brutally replaced by the preprocessor.</p>

<p>Another common example of its simplicity is the following:</p>

<pre><code>#define four 2+2
int i = four * four;
</code></pre>

<p>The result of this? It is <code>8</code>. The preprocessor just performs simple text substitution, resulting in this code: <code>int i = 2+2 * 2+2</code>, which of course gets evaluated as <code>int i = 2 + (2*2) + 2</code>.</p>

<p>We can also use the preprocessor to perform conditional compilation removing sections of code at compile-time:</p>

<pre><code>#define waffles
#ifdef waffles // #if defined(waffles) would also have been legal
// this will get compiled
#else
// this will get removed by the preprocessor
#endif
</code></pre>

<p>A variation on this is used in almost every header file, but we’ll get to that soon enough.</p>

<p>The compiler processes what is technically known as <em>translation units</em>. A translation unit is a single source file (typically <code>.cpp</code> or <code>.cc</code> for C++, or <code>.c</code> for C code), after preprocessing. So in our Hello World program, we have one translation unit, consisting of the contents of the header file <code>iostream</code>, followed by our main function. The result of compilation is not a program, but rather an <em>object file</em> (Visual Studio uses the extension <code>.obj</code> for these — GCC uses <code>.o</code>). An object file contains all the compiled code for this file, but with certain placeholder “gaps”. This is necessary as code files will typically depend on functions or variables defined in other translation units. We are able to tell the compiler that a function defined in another translation unit <em>exists</em>, but it won’t be able to see the actual defintion of the function, so it has to generate a kind of placeholder, saying “call the function with this name, as soon as we find out where that function <em>is</em>”. That is essentially the role of object files. Store the compiled code, along with the necessary information about which symbols <em>this</em> file defines, and which symbols it depends upon, and which must be found in other files for the program to be complete.</p>

<p>When all the object files are created, they are passed to the linker, which performs the final steps — reading all the object files, locating all these placeholders, and filling them in. If some code in object file A calls a function <code>f</code> defined in another file B, the linker must read both files A and B, determine the address of the function <code>f</code>, and insert it into the function call inside A.</p>

<p>If the linker finds multiple conflicting definitions of <code>f</code> (perhaps object file C also defined a function with the same signature), it is of course an error. Likewise, if it is unable to locate the full definition of a symbol referenced from a file, we get an error. Because the linker does not have access to the actual source code, but only the object files, linker errors are notoriously hard to understand, but it can be done. The following simple code causes a linker error: (we’re going to run with this example for a while, so feel free to add it to a new project, or overwrite the previous file. This code should be the only contents of the project)</p>

<pre><code>class myclass {
public:
    int f(float fl);
};

int main(){
    myclass c;
    c.f(1.0f);
}
</code></pre>

<p>The code should be straightforward enough. We declare a class with a member function <code>f</code>. In the main function we create an instance of our class, and call the <code>f</code> function. There is just one problem: the function is <em>declared</em>, but it has not been <em>defined</em>. In other words, the compiler knows it exists (so we don’t get a compiler errror when we try to call it, as we would if we called a completely unknown function), but because it does not have the function body, it has to assume that the full definition is… elsewhere. So the compiler lets this pass, hoping that the linker can sort things out.</p>

<p>But the the linker is given only this one translation unit. So it is unable to find a definition for the function <code>f</code>, so it spits the following error at us:</p>

<blockquote>
  <p>error LNK2019: unresolved external symbol <code>"public: int __thiscall myclass::f(float)" (?f@myclass@@QAEHM@Z)</code> referenced in function <code>_main</code></p>
</blockquote>

<p>Ouch. Again, the linker doesn’t have access to the source code, so this is about the best it can do. It tells us that the problem is an “unresolved external symbol”, or in other words, it was unable to resolve a symbol that one of our translation units expected to be “external” (defined in another translation unit). As for the symbol itself? All it actually sees is the mangled string near the end: <code>?f@myclass@@QAEHM@Z</code>. This is the name for the function generated by the compiler and stored in the object file, and I have no clue what the @‘s or the letters following it mean. They <em>somehow</em> encode information about parameters and return type, but that’s about all I can say. Luckily, the linker is able to decode this name, which it also does for us. It tells us that the function has <em>public</em> visiblity, and its return type is int. <code>__thiscall</code> is the <em>calling convention</em> used for member methods. (It is essentially a calling convention that allows for a <code>this</code> parameter, hence the name). The calling convention isn’t usually important here though. Next, we can see that the unresolved symbol is a member of the class <code>myclass</code>, the function is named <code>f</code>, and it takes a <code>float</code> as its parameter. Finally, it tells us that the symbol was referenced from the <code>_main</code> function (again, we can’t always trust the compiler to preserve the precise names, but it’s probably a safe bet to assume that when it says <code>_main</code>, it means <code>main</code>.</p>

<p>So the error is actually pretty straightforward once you filter out the noise. A lot of C++ programmers don’t realize this, and go into a panic whenever they encounter a linker error, which is why I wanted to demonstrate this one. They typically contain a lot of noise (especially in more complicated cases), but they can be deciphered if you eliminate all the <code>@@</code> nonsense and read the remaining text slowly and carefully.</p>

<p>The other reason why I wanted to demonstrate this is that it is key to why header files are used. Based on the above example, we now know that the compiler can be tricked into accepting a call to a function it has no knowledge of, as long as it can see a valid declaration. (a function declaration is essentially just the signature (including return type), followed by a semicolon, much like an interface method in C#.</p>

<p>So perhaps we should get creative and see if we can make the linker happy too. First, we create second <code>.cpp</code> file with the following contents:</p>

<pre><code>class myclass {
public:
    int f(float fl);
};
</code></pre>

<p>There’s still no definitions of <code>f</code>, but we’re taking it a step at a time. Now, though, we have <em>two</em> files containing the same definition of <code>myclass</code>. Of course, the compiler only sees one file at a time, so it won’t notice this, but what will the linker say? Won’t it complain about multiple definitions of the same symbol? Try compiling the project and find out.</p>

<p>As it turns out, we get <em>exactly</em> the same error as before. But we don’t get any complaints about the multiple definitions of the same class. This is actually allowed. We are allowed to create as many definitions of the same symbol as we like, as long as there is only one in each translation unit (the compiler will choke on it if you try to define a class you’ve already defined), <strong>and</strong> all the definitions are <em>exactly</em> identical. (The linker will typically not enforce the last requirement though. If the definitions are not identical, it typically manifests as weird crashes at runtime)</p>

<p>This is called the One Definition Rule (ODR). Only one definition may exist. That definition may occur in multiple places, but it must be identical, it must be the <em>same</em> definition, every time it is encountered.</p>

<p>So it seems like we have a problem, doesn’t it? We’re allowed to duplicate the class definition, but we’re not allowed to modify it! So how are we supposed to add the definition of <code>f</code>?</p>

<p>Try changing your second file (the one without the <code>main</code> function) to the following:</p>

<pre><code>class myclass {
public:
    int f(float fl);
};

int myclass::f(float fl){
    return 42;
}
</code></pre>

<p>and compile it. Voila! It works. We didn’t modify the actual class definition, so we obeyed the ODR rule. Instead, we added the function definition <em>afterwards</em>, outside the actual class definition. And both the compiler and linker are happy. The linker now sees two identical definitions of the class <code>myclass</code>, but that’s allowed under the ODR rule. It also sees a call to the function <code>myclass::f</code>, <em>and</em> a single definition of the same function, so it is able to glue everything together into one single program.</p>

<p>Of course, having to copy/paste, and maintain duplicate code in every <code>.cpp</code> file is hardly ideal. Sooner or later, we’re going to modify <code>myclass</code> in one file, and forget to do the same modifications in all the other files. That will break the ODR rule, and everything will crash horribly.</p>

<p>That is where header files come in. We could put the shared code in a separate file, and use the <code>#include</code> directive mentioned earlier to <em>automatically</em> copy/paste the contents in! Let’s try that now. Create a new file (with the <code>.h</code> or <code>.hpp</code> extension), and place the class definition in that. Now remove the class definition from the two <code>.cpp</code> files we already had, and replace it with a <code>#include</code> referencing the header.</p>

<p>That is, your projcet should contain the following three files: (I’m going to name the <code>.cpp</code> files <code>main.cpp</code> and <code>myclass.cpp</code> for convenience:</p>

<pre><code>// myclass.h
class myclass {
public:
  int f(float fl);
};

// myclass.cpp 
#include "myclass.h" // note we use quotes, not angle brackets here
int myclass::f(float fl){
  return 42;
}

// main.cpp
#include "myclass.h"
int main(){
  myclass c;
  c.f(1.0f);
}
</code></pre>

<p>And it seems to work. Clever.
There is one little problem though. What happens if we include our header multiple times? We probably won’t intentionally do this, but perhaps we’re going to include it, and then include another header, which also includes it. We can easily get out into a situation where some headers get included many times. Think of standard headers like <code>iostream</code>. We’re going to end up including it fairly often. Sooner or later, we’ll end up including some of our headers twice, which breaks the ODR rule! We’re not allowed to have multiple definitions <em>in the same translation unit</em>. To test the problem, feel free to duplicate the <code>#include</code> statement and verify that the compiler chokes on it.</p>

<p>So to solve this problem include guards are used. Modify your header as follows:</p>

<pre><code>#ifndef MYCLASS_H
#define MYCLASS_H

class myclass {
public:
  int f(float fl);
};

#endif
</code></pre>

<p>There should be nothing new in this, but the consequence might be surprising. First, we ask the preprocessor to check if the macro <code>MYCLASS_H</code> is defined, and only evaluate the following if it is <strong>not</strong> defined (the directive is named <code>ifndef</code>, or <em>if <strong>not</strong> defined</em>).</p>

<p>If we enter the if statement, the first thing we do is define the symbol <code>MYCLASS_H</code>, and then we evaluate the original contents of the header. Finally, we end the if-statement with an <code>#endif</code>. So what happens if the file gets included twice now?</p>

<p>For simplicity, assume the following <code>.cpp</code> file, containing nothing except two includes:</p>

<pre><code>#include "myclass.h"
#include "myclass.h"
</code></pre>

<p>As the preprocessor parses this, it’ll expand both <code>#include</code>’s, resulting in this:</p>

<pre><code>#ifndef MYCLASS_H // At this point, the macro MYCLASS_H is not defined, so we enter the following block:
#define MYCLASS_H // define the macro MYCLASS_H

class myclass { // allow this code to stay in the translation unit
public:
  int f(float fl);
};

#endif // end the if statement
#ifndef MYCLASS_H // now MYCLASS_H *is* defined, the condition is not true, and so we *skip* the if statement.
//#define MYCLASS_H // of course the preprocessor doesn't actually comment out the code, it simply removes it from the translation unit. I'm commenting it to illustrate what happens
//
//class myclass { // this time, the preprocessor *removes* all this code, because it is inside a #if statement we're skipping
//public:
//  int f(float fl);
//};
//
#endif
</code></pre>

<p>so after the preprocessor has run, only this code actually gets inserted in our translation unit:</p>

<pre><code>class myclass {
public:
  int f(float fl);
};
</code></pre>

<p>So it seems we’re able to handle multiple inclusions of the same header now.</p>

<p>So to review, we’re now able to split our code across multiple source files, and do it <em>correctly</em>. We don’t need to duplicate any code — all the shared code can be placed in header files, and include guards protect against accidentally including the same file twice in the same compilation unit.</p>

<p>And now you should finally understand what it meant when we included <code>iostream</code> in the original Hello World example. We’re simply pasting in a lot of system code, containing declarations that get linked together with the standard library containing the full definitions.</p>

<p>This turned out a lot longer than I’d originally intended (I had originally, and naïvely, planned to write the entire series in one post), so let’s call it a day here. Part two will be posted very soon, and cover some actual C++, now that we’ve got the fundamentals out of the way. You needed to understand how C++ code is compiled before you’re able to write anything useful in the language.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>Application Binary Interface — a common ABI is required for two functions to be able to call each others. The ABI defines the memory layout of structs or classes, as well as calling conventions and basically everything you need to be able to call a function. Where should the return value go, where should parameters be placed, and so on. C defines a fixed ABI, which makes it easy to interface with. <a href="#fnref:1" rev="footnote">↩</a></p>
</li>

<li id="fn:2">
<p>The story goes that Bjarne Stroustrup, the language designer, didn’t want to create a language where a simple <code>hello world</code> required multiple lines of code in the main function. Hence the special rule that <code>main</code> doesn’t have to have an explicit <code>return</code> statement. <a href="#fnref:2" rev="footnote">↩</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/08/a-net-developers-guide-to-c/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
