<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>jalf.dk &#187; Programming</title>
	<atom:link href="http://jalf.dk/blog/category/programming/feed/" rel="self" type="application/rss+xml" />
	<link>http://jalf.dk/blog</link>
	<description>Musings and thoughts on programming and other geeky stuff</description>
	<lastBuildDate>Mon, 12 Jul 2010 15:21:00 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Dogfooding redux</title>
		<link>http://jalf.dk/blog/2010/04/dogfooding-redux-2/</link>
		<comments>http://jalf.dk/blog/2010/04/dogfooding-redux-2/#comments</comments>
		<pubDate>Fri, 09 Apr 2010 16:36:08 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=560</guid>
		<description><![CDATA[A while ago I wrote a bit about Microsoft’s practice of “dogfooding” their software. That sparked a fair amount of discussion on Reddit. Of course, a few people assumed I was talking in absolutes, that because the practice of dogfooding is not perfect, it must be evil. That’s a bit of an exaggeration. I never [...]]]></description>
			<content:encoded><![CDATA[<p>A while ago I wrote a bit about Microsoft’s practice of <a href="http://jalf.dk/blog/2010/01/the-downside-to-dogfooding/">“dogfooding”</a> their software. That sparked a fair amount of discussion on Reddit.</p>

<p>Of course, a few people assumed I was talking in absolutes, that because the practice of dogfooding is not <em>perfect</em>, it must be <em>evil</em>. That’s a bit of an exaggeration.<span id="more-560"></span></p>

<p>I never said that “dogfooding is bad”. But dogfooding generally means using your product <em>on a daily basis</em>. Not just poking around with it, or checking it out, but actually using it as your primary tool for… whatever the product is supposed to do for you.</p>

<p>This implies that you can’t really use a lot of <em>other</em> products as intensively while you’re dogfooding your own. A developer on the Visual Studio team can’t <em>both</em> use VS10 for all development, <em>and at the same time</em> use Eclipse, Vim and Emacs on a daily basis. If he were to use all  those, there would no longer be enough time left to use VS10, and so he wouldn’t be dogfooding it.</p>

<p>And that is the problem. Dogfooding is a valuable way to incrementally improve and polish your product. But it also takes up time that could have been spent using <em>someone else’s</em> product. And when you use a product in your daily life, you become blind to a number of its weaknesses.</p>

<p>One of the Reddit commenters mentioned a lovely example: Java developers, people who write Java and nothing else, don’t really see the need for closures. They just don’t really consider it a valuable feature. But pretty much everyone else does. Does this mean that in Java, alone of all languages, closures are not relevant? C# programmers got closures and are ecstatic about them. C++ programmers are about to get closures, and everyone’s just itching for it to happen. Functional languages always had closures, and programmers in those languages just can’t live without them. Are we to believe that specifically in Java, closures just <em>do not matter</em>?</p>

<p>Or does it just mean that the Java developer <em>doesn’t yet realize how much easier closures could make his job</em>? He doesn’t realize this because he’s never had the chance to use them. He’s stuck in the world of Java as it looks today, and if you were to ask him what he’d like changed about the language, he’s not going to say “closures” or “higher order functions”, or even “templates” or “type inference”. He’s going to come up with some small incremental improvement. Wouldn’t it be nice if class X was added to the class library? Wouldn’t it be nice if Y was named differently? Wouldn’t some shorthand syntax for things you can do already be nice?</p>

<p>And the same is true for dogfooding. Visual Studio is slow. Many common operations cause multiple seconds of wait time. Adding an empty file to a project in VS2008 is painful sometimes. But oddly enough, I only really notice it when I’ve gotten used to Vim or Emacs or some other alternative IDE or code editor, which isn’t so slow. Its project structure is fundamentally broken, but I only really notice this when I’ve just spent a few weeks playing around with makefiles<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>. And the same goes for the Windows Mobile phone I had a few years ago. While it worked, I tolerated its quirks. Apart from a few instances when it went <em>really</em> haywire, it wasn’t so bad. Sure, it was sluggish and needed to be rebooted regularly, and sure, the interface wasn’t exactly pretty or convenient to use. But it worked, most of the time, and I got used to it. I didn’t love it, and I didn’t think it was a particularly good phone, or platform or OS, but it was usable. Then the phone broke, and it took me a few weeks with my replacement phone to realize just how much better <em>everything else</em> was. My Windows Mobile phone sucked. I just didn’t realize it until I got to use a different one on a daily basis.</p>

<p>But let me reiterate, dogfooding is a good thing. There’s no doubt of that. Your product can only get better by having your developers actually use it in the same ways that end users are expected to. But if dogfooding is <em>all</em> you do, then the world will pass you by.</p>

<p>So why did I dig this old post up again?</p>

<p>Because Visual Studio 2010 is about to be released, and as we may have expected, this means Microsoft’s developers have to beat their drum a bit about how awesome it is <em>because it’s been dogfooded</em>.</p>

<p><a href="http://blogs.msdn.com/somasegar/archive/2010/04/08/dogfooding-vs-2010-and-net-4.aspx">Soma</a> just seems to forget that the glaring performance issues in both beta 1 and 2 came as a huge surprise to Microsoft <em>until the betas had been released in the wild</em>. As much as they dogfooded it, it didn’t give them the information they needed: <em>that VS10 is painfully slow compared to everything else, including VS9</em>.</p>

<p>Both betas were released pretty much with the message “Don’t worry. It might use a lot of managed code and use a completely new WPF-based editor. But it runs really well and is just as fast as previous versions of Visual Studio”.</p>

<p>There were two reasons for this. One seems to be (according to another Microsoft developer’s blog post which I can’t seem to find at the moment, unfortunately) that they simply didn’t collect the right metrics from all their dogfooders. A lot of their developers <em>did</em> think it was painfully slow, but were never asked how they felt about the current performance level.</p>

<p>And the other is obviously that all their dogfooders <em>were using VS10 in their day-to-day work</em>. They had nothing else to compare it with. They weren’t using Eclipse or Vim or Emacs or even VS9 or VS6 in their daily work. So they became used to the downright painful performance.</p>

<p>But once they released the beta into the wild, it was obvious to <em>everyone else</em>, all those people who had <em>not</em> been using VS10 on a daily basis for months, that it was a huge step backwards.</p>

<p>So yes Microsoft, you’ve heavily dogfooded VS10. No doubt about that. But let’s not pretend that it solved all your problems, or that it gave you the best possible product. In reality, it led to you scrambling the last 6 months to bring the performance back up to where it should have been all along, and where you’d probably have kept it if you’d used <em>other</em> IDE’s occasionally so that you’d had a basis for comparing performance.</p>

<p>It’s not that dogfooding is <em>bad</em>. It’s just not enough. And it is not, in itself, a selling point or a proof of quality.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>No, I’m not saying makefiles are “better” or even “good”. They have plenty of problems on their own. But they do allow a remarkable degree of flexibility over VS solutions, as <a href="http://gamesfromwithin.com/from-full-to-lite-in-under-an-hour">Noel Llopis realized here</a>. The point is not that VS should switch to plain old makefiles, but just that perhaps the VS project system could be improved to incorporate some of the strengths of these. <a href="#fnref:1" rev="footnote">↩</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2010/04/dogfooding-redux-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thesis defense!</title>
		<link>http://jalf.dk/blog/2010/04/thesis-defense/</link>
		<comments>http://jalf.dk/blog/2010/04/thesis-defense/#comments</comments>
		<pubDate>Wed, 07 Apr 2010 15:56:46 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Meanwhile]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[thesis]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=556</guid>
		<description><![CDATA[The end is nigh. On monday the 12th of April, I’m going to defend my master’s thesis. If you’re in the area, and are geeky enough to find it interesting, feel free to drop by. The precise place and time is: 15:00, April 12, 2010 Room S125 / 3–1-25 DIKU (Datalogisk Institut) Universitetsparken 1 København [...]]]></description>
			<content:encoded><![CDATA[<p>The end is nigh.</p>

<p>On monday the 12th of April, I’m going to defend my <a href="http://jalf.dk/thesis/">master’s thesis</a>. If you’re in the area, and are geeky enough to find it interesting, feel free to drop by.
<span id="more-556"></span></p>

<p>The precise place and time is:
15:00, April 12, 2010</p>

<p>Room S125 / 3–1-25
DIKU (Datalogisk Institut)
Universitetsparken 1
København Ø</p>

<p>Looks like I’m going to be busy the next couple of days preparing my presentation.</p>

<p>That is all.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2010/04/thesis-defense/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Post-thesis, post-aprils-fools update</title>
		<link>http://jalf.dk/blog/2010/04/post-thesis-post-aprils-fools-update/</link>
		<comments>http://jalf.dk/blog/2010/04/post-thesis-post-aprils-fools-update/#comments</comments>
		<pubDate>Sat, 03 Apr 2010 14:03:54 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Games]]></category>
		<category><![CDATA[Meanwhile]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[thesis]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=551</guid>
		<description><![CDATA[Just over a month ago, I handed in my Masters Thesis. All that’s left now is an oral defense of it one of the next weeks. So what happens then? I suppose I should find a job. A few people have asked if I am going to do a PhD, but I don’t think so. [...]]]></description>
			<content:encoded><![CDATA[<p>Just over a month ago, I handed in my Masters Thesis. All that’s left now is an oral defense of it one of the next weeks. So what happens then? I suppose I should find a job. A few people have asked if I am going to do a PhD, but I don’t think so. I think I’ve had enough of academia for now. It was fun while it lasted, but I think it’s time to try something different.
<span id="more-551"></span>
 But beyond that, I don’t really know what I’m going to do. For now, I’ve just enjoyed my free time, catching up on all the things I haven’t really had time for while writing the thesis (such as playing Mass Effect 2, which I heartily recommend, and yes, some coding on various hobby projects).</p>

<p><a href="http://jalf.dk/blog/tag/thesis/">Here</a> is what I’ve previously written about my thesis on the blog, <a href="http://en.wikipedia.org/wiki/Software_transactional_memory">here</a> is what Wikipedia has to say on the subject, and <a href="http://jalf.dk/thesis">here</a> is the thesis itself, including source code.</p>

<p>I’ve been meaning to write this post pretty much for the last month. The reason I’m finally doing it is that I also wanted to drop a quick line on a cute aprils fool joke that should be of interest to a lot of gamers:</p>

<p><a href="http://www.rockpapershotgun.com/">Rock, Paper, Shotgun</a> dedicated the entire day to perfectly ordinary PC game reporting/blogging <a href="http://www.rockpapershotgun.com/2010/04/02/back-to-the-pre-working-for-future-1993"><em>as if it’d been April 1st, 1993</em></a>. Cute and intelligent, and served as a fun trip down memory lane. Nice idea, and a nice change from the usual fare of everyone trying to pull off outrageous or absurd stories ad nauseam. Especially as there seemed to be practically no worthwhile pranks to be found anywhere this year (even Google had some pretty tame ones), your up to the minute coverage of PC gaming news as of 17 years ago really made the day.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2010/04/post-thesis-post-aprils-fools-update/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Singletons: Solving problems you didn’t know you never had since 1995</title>
		<link>http://jalf.dk/blog/2010/03/singletons-solving-problems-you-didnt-know-you-never-had-since-1995/</link>
		<comments>http://jalf.dk/blog/2010/03/singletons-solving-problems-you-didnt-know-you-never-had-since-1995/#comments</comments>
		<pubDate>Fri, 12 Mar 2010 04:40:59 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[design patterns]]></category>
		<category><![CDATA[singleton]]></category>
		<category><![CDATA[stackoverflow]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=532</guid>
		<description><![CDATA[Funny how it goes. Some subjects are just flat out impossible to write catchy titles for. Others seem to attract them like flies. A lot of very clever people have written volumes about “The Simpleton Pattern”, and “Singletonitis”. Many people are in love with the Singleton pattern. Others — a small minority, I suspect — [...]]]></description>
			<content:encoded><![CDATA[<p>Funny how it goes. <a href="http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/">Some subjects</a> are just flat out impossible to write catchy titles for. Others seem to attract them like flies. A lot of very clever people have written volumes about <a href="http://steve.yegge.googlepages.com/singleton-considered-stupid">“The Simpleton Pattern”</a>, and <a href="http://www.gamedev.net/community/forums/mod/journal/journal.asp?jn=259115">“Singletonitis”</a>.</p>

<p>Many people are in love with the <a href="http://en.wikipedia.org/wiki/Singleton_pattern">Singleton pattern</a>. Others — a small minority, I suspect — consider it a mistake, an anti-pattern, or something that was only ever included in <em>the</em> Design Patterns book as a lifeline to procedural programmers who couldn’t really figure out this OOP thing.
<span id="more-532"></span></p>

<p>I won’t pretend to be half as clever as all the people who have already written about the problems with singletons years ago, and I don’t think I have anything <em>new</em> to bring to the table. But it is a pattern I learned to loathe very soon after I first saw it in use. (Singletons do sound attractive when you first hear of them. But they pale a bit when you end up having to tear up and rewrite half your code just because all your singleton classes start revealing their shortcomings) And for a long time now, I’ve tried to convince other programmers that Singletons have some serious problems. Recently, it seems like I’ve even gotten noticed for it on StackOverflow.</p>

<p>First, <a href="http://stackoverflow.com/users/87234/gman">GMan</a> posts an answer to <a href="http://stackoverflow.com/questions/2080233/is-it-good-programming-to-have-lots-of-singleton-classes-in-project/2080242#2080242">one question</a>, and I comment with a mild disagreement, and the discussion goes on for a few more comments. As singleton rants go, this one is pretty mild, and I don’t really think about it any further. Then, a few weeks later, I discover his blog and <a href="http://blackninjagames.com/?p=24">this post</a>. Wow! A convert. A person I know to be extremely bright and a knowledgeable programmer has changed his mind in response to something <em>I</em> said… I’m flattered.</p>

<p>And today, I noticed another question being posted, which had both Boost and Singletons in the title — how could I resist? Two subjects I enjoy talking about, even if the things I say about them are very different. Surprisingly, the comments there already mentioned me, and some of my earlier answers regarding singletons. Should I be flattered that people have started bringing my name up when discussing Singletons?</p>

<p>Anyway, one of the comments also suggested I write a blog post describing my argument in detail. So I will.</p>

<h1>Two wrongs don’t make a right</h1>

<p>There are a lot of problems with singletons. In fact, it’s surprising that so many people still consider the pattern useful, when it is afflicted with so many weaknesses and flaws. However, for now I will single out the two that I feel are the most fundamental: not just problems with how a singleton works, but with what they’re trying to achieve:</p>

<p>A singleton, as defined by the Gang of Four, combines two properties:</p>

<ul>
<li>it guarantees that exactly one instance of an object exists. While that one instance is typically created lazily, so it doesn’t technically exist throughout the entire application’s lifetime, it always seems to the programmer as if precisely one instance exists, and</li>
<li>it guarantees global access to this one instance.</li>
</ul>

<p>Let’s pick those apart a bit. The last one is easiest: it is, by now, fairly common knowledge that <em>global state is bad</em>. We don’t like global variables, we don’t like static class members, we don’t like anything that makes it harder to isolate bits of our code. Dependence on global state causes a lot of problems: it hurts parallelism, as access to global mutable state generally has to be serialized through the use of locks. It makes dependencies harder to detect and control (any function might silently decide to access our singleton. The function signature says nothing about this, so we have to read the source code of the function to determine if this is the case. And because it is so convenient to always just add a reference to a singleton, we tend to do it a lot. When you have a singleton, you quickly end up in a situation where three out of four classes depend on it. How did that happen? Why, logically speaking, do so many classes need direct access to the database? Or the renderer? Is that good design? Not only is this messy, it’s also painfully hard to fix after the fact. Once we have these dependencies on global objects everywhere, that’s a lot of code we need to change to eliminate the global. Almost every class will be impacted by the change, and a huge number of functions have to have their signatures modified to take that extra parameter replacing the global. Or even worse, the function has to be completely rewritten to eliminate the need for whatever service the singleton provided. The more globals you have in your project, the more your dependency graph starts resembling spaghetti. And the harder it gets to clean it up.</p>

<p>It hurts reusability, as code taken from one project and inserted into another may break because it depended on globals not present in the new project. It hurts testability partly for the same reason, a unit test testing a class must suddenly provide a number of globals as well just for the code under test to compile, but also because global state makes tests less deterministic. One test might change the state of this global, affecting the outcome of the next test to run.</p>

<p>Globals are bad for a lot of reasons. They have their uses, no doubt about that, but we should be suspicious whenever the solution to a problem involves global data. It might be the best solution, but often, it is more trouble than it’s worth.</p>

<p>The other point is more subtle. Why do I object to a class enforcing that “only one instance may exist”? It’s really just common sense. As the Agile movement tells us, we don’t really know what our code is going to look like tomorrow. Over the course of development, we <em>have</em> to adapt to changes, modify our code, revise decisions already made. Why put roadblocks in front of us? Why make it harder to adapt to unforeseen changes or requirements?</p>

<p>Today, I might think that I need only one logger instance. But what if I realize tomorrow that I need two? That’s not so far fetched. We may have one log we write ad-hoc messages intended for debugging purposes, solely to be read by developers, and another formalized log, where structured messages are written when predetermined events occur, so that the application can be monitored in production. Sure, we <em>could</em> define the two as completely separate classes, and then we’d only need one instance of each (but then we’d start duplicating code). Or we could use the same log instance to write to both logs (but then the logging code would become more complex, having to interleave two separate and non-overlapping logs.</p>

<p>Once we’ve accepted that an application may need more than one logger, shouldn’t we do ourselves the favor of ensuring that our loggers <em>can</em> be instantiated more than once, just in case it turns out to be the right thing to do? We’re not even adding any complexity, there’s no cost associated with this. On the contrary, we’re <em>removing</em> significant complexity. Thread-safe singletons are surprisingly hard to get right. Dependencies between singletons are tricky and circular ones can cause them to blow up in all sorts of fun ways. And let’s not even get into how to handle anything our singletons might do while the application is shutting down. What if the database singleton tries to write a simple “goodbye” log message to the log singleton? What if the log singleton got destroyed before the database one? Ouch.</p>

<p>Singletons are hard to write and hard to use. Removing them only simplifies our code, so if it also enables us to better adapt to unforeseen requirements, why <em>shouldn’t</em> we remove them?</p>

<p>Not convinced? Let’s think of some other examples then:</p>

<ul>
<li><em>the application configuration should be a singleton, right? We <strong>obviously</strong> can’t have more than one of those!</em> Wrong. We can. We often do. Think about what happens when the user opens the “Options” screen and modifies the settings. During that time, two sets of settings exist: the “applied” settings that are currently in effect, and the “speculative” ones, currently being picked out by the user. Once he clicks OK, the speculative changes should be applied, replacing the ones that were previously in effect. But until then, we have two sets of settings to maintain.</li>
<li><em>a database connection pool then! If we have more than one pool of connections, we can’t efficiently share them!</em> Correct, but perhaps we don’t <em>want</em> to share them. Perhaps I want to ensure that library A has one pool of 10 connections available to it, component B has a smaller pool of 3 connections, an components C, D and E use the global pool with however many connections it supplies. That would ensure that no matter the number of threads running in component B, it’ll never starve out other components trying to access the database. It can never hold more than three connections, leaving room for other components. Of course, in the common case, we do want all connections to be shared in one single pool. But perhaps not <em>always</em>. So yes, there should probably be a globally accessible default pool available. But why shouldn’t it also be possible to define new <em>local</em> pools if the user deems it necessary? Why limit ourselves to one instance?</li>
</ul>

<p>And even if you do come up with some case where we absolutely <em>must</em> never have more than one instance, where it would make the sky come crashing down on us, consider testing. Consider that each of your unit tests should set up the environment it needs, and run within that environment, in isolation from other tests. That means that every test should create its own logger instance, or database pool instance, or whatever else our singletons are doing, just so it can avoid being polluted by stateful changes made by earlier tests. Each unit test for the Direct3D renderer <em>should</em> set up its own renderer object. Each physics simulation test <em>should</em> initialize the physics engine first, and shut it down again after use. Singletons don’t easily allow that. Sure, we can extend them with explicit <code>Create()</code> and <code>Destroy()</code> methods, but then our abstraction is starting to get leaky. We can no longer assume that precisely one instance exists, because we might have just destroyed the one that existed.</p>

<p>The “exactly one instance” guarantee removes flexibility from our code that we may need, in order to enforce a constraint that we <em>definitely</em> don’t need. Where’s the harm in allowing the user to create more than one instance <em>if he decides to?</em></p>

<p>C++ programmers are familiar with <code>std::cout</code>, the standard output stream. Funny thing about this, it is a simple global object. We can <em>obviously</em> never have more than one standard output stream. But we <em>can</em> have more than one stream. The standard library just initializes one of them to point to the standard output, and saves it as a global variable. We don’t need it to be a singleton, we don’t even need it to be a static class specially defined for the purpose. We just need a stream, defined somewhere where it’s globally accessible.</p>

<p>True, a sufficiently stupid programmer <em>could</em> create a new stream when he intended to write to <code>std::cout</code>, and true, a singleton implementation would have prevented that. But is it worth it? When was the last time you saw someone <em>accidentally</em> invoke <code>std::ostream() &lt;&lt; "Hello world";</code>, when they intended to write <code>std::cout &lt;&lt; "Hello world";</code>? It’s not the most common typo I’ve seen.</p>

<p>We don’t <em>need</em> to prevent multiple instantiations. If we want only one instance, we just instantiate the class once, and refer to that instance whenever we need it, end of story. We don’t need the compiler to slap us over the wrists if we do create multiple instances, because we never do so by mistake. If we do it, it’s because we have a reason. It’s because our initial assumption that only one instance was needed, turned out to be wrong!</p>

<p>So there you have it. A singleton combines two <em>negative</em> qualities. It takes the “you can never create a second instance of this class” constraint, which hardly ever makes sense, and even when it does, does not typically need to be enforced by the compiler, <em>and combines it with a global object</em>, giving us all the downsides of both!</p>

<p>Two wrongs don’t make a right. Not even if they were described as a good idea by some guys 15 years ago. They’re still no greater than the sum of their parts: two wrongs. One bad thing combined with another bad thing, creating a <em>very</em> bad thing.</p>

<p>Too many programmers rely heavily on singletons to solve a problem they never had. They never <em>needed</em> a compile-time guarantee that multiple instances of a class can never be created. They just needed one instance to be created.</p>

<p>Sometimes, we do need globals, yes. In those cases, make old-fashioned globals. Use static class members, or if the language allows it, global (non-member) objects. Or use the Monostate pattern, or whatever you feel is the cleanest solution. But remember that the problem you’re trying to solve is “enabling global access to this data”. No more, no less. You do <em>not</em> want a solution which sneaks completely unrelated constraints and limitations in through the back door.</p>

<p>And while I can’t personally think of many cases where this is true, you <em>might</em> also run into situations where it is truly <em>necessary</em> to prevent more than one instance of a class from ever existing. Again, I can’t think of what situation this might be, but I won’t rule out that it can occur. If it does, then enforce <em>that</em> constraint alone. But don’t go around providing <em>global access</em> to the object as well. Whatever specialized purpose your “one instance only” class serves, it’s highly unlikely that <em>everyone</em> should be allowed access to it. So don’t make it a global.</p>

<p>Most of the time, your classes should have neither of these attributes. Sometimes, rarely, they may need <em>one</em> of them. But the singleton pattern imbues the class with <em>both</em> properties, and <em>that</em> is just a plain bad idea.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2010/03/singletons-solving-problems-you-didnt-know-you-never-had-since-1995/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>The downside to “dogfooding”</title>
		<link>http://jalf.dk/blog/2010/01/the-downside-to-dogfooding/</link>
		<comments>http://jalf.dk/blog/2010/01/the-downside-to-dogfooding/#comments</comments>
		<pubDate>Wed, 13 Jan 2010 17:00:05 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[dogfooding]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[visual-studio]]></category>
		<category><![CDATA[Windows Mobile]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=433</guid>
		<description><![CDATA[A term that’s become very popular, and which especially Microsoft’s developers seem to champion, is “dogfooding”. The idea that as a developer, you should use your own products on a daily basis, even during development. This exposes you to all the weaknesses and flaws of the product, and makes you much better equipped to deliver [...]]]></description>
			<content:encoded><![CDATA[<p>A term that’s become very popular, and which especially Microsoft’s developers seem to champion, is “dogfooding”. The idea that as a developer, you should use your own products on a daily basis, even during development. This exposes you to all the weaknesses and flaws of the product, and makes you much better equipped to deliver a product that’s actually <em>worth using</em>.
<span id="more-433"></span></p>

<p>But perhaps there’s a counter-argument that people seem to miss. If you use a lousy piece of software on a daily basis, <em>you get used to it</em>. You stop thinking about how it <em>should</em> be, and only consider <em>how it is</em>.</p>

<p>I think the first place I heard of the term “dogfooding” was on the <a href="https://blogs.msdn.com/windowsmobile/archive/2007/05/04/dogfood-doesn-t-always-taste-good.aspx">Windows Mobile team blog</a>. And let’s be honest, is Windows Mobile really a competitive product? Is it worth using? Perhaps in a vacuum. If all you know is Windows Mobile, then, well, it’s not <em>too</em> bad. But there’s an obvious reason why the product is struggling in the marketplace. Compared to <em>everything else</em>, it feels horrible to use.</p>

<p>Perhaps the recipe for fixing Windows Mobile would be <em>less</em> dogfooding. Windows Mobile developers shouldn’t be forced to use their own buggy, slow, in-development OS all the time on their phones. Perhaps they should be given iPhones and Blackberries. Perhaps some of them should even be given simple old-school non-smartphones. The ones that didn’t need to be rebooted, and didn’t “feature” load times for opening your contacts list, or to write a new SMS (text message). Perhaps they need to be shaken up a bit, and see what <em>else</em> a phone can feel like when you use it. Windows Mobile 6.5 might be better than WM6.0. But that’s not the competition they need to beat. They need to beat the iPhone, they need to beat Android, Blackberry and Symbian. So those are the products they should use at least as much as they use Windows Mobile.</p>

<p>The same may be true for Visual Studio. It’s great that the team <a href="https://blogs.msdn.com/ricom/archive/2009/10/19/my-history-of-visual-studio-part-10-final.aspx">uses Visual Studio 2010 internally</a> as much as possible during development. But that also means that they get used to its performance issues. And it means they get used to the assumption that “this is what an IDE is like”.</p>

<p>Perhaps Visual Studio would be a better product if the team was forced to use Emacs, Vim and Eclipse. Or perhaps even Notepad and makefiles.</p>

<p>And how much better would TFS be, if the developers had used Git or Bazaar instead of <a href="http://blogs.msdn.com/somasegar/archive/2007/06/18/so-what-does-microsoft-use-for-software-development.aspx">dogfooding</a> <a href="http://blogs.msdn.com/granth/archive/2009/08/27/vsts-pioneer-tfs2010-dogfood-server.aspx">TFS</a> during development?</p>

<p>Dogfooding has its advantages, certainly, but I don’t think it <em>alone</em> is a recipe for a good, competitive product. It leads to an incremental improvement over the previous version of your product, but it doesn’t take into account what <em>else</em> is happening in the world. It doesn’t give you the opportunity to question your basic assumptions<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>. Sometimes, incremental improvement is not what your product <em>needs</em>.</p>

<p>Just a thought.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>Of course I’m not claiming that Microsoft’s developers <em>never</em> use or examine competing products. And likewise, there are obvious benefits to dogfooding, and I’m certainly not claiming that the practice should be eliminated. But I think it is telling that their blog posts frequently mention how heavily they dogfood their products. But they never mention “for this release of Visual Studio, we actually went back and looked at why many people still prefer Vim.”, or “In developing Windows Mobile 7, the entire team was issued phones running various other OS’es, and this taught us what we need to do to finally ship an OS that will take over the world”. <a href="#fnref:1" rev="footnote">↩</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2010/01/the-downside-to-dogfooding/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The meaning of RAII — or why you never need to worry about resource management again</title>
		<link>http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/</link>
		<comments>http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/#comments</comments>
		<pubDate>Sat, 02 Jan 2010 05:00:52 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[.net]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[raii]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=340</guid>
		<description><![CDATA[I tried really hard to come up with some witty title or pun to weave into the title of this post. I couldn’t. RAII is just a terrible name, and it isn’t really clever or funny. Unfortunately, it is also the single most important key to C++. It is not just an idiom but a [...]]]></description>
			<content:encoded><![CDATA[<p>I tried <em>really</em> hard to come up with some witty title or pun to weave into the title of this post. I couldn’t. RAII is just a terrible name, and it isn’t really clever or funny. Unfortunately, it is also <em>the</em> single most important key to C++. It is not just an idiom but a fundamental philosophy used to solve almost any problem in the language. So we can’t really avoid it.</p>

<p>If I had to pinpoint one thing that marked the difference between a skilled and an unskilled C++ programmer, it would be “do they understand RAII”. Many people don’t, hence this post.<span id="more-340"></span></p>

<p>RAII is, apart from being badly named, one of those deceptively simple concepts that you <em>think</em> you understand when you first hear of it, think “well duh, that’s obvious”, and then proceed to write code as usual, because you just don’t see how widely applicable it is.</p>

<p>But let’s get the name out of the way first. <a href="http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization">RAII</a> stands for “Resource Acquisition Is Initialization”. And if you’re not already familiar with the idiom, then this has told you <em>nothing at all</em>. If you did know about RAII in advance, then you can, when you stop and think about it, kind of see how the name relates to it… vaguely… sort of.</p>

<p>What it actually <em>means</em> is simple: Resources should be managed by classes. When the class is initialized, the resource is acquired (hence the name). When the class is destroyed, the resource is released. And the lifetime of the object should exactly match the desired lifetime of the resource. That sounds obvious, and many programmers will (assuming they’re working in a language that <em>has</em> classes), say that this is what they always do.</p>

<p>Often, C++ developers think this just means “smart pointers. Wrap your memory allocation in a <code>boost::shared_ptr</code> and you’re done”. I see that as one not-very-often used border case though, rather than a typical example of RAII. So let’s take a step back instead.</p>

<p>The key idea isthat any kind of resource, not just memory, but file handles, sockets, database connections, or even more abstract resources like loggers or profiling timers or textures, really <em>any</em> concept or process which has a lifetime, should be mapped to an object.</p>

<p>Unlike the typical object-oriented line of thought which goes that “everything must be an object, because then.… well, everything will be an object, and your code will be better”, here we actually have a concrete <em>reason</em>: We want to use the object to manage the lifetime of the resource.</p>

<p>When I allocate memory with <code>new</code>, I have to deallocate it again sooner or later, with <code>delete</code>. (Or in C, with <code>malloc()</code> and <code>free()</code> respectively). And I have to make sure that this is done. And I have to make sure that it is not done twice. And that the object is not accessed after this is done. There are a lot of constraints we have to obey, all related to the lifetime of the resource. And this is why unmanaged programs have a reputation of leaking memory left and right. If we allocate memory, and it is to be used by a dynamic number of objects or functions all referencing the same allocations, which of the users is responsible for deleting it? And how do we know when it is safe to delete, when no users remain?</p>

<p>Ironically, most managed languages have <em>not</em> solved the problem. They have added a garbage collector (which yes, is very useful for a wide number of reasons), but that only solves one specific instance of the problem. It takes care of avoiding memory leaks, but it doesn’t avoid resource leaks <em>in general</em>.</p>

<p>The garbage collector ensures that this code won’t leak memory:</p>

<pre><code>void foo() {
  SomeObject* obj = new SomeObject();
  bar(obj);
}
</code></pre>

<p>where without a garbage collector, we’d (at least without RAII) have to write code such as</p>

<pre><code>void foo() {
  SomeObject* obj = new SomeObject();
  try {
    bar(obj);
    delete obj;
  }
  catch(...){ delete obj; }
}
</code></pre>

<p>In the garbage collected case, we don’t know what <code>bar</code> does, and we don’t <em>need</em> to know. It doesn’t have to delete the object. And neither does the <code>foo</code> function. So we have successfully dodged the problem of managing the lifetime of memory allocations. We haven’t really <em>solved</em> the problem though. We still don’t have any good tools to <em>manage</em> the lifetime. We’re just guaranteed by the system that it’ll last <em>long enough</em>.</p>

<p>In C++, this effect can be approximated using some kind of smart pointer<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>.</p>

<p>Smart pointers allow us to write code like this:</p>

<pre><code>void foo() {
  boost::shared_ptr&lt;SomeObject&gt; ptr = new SomeObject();
  bar(ptr);
}
</code></pre>

<p>and be sure we won’t leak memory. Of course, this solution isn’t perfect — reference counting is much more expensive than a good garbage collector, and if we create cyclic references, the objects will never be deleted, as the reference counts never reach zero. It is a decent approximation, but nowhere near as good and reliable as the garbage collector in managed languages.</p>

<p>But the problem shows up again if we use another type of resource. What if we’d opened a database connection instead?
We’d have to write code such as this:
(The following Java-like pseudocode is copied almost verbatim from <a href="http://stackoverflow.com/questions/161177/does-c-support-finally-blocks-and-whats-this-raii-i-keep-hearing-about/161247#161247">this StackOverflow.com answer</a>, courtesy of <a href="http://stackoverflow.com/users/14065/martin-york">Martin York</a>.)</p>

<pre><code>void writeToDb()
{
  Db db = new Db("DBDesciptionString");
  try
  {
    // Use the db object.
  }
  finally
  {
    db.close();
  }
}
</code></pre>

<p>(And of course it gets even worse if <code>db.close()</code> can throw exceptions. Then we have to catch <em>that</em> exception, just to avoid it propagating out from the <code>finally</code> clause if we reached <code>finally</code> because of an exception being thrown in the <code>try</code> clause.)</p>

<p>The resource management problem still exists. We still have to wrap the code in exception handling just to make sure that the connection is closed as soon as we’re done with it. And we have to do this at <em>every</em> use. And it gets complicated fast.</p>

<p>Of course, .NET makes this a bit simpler:</p>

<pre><code>using (Db db = new Db("DbDescriptionString"))
{
  // use the database object.
}
</code></pre>

<p>But the onus is still on the user of the class to ensure it is closed correctly. There is no obvious way to encode into the <code>Db</code> class that “once we’re done with an object of this type, the connection must be closed immediately”.</p>

<p>And in C++, smart pointers are no longer suitable solutions, since the resource to be managed is no longer a pointer allocated with <code>new</code>.</p>

<p>Instead, a more basic flavor of RAII comes to the fore:</p>

<pre><code>void someFunc()
{
    Db db("DBDesciptionString");
    // Use the db object.
} 
</code></pre>

<p>Yes, that’s all. When the <code>db</code> object goes out of scope, at the end of the function, its destructor is called. The destructor internally calls <code>this-&gt;Close()</code> for us, so we don’t need to do it! We just have to trust the scoping rules of C++, which guarantee that destructors are called on local variables when they go out of scope, and on class members when the class is destroyed.</p>

<p>So in a sense, the key idea in RAII is simply that “resources should behave sensibly”. They should get copied safely if an assignment is made (or otherwise, assignments should be prevented), they should be available if their owning object is successfully created (if it can’t create the resource, it should throw an exception, aborting the creation of the object), and when they are no longer used, they should be cleaned up.</p>

<p>The C++ standard library class template <code>std::vector</code> is a wonderful example of RAII in action. The resources being managed by a <code>vector</code> are memory (the array allocated internally to hold the objects being contained in the vector, as well as the objects themselves. When the <code>vector</code> is destroyed, every object it holds must be destroyed too, and the array in which they were placed must be deallocated.</p>

<p>In the following examples, assume that a function <code>foo</code> is passed a vector of <code>MyClass</code> objects by value. We don’t know how many, if any, objects are stored in it, but since we are passed a copy of the original <code>vector</code>, we take ownership of it. It exists only in the function <code>foo</code>, and must be destroyed afterwards.</p>

<pre><code>void foo(std::vector&lt;MyClass&gt; vec) {
  ...
 //  when we get to the end of the function, all local variables, including vec, 
 // are automatically destroyed by having their destructors invoked.
 // So no matter how many MyClass objects were stored in the vector, it ensures that they too have their destructors called.
 // And the vector also deallocates its internal array, leaving neither of its resources alive at the end of the function
}

void foo(std::vector&lt;MyClass&gt; vec) {
  throw std::exception("Oops");
  // as above, vec is automatically destroyed when we leave the function,
  // regardless of *how* we leave it. Even if we leave it because an exception was thrown and not caught.
} 

void foo(std::vector&lt;MyClass&gt; vec) {
  // other is constructed as a copy of vec. std::vector ensures that both of vecs resources are copied as well
  std::vector&lt;MyClass&gt; other = vec;
  // we now have two vectors, each owning a dynamically allocated array and a number of MyClass objects
  // and again, at the end of the function, both are deallocated cleanly
} 

void foo(std::vector&lt;MyClass&gt; vec) {
  std::vector&lt;MyClass&gt; other; // a second, empty, vector

  // perform an assignment, setting vec to be an empty vector
  // std::vector makes sure that if you do this, the resources previously held by vec are cleanly released
  // before copies are made of the resources held by other
  vec = other;

  // and so when the function ends, the MyClass objects originally held by vec
  // have already been destroyed, so their destructors are *not* invoked now
} 
</code></pre>

<p>As the above shows, <code>vec</code> owns its resources, and manages them tightly. Whenever a change happens to <code>vec</code>, it reflects this by updating its owned resources. If it is destroyed, it destroys its owned resources. If it is copied, it copies the resources it owns. If it is assigned to hold something else, it first destroys its existing resources. And so on. Nothing you do can bring it “out of balance”. It just works. <em>That</em> is RAII. Smart pointers are just convenient adapters turning raw pointers into RAII objects. But RAII is much more than smart pointers.</p>

<p>It is the broad and general idea that <em>resources should be mapped to objects</em>, so that the object can not be created unless it succeeded in acquiring its resource, and it can not be destroyed without also releasing its resource. This effectively saves C++ programmers from having to worry about resource management.</p>

<p>Take an example that’s guaranteed to cause pain without the use of RAII: Handling exceptions being through halfway through constructors. Say you have a class with multiple members which are initialized in its constructor. After the first member has been initialized, but before all of them have been initialized, an exception is thrown. Let’s use the following contrived example:</p>

<pre><code>class Foobar {
  Foo f;
  Bar b;
  MyClass c;

public:
  Foobar() : f(42), b("hello world), c('a') {}
};
</code></pre>

<p>unfortunately, <code>b</code>’s constructor throws an exception. How to handle this? We know that in C++, partially constructed objects do not automatically have their destructors called. when the construction is aborted.</p>

<p>And since we want to avoid any resource leaks, we require that the following must happen:
– <code>a</code> must have its destructor called (because <code>a</code> was successfully initialized before the error occurrd)
– <code>b</code> must release any resources it acquired in its constructor before it threw the exception
– <code>c</code> must do nothing. Its construction was not yet begun when the error ocurred, so it would be an error to attempt any kind of cleanup of <code>c</code>.
– The <code>Foobar</code> object (the object pointed to by the <code>this</code> pointer) must ensure that the above, and nothing else, happens, and it must do so without relying on its own destructor (which won’t be called, as construction did not successfully complete).</p>

<p>And of course, pretending that only <code>b</code> can throw an exception may be a simplification over the real world. Perhaps every member could throw one from its constructor. Care to write a <code>Foobar</code> constructor which takes all this into account, providing enough <code>try</code>/<code>catch</code> blocks to correctly catch every exception that might be thrown, and release exactly the resources that have been allocated until then, and <em>nothing</em> else? A tall order, and an open invitation for bugs. And of course, it’d lead to a huge, bloated and error-prone constructor. It’d also prevent us from using the <em>initializer list</em>. We’d have to perform some kind of “safe” non-throwing default construction of both <code>a</code>, <code>b</code> and <code>c</code> before entering the constructor body, where exception handling is possible, and from there, attempt to perform assignments to bring the three members into the desired state.</p>

<p>In pseudocode, the constructor might look something like this:</p>

<pre><code>Foobar() {
  a = new Foo(42);
  try {
    b = new Bar("hello world");
  }
  catch {
    destroy a;
    throw;
  }
 try {
    c = new MyClass();
  }
  catch {
    destroy b;
    destroy a;
    throw;
  }
}
</code></pre>

<p>Note that all this complexity is only necessary because we want to handle several different resources. <code>a</code>, <code>b</code> and <code>c</code> all contain resources that must be attempted acquired, and properly released if this fails. If there’d been only one resource, the job would have been much simpler. There wouldn’t be any point at which <em>some</em> resources have been acquired, and others have not. If we succeeded in acquiring that one resource, there’d be no risk of errors occurring afterwards, so we wouldn’t need complex conditional cleanup code. And if we failed to acquire the one resource, there’d be nothing to clean up — after all, the resource was never acquired!</p>

<p>So to keep down the complexity, the only safe way to define a class is to make it own <em>at most one</em> resource. And this one-to-one mapping of resources to classes is exactly what RAII is all about. If <code>a</code>, <code>b</code> and <code>c</code> had all been RAII objects, then the above code <em>would work</em>. Regardless of which members could or couldn’t throw exceptions. According to the rules of C++, we know that in the above case,</p>

<ul>
<li>the <code>Foobar</code> destructor (<code>this-&gt;Foobar::~Foobar()</code> will not be called, as <code>*this</code> was not successfully constructed.</li>
<li>the <code>a</code> destructor will be called, as this member was fully constructed at the time of the error.</li>
<li>the <code>b</code> and <code>c</code> destructors will not be called, as these members were not fully constructed at the time of the error.</li>
</ul>

<p>So assuming that <code>b</code>’s constructor takes care of releasing any resources successfully allocated when the error occurred (the number of which, as pointed out above, should ideally be zero), we’re actually home free! What happens is exactly what we listed earlier as our goal. <code>a</code> has its destructor called, <code>c</code>’s constructor was never run in the first place, so it doesn’t have to do anything, and <code>*this</code> doesn’t have to do <em>anything</em> special in its constructor. All of its members take care of their own resources, so the number of resources managed by <code>*this</code> is zero!</p>

<p>We don’t even need to write a destructor for <code>Foobar</code> now, if all its members are RAII objects. Whether the <code>Foobar</code> object is partially or fully constructed, its members take care of themselves. That is the power of RAII. Once a resource has been mapped to a class, we can use it as much as we like, and even in very complex situations, and never have to worry about the resource being leaked. It is managed by its wrapping RAII object, and the C++ lifetime and scope rules ensure that this wrapper object gets destroyed when it goes out of scope</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>A smart pointer is an object which behaves as a pointer (meaning that it overloads the <code>*</code> and <code>-&gt;</code> operators, so it can be dereferenced to yield the pointed-to value), but also enforces some kind of ownership semantics on the value. A plain pointer does nothing when it goes out of scope. If it pointed to some dynamically allocated memory, nothing happens to that memory. And if no one else have a pointer to it, then that memory is lost, and can not be reclaimed.
A smart pointer does <em>something</em> when it is destroyed. Some variants simply free the memory they point to (<code>boost::scoped_ptr</code>, <code>std::auto_ptr</code> or <code>std::unique_ptr</code> all fall into this category, although with some important differences), while others implement reference counting, so that the memory is only destroyed when <em>all</em> smart pointers pointing to it have been destroyed. <code>boost::shared_ptr</code> is by far the best known implementation of this concept. <a href="#fnref:1" rev="footnote">↩</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Hopes for 2010: Microsoft Visual C++</title>
		<link>http://jalf.dk/blog/2009/12/hopes-for-2010-microsoft-visual-c/</link>
		<comments>http://jalf.dk/blog/2009/12/hopes-for-2010-microsoft-visual-c/#comments</comments>
		<pubDate>Wed, 30 Dec 2009 17:00:24 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[ide]]></category>
		<category><![CDATA[intellisense]]></category>
		<category><![CDATA[msvc]]></category>
		<category><![CDATA[new-year]]></category>
		<category><![CDATA[visual-studio]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=387</guid>
		<description><![CDATA[As I mentioned earlier, I’d like to celebrate the new year by calling out a few products I’d like to see improved in the new year. First in line is Microsoft’s C++ compiler and IDE. From you, what I’d like to see in 2010 is actually fairly simple (at least conceptually): rethink your IDE. The [...]]]></description>
			<content:encoded><![CDATA[<p>As I mentioned <a href="http://jalf.dk/blog/?p=352">earlier</a>, I’d like to celebrate the new year by calling out a few products I’d like to see improved in the new year.</p>

<p>First in line is Microsoft’s C++ compiler and IDE.<span id="more-387"></span></p>

<p>From you, what I’d like to see in 2010 is actually fairly simple (at least conceptually): rethink your IDE. The Visual Studio team as a whole is already <a href="http://blogs.msdn.com/ricom/archive/2009/10/19/my-history-of-visual-studio-part-10-final.aspx">doing this in a big way</a> with VS10. I’m hoping you can find the time to reinvent the C++ IDE specifically as well.</p>

<p>For the past decade (except for the 5 years you wasted trying to eliminate native C++), you’ve been trying very hard to write the ultimate IDE for the wrong language. You’ve got something that works pretty well for C++ code anno 1993 or so. And which completely falls apart when used for more modern C++. Why do you persist in putting so many resources into making Intellisense better, when it <em>still</em> has no way to deal with a simple template function? Isn’t that a hint that you should rethink your approach? Modern C++ has quite a bit in common with dynamic languages. The type of a function parameter may not be known just by looking at the function definition. Often a full-fledged “interactive mode” as is common in dynamic languages, is desired, for example if we want to see what the actual type of some template parameter is. Or when instantiating template metaprograms, we may wish to step through them line by line at compile-time, rather than being limited to looking at the compiler errors — the metaprogramming equivalent of <code>printf</code>–debugging. What I’d like to see from the MSVC IDE in the coming years is an acceptance and support of modern C++ paradigms — generic programming, template metaprogramming and all the difficulties that implies. Don’t give me an IDE that tries to provide Intellisense for a program with a completely static structure. If you’re going to do Intellisense, make it able to handle the very flexible type system enabled by templates. take a leaf from the JavaScript support in your IDE, which supports intellisense even though the language is dynamic and types generally aren’t known until runtime. So to display type information in the IDE while the code is being written, they have to be clever. But they can do it.</p>

<p>In C++, the types aren’t known until compile-time, so from the IDE’s point of view, the problem is similar. To display type information while the code is being written (and before it is compiled), the IDE has to be clever. And at the moment, it isn’t. At the moment, Intellisense just gives up.</p>

<p>Let’s take a simple example. What should the IDE do about this function:</p>

<pre><code>template &lt;typename T&gt;
typename T::return_type foo(T arg){
  bar(arg);
  return arg.baz();
}
</code></pre>

<p>If we play it by the book, and demand a “perfect” solution, there is nothing the IDE can do. It doesn’t know what <code>T</code> is, so it can’t help us with autocompletion, suggesting members after the dot, or anything else.
But if we’re willing to think outside the box, and accept a success rate lower than 100%, there are several strategies the IDE <em>could</em> use to provide meaningful Intellisense information:</p>

<ul>
<li>we could look at the call sites. They must logically provide types that are valid in this context. We could find one call site, and provide Intellisense on the assumption that the type <code>T</code> is whatever was passed at <em>that</em> call site. That wouldn’t be 100% accurate in all cases, of course, but it would give us a type that works with the function, so it would be useful. It could even look at several call sites, and compute the union of the types used. If they all provide a <code>frobnicate()</code> method, then the IDE could assume that <code>T</code> inside the function <code>foo</code> always contains such a member.</li>
<li>We could look at how the type is used in the function. It must have a copy constructor (because it is passed by value), it must have some nested type <code>return_type</code>, and it must have a no-arg <code>baz</code> member function, which returns something convertible to that type. And it must be convertible to whatever arguments <code>bar</code> expects. This probably isn’t a complete description of the type, but it would be enough to give us some limited Intellisense information at least. The compiler might be able to deduce some information about the type. We could even generate some kind of ad-hoc “concepts” implementation — perhaps not as extensive as that which was proposed in C++0x (and subsequently dropped), but a kind of helper datastructure that the IDE can attempt to map onto unknown template types.</li>
<li>Or we could allow the user to specify an example of a valid parameter type, and then use that to generate Intellisense information from.</li>
</ul>

<p>But an alternative approach (and these aren’t mutually exclusive) might be to reduce the reliance on Intellisense, which is essentially a static analysis tool. Perhaps a better approach would be to bring the “Immediate” pane up to date, and make it useful, not just during debugging, but while programming as well. Why can’t I through the intermediate window ask for the class <code>std::vector&lt;bool&gt;</code> to be instantiated, for example, so that I can inspect its structure? Perhaps I’m curious what its <code>iterator</code> type will resolve to, or perhaps I want to know the size of the class or other static information. Why can’t I just ask the IDE for this information? Again, modern C++ has a lot in common with dynamic languages. Very little information can be reliably extracted without compiling the code. So give me the tools for optionally and temporarily compiling bits and pieces.</p>

<p>When I write silly template metaprograms to compute the N’th prime number, and the result is wrong, why doesn’t MSVC provide a compile-time debugger? One which lets me step through the instantiation of this maze of templates, inspect the members of each, and find out where it went wrong, where it instantiated the wrong template, or where I forgot to write the specialization I intended.</p>

<p>In far too many ways, the C++ IDE really feels like a C IDE. Most of it doesn’t seem to know that there’s this new-fangled thing called “templates”, or that they change how people write code. The Immediate window or the debugger, do not recognize template parameter names. If I am debugging a function <code>template &lt;int I&gt; void foo()</code>, why can’t I get the debugger to tell me the value of  <code>I</code>? It should be absolutely trivial to do. But the debugger can’t seem to do it. Intellisense can’t seem to do it. The Immediate pane can’t seem to do it. There’s a clear mismatch between the compiler, which is clearly a C++ compiler, and pretty much hasn’t bothered about the C side for close to a decade, and the IDE which still seems to be trying to be the perfect C IDE, completely disregarding every feature unique to C++.</p>

<p>I know you’re used to being told that you have one of the best IDE’s in existence. I beg to differ. You may have got one of the best C IDE’s, and your C# and VB IDE’s kick some serious butt. But your C++ IDE is essentially nonexistent. Your IDE does not support C++. It supports a marginally and conservatively extended C.</p>

<p>So far, I’ve dealt exclusively with the IDE issues, and that’s not a coincidence. On the whole, I’m quite happy with the MSVC compiler. The <a href="http://blogs.msdn.com/vcblog/archive/2009/11/02/visual-c-code-generation-in-visual-studio-2010.aspx">performance</a> of generated code is good; you’re making great progress on <a href="http://blogs.msdn.com/vcblog/archive/2009/04/22/decltype-c-0x-features-in-vc10-part-3.aspx">C++0x support</a>, and overall, you’ve got a compiler I’m happy with. Of course there are still a couple of areas where the lack of standards-conformance is embarassing (never mind the <code>export</code> keyword, I’m more bothered about two-phase name lookup and other <em>relevant</em> features), and there are some features I wish you’d borrow from GCC, but on the whole, and I wish you’d tighten up your warning messages a bit (some of them are nothing more than noise, or are impossible to avoid in “good” healthy code), I have relatively few <em>serious</em> complaints about the compiler. I do, however, have a few suggestions.</p>

<p>It seems to me that the source/header compilation mechanism could use a makeover. We can’t change the actual semantics (yet — hopefully the proposal for a module system for C++ gains traction), but the compiler <em>can</em> change how it actually processes the code. And yet, major compilers still process the source files in the exact same manner they did 20 years ago. Even though this is, on today’s machines, and with today’s huge codebases, ridiculously inefficient.</p>

<p>Ages ago, precompiled headers were invented, but I’m not really a fan of them. It’s a hackish solution which sometimes helps, but may also hurt, due to the tendency towards including everything in one single “blob” header. Even if that header is precompiled, it still means everything that includes it has to deal with these bloated monolithic symbol tables and other data structures. More importantly, it is a fragile solution, as <a href="http://blogs.msdn.com/vcblog/archive/2009/11/12/visual-c-precompiled-header-errors-on-windows-7.aspx">the VC Team’s own blog shows</a>.</p>

<p>But why can’t this mechanism be generalized?
Why can’t the compiler process every header in isolation, build a complete parse tree of each one, and store those on disk? And then, when the header is included, rather than reading and parsing the header again, simply load this parse tree and merge it into the rest of the compilation unit. Of course, it is easy to come up with cases where the file may have to be parsed differently depending on where it is included, but in 99.9% of all cases, the inclusion mechanism is straightforward and simple: The header is typically not included in the middle of a class definition or from inside a namespace. It usually only reacts to a few fixed macros that may be defined before the header’s inclusion. So <em>most</em> of the time, the header could be precompiled in isolation and reused. And for the few cases where the changed state actually matters, where the header is included in the middle of a function definition or with no include guards or where a macro (say <code>CreateWindow</code>, or a similarly common name, <em>cough cough</em>) mangles the contents of the header, in <em>those</em> cases, the compiler can simply fall back to the traditional source code inclusion and subsequent compilation of the translation unit. Even if these precompilation passes aren’t stored to disk in the manner of precompiled headers, they could still be kept in memory, and reused between translation units during a build. If N <code>.cpp</code> files all include a certain header, it would allow that header to be compiled once, rather than N times.</p>

<p>Once again, we have something that feels like a leftover from C. In C, headers were mostly forward declarations and little actual <em>code</em>, so naive processing of headers worked fairly efficiently. in C++, it is getting more and more common to put huge amounts of code in headers, which means that the naive compilation strategy traditionally used for C becomes ridiculously slow and inefficient. Creating a truly <em>general</em> replacement strategy is nearly impossible, true, but it seems like it’d be possible to create a heuristic that’d enable more efficient processing of header files 99% of the time, and which could then fall back to the traditional method of copy/pasting headers into the translation unit for the last percent of cases.</p>

<p>And why does every translation unit have to read every file every time? Can’t their contents be kept in memory, at least for a short time? Those hundreds or thousands of file accesses are painfully slow. Windows already exposes APIs for monitoring file changes, so it should be fairly simple to determine when a source file has been modified, and only then flush it from memory.</p>

<p>And of course, everyone’s favorite nitpick: Why is <code>windows.h</code> so absolutely horrible? Why does it have to be one monolithic header which gives us <em>everything</em> Windows has to offer? Why doesn’t it compile as standard C++? Why does it include so many other headers (as above, slowing down compilation)? Why does it pollute the global namespace with macros for ridiculously common names?</p>

<p>Well, it does, and it’d be silly to expect this to change, due to backwards compatibility concerns.
But why then, is there not a <code>windows.hpp</code> or similar? Why isn’t there a separate cleaned-up, C++-compatible header? One which uses function overloading instead of macros, for example? Or which just defines simple forwarding functions instead of macros? One which compiles even with the non-standard language extensions disabled? Or why isn’t there a <em>set of</em> these headers, allowing us to access the bits of the Windows API we’re interested in, without having to include *everything?</p>

<p>In short, I think the MSVC IDE (and in some cases the compiler too) is in desperate need of a rethink. Out with those 12-year-old project wizards, which create complex predefined project structures accumulating every bad practice and unexpected project setting in one place. I’ve lost count of how many beginners I’ve seen choke because their tiny little projects automatically get a precompiled headers thrown in for absolutely no reason. Out with the idea that C++ can best be presented like C#, as a static language where every piece of code can be understood in isolation. Instead, give us an IDE that treats C++ as a more dynamic language, where many types of information are just not available until the program has been compiled, and which embraces the unique features of C++ — one which supports and encourages use of templates, one which accepts that in modern C++, most code ends up in header files, and these header files become expensive to compile, and so are an area ripe for optimization. An IDE which treats C++ compilation as an interactive process, where template instantiation can be stepped through and inspected at each stage, and where interactive queries can be made statically or during debugging to inspect not just data, but also types.</p>

<p>Another addition that would really boost the usefulness of MSVC would be to provide facilities for template metaprogramming in unit tests: For example, it is common to use metaprograms to force compilation failures if a template is instantiated with a specific type. But how do we test that this works as intended? Give us the hooks and language extensions necessary to specify that “this function is expected to fail to compile, and if it does, that’s not an error, just remove the function and compile the remainder of the file”. Again, consider compilation process a part of the language — it is something that must be inspected and debugged, and for which we may wish to write tests.</p>

<pre><code>void my_testcase() __declspec(wontcompile){
  // perhaps we want to ensure that the template can not be instantiated with a reference,
  // so we expect this test to fail
  frobnicator&lt;int&amp;&gt; f; 
}

// if the above test fails to compile (as it should), the compiler should simply ignore it, and proceed to compile the other tests, rather than aborting
void next_testcase() {... } 
</code></pre>

<p>Target your IDE at Modern C++, rather than C with classes. Impress the world by being the first IDE to even think about this. Embrace, and provide support for, the changes that have happened in the C++ language, in best practices and in the mindset of the C++ community. Accept that yes, headers are ridiculously heavy these days, and blindly recompiling them for every translation unit doesn’t scale. Accept that the C++ language needs IDE support to inspect what happens in our compile-time metaprograms as well as at runtime. And face up to the fact that traditional intellisense is a lost cause. There is no way to statically produce all the information we expect from intellisense. Some can be improvised by various heuristics, or as in the template function case, by assuming some suitable dummy value for the function’s template parameters, but others may be nearly impossible to provide until the program has been compiled. So perhaps you need to think beyond Intellisense to provide this information to the programmer. Perhaps an “interactive mode” would be more suitable. If the IDE can’t provide the information I need automatically, it could at least allow me to query for the information. Perhaps it can’t tell me anything about the template type <code>T</code>, but why can’t I tell it to assume that <code>T</code> is a <code>std::wstring</code>, and provide information based on this assumption. Or something else entirely. You already have a pretty good C++ compiler. It’s time to start working on a C++ IDE, and call it a day on the C IDE you’ve been polishing until now.</p>

<p>So dear MSVC team, in case you can’t think of anything useful to do with your time in the year 2010 (as if… I know you’ve got C++0x support to work on, and that’s infintely more important to me than IDE improvements), here’s a new year’s resolution for you: Amaze the world, by showing what a C++ IDE <strong>should</strong> work like. Reinvent the role of the C++ IDE, instead of trying to force the C# or C IDE to work for C++ as well.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/12/hopes-for-2010-microsoft-visual-c/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Adventures in Microoptimizations</title>
		<link>http://jalf.dk/blog/2009/12/adventures-in-microoptimizations/</link>
		<comments>http://jalf.dk/blog/2009/12/adventures-in-microoptimizations/#comments</comments>
		<pubDate>Sun, 20 Dec 2009 07:10:49 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[assembly]]></category>
		<category><![CDATA[cpu]]></category>
		<category><![CDATA[low-level]]></category>
		<category><![CDATA[optimization]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=425</guid>
		<description><![CDATA[A friend recently asked me for “the simplest optimization problem I could think of”. This led to a fun discussion of low-level optimization and how the CPU executes your code. And so I decided to share it here. Let’s make it clear though, that the following will have very little practical use. We’re not just [...]]]></description>
			<content:encoded><![CDATA[<p>A friend recently asked me for “the simplest optimization problem I could think of”. This led to a fun discussion of low-level optimization and how the CPU executes your code. And so I decided to share it here.<span id="more-425"></span></p>

<p>Let’s make it clear though, that the following will have very little practical use. We’re not just into “it doesn’t make a measurable difference” territory, but also deep into “the compiler will do this for you”. So please, don’t try to apply these “optimizations” to your real-world code to save a clock cycle.</p>

<p>This is merely intended as a thought experiment, illustrating some of the factors that makes performance so difficult to predict. And now, with that disclaimer in place, let’s get on with it:</p>

<h1>The problem</h1>

<p>The “problem” I came up with was the evaluation of <code>x+x+x+x</code>. This was the simplest snippet of code I could think of for which optimization is possible. For the sake of this discussion, let us assume that <code>x</code> is an integer.</p>

<p>A naive compiler will evaluate this code as <code>((x+x)+x)+x</code>. In other words, it will evaluate one addition, feed that result to the second addition, and then finally feed the result of that to the third addition.</p>

<h1>Optimization #1</h1>

<p>The optimization I suggested was to evaluate it as <code>(x+x)+(x+x)</code> instead. And why is this faster?
A modern CPU is superscalar — that is, it is able to execute multiple instructions every clock cycle. Depending on the CPU model, it can probably execute three or four instructions belonging from the same thread every cycle.</p>

<p>So where the original version would take three times the duration of an <code>add</code> instruction to execute, my optimization can be done in two times the duration: Both the initial subexpressions can be evaluated <em>in parallel</em>. And so, after only the duration of <em>one</em> <code>add</code> instruction, we’ve got the result of two of the additions, and can perform the third and final one. So in this very simple case, we actually reduced the run time by 33%. Not bad, eh?</p>

<h1>Optimization #2</h1>

<p>My friend then asked if <code>x*4</code> would be an optimization as well. And now it gets a bit more interesting.</p>

<p>First, of course, <code>x*4</code> is just a single multiplication. Is that faster than three additions? Is it faster than two additions (which is the time it’d take for my “optimized” version to run)?</p>

<p>That depends on the speed of a multiplication instruction. On common CPU’s, a moment’s research tells us that <code>add</code> has a latency of 1 cycle, and <code>mul</code> has a latency of 3 cycles.<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>, so the multiplication takes as long as the original unoptimized version.</p>

<h1>Evaluation</h1>

<p>So what does this mean? That at a glance, optimization #1 is faster than #2, certainly. #1 yields a result after two clock cycles, where #2 takes a whopping <em>three</em> cycles.</p>

<p>But there are other factors at play. Sometimes the multiplication version may be more efficient. The CPU has a limited number of execution units. It can also only decode a limited number of instructions at a time.</p>

<p>The version using addition requires three instructions to be decoded, and uses two execution units during the first cycle, and one unit in the second. All in all, we’re occupying three “execution-unit cycles”. The version using multiplication does take three cycles, but because modern CPU’s are pipelined, it only occupies the execution unit during the first cycle. In the second cycle, the execution unit is able to begin on a new instruction, while continuing to process the <code>mul</code> instruction. So this version only requires one “execution-unit cycle”. In other words, we’ll free up other execution units so they can execute other instructions. We’re also freeing the front-end from having to decode three instructions.</p>

<p>So we now know that:</p>

<ul>
<li>If we need the result as soon as possible, the optimized <code>add</code> version will be more efficient because it finishes sooner.</li>
<li>But if we need to execute a lot of other instructions as well, the <code>mul</code> version will be more efficient because it uses fewer hardware resources on the CPU</li>
</ul>

<p>What if we have a lot of instructions we want to execute <em>and</em> we need the result soon? Or if we only have these instructions to execute, and we don’t care about when we’ll get the result (perhaps the next operation is to add the result to that of an ongoing division, which is <em>very</em> slow, so it won’t matter if we take 2, 3 or 15 cycles to get ready)? Hard to say. Either one may be preferable.</p>

<p>Of course on x86 CPUs we also have to take the variable instruction length into account. How many bytes does a <code>mul</code> instruction take? What about three <code>add</code>s? That affects both how much data has to be read from memory and how much space will be taken up in CPU cache, and so that should be taken into account as well.</p>

<p>So what can we learn from this? Mainly that performance is nontrivial. Never assume that you can tell whether some code is “fast” or “slow”. And be especially careful with assumptions about how it can be improved. It is very possible that your “optimization” will actually run slower.</p>

<p>Whenever you optimize code, do as the <a href="http://blogs.msdn.com/ricom/archive/2003/12/02/40779.aspx">pros</a>: <a href="http://blogs.msdn.com/ricom/archive/2007/06/13/partly-sunny-chance-of-showers-bring-an-umbrella.aspx"><em>measure, measure and measure</em></a>. Measure the speed of the original code. Measure the result of the optimized code. Be careful with the many ways in which your measurement can be invalidated (by the compiler optimizing away the code you wanted to test, or by the CPU cache changing the result in your test case from what you’d expect in the real world by caching — or not caching — the data you’re operating on).</p>

<p>And when performing low-level optimizations, another vital piece of advice is to <em>understand the hardware</em>. Know which instructions are being executed, know the cost of instructions on the relevant hardware, and know what other tricks the hardware uses (Your CPU is probably superscalar and pipelined, and processes instructions out of order. It probably also has a cache of a certain size, with a specific cache line size, and a certain associativity. It has a fixed number of execution units, a known pipeline length and so on. And while we’re at it, the memory subsystem matters too. How long does it take to access RAM? How can the CPU reorder reads and writes? What is its policy for writes? When are they pushed from cache to RAM? If you want to optimize your code on the instruction level, you <em>need</em> to know your CPU. Even the simplest code is affected by dozens such factors, any of which might make a difference.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>For the sake of this example, let us assume that simple multiplication and addition instructions are used. Some CPU’s may have more complex instructions that, for example, can perform the multiplication faster if the second operand is a power of two. And of course we could implement the multiplication as <code>x &lt;&lt; 2</code> too. <a href="#fnref:1" rev="footnote">↩</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/12/adventures-in-microoptimizations/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Houston, we have a (performance) problem</title>
		<link>http://jalf.dk/blog/2009/12/houston-we-have-a-performance-problem/</link>
		<comments>http://jalf.dk/blog/2009/12/houston-we-have-a-performance-problem/#comments</comments>
		<pubDate>Tue, 15 Dec 2009 13:49:43 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[stm]]></category>
		<category><![CDATA[thesis]]></category>
		<category><![CDATA[transactional-memory]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=403</guid>
		<description><![CDATA[Ouch. These last few days, I’ve been fixing a few lingering bugs in my STM system, and last night, I finally nailed them. Specifically, it is now possible to open variables within a transaction as read-only. An obvious optimization, right? At least that’s the idea. Less work is required by the STM system if we [...]]]></description>
			<content:encoded><![CDATA[<p>Ouch. These last few days, I’ve been fixing a few lingering bugs in my STM system, and last night, I finally nailed them. Specifically, it is now possible to open variables within a transaction as <em>read-only</em>. An obvious optimization, right? At least that’s the idea. Less work is required by the STM system if we can trust that the variable isn’t modified by this transaction.
<span id="more-403"></span></p>

<p>Well, my test case for this feature now takes <em>ages</em> to run. As I mentioned previously, a simple transaction modifying two integer variables under heavy contention can pull off almost two million transactions per second on my laptop.</p>

<p>My new test, in which each thread takes four variables and alternates between modifying two of them and reading the other two, runs perhaps ten thousand (!) times slower.</p>

<p>Of course I have several leads on how to fix this. The problem is largely all the performance-related “extras” I’ve been leaving out. For example, if a transaction fails to acquire a variable it needs, it simply aborts and immediately retries. In many cases, a  better approach would be to block the thread, waiting for that variable to actually become available.</p>

<p>There are several other cases where I have a similar problem: I have to choose between delaying the thread for a moment with <code>sleep()</code> before attempting to continue, blocking it until some condition is true, or aborting the transaction entirely and starting over from scratch. At the moment, I generally just pick the easiest solutions (typically abort, and <em>occasionally</em> call <code>sleep()</code> a few times before we resort to that. Again, implementing some actual meaningful policies here would make a big difference. And tweaking these policies should help still more.</p>

<p>Another problem is that currently, I do not enforce a consistent global order when acquiring objects during a commit. This means I risk livelocks, again causing excessive rollbacks when multiple threads are competing over access to the same variables.</p>

<p>So I’m still optimistic. It should be possible to get performance back on track. But man, it’s depressing watching performance plummet like this.</p>

<p><strong>Edit</strong><br />
And an update. After poking around a bit, it turned out that most of the time was being spent sleeping. When a transaction attempts to commit, if it can not acquire all the all the variables it needs, it retries a few times with a short delay (a couple of milliseconds) in between. If it doesn’t succeed after a few tries, it rolls back the entire transaction and starts over.</p>

<p>It turned out that these few, short <code>sleep()</code> calls brought CPU utilization down to something like 0.01%, and totally destroyed performance. Simply turning the <code>sleep()</code> call into a <em>no-op</em> brought me back to something more or less reasonable. I still need to improve on the above shortcomings, but now at least I can run my tests in less than an hour.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/12/houston-we-have-a-performance-problem/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using My STM Library</title>
		<link>http://jalf.dk/blog/2009/11/using-my-stm-library/</link>
		<comments>http://jalf.dk/blog/2009/11/using-my-stm-library/#comments</comments>
		<pubDate>Mon, 30 Nov 2009 12:58:22 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[stm]]></category>
		<category><![CDATA[thesis]]></category>
		<category><![CDATA[transactional-memory]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=362</guid>
		<description><![CDATA[As promised yesterday, I’d like to show off a few bits of my STM library. Of course it’s far from done, and is still missing several key features, but the core library is in pretty good shape. So as they say on the internets, “my STM library, let me show you it” In the following, [...]]]></description>
			<content:encoded><![CDATA[<p>As promised yesterday, I’d like to show off a few bits of my STM library. Of course it’s far from done, and is still missing several key features, but the core library is in pretty good shape. So as they say on the internets, <em>“my STM library, let me show you it”</em><span id="more-362"></span></p>

<p>In the following, I’ll show a slightly modified version of one of my test cases. It shows how to call and use my library, from the user’s point of view. The test spawns a number of threads, which each wait on a barrier (because if one thread was allowed to run while another was being constructed, I’d get fewer concurrent transactions, and my test would be less likely to uncover race conditions), and then perform a fixed number of transactions. In each transaction, two transactional variables are opened for writing, one is decremented and the other incremented. If the sum of these variables is nonzero in any iteration, the thread registers a failure.</p>

<p>And if, after all the threads have terminated, the value of each of these variables is not what was expected, a failure is registered as well.</p>

<p>So from a testing point of view, there should be plenty of opportunity for things to go wrong. Just a single small race condition somewhere, and <em>one</em> of the many millions of reads would be inconsistent and the test would fail.</p>

<pre><code>#include &lt;stm.hpp&gt; // my STM library

#include &lt;boost/test/unit_test.hpp&gt; // Boost.Test is used to supply a unit-testing framework
#include &lt;boost/thread.hpp&gt; // Boost.Thread is used as a threading API

// define the number of threads to run, and the number of iterations for each
enum { thread_count = 8, iterations = 200000 }; 

// The following are transactional variables. The shared template ensures that the contained value
// can only be accessed as part of a transaction, and provides the necessary metadata
// for checking validity and consistency
// Two such integers are created, both initialized to zero
stm::shared&lt;int&gt; val1(0);
stm::shared&lt;int&gt; val2(0); 

BOOST_AUTO_TEST_SUITE( threads ) // define a test suite named "threads"

// this function defines the body of our transaction. 
// We're passed a transaction, which can be used to open any "shared" variables
bool tx_func(stm::transaction&amp; tx){
    // open both variables for writing
    int&amp; a = val1.open_rw(tx);
    int&amp; b = val2.open_rw(tx);

    // modify the variables freely
    --a;
    ++b;

    return a + b == 0; // Our transaction returns a bool. Other return types (or void) are also supported
}

// this class defines a thread. operator() is called as the thread's entry point
struct thread_functor{
    // In the constructor, the thread object is given a barrier it can synchronize on,
    // and a reference where it can write the its result (success/failure)
    thread_functor(boost::barrier&amp; bar, int&amp; res) : bar(bar), res(res) {}

    void operator()(){
        // when the thread is first created, we wait for the barrier
        // This ensures that no transactions are running until all threads have been constructed
        bar.wait(); 
        for (int i = 0; i &lt; iterations; ++i){
            // for each iteration, pass our transaction function to the "atomic" function, which executes it atomically.
            // To get a non-void return type, we have to specify the template parameter explicitly 
            // (this can be avoided in C++0x using the return_of template to deduce the return type implicitly)

            // depending on the return value of the transaction, write success or failure back
            res = (res != 0) &amp;&amp; stm::atomic&lt;bool&gt;(tx_func) ? 1 : 0;
        }
    }

    boost::barrier&amp; bar;
    int&amp; res;
};

// another transaction, to be executed after our helper threads terminate, 
// to verify that the right number of modifications have occurred
// note that here variables are opened for reading only
void verify(stm::transaction&amp; tx) {
    const int&amp; a = val1.open_r(tx);
    const int&amp; b = val2.open_r(tx);

    BOOST_CHECK_EQUAL(-a, thread_count * iterations);
    BOOST_CHECK_EQUAL(b, thread_count * iterations);
}

// finally, we get to our test case itself
BOOST_AUTO_TEST_CASE ( short_concurrent_transactions )
{
    boost::barrier bar(thread_count);
    boost::thread_group gr;
    int res[thread_count]; // array of results
    // set all the results to an initial true/1 value (since each iteration "and"'s it together with the current result
    std::fill(res, res+thread_count, 1); 

    for (int i = 0; i &lt; thread_count; ++i){
        gr.create_thread(thread_functor(bar, res[i])); // create the threads, passing the necessary parameters to each
    }

    gr.join_all(); // wait for all threads to terminate

    // verify that each thread return success
    for (int i = 0; i &lt; thread_count; ++i){
        BOOST_CHECK_EQUAL(res[i], 1);
    }

    // run a final transaction to access both variables and check their final values
    stm::atomic(verify);
}

BOOST_AUTO_TEST_SUITE_END()
</code></pre>

<p>In the above, I used a function object to represent threads, and a regular function to represent transactions. Of course in both cases, either would work — a function object would potentially be more efficient as it is easier for the compiler to inline, but I used a function for brevity.</p>

<p>In C++0x, of course, lambdas could also have been used in both cases.</p>

<p>One of my design goals has been to make basic usage as simple and intuitive as possible, and I think I’ve succeeded so far. Any C++ programmer who is familiar with the STL algorithms or the Boost libraries, should find my library’s interface very straightforward. Note especially that all the transaction “magic”, of verifying validity and retrying transactions as needed, is completely invisible to the user. You simply define a function expressing what your transaction should do, and pass it to the <code>atomic</code> function.</p>

<p>In its current version, this test is able to execute around 1,800,000 transactions per second on my Core Duo 2GHz laptop. (Of course, with transactions as small as these, opening only two variables each, performance is a lot better than it would be in real-world transactions.</p>

<p>So that’s it for now. Of course I’ve got a few more user-facing features in the pipeline<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>, and a <em>lot</em> of backend changes, but the basic functionality is there, and I’m pretty happy with it so far.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>It should probably be possible to specify that the transaction should <em>not</em> automatically retry if it fails to commit, and instead abort with an exception. It should also be possible to define nested transactions, and use operations such as <em>OrElse</em> and <em>Retry</em> primitives introduced in <a href="http://www.haskell.org/haskellwiki/Software_transactional_memory">Haskell STM</a> <a href="#fnref:1" rev="footnote">↩</a></p>
</li>

</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/11/using-my-stm-library/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
