<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments for jalf.dk</title>
	<atom:link href="http://jalf.dk/blog/comments/feed/" rel="self" type="application/rss+xml" />
	<link>http://jalf.dk/blog</link>
	<description>Musings and thoughts on programming and other geeky stuff</description>
	<lastBuildDate>Sun, 14 Feb 2010 02:35:02 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>Comment on Privacy: Or why I don’t trust Google with my personal information by Craig</title>
		<link>http://jalf.dk/blog/2010/02/privacy-or-why-i-dont-trust-google-with-my-personal-information/comment-page-1/#comment-557</link>
		<dc:creator>Craig</dc:creator>
		<pubDate>Sun, 14 Feb 2010 02:35:02 +0000</pubDate>
		<guid isPermaLink="false">http://jalf.dk/blog/?p=506#comment-557</guid>
		<description>&lt;p&gt;So what&#039;s a good gmail alternative then?&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>So what’s a good gmail alternative then?</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on A .NET Developers Guide to C++ (part II) by Boris</title>
		<link>http://jalf.dk/blog/2009/09/a-net-developers-guide-to-c-part-ii/comment-page-1/#comment-430</link>
		<dc:creator>Boris</dc:creator>
		<pubDate>Sat, 30 Jan 2010 17:28:33 +0000</pubDate>
		<guid isPermaLink="false">http://jalf.dk/blog/?p=316#comment-430</guid>
		<description>&lt;p&gt;This is a great series, your writing style is very direct and reading this article feels like having a friend who knows me very well, giving me a crash course in C++.&lt;/p&gt;

&lt;p&gt;My profession has made me use C# a whole lot, and I&#039;ve forgotten most of what I had learned in C++, so this article makes for a great refresher for me.&lt;/p&gt;

&lt;p&gt;Thanks.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>This is a great series, your writing style is very direct and reading this article feels like having a friend who knows me very well, giving me a crash course in C++.</p>

<p>My profession has made me use C# a whole lot, and I’ve forgotten most of what I had learned in C++, so this article makes for a great refresher for me.</p>

<p>Thanks.</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on The downside to “dogfooding” by Kosi2801</title>
		<link>http://jalf.dk/blog/2010/01/the-downside-to-dogfooding/comment-page-1/#comment-406</link>
		<dc:creator>Kosi2801</dc:creator>
		<pubDate>Mon, 25 Jan 2010 17:14:54 +0000</pubDate>
		<guid isPermaLink="false">http://jalf.dk/blog/?p=433#comment-406</guid>
		<description>&lt;p&gt;Eating your own dogfood has quite a long history trail already. I can remember when I read it years ago on Joel Spolskys blog (http://www.joelonsoftware.com/articles/fog0000000012.html) and also Jeff Atwoods blog (http://www.codinghorror.com/blog/archives/000287.html)&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Eating your own dogfood has quite a long history trail already. I can remember when I read it years ago on Joel Spolskys blog (<a href="http://www.joelonsoftware.com/articles/fog0000000012.html" rel="nofollow">http://www.joelonsoftware.com/articles/fog0000000012.html</a>) and also Jeff Atwoods blog (<a href="http://www.codinghorror.com/blog/archives/000287.html" rel="nofollow">http://www.codinghorror.com/blog/archives/000287.html</a>)</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on The downside to “dogfooding” by micah.cowan.name &#187; Downsides to Dogfooding</title>
		<link>http://jalf.dk/blog/2010/01/the-downside-to-dogfooding/comment-page-1/#comment-396</link>
		<dc:creator>micah.cowan.name &#187; Downsides to Dogfooding</dc:creator>
		<pubDate>Thu, 21 Jan 2010 23:54:27 +0000</pubDate>
		<guid isPermaLink="false">http://jalf.dk/blog/?p=433#comment-396</guid>
		<description>&lt;p&gt;[...] had some interesting things to say against the development paradigm of dogfooding, using this related article at jalf.dk as a springboard. I have to say, there are some excellent [...]&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>[…] had some interesting things to say against the development paradigm of dogfooding, using this related article at jalf.dk as a springboard. I have to say, there are some excellent […]</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on The meaning of RAII — or why you never need to worry about resource management again by jalf</title>
		<link>http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/comment-page-1/#comment-382</link>
		<dc:creator>jalf</dc:creator>
		<pubDate>Tue, 19 Jan 2010 02:57:58 +0000</pubDate>
		<guid isPermaLink="false">http://jalf.dk/blog/?p=340#comment-382</guid>
		<description>&lt;p&gt;They do roughly the same thing, yes, but in very different ways.&lt;/p&gt;

&lt;p&gt;Consider that a reference counter has to be updated &lt;em&gt;every&lt;/em&gt; time a reference is created or deleted. If a smart pointer points to object &lt;code&gt;a&lt;/code&gt;, and you set it to point to &lt;code&gt;b&lt;/code&gt; instead, you have to update the reference counters for &lt;em&gt;both&lt;/em&gt; objects. And every update has to be done atomically to ensure thread safety as well. That makes it still more costly. When you add it up like that, it&#039;s actually quite a few CPU cycles that are thrown away updating reference counters.&lt;/p&gt;

&lt;p&gt;By comparison, a garbage collector doesn&#039;t have to do &lt;em&gt;anything&lt;/em&gt; when references are created, modified or destroyed. It only has to step in when the heap has been filled so much that a memory allocation fails. Once that happens, it has to traverse the graph of live objects, marking each as in use. All dead (nonreachable) objects are never even touched by the GC; so they&#039;re effectively free. Traversing this graph does take a bit of time, as does the heap compaction that typically follows. But because it happens so rarely compared to ref counting, it&#039;s still vastly cheaper overall.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>They do roughly the same thing, yes, but in very different ways.</p>

<p>Consider that a reference counter has to be updated <em>every</em> time a reference is created or deleted. If a smart pointer points to object <code>a</code>, and you set it to point to <code>b</code> instead, you have to update the reference counters for <em>both</em> objects. And every update has to be done atomically to ensure thread safety as well. That makes it still more costly. When you add it up like that, it’s actually quite a few CPU cycles that are thrown away updating reference counters.</p>

<p>By comparison, a garbage collector doesn’t have to do <em>anything</em> when references are created, modified or destroyed. It only has to step in when the heap has been filled so much that a memory allocation fails. Once that happens, it has to traverse the graph of live objects, marking each as in use. All dead (nonreachable) objects are never even touched by the GC; so they’re effectively free. Traversing this graph does take a bit of time, as does the heap compaction that typically follows. But because it happens so rarely compared to ref counting, it’s still vastly cheaper overall.</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on The meaning of RAII — or why you never need to worry about resource management again by sheepsimulator</title>
		<link>http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/comment-page-1/#comment-381</link>
		<dc:creator>sheepsimulator</dc:creator>
		<pubDate>Mon, 18 Jan 2010 18:59:44 +0000</pubDate>
		<guid isPermaLink="false">http://jalf.dk/blog/?p=340#comment-381</guid>
		<description>&lt;p&gt;GREAT ARTICLE!  I felt like it was directed &lt;em&gt;exactly&lt;/em&gt; where I am at in my journey to learn more about OOP.  I think I know how to talkabout RAII better: resources are mapped to objects.&lt;/p&gt;

&lt;p&gt;Didn&#039;t know that garbage collectors were less resource-intensive than shared_ptrs; I thought they were about the same, since they did about the same thing.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>GREAT ARTICLE!  I felt like it was directed <em>exactly</em> where I am at in my journey to learn more about OOP.  I think I know how to talkabout RAII better: resources are mapped to objects.</p>

<p>Didn’t know that garbage collectors were less resource-intensive than shared_ptrs; I thought they were about the same, since they did about the same thing.</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on The meaning of RAII — or why you never need to worry about resource management again by RAII is not Clear? &#171; C++ Soup!</title>
		<link>http://jalf.dk/blog/2010/01/the-meaning-of-raii-or-why-you-never-need-to-worry-about-resource-management-again/comment-page-1/#comment-350</link>
		<dc:creator>RAII is not Clear? &#171; C++ Soup!</dc:creator>
		<pubDate>Sun, 03 Jan 2010 16:44:48 +0000</pubDate>
		<guid isPermaLink="false">http://jalf.dk/blog/?p=340#comment-350</guid>
		<description>&lt;p&gt;[...] still be confused by this idiom. One thing I read while keeping tabs on the web for C++ articles is this one from [...]&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>[…] still be confused by this idiom. One thing I read while keeping tabs on the web for C++ articles is this one from […]</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on Adventures in Microoptimizations by jalf</title>
		<link>http://jalf.dk/blog/2009/12/adventures-in-microoptimizations/comment-page-1/#comment-331</link>
		<dc:creator>jalf</dc:creator>
		<pubDate>Sun, 27 Dec 2009 02:52:35 +0000</pubDate>
		<guid isPermaLink="false">http://jalf.dk/blog/?p=425#comment-331</guid>
		<description>&lt;p&gt;You&#039;re right, cache layout is generally easier to reason about, and &lt;em&gt;very&lt;/em&gt; important performance-wise. I did a project at university a couple of years ago where we saw a 2x speedup simply from changing from column- to row-major traversal of a 2D array. It&#039;s definitely the first thing to check if you need to squeeze better performance out of your code. (partly because much of the ASM hackery can be done by the compiler, but it can&#039;t do much about the cache layout)&lt;/p&gt;

&lt;p&gt;I didn&#039;t really intend this post to be a guide to &quot;useful&quot; optimizations though. It&#039;s just intended to highlight some of the quirks and complexities of low-level optimization. If anything, it should be taken as a warning that trying to optimize by fiddling with individual ASM instructions is going to cause a lot of headaches, and you&#039;ll see some unexpected results in many cases.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>You’re right, cache layout is generally easier to reason about, and <em>very</em> important performance-wise. I did a project at university a couple of years ago where we saw a 2x speedup simply from changing from column– to row-major traversal of a 2D array. It’s definitely the first thing to check if you need to squeeze better performance out of your code. (partly because much of the ASM hackery can be done by the compiler, but it can’t do much about the cache layout)</p>

<p>I didn’t really intend this post to be a guide to “useful” optimizations though. It’s just intended to highlight some of the quirks and complexities of low-level optimization. If anything, it should be taken as a warning that trying to optimize by fiddling with individual ASM instructions is going to cause a lot of headaches, and you’ll see some unexpected results in many cases.</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on Adventures in Microoptimizations by Ben Karel</title>
		<link>http://jalf.dk/blog/2009/12/adventures-in-microoptimizations/comment-page-1/#comment-313</link>
		<dc:creator>Ben Karel</dc:creator>
		<pubDate>Sun, 20 Dec 2009 23:28:36 +0000</pubDate>
		<guid isPermaLink="false">http://jalf.dk/blog/?p=425#comment-313</guid>
		<description>&lt;p&gt;But I thought the point was not to make assumptions in the first place? ;-)&lt;/p&gt;

&lt;p&gt;I agree that remembering what the combination of OoO and superscalar can do is important, and easy to forget. Witness Mike Pall and LuaJIT2 for the payoff...&lt;/p&gt;

&lt;p&gt;All things considered, cache layout is probably the most practical performance-related &quot;thing&quot; to keep in mind for most programmers. It&#039;s much easier to reason about memory access patterns than asm dependency chains. But of course I wish I had measurements to back my intuition up.&lt;/p&gt;

&lt;p&gt;My favorite example of the effect of a (relatively) obscure hardware structure on code is that inline caches make more sense than C++ style vtbls on a processor without a BTB, but the BTB helps the vtbl more than the inline cache. So modern processors are, in effect, optimized for C++-style virtual dispatch!&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>But I thought the point was not to make assumptions in the first place? ;-)</p>

<p>I agree that remembering what the combination of OoO and superscalar can do is important, and easy to forget. Witness Mike Pall and LuaJIT2 for the payoff…</p>

<p>All things considered, cache layout is probably the most practical performance-related “thing” to keep in mind for most programmers. It’s much easier to reason about memory access patterns than asm dependency chains. But of course I wish I had measurements to back my intuition up.</p>

<p>My favorite example of the effect of a (relatively) obscure hardware structure on code is that inline caches make more sense than C++ style vtbls on a processor without a BTB, but the BTB helps the vtbl more than the inline cache. So modern processors are, in effect, optimized for C++-style virtual dispatch!</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on Adventures in Microoptimizations by jalf</title>
		<link>http://jalf.dk/blog/2009/12/adventures-in-microoptimizations/comment-page-1/#comment-312</link>
		<dc:creator>jalf</dc:creator>
		<pubDate>Sun, 20 Dec 2009 21:51:12 +0000</pubDate>
		<guid isPermaLink="false">http://jalf.dk/blog/?p=425#comment-312</guid>
		<description>&lt;p&gt;Thanks for the comment! :)&lt;/p&gt;

&lt;p&gt;True, but there are still areas where such micro-optimizations may be useful. First, if you know what the CPU actually does to your code, you can make some more general assumptions about what is fast and what isn&#039;t. Taking the example in my post above, the exact latency might vary, but I doubt you&#039;ll find a CPU where integer multiplication has less latency than addition. So we can still reason about what is preferable inside a tight loop where the combined latency is what&#039;s holding us back. Or knowing that almost every modern CPU is superscalar, we can determine that a higher instruction count isn&#039;t necessarily worse for performance. We can try to split up our dependencies so that subexpressions can be evaluated in parallel.&lt;/p&gt;

&lt;p&gt;But of course you&#039;re right, at a certain point we get down to the really CPU-specific stuff, and that&#039;s probably dangerous to rely on, unless you know the exact hardware configuration on which the program is going to run. (You might be targeting a Playstation 3 specifically, for example, and then you don&#039;t care that your optimizations wouldn&#039;t work on other CPU&#039;s)&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Thanks for the comment! :)</p>

<p>True, but there are still areas where such micro-optimizations may be useful. First, if you know what the CPU actually does to your code, you can make some more general assumptions about what is fast and what isn’t. Taking the example in my post above, the exact latency might vary, but I doubt you’ll find a CPU where integer multiplication has less latency than addition. So we can still reason about what is preferable inside a tight loop where the combined latency is what’s holding us back. Or knowing that almost every modern CPU is superscalar, we can determine that a higher instruction count isn’t necessarily worse for performance. We can try to split up our dependencies so that subexpressions can be evaluated in parallel.</p>

<p>But of course you’re right, at a certain point we get down to the really CPU-specific stuff, and that’s probably dangerous to rely on, unless you know the exact hardware configuration on which the program is going to run. (You might be targeting a Playstation 3 specifically, for example, and then you don’t care that your optimizations wouldn’t work on other CPU’s)</p>]]></content:encoded>
	</item>
</channel>
</rss>
