<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>jalf.dk &#187; pointers</title>
	<atom:link href="http://jalf.dk/blog/tag/pointers/feed/" rel="self" type="application/rss+xml" />
	<link>http://jalf.dk/blog</link>
	<description>Musings and thoughts on programming and other geeky stuff</description>
	<lastBuildDate>Sat, 07 Jan 2012 15:42:18 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>The Great Pointer Conspiracy</title>
		<link>http://jalf.dk/blog/2009/07/the-great-pointer-conspiracy/</link>
		<comments>http://jalf.dk/blog/2009/07/the-great-pointer-conspiracy/#comments</comments>
		<pubDate>Thu, 30 Jul 2009 18:25:55 +0000</pubDate>
		<dc:creator>jalf</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[pointers]]></category>
		<category><![CDATA[teaching]]></category>

		<guid isPermaLink="false">http://jalf.dk/blog/?p=168</guid>
		<description><![CDATA[One of the great tragedies of C and C++ is that they are taught wrong — that a number of perfectly straightforward features are taught and described as if they were mythical and supernatural entities that no mortal can truly understand. Memory management in C++ is one such feature (it is actually very simple, once [...]]]></description>
			<content:encoded><![CDATA[<p>One of the great tragedies of C and C++ is that they are taught wrong — that a number of perfectly straightforward features are taught and described as if they were mythical and supernatural entities that no mortal can truly understand. Memory management in C++ is one such feature (it is actually very simple, once you know the trick), but the biggest of all is probably pointers.</p>

<p><em>Everyone</em> who learns C++  fears pointers. <em>Everyone</em> who is new to the language, or who has merely <em>heard</em> of the language consider pointers to be some kind of magic — arcane constructs that give the programmer access to <em>Real Ultimate Power</em> — a feature that both mark C/C++ as <em>superior</em> and <em>more powerful</em> than other languages, but is also feared as <em>dangerous</em> or unsafe*.</p>

<p>None of this is true.</p>

<p><em>Pointers are simple.</em></p>

<p><em>Pointers are not magical.</em></p>

<p><em>Pointers are safe (as long as you use them only as allowed by the language)</em></p>

<p><span id="more-168"></span>
 It is very well-defined what you may, and may not, do with a pointer. The only problem is that the compiler is unable to enforce most of this, so it relies on your own discipline, and knowledge of the rules. But the rules exist. And if you stay within the rules, if your C++ program is legal, then pointers are perfectly safe.</p>

<p>This post is my little attempt to debunk The Great Pointer Conspiracy. It seems there is some hidden rule that whenever we teach others C or C++, we must describe pointers</p>

<ul>
<li>as more complicated than they are, and, </li>
<li><strong>as something they are not</strong>. It sometimes makes sense to lie to your pupil in order to teach them the truth a bit at a time (similar to how most of what you learned in elementary school turns out to be wrong when you get to university. They didn’t mislead you, they just taught you simplified versions of the truth to get you on the right track). But in the case of pointers, the model taught is not merely wrong, it is also more complex and harder to understand!</li>
</ul>

<h1>So what is a pointer then?</h1>

<p>Let’s start with a crash course in syntax, just to get that out of the way.</p>

<ul>
<li>A pointer to type T is denoted <code>T*</code> (pronounced <em>pointer to T</em>)</li>
<li>A pointer is created with the <code>&amp;</code> operator. Assuming an <code>int i</code>, we can create a pointer to it: <code>int* p = &amp;i;</code> (<code>&amp;i</code> is typically pronounced as <em>take the address of i</em>)</li>
<li>A pointer can be <em>dereferenced</em> with the <code>*</code> operator, yielding the value it points to: <code>int j = *p;</code></li>
</ul>

<p>That’s easy, right? The only point of confusion is the dual role of <code>*</code>, as both part of the type, and as the dereferencing operator. There’s a bit of symmetry here, because  <code>&amp;</code> can be used in both places as well. As above, it can be used to take the address of an object, but it can also be used as part of the type, to create a <em>reference</em>: <code>int&amp; k = i</code> creates a reference to the previously defined integer i. But references aren’t the subject of this post. I only mention it because of the related syntax.</p>

<p>So, on to what pointers are, and what they can do:</p>

<p><strong>Pointers are references</strong></p>

<p>A pointer is little more than a reference (in the conceptual sense — not the specific C++ references mentioned in the previous section) to a variable. If we have multiple references to the same variable, they will all see changes made by each others. Here’s an example:</p>

<pre><code>void Foo(int* ptr){ // Because we're passed a pointer, we have a reference to the original variable, and can modify it so the changes are visible outside the function
  *ptr = 2; // set whatever ptr points to, to 2
}

int main(){
  // create a local variable i. This isn't a pointer, but it can be referenced by one.
  int i;
  int* p = &amp;i; // create a pointer to i by taking the address (see below) of i, and store that as a pointer p
  i = 1;
  assert(*p == 1); // the value referenced by p is now equal to 1
  Foo(p);
  assert(i == 2 &amp;&amp; *p == 2);
}
</code></pre>

<p><em>important note</em>: Yes, I used the word “address” in the comment above. It is important to realize what I mean by this. I do <em>not</em> mean “the memory address at which the data is physically stored”, but simply an abstract “whatever we need in order to locate the value. The address of <code>i</code> might be anything, but once we have it, we can always find and modify <code>i</code>. If you want a real-world analogy, what is an address in the real world? My email-address has nothing to do with my house address. My phone number could be considered a third address. Even my social security number, or my full name could be considered addresses in this sense. All of these allow you to locate or contact me, which is all we require.</p>

<p>So far, so good. Pointers are simply references to other variables, with slightly quirky syntax in that we have to use <code>*p</code> to get the value that the pointer  <code>p</code> points to, and we have to use <code>&amp;i</code> to create a pointer to <code>i</code>.</p>

<p>Of course pointers can do a bit more than this though. They’re not as complex as people often try to convince beginners, but they’re not <em>that</em> simple either.</p>

<p><strong>Pointers can be reseated</strong></p>

<p>Once a pointer exists, we can change what it points to. For example:</p>

<pre><code>int main() {
  int i = 1;
  int j = 2;
  int* p = &amp;i; // make the pointer p point to i
  assert(*p == 1);
  p = &amp;j; // and now make it point to j
  assert(*p == 2);
  *p = 3; // modify the variable p points to
  assert(j == 3);  // j is now 3
  assert(i == 1);  // but i is untouched, because p no longer points to it.
}
</code></pre>

<p>See, that’s not rocket science either, is it? Whatever the pointer points to, we can look at and modify. And when it no longer points to that, they have no connection any more.</p>

<p><strong>Pointers can be null</strong></p>

<p>Next up, pointers don’t have to point to something. They can be <em>null pointers</em>. And just like with addresses in the previous example, it is important to be clear on what we mean by this. A null pointer is exactly what I said: <em>a pointer which does not point to any object</em>.</p>

<p>In particular, it is <em>not</em> a pointer to the address zero. Of course, here is where it becomes tricky, because the following <em>does</em> create a null pointer:</p>

<pre><code>int* ptr = 0;
</code></pre>

<p>The trick here is that the C++ language standard makes a special rule for this case. Assigning the constant zero to a pointer creates a null pointer, and <em>not</em> a pointer to address zero. The “constant” part is important too. Here is the precise wording in the standard (Section 4.10 [conv.ptr], paragraph 1:</p>

<blockquote>
  <p>A <em>null pointer constant</em> is an integral constant expression (5.19) rvalue of integer type that evaluates to zero. A null pointer constant can be converted to a pointer type; the result is the <em>null pointer value</em> of that type…</p>
</blockquote>

<p>A “constant expression” is essentially an integral value which can be evaluated at compile-time. So <code>42</code>, <code>2+2</code> or <code>const int i = 99</code> are constant expressions.</p>

<pre><code>int* p0 = 0; // null pointer
const int zero1 = 0; // constant expression
int* p1 = zero1; // null pointer
const int zero2 = 2 - 2; // constant expression
int* p2 = zero2; // null pointer
int zero3 = 0; // not a constant expression
int* p3 = zero3; // not a null pointer
int a = 2;
int b = 2;
int zero4 = a - b; // not a constant expression
int* p4 = zero4; // not a null pointer
const int c = 2;
const int d = 2;
int zero4 = c -d; // constant expression
int* p4 = zero4; // null pointer
</code></pre>

<p>Obviously, the compiler is unable to enforce all of this, but that doesn’t make it less true. According to “the rules”, a null pointer is neither a pointer pointing to address zero, or a pointer to which the value zero has been assigned. It is <em>a pointer to which the constant expression zero has been assigned</em>.</p>

<p>As for what you’re allowed to do with a null pointer? Basically nothing. You may compare it to other pointers, and… that’s basically it.</p>

<p>With me so far? You might have noticed that what I have described so far is almost exactly what references in C# or Java (or many other languages) are. A variable of a reference type behaves pretty much exactly like this. We can set it to point to another <em>valid</em> object (but we are <em>not</em> allowed to ever set it to an <em>invalid</em> object), or we can set it to <code>null</code>.</p>

<p>Pointers are much like reference types in most other languages. This is an important point. Like I said to begin with, pointers are <em>not</em> difficult. They are a very simple concept, as the above shows. Where the confusion arises is in the <em>one</em> extra thing they can do, which I will describe next. Note that while this <em>does</em> make them somewhat more flexible than C# references, it is still a far cry from the “raw memory address” concept that people often think pointers are.</p>

<p><strong>Pointers can traverse arrays</strong></p>

<p>Now comes the (slightly) tricky part — the one that usually gets people confused, or gives them the wrong idea. If we have a pointer to an element within an array, we are allowed to move the pointer around within the array</p>

<pre><code>char arr[] = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j' };
</code></pre>

<pre><code>char* ptr = arr; // arrays are not pointers, but can *decay* into a pointer to the first element. So now we have a pointer to arr[0]
assert(*ptr == 'a');
++ptr; // move ptr to the next element
assert(*ptr == 'b');
ptr += 5;
assert(*ptr == g);
assert(*(ptr + 2) == 'i');
assert(*(--ptr) == 'f');
ptr -= 3;
assert(*ptr == 'c')
</code></pre>

<p>So far, so good. At this point it is probably a good idea to mention that when you increment a pointer, it always moves <em>to the next element</em>, and <em>not to the next byte</em>. Once again, being careful with the idea of “addresses” pays off. The pointer stores the address of an object. By adding one to that address, we get the address of the <em>next</em> object, no matter how big the object is. Think of your house address. It doesn’t matter how big your house is, the <em>next</em> address is always the neighboring house. It is not your garage door or your kitchen window.</p>

<p>If, for the sake of argument, pointers had merely been memory addresses, then adding one to a pointer would have produced an address that was one byte higher, which means the pointer would no longer have pointed to a valid object. Good thing we don’t live in <em>that</em> messy kind of world, eh? In C++ land, a pointer points to an object, and incrementing it gives us a pointer to the <em>next</em> object.</p>

<p>Now comes the next little surprise:
You are allowed to move the pointer one step <em>past</em> the end of the array. Assuming the same array as above:</p>

<pre><code>char* p = arr + 9; 
assert(*p == 'j'); // no surprises here, just verifying that we're at the end of the array.
char* q = arr + 10; // this is legal
++p; // so is this
</code></pre>

<p>But once again, we have to be careful. The language has only given us permission  to go <em>one</em> step past the end. A pointer to <code>arr + 11</code> is downright illegal, <em>even if we don’t dereference it. The mere existence of the pointer is illegal</em>. The compiler probably won’t complain, and your code may even <em>appear</em> to work, but it is no longer a legal C++ program.</p>

<p>We have also not been given permission to dereference the one-past-the-end-pointer. <code>*(arr + 10)</code> is not legal. Again, it may seem to work, on your computer, with your compiler, on this particular day. But it may not work tomorrow. Or on my compiler. Or when I run your program.</p>

<p>So the language allows us to create, and move pointers around freely, from the start of the array, and up to one past the end of the array. And it allows us to dereference pointers that point to any element in the array, but not one past the end.</p>

<p>And that’s basically it. This is the dreaded pointer arithmetic that usually have beginners running scared. Not all that scary, is it?</p>

<p>Of course, For the sake of completeness, there is one other arithmetic operation that is legal under much the same circumstances:</p>

<p>Two pointers <em>pointing to the same array</em> may be subtracted, yielding the distance between them, expressed as a number of elements.
And for the purposes of pointer arithmetics, single elements are considered arrays of size one, meaning that all the above is true for single variables too — they’re just treated as arrays with only a single element.</p>

<p><strong>And one final detail</strong></p>

<p>Now let’s get self-referential. There is nothing new in this — it follows as a logical conclusion of the above, but it often comes as a surprise, so let’s mention it:</p>

<p>Pointers may point to pointers. Again, there is no magic, no special cases. A pointer is simply a reference to an object, remember? And a pointer is an object too, so obviously we can point to <em>that</em> as well!</p>

<p>We don’t often need to do that, but there is one case where it is used. Typically when you call a library function, and want it to give you a pointer to some resource it has created, you do this:</p>

<pre><code>Resource* ptr = 0; // this is going to be our pointer to the resource. For now, make it a null pointer to avoid confusion
bool success = CreateResource(&amp;ptr); // pass the address of our pointer to the function
</code></pre>

<p>Note that the function wishes to return a status code to let us know if the operation succeeded, so it can’t simply return the pointer we want. So it has to resort to pointer-pointer trickery instead.</p>

<p>The insides of <code>CreateResource</code> might look something like this:</p>

<pre><code>bool CreateResource(Resource** res){
  Resource* actualResource = new Resource(); // create the resource, and temporarily store a pointer to it
  // now we need to pass this pointer to the caller. If res had been a regular "single" pointer, it would simply have been a null pointer. 
  // And sure, we could have made it point to our resource instead, but the caller wouldn't know, because we only received a *copy* of the original null pointer. So even if we change what it points to, we can't change what the *original* points to.
  // Instead, we use a pointer to a pointer. We know that 'res' now points to the caller's Resource pointer. So if we manipulate the value pointed to by 'res', we're actually manipulating the caller's pointer.
  *res = actualResource; // so take our newly allocated resource pointer, and store that into the caller's pointer, which we get by dereferencing res.
}
</code></pre>

<p>It may help to remember that function arguments in C++ are <em>always</em> copied. If you pass an <code>int</code> to a function, it receives a <em>copy</em> of that <code>int</code>. And if you pass a pointer, then the function receives a <em>copy</em> of that pointer. A copy which points to the same address, so anything we do to the pointed-at address will be visible outside the function as well. But if we change the pointer itself, no one else will see it, because the function has been given its own copy.</p>

<p>So if we pass a pointer <code>p0</code> to a pointer <code>p1</code>, then this is again copied. The function receives a copy of <code>p0</code>, let’s call it <code>p2</code> which points to <code>p1</code>. So if we change what <code>p2</code> points to, the calling function won’t see it, but if we change what <code>p1</code> points to, it will be visible to the caller, because <code>p0</code> still points to <code>p1</code>.</p>

<p>Yes, this added level of indirection may take some getting used to, but the important part is that there’s nothing fundamentally special. It is simply the logical conclusions of the rules I described previously, so even if you don’t get it now, you will when you’ve got a bit more experience with pointers. It’s similar to how, when you first learned to read “See Spot Run”, you had all the rules necessary to read longer words, like “stewardesses” or “programmatically”. After that, you pretty much just needed practice.</p>

<p>So that’s it. That’s all pointers are. If you hadn’t previously encountered pointers, you can stop reading here. But if you were already taught about pointers, we probably have to undo some of the damage.</p>

<p>So the following will discuss what pointers are <em>not</em> — that is, the misconceptions that typically exist about pointers, and which beginners are almost invariably taught. I’ll try to explain <em>why</em> these limitations exist as well, partly so you can take the rule seriously as “something with real-world relevance”.</p>

<h1>The Pointer Abuse Rehab and Correction Center</h1>

<p>In the following, assume that <code>i, j</code> are integer variables (<code>int</code>), and <code>p, q</code> are pointers to integers (<code>int*</code>) and <code>n</code> is a null pointer:</p>

<ul>
<li><p>A pointer is not just a number. For example, <code>i + j</code> is legal, but <code>p + q</code> is not. Try it. Your compiler will give you an error. Likewise, <code>i*j</code> is valid, but <code>i * p</code> is not. Integers may be added to or subtracted from pointers, and pointers may be subtracted from pointers (as long as they both point to the same array).   And on some computers, a pointer isn’t implemented as an integer either. Some machines have segmented memory space, so an address is a tuple consisting of a segment identifier plus an offset. Sure, you <em>can</em> combine those two in a single number, in the same way that you can combine the country code with my phone number to create a single integer. But the address is still, fundamentally, a tuple of two numbers on that machine.</p></li>
<li><p>A pointer is not a memory address! I mentioned this above, but let’s say it again. Pointers are typically <em>implemented</em> by the compiler simply as memory addresses, yes, but they don’t have to be. A pointer may not point to just any address (and again, some computers, which have separate address and integer registers, are actually able to enforce this at runtime, generating a hardware fault if you try to create a pointer to an address that is not allocated to your process.) The same goes for moving past the end of an array. You’re allowed to go one element past, but pointing two past the end is not allowed, and again, some computers are able to <em>enforce</em> this, at least in some cases. (imagine that the array is located at the very top of the address space, so moving two elements past the end produces an overflow. On a CPU with dedicated address registers, overflows probably won’t be allowed. They’ll be caught and they’ll generate an exception).</p></li>
<li><p>All pointers are not born equal. A pointer to T may not be convertible to a valid pointer to U. Some machines require datatypes to be aligned. Typically, a 4-byte integer will have to be aligned so it starts on an address  that is divisible by 4. But a single byte datatype such as a char can be placed anywhere. So that means three out of four char pointers will not be valid integer pointers! We also can’t rely on casting as much as we’d typically expect. <code>reinterpret_cast</code> in particular often trips people up. (For non-C++ programmers, you can assume that we had used the “traditional” casting syntax, as in <code>(float*)i</code>. The difference is not important.)</p></li>
</ul>

<pre><code>int* i; // assume we have a pointer i and that it points to a valid integer
float* f = reinterpret_cast&lt;float*&gt;(i); // #1
int* j = reinterpret_cast&lt;int*&gt;(f); // #2
assert(i == j);
</code></pre>

<p>In the above, we know <em>nothing</em> about the value of  <code>f</code> after the cast on line <code>#1</code>. We know that it contains an “implementation-defined mapping” of the original <code>i</code>. But we are <em>not</em> guaranteed that it points to the same address, <em>or even that it contains the same bit pattern</em>!</p>

<p>True, the standard says that the mapping is “intended to be unsurprising to those who know the addressing structure of the underlying machine”, but in general, we can’t rely on that. All we are guaranteed is that once we cast <em>back</em> to the original type, we’re given the original value. So the standard guarantees that <code>i</code> and <code>j</code> in the above will point to the same address. But we know nothing about <code>f</code>, other than that the compiler is able to convert the value stored in it back to the original pointer <code>i</code>.</p>

<h1>Conclusion</h1>

<p>By now, I hope it’s clear that pointers actually become a lot simpler when we treat them as what they are, reseatable references to objects. If we start pretending that they are memory addresses, we get a whole host of complications: we start thinking that they should be allowed to point to <em>any</em> memory address, or even worse, that they are just numbers, and that all the usual arithmetics work on them. (Remember, adding or subtracting integers is legal, but it adjusts the pointer by that number of <em>objects</em>, not <em>bytes</em>, as we would have expected if pointers were just memory addresses. And <code>pointer + pointer</code>, <code>pointer * pointer</code> or <code>pointer / pointer</code> are simply not defined at all.)</p>

<p>As if that wasn’t bad enough, we also require the student to understand the underlying hardware, in particular the concept of a memory space, and of physical (or virtual) hardware addresses.</p>

<p>But if we treat pointers as what they are, that is no longer necessary. A pointer points to a C++ object, not a memory address, so to understand pointers you merely have to understand C++ objects, not memory addresses.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalf.dk/blog/2009/07/the-great-pointer-conspiracy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

