By now, I’ve read an awful lot of papers about STM systems, and certain trends are really starting to stand out, not so much in terms of the algorithms used or the clever schemes invented to make transactions appear atomic, but in how they interface with the actual language.
It has really underlined to me just how deeply entrenched most Java, C and C++ programmers are in the imperative mindset. When we want to perform a transaction, we’ll need to execute something like the following pseudocode:
for (;;) {
tx = new Transaction();
try {
// ... fill in body of the transaction ...
tx.Validate();
tx.Commit()
}
catch (InvalidTxException){
tx.Rollback();
}
}
And several STM systems actually require the user to write the above whenever a transaction is to be executed.
Some implementers have realized that this is both verbose, messy and error-prone. So they try to hide it behind a macro.
#define BEGIN_ATOMIC \
for (;;) { \
tx = new Transaction();\
try { \
#define END_ATOMIC \
tx.Validate();
tx.Commit()
}
catch (InvalidTxException){
tx.Rollback();
}
}
and now a transaction looks like this instead, from the user’s point of view:
BEGIN_ATOMIC
// ... fill in body of the transaction ...
END_ATOMIC
Which is obviously shorter and requires less duplicate code. On the other hand, it relies on macros. Eeeew, yuck!
In Java, most implementers have tried to extend the language, to add a language-level atomic statement, allowing code like this:
atomic {
// ... fill in body of the transaction ...
}
And yes, this obviously works too, and has the potential to integrate much more closely with the compiler and type-checker. But it’s also a very imperative kind of thinking. “add another type of scope inside whichever function wants to perform the transaction”.
What astonishes me is that no one (apart from Peyton Jones, who implemented Haskell STM) have realized that what we really want is nothing more than a higher-order function. A function which implements the transaction infrastructure, and then calls a user-defined function.
What we’d like to do, ideally, is to define a function atomic, as in the following pseudocode:
T atomic(f : Tx -> T){
for (;;) {
tx = new Transaction();
try {
T ret = f(tx);
tx.Validate();
tx.Commit();
return ret;
}
catch (InvalidTxException){
tx.Rollback();
}
}
}
Of course, if we wanted to be really functional, we could use recursion instead of the loop, but let’s not get carried away. Ultimately, what we want is just to hide all the messy infrastructure of a transaction from the user.
The above declares a function atomic which takes as its parameter a function which, given a transaction (type Tx), returns type T.
atomic implements the messy transaction loop, and simply calls the user-supplied function inside it.
And presto, the user can now create a transaction like this:
int myTx(Tx tx) {
// do whatever
}
int result = atomic(myTx);
Or, in a language which supports lambda expressions,
int result = atomic((tx) => {/* do whatever */ });
Bit nicer than messing around with user-implemented loops or macros, isn’t it? Of course, I’ve so far sought refuge within the peaceful realms of pseudocode, where anything is possible. We obviously can’t do this in Java or C++, can we?
No, and yes, in that order. In Java, I agree, we’re just screwed. Of course, we could just pass in an object instead of a function, and if it implements some ICallableTransactionBody interface, atomic could call its RunTx() method. That would work, but it’s not quite as elegant.
In C++, though, the situation is different. A large part of the standard library (all the STL algorithms) relies on higher-order functions. .
In C++, we can define atomic as follows:
template <typename Func>
void atomic(Func f){
for (;;) {
tx = new Transaction();
try {
f(tx);
tx.Validate();
tx.Commit();
}
catch (InvalidTxException){
tx.Rollback();
}
}
}
First, I should mention that I cheated. I removed the return value. atomic now returns void. The reason is that it’s not quite trivial to infer the return type of a function in C++. TR1 does introduce a function std::result_of but it doesn’t work in every case. C++0x will fix this once and for all, by providing the necessary language features to implement it. But until then, we could hack around it, either by requiring the user to explicitly specify the return type (atomic<int>(myTx)), or by using result_of and simply telling the user to avoid the cases where it doesn’t work. But to keep the above code example simple, I simply removed the return type instead. But it can be solved, and my prototype does it.
Now, on with the show. We have now defined our atomic function. How do we call it?
In one of three ways, in decreasing order of familiarity:
First, with a function pointer. Hopefully every C++ programmer knows about these:
void myTx(Tx tx) {... }
atomic(myTx);
Fairly clean and simple. One downside is that because the atomic function is just passed a function pointer, the compiler may not be able to inline the call to myTx. It is doubtful that it’d make much difference here, but it is nevertheless a valid concern.
The second approach fixes this, as well as enabling some more flexibility: Functors, or function objects, which every C++ programmer should know about, but unfortunately, not everyone does:
struct myTx {
void operator()(Tx tx) {...}
};
atomic(myTx());
Now we’re passing in an object of type myTx, which means that the compiler knows exactly which function to call: myTx::operator()(Tx). So it can inline this trivially, eliminating the above performance concern.
Perhaps more importantly, our function can now hold state. It is an object, so it can be given a constructor and destructor. It can be given other member functions to allow the user to interact with it. It has become a lot more versatile.
Finally, we can wait for C++0x, which introduces lambda expressions:
atomic([](Tx tx) { ...});
And what’s noteworthy is that the same atomic definition works in all three cases! So the user is actually given the choice of which method to use. Our STM library supports them all.
Isn’t this cleaner than the original approach, of requiring the user to implement the whole loop himself? So why have no one used it before?
Ultimately, I can think of two possible reasons:
- the writers, usually very clever researchers and academics, were not familiar with higher-order functions, or with functional programming in general, or
- the writers did not know that this was possible in C++
I suppose the second option is the most forgiveable. C++ is a big messy language, and not everyone are familiar with every corner of it. That’s fair enough, but even then, most of this could have been done in C too, by using function pointers. It’s not that esoteric, is it? But I sincerely hope it is not the first case. I could understand if Joe Coder at Insignificant Software Inc. had never heard of functional programming, but most of these are computer science professors!
Incidentally, this should also provide a nice example for when an imperative programmer asks “how would it benefit me to learn a functional language?”
Here we have a clear example of how using a very simple functional technique can significantly clean up the code and make it much more robust1, even in an imperative language like C++.
-
Imagine what would have happened in the original loop– or macro-based versions, if the user had dared to put a
breakorreturnin their transaction… But both are safe in the higher-order function version. ↩
Tags: c++, functional, imperative, stm, thesis, transactional-memory





You’re such a nerd :p
Hopefully that doesn’t come as too much of a surprise… :)