Hopes for 2010: Microsoft Visual C++

As I men­tioned ear­lier, I’d like to cel­e­brate the new year by call­ing out a few prod­ucts I’d like to see improved in the new year.

First in line is Microsoft’s C++ com­piler and IDE.

From you, what I’d like to see in 2010 is actu­ally fairly sim­ple (at least con­cep­tu­ally): rethink your IDE. The Visual Stu­dio team as a whole is already doing this in a big way with VS10. I’m hop­ing you can find the time to rein­vent the C++ IDE specif­i­cally as well.

For the past decade (except for the 5 years you wasted try­ing to elim­i­nate native C++), you’ve been try­ing very hard to write the ulti­mate IDE for the wrong lan­guage. You’ve got some­thing that works pretty well for C++ code anno 1993 or so, but which falls com­pletely apart when used for more mod­ern C++.

Why do you per­sist in putting so many resources into mak­ing Intel­lisense bet­ter, when it still has no way to deal with a sim­ple tem­plate func­tion? Isn’t that a hint that you should rethink your approach? Mod­ern C++ has quite a bit in com­mon with dynamic lan­guages. The type of a func­tion para­me­ter may not be known just by look­ing at the func­tion def­i­n­i­tion. Per­haps what we need is actu­ally a kind of compile-time REPL loop, an “inter­ac­tive mode”, sim­i­lar to what is com­monly found in dynamic languages.

For exam­ple, I could use this to imme­di­ately and inter­ac­tively instan­ti­ate a tem­plate (or a com­plex tem­plate metapro­gram), in order to inspect which types are used as para­me­ters for each tem­plate that is instan­ti­ated as a con­se­quence. I could use it to query the type sys­tem, for exam­ple ask­ing whether or not the type of std::vector<int>::iterator is the same as int*, or what sizeof(std::string) is. Per­haps it could allow me to “step through” the chain of tem­plate instan­ti­a­tions, like we do with the debug­ger at run­time, instead of being lim­ited to com­pil­ing, and then look­ing at the com­piler errors — the metapro­gram­ming equiv­a­lent of printf–debug­ging.

What I’d like to see from the MSVC IDE in the com­ing years is an accep­tance and sup­port of mod­ern C++ par­a­digms — generic pro­gram­ming, tem­plate metapro­gram­ming and all the dif­fi­cul­ties that implies. Don’t give me an IDE that tries to pro­vide Intel­lisense for a pro­gram with a com­pletely sta­tic struc­ture, because that’s not what a mod­ern C++ pro­gram looks like. If you’re going to do Intel­lisense, make it able to han­dle the very flex­i­ble type sys­tem enabled by tem­plates. take a leaf from the JavaScript sup­port in your IDE, which sup­ports intel­lisense even though the lan­guage is dynamic and types gen­er­ally aren’t known until run­time. To dis­play type infor­ma­tion in the IDE while the code is being writ­ten, they have to be clever, and cheat and bluff and pre­tend. But they do a rea­son­able job of it.

In C++, the types aren’t known until compile-time, so from the IDE’s point of view, the prob­lem is sim­i­lar. To dis­play type infor­ma­tion while the code is being writ­ten (and before it is com­piled), the IDE has to be clever. And at the moment, it isn’t. At the moment, Intel­lisense just gives up.

Let’s take a sim­ple exam­ple. What should the IDE do about this function:

template <typename T>
typename T::return_type foo(T arg){
  bar(arg);
  return arg.baz();
}

If we play it by the book, and demand a “per­fect” solu­tion, there is noth­ing the IDE can do. It doesn’t know what T is, so it can’t help us with auto­com­ple­tion, sug­gest­ing mem­bers after the dot, or any­thing else. But if we’re will­ing to think out­side the box, and accept a suc­cess rate lower than 100%, there are sev­eral strate­gies the IDE could use to pro­vide mean­ing­ful Intel­lisense information:

  • we could look at the call sites. They must log­i­cally pro­vide types that are valid in this con­text. We could find one call site, and pro­vide Intel­lisense on the assump­tion that the type T is what­ever was passed at that call site. That wouldn’t be 100% accu­rate in all cases, of course, but it would give us a type that works with the func­tion, so it would be use­ful. It could even look at sev­eral call sites, and com­pute the union of the types used. If they all pro­vide a frobnicate() method, then the IDE could assume that T inside the func­tion foo always con­tains such a member.
  • We could look at how the type is used in the func­tion. It must have a copy con­struc­tor (because it is passed by value), it must have some nested type return_type, and it must have a no-arg baz mem­ber func­tion, which returns some­thing con­vert­ible to that type. And it must be con­vert­ible to what­ever argu­ments bar expects. This prob­a­bly isn’t a com­plete descrip­tion of the type, but it would be enough to give us some lim­ited Intel­lisense infor­ma­tion at least. The com­piler might be able to deduce some infor­ma­tion about the type. We could even gen­er­ate some kind of ad-hoc “con­cepts” imple­men­ta­tion — per­haps not as exten­sive as that which was pro­posed in C++0x (and sub­se­quently dropped), but a kind of helper datas­truc­ture that the IDE can attempt to map onto unknown tem­plate types.
  • Or we could allow the user to spec­ify an exam­ple of a valid para­me­ter type, and then use that to gen­er­ate Intel­lisense infor­ma­tion from.
  • going back to the pre­vi­ously sug­gested “compile-time” REPL loop, the user could query the tem­plate, ask­ing “if called with T=int, would the func­tion com­pile? And what would the return type be? And what would the result of std::is_const<T>::value be?”

Mod­ern C++ has a lot in com­mon with dynamic lan­guages. Very lit­tle infor­ma­tion can be reli­ably extracted with­out com­pil­ing the code, so give me the tools for option­ally and tem­porar­ily com­pil­ing bits and pieces.

When I write silly tem­plate metapro­grams to com­pute the N’th prime num­ber, and the result is wrong, why doesn’t MSVC pro­vide a compile-time debug­ger? When I get a com­pile error inside tem­plate code, the error mes­sage con­tains what is effec­tively a compile-time stack trace. It shows the stack of tem­plate instan­ti­a­tions, but as hard-to-read ver­bose text. Why isn’t it ren­dered as a stack trace? One which lets me step through the instan­ti­a­tion of this maze of tem­plates, inspect the mem­bers of each, and find out where it went wrong, where it instan­ti­ated the wrong tem­plate, or where I for­got to write the spe­cial­iza­tion I intended.

In far too many ways, the C++ IDE really feels like a C IDE. Most of it doesn’t seem to know that there’s this new-fangled thing called “tem­plates”, or that they change how peo­ple write code. The Imme­di­ate win­dow and the debug­ger fail to rec­og­nize tem­plate para­me­ter names. If I am debug­ging a func­tion template <int I> void foo(), why can’t I get the debug­ger to tell me the value of I? It should be absolutely triv­ial to do. But the debug­ger can’t seem to do it. Intel­lisense can’t seem to do it. The Imme­di­ate pane can’t seem to do it. There’s a clear mis­match between the com­piler, which is clearly a C++ com­piler, and pretty much hasn’t both­ered about the C side for close to a decade, and the IDE which still seems to be try­ing to be the per­fect C IDE, com­pletely dis­re­gard­ing every fea­ture unique to C++.

I know you’re used to being told that you have one of the best IDE’s in exis­tence. I beg to dif­fer. You may have got one of the best C IDE’s, and your C# and VB IDE’s kick some seri­ous butt. But your C++ IDE is essen­tially nonex­is­tent. Your IDE does not sup­port C++. It sup­ports a mar­gin­ally and con­ser­v­a­tively extended C, and tries to make it look sim­i­lar to your C# IDE.

So far, I’ve dealt exclu­sively with the IDE issues, and that’s not a coin­ci­dence. On the whole, I’m quite happy with the MSVC com­piler. The per­for­mance of gen­er­ated code is good; you’re mak­ing great progress on C++0x sup­port, and over­all, you’ve got a com­piler I’m happy with. Of course there are still a cou­ple of areas where the lack of standards-conformance is embarass­ing (never mind the export key­word, I’m more both­ered about two-phase name lookup and other such rel­e­vant fea­tures), and there are some fea­tures I wish you’d bor­row from GCC, and I wish you’d tighten up your warn­ing mes­sages a bit (some of them are noth­ing more than noise, or are impos­si­ble to avoid in “good” healthy code and please please please give us a sane alter­na­tive to windows.h), I have few seri­ous com­plaints about the com­piler. I do, how­ever, have a few suggestions.

It seems to me that the source/header com­pi­la­tion mech­a­nism could use a makeover. We can’t change the actual seman­tics (yet — hope­fully the pro­posal for a mod­ule sys­tem for C++ gains trac­tion), but the com­piler can change how it actu­ally processes the code. And yet, major com­pil­ers still process the source files in the exact same man­ner they did 20 years ago. Even though this is, on today’s machines, and with today’s huge code­bases, ridicu­lously inefficient.

Ages ago, pre­com­piled head­ers were invented, but I’m not really a fan of them. It’s a hack­ish solu­tion which some­times helps, but may also hurt, due to the ten­dency towards includ­ing every­thing in one sin­gle “blob” header. Even if that header is pre­com­piled, it still means every­thing that includes it has to deal with these bloated mono­lithic sym­bol tables and other data struc­tures. More impor­tantly, it is a frag­ile solu­tion, as the VC Team’s own blog shows.

But why can’t this mech­a­nism be gen­er­al­ized? Why can’t the com­piler process every header in iso­la­tion, build a com­plete parse tree of each one, and store those on disk? And then, when the header is included, rather than read­ing and pars­ing the header again, sim­ply load this parse tree and merge it into the rest of the com­pi­la­tion unit. Of course, it is easy to come up with cases where the file may have to be parsed dif­fer­ently depend­ing on where it is included, but in 99.9% of all cases, the inclu­sion mech­a­nism is straight­for­ward and sim­ple: The header is typ­i­cally not included in the mid­dle of a class def­i­n­i­tion or from inside a name­space. It usu­ally only reacts to a few fixed macros that may be defined before the header’s inclu­sion. So most of the time, the header could be pre­com­piled in iso­la­tion and reused. And for the few cases where the changed state actu­ally mat­ters, where the header is included in the mid­dle of a func­tion def­i­n­i­tion or with no include guards or where a macro (say CreateWindow, or a sim­i­larly com­mon name, cough cough) man­gles the con­tents of the header, in those cases, the com­piler can sim­ply fall back to the tra­di­tional source code inclu­sion and sub­se­quent com­pi­la­tion of the trans­la­tion unit. Even if these pre­com­pi­la­tion passes aren’t stored to disk in the man­ner of pre­com­piled head­ers, they could still be kept in mem­ory, and reused between trans­la­tion units dur­ing a build. If N dif­fer­ent .cpp files all include a cer­tain header, it would allow that header to be com­piled once, rather than N times.

Once again, we have some­thing that feels like a left­over from C. In C, head­ers were mostly for­ward dec­la­ra­tions and lit­tle actual code, so naive pro­cess­ing of head­ers worked fairly effi­ciently. in C++, it is get­ting more and more com­mon to put huge amounts of code in head­ers, which means that the naive com­pi­la­tion strat­egy tra­di­tion­ally used for C becomes ridicu­lously slow and inef­fi­cient. Cre­at­ing a truly gen­eral replace­ment strat­egy is nearly impos­si­ble, true, but it seems like it’d be pos­si­ble to cre­ate a heuris­tic that’d enable more effi­cient pro­cess­ing of header files 99% of the time, and which could then fall back to the tra­di­tional method of copy/pasting head­ers into the trans­la­tion unit for the last per­cent of cases.

And why does every trans­la­tion unit have to read every file every time? Can’t their con­tents be kept in mem­ory, at least for a short time? Those hun­dreds or thou­sands of file accesses are painfully slow. Win­dows already exposes APIs for mon­i­tor­ing file changes, so it should be fairly sim­ple to deter­mine when a source file has been mod­i­fied, and only then flush it from memory.

Yes, some of these things would be chal­leng­ing to imple­ment, but like I said, I have few real com­plaints about the com­piler, so why not imag­ine how it could be taken to the next level?

And of course, everyone’s favorite nit­pick: Why is windows.h so absolutely hor­ri­ble? Why does it have to be one mono­lithic header which gives us every­thing Win­dows has to offer? Why doesn’t it com­pile as stan­dard C++? Why does it include so many other head­ers (as above, slow­ing down com­pi­la­tion)? Why does it pol­lute the global name­space with macros for ridicu­lously com­mon names?

Well, it does, and it’d be silly to expect this to change, due to back­wards com­pat­i­bil­ity con­cerns. But why then, is there not a windows.hpp or sim­i­lar? Why isn’t there a sep­a­rate cleaned-up, C++-compatible header? One which uses func­tion over­load­ing instead of macros, for exam­ple? Or which just defines sim­ple for­ward­ing func­tions instead of macros? One which com­piles even with the non-standard lan­guage exten­sions dis­abled? Or why isn’t there a set of these head­ers, allow­ing us to access the bits of the Win­dows API we’re inter­ested in, with­out hav­ing to include *everything?

In short, I think the MSVC IDE could do a make-over. Out with those 12-year-old project wiz­ards, which cre­ate com­plex pre­de­fined project struc­tures accu­mu­lat­ing every bad prac­tice and unex­pected project set­ting in one place. I’ve lost count of how many begin­ners I’ve seen choke because their tiny lit­tle projects auto­mat­i­cally get a pre­com­piled head­ers thrown in for absolutely no rea­son, which makes their code so frag­ile it breaks when­ever they try to change anything.

Another addi­tion that would really boost the use­ful­ness of MSVC would be to pro­vide facil­i­ties for tem­plate metapro­gram­ming in unit tests: For exam­ple, it is com­mon to use metapro­grams to force com­pi­la­tion fail­ures if a tem­plate is instan­ti­ated with a spe­cific type. But how do we test that this works as intended? Give us the hooks and lan­guage exten­sions (ide­ally some­thing that can be hooked into from third-party unit test­ing frame­works) nec­es­sary to spec­ify that “this func­tion is expected to fail to com­pile, and if it does, that’s not an error, just ignore the func­tion and com­pile the remain­der of the file”. Con­sider the com­pi­la­tion process a part of the lan­guage — it is some­thing that must be inspected and debugged, (and ide­ally, some­thing which should be pos­si­ble to do piece­wise, with­out com­pil­ing the entire project) and for which we may wish to write tests.

Tar­get your IDE at Mod­ern C++, rather than C with classes. Impress the world by being the first IDE to even think about this. Embrace, and pro­vide sup­port for, the changes that have hap­pened in the C++ lan­guage, in best prac­tices and in the mind­set of the C++ com­mu­nity. Out with the idea that C++ can best be pre­sented like C#, as a sta­tic lan­guage where every piece of code can be under­stood in iso­la­tion. Instead, give us an IDE that treats C++ as a more dynamic lan­guage, where many types of infor­ma­tion are just not avail­able until the code has been com­piled. Sup­port and encour­age use of tem­plates, and accept that yes, head­ers are ridicu­lously heavy these days, and blindly recom­pil­ing them for every trans­la­tion unit just doesn’t scale the way it used to. Treat com­pi­la­tion as an inter­ac­tive process where tem­plate instan­ti­a­tion can be stepped through and inspected at each stage, and where inter­ac­tive queries can be made sta­t­i­cally or dur­ing debug­ging to inspect not just data, but also types. And face up to the fact that tra­di­tional intel­lisense is a lost cause. There is no way to sta­t­i­cally pro­duce all the infor­ma­tion we expect from intel­lisense. Some can be impro­vised by var­i­ous heuris­tics, or per­haps by assum­ing some suit­able dummy val­ues for th evar­i­ous tem­plate para­me­ters, but oth­ers may be nearly impos­si­ble to pro­vide use­ful infor­ma­tion on until at least part of the pro­gram has been com­piled. If the IDE can’t pro­vide the infor­ma­tion I need auto­mat­i­cally, it could at least allow me to query for the infor­ma­tion. Per­haps it can’t tell me any­thing about the tem­plate type T, but why can’t I tell it to assume that T is a std::wstring, and pro­vide infor­ma­tion based on this assump­tion. You already have a pretty good C++ com­piler. It’s time to start work­ing on a C++ IDE, and call it a day on the C IDE you’ve been pol­ish­ing until now.

So dear MSVC team, in case you can’t think of any­thing use­ful to do with your time in the year 2010 (as if… I know you’ve got C++0x sup­port to work on, and that’s infin­tely more impor­tant to me than IDE improve­ments), here’s a new year’s res­o­lu­tion for you: Amaze the world by show­ing what a C++ IDE should work like. Rein­vent the role of the C++ IDE, instead of try­ing to force your cur­rent C-C# hybrid IDE to work for C++ as well.

Share and Enjoy: These icons link to social book­mark­ing sites where read­ers can share and dis­cover new web pages.
  • Digg
  • del.icio.us
  • StumbleUpon
  • Reddit
  • Technorati

Tags: , , , , ,

Leave a Reply

Name and Email Address are required fields. Your email will not be published or shared with third parties.