C++ Style choices (notes)
This post is aimed at programmers-in-general, and covers some general stylistic perspectives. The vehicle is C++, but the stylistic issues at hand are experienced by programmers in basically every language of which I'm aware. Except Whitespace. Those guys have no issues with style. :-)
I would be the last person to jump down anyone's throat for off-center style practices. In fact, I consider style quirks (my own and others) to be one of my most useful touchstones when learning new code. Moreover, I try to be a stylistic chameleon when working in new code. So please don't construe this post as recommendations or an invitation to a holy war, but rather as a report of what I've tried, liked, and collected after programming computers for the past 20 years.
80-column limits
Unless I would be wrecking the style integrity of already-existing code, I have no qualms about writing an if() conditional that is 300 characters long. Obviously, this makes it harder to translate code into this sort of medium, and it occasionally drives (even me) bananas when I need to debug such a conditional, but the thing it saves is indentation confusion, split-line slashes, and a false sense of simplicity.
Soft versus hard tabs
I go back-and-forth on this. And that is evident from the trail of tabs I've left behind me on github. These days, I'm preferring soft tabs with a width of 2 spaces. I have no reasons for preferring this, which is why I'm back-and-forth. Typically, I'll use whatever tabbing mode is already extant in the source file.
Superfluous parentheticals
If you passed algebra, you almost certainly don't need to be reminded of the order of operations. And those same algebra rules apply inside the compiler. IE....
int x = 6 + 1 * 3;
...is always equal to 9, and there is no ambiguity. But what if you see this?
int n = x * x + y * y + z * z;
I don't know about you, but I distrust my ability to read that accurately all the time. So I would make my meaning explicit so it is immediately clear that I intend to add the squares of those terms...
int n = (x * x) + (y * y) + (z * z);
It is worth noting that when thinking about order-of-operations with Boolean algebra, many people use a translation-layer in their knowledge that looks like this....
Ok... so... "a & b | c" means "a * b + c", therefore I need
to "a & (b | c)".
This is common because most of us learned order-of-ops WAY earlier than Boolean algebra. And Boolean algebra was probably presented to us with our decimal algebraic knowledge as a base.
Even if you innately know that there is no difference between AND and multiplication, it still doesn't address this....
int n = p | r ^ s & q;
Too much parenthetical delineation is better (IMO) than chasing down bugs resulting from a human being misunderstanding order-of-ops.
Superfluous delineation of scope
Python programmers can ignore this, as their language prevents the problem depicted here.
Lots of code out there looks like this...
if (x) do_something();
Or worse...
if (x)
do_something();
In all but very narrow circumstances, I will always rewrite this to be...
if (x) {
do_something();
}
...because on more occasions than I can count, I have encountered (or myself written) bugs like this...
if (x)
do_something();
do_something_else_afterward();
The rare occasion that I let myself be THAT lazy is this one...
if (verbosity > 1) local_log.concat("Unexpected return case.\n");
The intention here is to only print that log line. Nothing else should depend on that conditional. In that case, the whole line could be sensibly replaced by a macro without risking the bug.
If you haven't caught onto the theme of my style choices yet, notice that they are typically geared toward reducing human error.
Artful ommission of break;
This is related to "stacking case labels", and underscores a programmer that values tight-code over clarity.
I'm not always disciplined about it, but if I have code like this that not only stacks case labels, but also has more than a few lines of intervening code, I will try to notate it to indicate that...
- it was not an oversight, and
- care should be exercised nearby
Avoidance of assignment while testing for equality
From StringBuilder....
/*
* Thank you N. Gortari for the most excellent tip:
* // Null-checking is a common thing to do...
* Intention: if (obj == NULL)
*
* // Compiler allows this. Assignment always evals to 'true'.
* The risk: if (obj = NULL)
*
* // Equality is commutitive.
* Mitigation: if (NULL == obj)
*
* "(NULL == obj)" and "(obj == NULL)" mean exactly the
* same thing, and are equally valid.
*
* Leverage assignment operator's non-commutivity,
* so if you derp and do this...
* if (NULL = obj)
* ...the mechanics of the language will prevent compilation,
* and thus, not allow you to overlook it on accident.
*/
void StringBuilder::prependHandoff(StringBuilder *nu) {
if (NULL != nu) {
this->root = promote_collapsed_into_ll();
StrLL *current = nu->promote_collapsed_into_ll();
if (NULL != nu->root) {
// Scan to the end of the donated LL...
while (NULL != current->next) { current = current->next; }
current->next = this->root;
this->root = nu->root;
nu->root = NULL;
}
}
}
Subverting linguistic rules with type-casting
To achieve certain kinds of OO design patterns in C (not C++), programmers will frequently cast and type-pun to avoid re-writing code. See this.
I think that this reeks of duress, but sometimes it is the best thing to do. Here are some reasons people do it...
- When the reason for casting is to avoid a harmless linguistic granularity (Casting to-and-from volatile or [sometimes] const)
- C has no function overrides. So without doing this, we would also need to write a second void some_fxn(SomeOtherDataType)*, thus repeating ourselves. Not only do we repeat ourselves in our source code, but we also repeat the same machine code in the binary.
- It is expedient. And it is often a graceful segue into a proper type strategy.
Type-casting might also be used in arithmetic to ensure proper integer alignment, and so forth. But if I find myself type-casting under any circumstance not listed above, I start questioning my own judgement.
Short-circuit evaluation
There are non-trivial efficiency gains from using this....
boolean marked_complete = markComplete();
if (err_state || !marked_complete) {
report_failure();
}
...versus this (note the short-circuit)...
boolean marked_complete = markComplete();
if (err_state | !marked_complete) {
report_failure();
}
That's great. But what if you carelessly code this...
if (err_state || !markComplete()) {
report_failure();
}
Your intention might have been to always call markComplete(), regardless of err_state. But that is not what will happen. If your err_state evals to 'true', the short-circuit will drop out before doing the thing you thought was unconditional. The quick solution is to take advantage of the fact that OR is commutative...
if (!markComplete() || err_state) {
report_failure();
}
I love encapsulating my fxn calls into conditionals in this way. So I am at particular risk of creating this bug. I've found no stylisticc means of catching myself (other than not using fxn calls inside of compound conditionals). So I simply stay vigilant.
Inlining accessor members
I see nothing but benefit by doing things like this.
I consider this to be FAR superior to macros because a macro happens at the pre-processor (prior to the compiler). And the pre-proccessor has no type checks. Which leads me to....
Macros
Here is a good SO post on the pitfalls of macros. Mats Petersson's answer pretty much sums up my thoughts on macros. I do use them occasionally, but this is pushing the boundaries of good taste (and is on-deck for refactor due to type ambiguity)...
#ifndef max
#define max( a, b ) ( ((a) > (b)) ? (a) : (b) )
#endif
#ifndef min
#define min( a, b ) ( ((a) < (b)) ? (a) : (b) )
#endif
I would, however, consider platform abstraction to be a generally-acceptable use of macros...
#ifndef STM32F4XX
// To maintain uniformity with ST's code in an non-ST env...
#define I2C_Direction_Transmitter ((uint8_t)0x00)
#define I2C_Direction_Receiver ((uint8_t)0x01)
#define IS_I2C_DIRECTION(DIRECTION) (((DIRECTION) == I2C_Direction_Transmitter) || \
((DIRECTION) == I2C_Direction_Receiver))
#endif
A note regarding tool-choices and graphanalysis
I refuse to use an auto-formatter. I do all of my programming in a glorified text-editor (JEdit / Atom / TextPad) with no auto-complete and no auto-formatting. The one exception to this is java, where I will use Eclipse or IntelliJ for the sake of using the other tools surrounding the Java (Android layout editor, ADB front-end, etc).
Why would I be so masochistic?
- Tools that engender dependency are very dangerous tools.
- I consider this to be perfect legible commentary. Feed it into the autoformatter of your choice, and watch it become illegible. Tools that are easy-to-use and wreck perfectly good information are very dangerous tools.
- Buried in this class's whitespace are strong clues to its phylogeny.
That last point is subtle. But here is a simple example.
Take this comment block, and contrast it with this one. Notice that they are different widths. There are a number of reasons this might be the case. But considering that the flavor of the block is the same, the most likely is that the same author is using another (already extant) class as fodder for copy-pasta. A fuzzier version of the same conclusion could be drawn from hard vs soft tabbing that is intermixed in the same file; Either several authors, or a passage of time touched this file between edits.
If I were hunting a bug, I could then use that whitespace difference to find the functional boundaries of the edits, and looks for mistakes on the interface. Either the authors were not aware of how the other thought (if there were really two authors), or the same author let too many moons pass, and then dove into a feature addition with an insufficient re-read of his own code. Any experienced programmer who is honest with vimself realizes that he has done both.
After all... whitespace is part of the code, and I can only assume that the person who put it there did it for a reason (if if that reason was "copy-paste").
I've been trying to convert all of my commentary into a format that Doxygen can parse and use. But to use an autoformatter to achieve that end on someone else's code is wrecking perfectly good information that (if you are a detail-oriented reader) would help you better understand the program, and not merely the code.