c++ - since - what is cxx11

C++0x lambdas coding style (4)

I wonder how people are using C++0x lambdas, in terms of coding style. The most interesting question is how thorough to be when writing the capture list. On one hand, the language allows to list captured variables explicitly, and by the "explicit is better than implicit rule", it would therefore make sense to do an exhaustive listing to clearly state the intetion. E.g.:

 int sum;
 std::for_each(xs.begin(), xs.end(), [&sum](int x) { sum += x });

Another argument for this is that, since lifetime of ref-captured locals doesn't change just because they're captured (and so a lambda can easily end up referencing a local whose lifetime has long ended), making capture explicit helps reduce such bugs and track them down.

On the other hand, the language also deliberately provides a shortcut for auto-capturing all referenced locals, so clearly it's intended to be used. And one could claim that for an example such as one above, it is very clear what happens even with auto-capture, and lifetime of lambda is such that it won't outlive the surrounding scope, so there's no reason not to use it:

 int sum;
 std::for_each(xs.begin(), xs.end(), [&](int x) { sum += x });

Obviously this doesn't have to be all-or-nothing, but there has to be some rationale to decide when to auto-capture, and when to do capture explicitly. Any thoughts?

Another question in the same vein is when to use capture-by-copy - [=], and when to use capture-by-reference - [&]. Capture-by-copy is obviously safer because there are no lifetime issues, so one could argue that it should be used by default whenever there's no need to mutate the captured value (or see the changes done to it from elsewhere), and capture-by-reference should be treated as (potentially premature) optimization in such cases, to be applied only where it clearly makes a difference.

On the other hand, capture-by-reference is almost always faster (especially as it can often be optimized down to a copy, if the latter is actually faster, for small types and inlineable template functions such as most STL algorithms), and is safe if lambda never outlives its scope (which is also the case for all STL algorithms), so defaulting to capture-by-reference in this case is a trivial and harmless optimization which doesn't hurt.

What are your thoughts?

I can see a new coding standard rule here! ;)

This is a bit contrived but just to highlight an "advantage" to being explicit, consider the following:

void foo (std::vector<int> v, int x1)
  int sum = 0;
  std::for_each (v.begin ()
    , v.end ()
    , [&](int xl) { sum += x1; } 

Now, I've purposely chosen poor names etc for this, but it's just to illustrate the point. If we used an explicit capture list then the above code wouldn't compile, but currently it will.

In a very strict environment (safety critical) I can see a rule like this being part of the coding standard.

I've never heard of the "explicit is better than implicit rule" rule, and I don't agree with it. There are cases where it's true, of course, but also plenty of cases where it isn't. That's why 0x is adding type inference with the auto keyword after all. (and why function template parameters are already inferred when possible) There are plenty of cases where implicit is preferable.

I haven't really used C++ lambdas yet (other than poking around with the VC10 beta), but I'd go with the latter most of the time

std::for_each(xs.begin(), xs.end(), [&](int x) { sum += x });

My reasoning? Why not do it? It's convenient. It works. And it's easier to maintain. I don't have to update the capture list when I modify the body of the lambda. Why should I be explicit about something the compiler knows better than me? The compiler can figure out the capture list based on what's actually used.

As for capture by reference vs value? I'd apply the same rules as I do for regular functions. If you need reference semantics, capture by reference. If you need copy semantics, do that. If either will do, prefer value for smallish types, and reference if copying is expensive.

It doesn't seem different from the choice you have to make when designing a regular function.

I should probably read up on the specs for lambdas, but isn't the main reason for explicit capture lists so that you can capture some variables by value and others by reference?

My initial instinct was that capture by value offers more or less the same as Java's anonymous inner classes, which are a known quantity. But rather than using the array-of-size-1 trick when you want the enclosing scope to be mutable, you can capture by reference instead. It's then your responsibility to confine the duration of the lambda within the scope of the referand.

In practice I agree with you that capture by reference should be the default when dealing with algorithms, which I expect will be the majority of uses. A common use for anonymous inner classes in Java is listeners. There are fewer listener-style interfaces to be seen in C++ to start with, so it's a lesser need, but still there. It might be best to strictly stick to capture by value in that kind of case, to avoid the opportunity for error. Will capture-by-value of a shared_ptr be a big idiom, maybe?

However, I haven't used lambdas yet, so I may well have missed something huge.