18

I have to use IAR compiller in embedded application (it does not have namespaces, exceptions, multiple/virtual inheritance, templates are bit limited and only C++03 is supported). I can't use parameter pack so I tried to create member function with variadic parameter. I know variadic parameters are generally unsafe. But is safe to use this pointer in va_start macro?

If I use ordinary variadic function it would need a dummy parameter before ... to be able to access remaining parameters. I know variadic macro would not need parameter before ... but I would prefer not to use it. If I use member function it has hidden this parameter before ... so I tried it.:

struct VariadicTestBase{
  virtual void DO(...)=0;
};

struct VariadicTest: public VariadicTestBase{
  virtual void DO(...){
    va_list args;
    va_start(args, this);
    vprintf("%d%d%d\n",args);
    va_end(args);
  }
};

//Now I can do
VariadicTestBase *tst = new VariadicTest;
tst->DO(1,2,3);

tst->DO(1,2,3); prints 123 as expected. But I am not sure if it is not just some random/undefined behavior. I know tst->DO(1,2); would crash just like normal prinf would. I do not mind it.

  • 3
    The standard requires the argument to va_start to be the rightmost parameter in the parameter list ([cstdarg.syn]). Since this is not in the parameter list, I'd say it's undefined behavior. – cpplearner Apr 17 at 14:33
  • From a conceptual standpoint this is strange, since how is the callee supposed to know the number of arguments? The only sane place it can get it from is via this, but that means the number of arguments passed isn't a property of the invocation anymore, but of the object itself. That's pretty weird and it's not clear to me that it's a good design... and I'm not sure I've seen that in any language before. – Mehrdad Apr 17 at 16:15
  • @Mehrdad: Consider a function that accepts an arbitrary number of strings and output their total length followed by their content. If one could get a va_list that started with the first string, one could use va_copy on that, use one loop that reads all the arguments from the original va_list and adds up the lengths, and then a loop that uses the copied va_list to output all the strings, without having to "special-case" the first string. – supercat Apr 17 at 16:32
  • 1
    @supercat: But in that case you should be making the first string explicit. Varargs would imply that it could be absent, which is clearly not the case, unlike with the arguments that follow, so it really is a special case. And practically speaking it helps the compiler raise an error if you forget to specify it. – Mehrdad Apr 17 at 16:36
  • @Mehrdad: Using a separate explicit argument for the first string pointer require that code make an extra copy of that pointer as well as the va_list. Further, any functions that are passed a va_list would need to be passed a first-string pointer in addition to the va_list. Not a terribly huge problem, but the code could be cleaner and more efficient if that weren't required. – supercat Apr 17 at 16:46
20

Nothing specifies that behaviour in the standard, so this construct just invokes formal Undefined Behaviour. That means that it can work fine in your implementation and cause compilation error or unexpected results in a different implementation.

The fact that non static methods have to pass the hidden this pointer cannot guarantee that va_start can use it. It probably works that way because in the early times, C++ compilers were just pre-processors that converted C++ source to C source and the hidden this parameter was added by the pre-processor to be available to the C compiler. And it has probably be maintained for compatibility reasons. But I would try hard to avoid that in mission critical code...

  • 2
    If it's undefined, then I'd go even further and say that not only can it cause different results on different implementations, but also with different runs on the same implementation. – cadaniluk Apr 17 at 14:19
  • 2
    @cadaniluk Real world implementations are much more consistent than what is required by standard, whether their extensions are documented or not. Simply relying on that gives non portable code. – Serge Ballesta Apr 17 at 14:26
  • 2
    @SergeBallesta: Some real-world compiler-writers try to be more consistent than what is required by the Standard. Others are more prone to view consistent behaviors as "missed optimization opportunities" regardless of whether the consistent behaviors might have been useful, or the "optimizations" are likely to offer any real benefit. – supercat Apr 17 at 16:34
  • @SergeBallesta More consistent than is required, yes, but that does not mean that they are consistent enough to be relied upon. Even if an implementation always compiles the UB-containing code the same way (which is not guaranteed, but isn't unusual for some types of UB), and even if it compiles it the same way across all different levels of optimization (which is less likely), there's still no guarantee that the next release of the same compiler won't change a few things that result in different behavior in certain undefined cases. And trying to get such code to work again is a colossal pain. – Ray Apr 18 at 1:12
  • @supercat: That's why I had said that it was UB and would give non portable code. My remark was just about different runs on the same implementation. It is certainly allowed per standard for UB. But in this specific use case, I still think that the risk is low. Just my opinion of course, only enforced by programming for forty years. – Serge Ballesta Apr 18 at 7:52
3

Seems to be undefined behavior. If you look at what va_start(ap, pN) does in many implementations (check your header file), it takes the address of pN, increments the pointer by the size of pN and stores the result in ap. Can we legally look at &this?

I found a nice reference here: https://megam.info/a/9115110/10316011

Quoting the 2003 C++ standard:

5.1 [expr.prim] The keyword this names a pointer to the object for which a nonstatic member function (9.3.2) is invoked. ... The type of the expression is a pointer to the function’s class (9.3.2), ... The expression is an rvalue.

5.3.1 [expr.unary.op] The result of the unary & operator is a pointer to its operand. The operand shall be an lvalue or a qualified_id.

So even if this works for you, it is not guaranteed to and you should not rely on it.

  • You say that va_start(ap, pN) takes the address of pN. Are you sure that's what it does? What if pN is a parameter passed in a register, and therefore has no address? – John Zwinck Apr 17 at 14:58
  • @JohnZwinck putting an ellipsis in the arguments should force the function/method into cdecl calling convention – jakub_d Apr 17 at 14:59
  • 2
    IAR defines: va_start(ap, A) (ap._Ap = _CSTD __va_start1()). So it ignores 2nd parameter. va_start(args, 0); works too. But it is ugly and not portable. – user11373693 Apr 17 at 16:46
  • 1
    @user11373693 I think I remember borland c++ 3.1 ignoring the second arg as well. So you have the option of putting va_start(ap, if this breaks for you - fix it), if someone uses a different compiler, at least they will find out :) – jakub_d Apr 17 at 16:52
  • 1
    @jakub_d On intel + linux the ellipsis still passes first arguments by registers. As evidence, play with printf("%d %lf", 1.2, 10) and see how it prints 10, 1.2, since it looks at the registers not the stack. – Michael Veksler Apr 17 at 19:08
2

I think it should be OK, though I doubt you will find a specific citation from the C++ standard which says so.

The rationale is this: va_start() must be passed the last argument to the function. A member function taking no explicit parameters has only a single parameter (this), which therefore must be its last parameter.

It will be easy to add a unit test to alert you if you ever compile on a platform where this doesn't work (which seems unlikely, but then again you are already compiling on a somewhat atypical platform).

  • 4
    Is it possible for an ABI to assign a specific register for this which is never used in regular functions (hence breaking va_start)? Is it possible for this to be subject to special optimizations which break va_start? – Michael Veksler Apr 17 at 13:48
  • 2
    you are talking about implementation details, not the language of the standard. The language does not say thatthis is a parameter. At least not in a way compatible with va_start language. Of course it's implemented as a parameter, but without explicit language guarantee the compiler might do something bad with it – Michael Veksler Apr 17 at 14:12
  • 1
    @JohnZwinck The behavior of bind is specified to use the INVOKE operation described here, which has special handling for member functions. This is just an abstraction layer, so it doesn't mean the ABI is compatible. – interjay Apr 17 at 14:27
  • 1
    "It will be easy to add a unit test", how, in any reasonably reliable way? – Passer By Apr 17 at 14:43
  • 1
    @PasserBy: Here are some ideas: (1) Call the function, check the results (via passing the args to snprintf(), vsnprintf(), etc). (2) Do the same thing but running inside UBSan, valgrind, etc. (3) Compile but do not assemble, check in the generated assembly code for a simple example of both callee and caller, and diff (in case your platforms ever change). – John Zwinck Apr 17 at 14:48
0

This is undefined behavior. Since the language does not require this to be passed as a parameter, it might not be passed at all.

For example, if a compiler can figure out that an object is a singleton, it may avoid passing this as a parameter and use a global symbol when the address of this is explicitly required (like in the case of va_start). In theory, the compiler might generate code to compensate that in va_start (after all, the compiler knows this is a singleton), but it is not required to do so by the standard.

Think of something like:

class single {
public:
   single(const single& )= delete;
   single &operator=(const single& )= delete;
   static single & get() {
       // this is the only place that can construct the object.
       // this address is know not later than load time:
       static single x;
       return x;
   }
  void print(...) {
      va_list args;
      va_start (args, this);
      vprintf ("%d\n", args);
      va_end (args);
}

private:
  single() = default;
};

Some compilers, like clang 8.0.0, emit a warning for the above code:

prog.cc:15:23: warning: second argument to 'va_start' is not the last named parameter [-Wvarargs] va_start (args, this);

Despite the warning, it runs ok. In general, this proves nothing, but it is a bad idea to have a warning .

Note: I don't know of any compiler that detects singletons and treats them specially, but the language does not forbid this kind of optimization. If it is not done today by your compiler, it might be done tomorrow by another compiler.

Note 2: despite all that, it might work in practice to pass this to va_start. Even if it works, it is not a good idea to do something that isn't guaranteed by the standard.

Note 3: The same singleton optimization can't be applied to parameters such as:

void foo(singleton * x, ...)

It can't be optimized away since it may have one of two values, point to the singleton or be nullptr. This means that this optimization concern does not apply here.

  • Isn't the hypothetical optimization you've presented also possible if the parameter is named? E.g. a single* which is ever dereferenced can be shown to come from static single x in get(), therefore it does not need to be passed to a regular free function void f(single* s, ...) { *s; va_list vl; va_start(vl, s); va_end(vl); }. Do you think this function f has undefined behavior too? – John Zwinck Apr 17 at 15:02
  • @JohnZwinck no, it doesn't, since that would break va_start for an explicit parameter. The compiler is not allowed to perform optimizations which break legal code – Michael Veksler Apr 17 at 15:02
  • 2
    You've basically made a circular argument now. – John Zwinck Apr 17 at 15:04
  • You can't optimize away regular single* parameter, since it could either point to the singleton or be null. On the other hand this must point to the singleton, always. – Michael Veksler Apr 17 at 18:27
  • You can optimize away a regular single* if it is dereferenced (because that means if it is null you have UB, so it didn't matter what you did). That's why I wrote the unused *s. – John Zwinck Apr 18 at 4:54

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.