C/C++ and Fortran

The View from the C++ Standard meeting September 2013 Part 2 of 2.

By Archive User posted Mon December 02, 2013 03:54 PM

  

Originally posted by: Michael_Wong


In Part 1 of this C++ Standard September, 2013 meeting trip report, I wanted to mostly go over the core, library issues that affects C++14 and are urgent for the new Standard to emerge. What some people forget is that while this drive for C++14 is happening, there are still some parts of the Committee working on large and small features beyond C++14. This part will describe the many future feature proposals. Many of these proposals may only get full air time during the plenary session and these plenary sessions are getting longer and longer because there are so many subgroups reporting.

Last time I checked, there were ten subgroups. Now, there are thirteen.

http://isocpp.org/std/the-committee

http://isocpp.org/files/img/wg21-structure.png

The new additions are:

SG11, Databases: Bill Seymour

SG12, Undefined and Unspecified Behavior: Gabriel Dos Reis

SG13, Graphics: Herb Sutter

The SGs act as mini-evolution Working groups who are tasked with adding significant new features to C++. Most of these will require a combination of approvals through the Evolution Working Group and/or Library Evolution Working Group. After the features are approved through the EWG or LEWG, they are moved to each of Core WG or Library WG for wording refinement. Doing it this way enables massive parallelism and still allows each of the Evolution groups to continue to work to treat small features that do not fit into any particular category.

For instance, I also had a proposal to add restrict-like aliasing semantics to C++, and this was reviewed in EWG.

N3635

Towards restrict-like semantics for C++

Up till now, when you want to make sure 2 function pointer arguments do not alias, you have to borrow from C99’s restrict facility. This is a non-standard C feature that has been added by various C++ compilers to support the demand for such facility. But C++ has found a number of issues restrict does not address. Some of these are that there is no way to use it for overlapping array elements, or member aliasing. It seems to work well really only for arguments. There are others and the result is we do not want something exactly like C99 restrict. So our paper proposes a facility called alias grouping, where the user can code, using C++11 attributes, pointers that can be aliased together, say as green pointers as being separate from the blue pointers even though they are the same pointer types. This has the advantage that it is easy for the user to define, non-intrusively backwards compatible, and can be ignored if the compiler does not understand the attribute.

Most members of EWG loved the idea and urged us to develop it fully for the entire C++ language. This is an example of some future C++ feature that is addressed at the EWG level, and not by a study group. There are many others. For the most part, EWG tasks itself on addressing small-ish features and annoyances, that does not necessarily fit in any subgroup. This list is maintained by Ville in N3811.

Starting with the Concurrency, SG1 had ten or so National Body comments to address.

One of the issue was a discussion on N3630. This paper has three proposals covering:

  1. Require that return-from-main and exit join with outstanding async operations.
  2. Remove the requirement that releasing an async operation’s shared state shall block.
  3. Require that ~thread and thread::operator= implicitly join.

Working backwards, on the issue of thread destructor behavior, there was simply not enough consensus for a change, because some argued this was deliberate design.

On the issue that async destructors should not block we devoted a great deal of discussion on it. There are currently three services in C++11 that can return a future. These are packaged tasks, promises, and async. Of these, only async blocks on destruction. There were in fact at least 6 possible positions that we considered and the subsequent straw poll vote (Strongly for, for, neutral, against, strongly against) were:

  1. ~future will not block unless returned from async  20-1-1-0-0
  2. Add detach() to future to prevent blocking 0-1-1-8-12
  3. Deprecate async without replacement 15-6-1-1-1 threads? Exceptions? Deprecate it now and not establish wrong usage 12-4-2-0-4
  4. Split off task responsibility from future. NOT VOTED.
  5. Add launch mode nonblocking async. NOT VOTED.
  6. Is_evil flag could block destructor, or a special return_from_async_launch_async 3-8-2-4-4

As you can see the only position that received considerable support was A, giving advisory that future destructors will not block, unless returned from async, making it the notable exception.

One of the design issue discussed was that std: async serves two concerns which are conflated, as both a value return mechanism, and a task control. When there is a value, they block naturally and you won’t get to the destructor anyway. When there is no value, then blocking becomes a potential problem if you look at it as a task control mechanism because other asyncs can block behind it.

But there are programming models where blocking is useful and would not run on, especially when you wish to use it as a task control mechanism for returning a value, then maybe you might want it to block. This is because futures returned from std::async are not intended to be passed across library/API boundaries without first calling .get() or .wait() so that thereafter ~future will not block.

As a comparison to the other popular programming model, OpenMP parallel regions also have an implicit barrier at the end. You have to specify the nowait clause to make it not block and wait. But OpenMP tasks do not have an implicit barrier. You have to specify the taskwait directive for it to block and wait.

I think that in future, the proposed concurrency TS may allow Executors to help separate these concerns

After significant discussion, the only part that we tried to carry was N3776, an attempt to clarify the position that ~future and ~shared_future don’t block except possibly in the presence of async.

There was an attempt to issue a deprecation along the lines of C. Deprecate async without replacement. This motion was actually almost put forward. But before it even went to the mock plenary, Nikolai Josuttis circulated a petition arguing that lack of replacement would serious jeopardizes existing usage pattern, in effect invalidating all the C++11 courses and material that has been taught so far. There was so much concern raised from this point alone, along with the certainty that the motion will almost certainly be defeated by NBs (as there were many who supported the petition), that it was deemed that the motion should not even be brought forward at all. It died even before it reached the operating table.

Other papers that were discussed include how atomics work with signal handler. While we felt there was sufficient resolution, this was delayed at core and was not moved at this meeting. Another paper that was moved was the prohibition on Out-of-Thin-Air (OOTA) results. The issue here is that the current wordings for OOTA prohibited  too much, including specifically PowerPC in relaxed memory model. In a code example using Dekker's Algorithm, there is no way to tell that a reordered results was not manufactured out-of-thin-air or actually deliberately generating a specific value. From N3710, which describes this problem well:

Consider the following example, where x and y are atomic variables initialized to zero, and ri are local variables:

Thread 1:
  r1 = x.load(memory_order_relaxed);
  y.store(r1, memory_order_relaxed);

Thread 2:
  r2 = y.load(memory_order_relaxed);
  x.store(r2, memory_order_relaxed);	

Effectively Thread 1 copies x to y, and thread 2 copies y to x. The section 1.10 specification allows each load to see either the initializing store of zero, or the store in the other thread.

This famously allows both r1 and r2 to have final values of 42, or any other "out of thin air" value. This occurs if each load sees the store in the other thread. It effectively models an execution in which the compiler speculates that both atomic variables will have a value of 42, speculatively stores the resulting values, and then performs the loads to confirm that the speculation was correct and nothing needs to be undone.

No known implementations actually produce such results. However, it is extraordinarily hard to write specifications that present them without also preventing legitimate compiler and hardware optimizations. As a first indication of the complexity, note that the following variation of the preceding example should ideally allow x = y = 42, and some existing implementations can produce such a result:

Thread 1:
  r1 = x.load(memory_order_relaxed);
  y.store(r1, memory_order_relaxed);

Thread 2:
  r2 = y.load(memory_order_relaxed);
  x.store(42, memory_order_relaxed);	

In this case, the load in each thread actually can see the store in the other thread, without problems. The two operations in thread 2 are independent and unordered, so either the compiler or hardware can reorder them.

Essentially this issue has been an open issue in the Java specification for about 10 years. The major advantage that we have in C++ is that the problem is confined to non-synchronizing atomics, i.e. memory_order_relaxed, and some memory_order_consume uses (or read-modify-write operations that effectively weaken the ordering on either the read or write to one of those). Many of us expect those to be rarely used.

In general, there is difficulty formally describing OOTA results, and the current description in the C++11 Standard was simply wrong. So it was deemed best to remove that description in the Standard, and replace it with normative encouragement to discourage implementers from generating OOTA results.

Further discussions was carried on regarding vectorization, resumable functions, and coroutines.  All seems encouraging. There were continued discussions on vectorization, taskgroups, concurrent containers, and counters.

SG3 on FileSystems still has some work left to do for the TS, but is largely complete. They are starting to think about a second TS.

SG4 on Networking expects a PDTS in the February meeting, but it would depend on when the Library Fundamental TS would be shipping because there is a dependency on stringview. This is a new kind of string, that is different from the original Class string, but is a reference to an actual string.

Within SG5 Transactional Memory(TM), we have put forward a specification proposal in N3718, and it was presented to full Evolution for the first time. We obtained fantastic feedback which offered guidance as to how TM can work well within C++. There was still general approval of the design, but the guidance meant that we will need to further simplify the proposal and more integrate it within C++. The most interesting guidance is to not conflate invariance and synchronization. Herb Sutter, in particular gave specific feedback indicating that what is desired is a simple way of offering composable synchronization over current locks. We also gave an evening session to acquaint and educate members on the design.

SG8 on Concept-lite has a proposed paper which will be turned into a TS in future, but still has some work to do.

SG10 on Feature Test has N3745 which was passed in EWG. But it is really a living document that is non-normative as the Standard changes.

SG11 is formally started to support Databases. SG12 is another new group that will discuss and educate user community on what is Undefined and Unspecified Behaviour. There will be a paper that lists where something is undefined or unspecified.

SG13 is a group that will cover Graphics. It is also new and just starting to meet, but MS has great interest in leading it.

The next meeting will be in February, 2014 where we will continue triage of the remaining defects and NB comments. If it becomes possible to complete the work and issue out a C++14 Draft International Standard(DIS), then we will ballot through the summer even through the June C++ meeting, as there is no problem in having a meeting while a ballot is on-going. If we still need more time to complete the work, then we have till November for the next meeting after June, giving us the necessary months to complete the C++14 ballot. Either way, I would say we are in very good shape to ship C++14 in 2014.

0 comments
0 views

Permalink