IBM TechXchange Community Home

 View Only

The View form the C++ Standard meeting April 2013 Part 3

By Archive User posted Mon April 29, 2013 08:04 PM

  

Originally posted by: Michael_Wong


In this series that looks at C++14 content, we looked at features from Language and Library for C++14. Now we will look at Concurrency which is the other group that contributed features for C++14. In reality, some of the features from Language and Library also came from Concurrency.

Concurrency is the other major group that convened along with many small groups, needing to process all the smallish items for C++14. It came forward with 3 papers.

  • N3659 is based on N3568 and adds support for more complex forms of mutexes including shared mutex and upgradable mutex. This paper has been around since before C++11 but was postponed due to workload. It allows clients to easily code the well-known multiple readers/ single-writer locking pattern. However, only shared mutex was passed. It adds the following shared_mutex:
namespace std {

class shared_mutex
{
public:

    shared_mutex();
    ~shared_mutex();

    shared_mutex(const shared_mutex&) = delete;
    shared_mutex& operator=(const shared_mutex&) = delete;

    // Exclusive ownership

    void lock();  // blocking
    bool try_lock();
    template <class Rep, class Period>
        bool try_lock_for(const chrono::duration<Rep, Period>& rel_time);
    template <class Clock, class Duration>
        bool
        try_lock_until(const chrono::time_point<Clock, Duration>& abs_time);
    void unlock();

    // Shared ownership

    void lock_shared();  // blocking
    bool try_lock_shared();
    template <class Rep, class Period>
        bool
        try_lock_shared_for(const chrono::duration<Rep, Period>& rel_time);
    template <class Clock, class Duration>
        bool
        try_lock_shared_until(
                      const chrono::time_point<Clock, Duration>& abs_time);
    void unlock_shared();
};

}  // std

and shared_lock:

namespace std {

template <class Mutex>
class shared_lock
{
public:
    typedef Mutex mutex_type;

    // Shared locking

    shared_lock() noexcept;
    explicit shared_lock(mutex_type& m);  // blocking
    shared_lock(mutex_type& m, defer_lock_t) noexcept;
    shared_lock(mutex_type& m, try_to_lock_t);
    shared_lock(mutex_type& m, adopt_lock_t);
    template <class Clock, class Duration>
        shared_lock(mutex_type& m,
                    const chrono::time_point<Clock, Duration>& abs_time);
    template <class Rep, class Period>
        shared_lock(mutex_type& m,
                    const chrono::duration<Rep, Period>& rel_time);
    ~shared_lock();

    shared_lock(shared_lock const&) = delete;
    shared_lock& operator=(shared_lock const&) = delete;

    shared_lock(shared_lock&& u) noexcept;
    shared_lock& operator=(shared_lock&& u) noexcept;

    void lock();  // blocking
    bool try_lock();
    template <class Rep, class Period>
        bool try_lock_for(const chrono::duration<Rep, Period>& rel_time);
    template <class Clock, class Duration>
        bool
        try_lock_until(const chrono::time_point<Clock, Duration>& abs_time);
    void unlock();

    // Setters

    void swap(shared_lock& u) noexcept;
    mutex_type* release() noexcept;

    // Getters

    bool owns_lock() const noexcept;
    explicit operator bool () const noexcept;
    mutex_type* mutex() const noexcept;

private:
    mutex_type* pm; // exposition only
    bool owns;      // exposition only
};

template <class Mutex>
  void swap(shared_lock<Mutex>& x, shared_lock<Mutex>& y) noexcept;

}  // std

The upgradable mutex became contentious because there were some concerns about contentions in the implementation, so it was deferred.

  • N3636 is based on N3630, an update of part of N3451. It was NOT PASSED. It is an outcome of the Kona Compromise as it is trying to fix a potential problem in the std::future semantics model. Currently, it is not clear whether destructors for future blocks or not when it throws. This makes future hard to use in generic code because the destructor will sometimes block. You could say that throwing from destructor is not recommended anyway so this probably happens rarely. But there was a drive to define clearly whether they block, not block, or as they are now. Fearing to break code, the proposal worked out in committee supported all three solutions, but in particular invented a waiting_future returned by async() for the blocking case. Now code using async() with auto return type would continue to work while other code would break noisily. There is no implementation concern but the counter argument is that some objected this invention in the last minute. In general, the Kona Compromise was a devil’s bargain to have some advanced support for asynchronous feature given that there was not enough time in C++11 to enable more advanced abstractions. The result was async and future, which has been somewhat full of problematic corner cases. Since then, we have been fixing it and there is strong consensus in the group that we should stop doing that, and focus more time on more complete proposals such as Google executors and Microsoft then continuations.
  • N3637 is based on N3630, an update of part of N3451. It was NOT PASSED. It makes the thread destructor joining instead of terminate as it does now. This has the potential of making thread a proper RAII type and avoids inconsistency with async which also joins. This change may cause some dubious code to hang. It was defeated because there were some concerns that this behavior of thread destructor was a deliberate design decision. This too likely will be revisited.

After the C++14 features were triaged, we studied a number of future concurrency developments. Most of these are aiming for a Concurrency TS which will likely be put forward in the next meeting in Chicago. The current set of proposals can be roughly divided into several groups. There are the full-control tasking proposals which allow you to launch tasks asynchronously but give you all the controls a programmer would expect. These are the MS future.then continuations (N3558) with resumable functions (N3564) , and Google executors (N3562). Apple GCD also falls in this category. There are the groups that parallelize the C++ std library by adding parallel algorithms (N3554), and possibly concurrent containers but also some form of task management as a library facility. This is represented by a marriage of TTB, PPL and Nvidia Thrust’s. AMD’s Bolt also falls in this category. They enable throughput. There are the high level task language constructs that keep it simple and allow non-computer scientists to launch a parallel loop or task. These are the Cilk (N3557) and OpenMP (N3530) proposals. Vectorized loop (N3561) access from Intel also roughly falls along this line to enable a uniform way of accessing different vector units. I helped mostly with marrying OpenMP with Cilk and hope to have both semantics to enabled more close collaboration between the two groups. Even so, there is real question as to whether the language approach is superior to a pure library approach and this needs to be studied with actual usage examples.  These two proposals were also passed to the C Committee and a Study Group has been formed. This will hopefully enable further compatibility with C. Even though OpenMP is about to release 4.0 with vast new support for accelerators, improved support for tasks, reductions amd affinity, much of that will not be in scope yet. For the merging purpose, we would mainly aim for loops, reductions and tasks as a first stage for merged syntax.

Three Technical Specifications were approved to move forward. They include Concepts-lite, Networking and Filesystems (N3505). I did not attend the Networking subgroup but they plan to issue a new TS every year. A document will be present in the post-meeting mailing. Filesystem did not meet but I have analyzed it extensively for IBM and feel it is a good document that has been in service as Boost.Filesystem for sometimes. It is designed around the posix filesystem notation, with specific support for Linux and Windows-style filesystems. Having spoken to the author, he is looking at improving it in future to support large enterprise scale filesystems, potentially GFS, or even 390 MVS.

I will talk a little about Concepts. N3580 is Concepts-lite is Concepts without separate checking, and the concept map feature. This significantly reduces the burden of implementation. It came out of what is called the Palo Alto Technical Report N3351, where Andrew Lumsdaine called a meeting with Bjarne Stroustrup, Alex Stepanov, Sean Parent and many from Indiana University to consolidate a way of moving forward after the removal of Concepts from C++11. Alex used a structure similar to his Elements of Programming book to design a Concepts for STL.

This resulted in far fewer concepts then the original removed Concepts proposal. This has been implemented in gcc and is shown to be useful for template prototype checking as expected and replaces the similar facility in D's static_if (N3613). Bjarne further showed a terse syntax for constrained lambdas which Herb had shown where there is no <> bracket in sight. But more importantly, to take over the world, it has to be fast. Early test results from GCC indicate it compiles faster with Concept-lite because it stops all the needless instantiation that would normally occur.

Two other possible Technical Specification for Concurrency, and Library Extension 2 were withdrawn without vote because they did not have paper, although they both have significant planned content and probably will be put forward in Chicago. The other possible TS for Chicago will be Transactional Memory.

 

I chaired the Transactional Memory SG5 group after giving a talk at ACCU on Transactional Memory progress to a room of over one hundred people. There were a lot of interest in Transactional Memory, including a genuine interest in participating in the design. Our group has been meeting regularly every 2 weeks since 2008 between Intel, IBM, Oracle, HP, and Redhat with many academic participation. We have developed a specification, which is now used as a base document and is part of the Intel C++ compiler, the GNU 4.7 C++ compiler. But this specification will likely be reduced for our first approach for standardization.

We continue to refine the semantics in this meeting. We had a 5 hour session which gathered feedback for design directions. In particular, Victor from Oracle presented N3589, a review of the specification as it stands today, the progress since Portland and the simple exception proposal. Tovald of Redhat presented his paper on advanced data escape mechanism N3592. A third paper N3591 on explicit cancellation was not discussed.

We had a large number of polls which helped us with design directions and obtained feedback that we should move quickly to submit standard wording for a simple specification for the 1st iteration TS. We aim to do that for Chicago.  The poll results were as follow:

On relaxed TX name change: no consensus

on explicit cancellation: no

cancel on escaping exception: even, mixed

commit on escaping exception: weak yes, what does no mean here?

simple data escape: yes but is tied to cancel on escape

adbanced data escape: clear no

atomic and relaxed: both yes

The current syntax contains the following support for 2 types of transaction statement.

  • Atomic transaction: The body of an atomic transaction appears to take effect atomically: no other thread sees any intermediate states of an atomic transaction, nor does the thread executing an atomic transaction see the effects of any operations of other threads interleaved between the steps within the transaction.
     

__transaction_atomic [[noexcept]] { <body> }
__transaction_atomic [[commit_on_escape]] { <body> }
__transaction_atomic [[cancel_on_escape]] { <body> }

  • Relaxed transaction: It is semantically equivalent to having a special mutex lock (one for the entire system) that is acquired before executing the body and released after the body is executed (unless the relaxed transaction statement is nested within another relaxed transaction, in which case the lock is not released until the end of the outermost relaxed transaction), and no atomic transaction appears to take effect while this special lock is held by any other thread.

__transaction_relaxed { <body> }

One of the key distinction between an atomic and a relaxed transaction is that any code may be executed within a relaxed transaction, but only transaction safe code may be executed within an atomic transaction.

I will say more on the design in future. But this completes this series of Trip Report on this recent C++ Standard meeting.

 

0 comments
1 view