- correctly handle non commutative operators
- correctly handle non zero lower bound ddt
- correctly handle ddt with size > extent
- revamp NBC_Sched_op so it takes two buffers and matches ompi_op_reduce semantic
- various fix for inter communicators
Thanks Yuki Matsumoto for the report