From 50765ae5a26f718c8840e65cb0cb813f4a65004b Mon Sep 17 00:00:00 2001 From: Brian Barrett Date: Fri, 5 Jun 2020 11:14:34 -0700 Subject: [PATCH] dist: Update NEWS from release branches We have been bad about updating the NEWS file in master with all the changes that have gone into the release branches. Patch up NEWS with the changes from v3.0, v3.1, and v4.0 branches. Signed-off-by: Brian Barrett --- NEWS | 474 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 474 insertions(+) diff --git a/NEWS b/NEWS index 5822373d94..f04f643007 100644 --- a/NEWS +++ b/NEWS @@ -80,6 +80,352 @@ Master (not on release branches yet) Currently, this means the Open SHMEM layer will only build if a MXM or UCX library is found. +4.0.4 -- May, 2020 +----------------------- +- Add checks to avoid conflicts with a libevent library shipped with LSF. +- Switch to linking against libevent_core rather than libevent, if present. +- Add improved support for UCX 1.9 and later. +- Fix an ABI compatibility issue with the Fortran 2008 bindings. + Thanks to Alastair McKinstry for reporting. +- Fix an issue with rpath of /usr/lib64 when building OMPI on + systems with Lustre. Thanks to David Shrader for reporting. +- Fix a memory leak occurring with certain MPI RMA operations. +- Fix an issue with ORTE's mapping of MPI processes to resources. + Thanks to Alex Margolin for reporting and providing a fix. +- Correct a problem with incorrect error codes being returned + by OMPI MPI_T functions. +- Fix an issue with debugger tools not being able to attach + to mpirun more than once. Thanks to Gregory Lee for reporting. +- Fix an issue with the Fortran compiler wrappers when using + NAG compilers. Thanks to Peter Brady for reporting. +- Fix an issue with the ORTE ssh based process launcher at scale. + Thanks to Benjamín Hernández for reporting. +- Address an issue when using shared MPI I/O operations. OMPIO will + now successfully return from the file open statement but will + raise an error if the file system does not supported shared I/O + operations. Thanks to Romain Hild for reporting. +- Fix an issue with MPI_WIN_DETACH. Thanks to Thomas Naughton for reporting. + +4.0.3 -- March, 2020 +----------------------- +- Update embedded PMIx to 3.1.5 +- Add support for Mellanox ConnectX-6. +- Fix an issue in OpenMPI IO when using shared file pointers. + Thanks to Romain Hild for reporting. +- Fix a problem with Open MPI using a previously installed + Fortran mpi module during compilation. Thanks to Marcin + Mielniczuk for reporting +- Fix a problem with Fortran compiler wrappers ignoring use of + disable-wrapper-runpath configure option. Thanks to David + Shrader for reporting. +- Fixed an issue with trying to use mpirun on systems where neither + ssh nor rsh is installed. +- Address some problems found when using XPMEM for intra-node message + transport. +- Improve dimensions returned by MPI_Dims_create for certain + cases. Thanks to @aw32 for reporting. +- Fix an issue when sending messages larger than 4GB. Thanks to + Philip Salzmann for reporting this issue. +- Add ability to specify alternative module file path using + Open MPI's RPM spec file. Thanks to @jschwartz-cray for reporting. +- Clarify use of --with-hwloc configuration option in the README. + Thanks to Marcin Mielniczuk for raising this documentation issue. +- Fix an issue with shmem_atomic_set. Thanks to Sameh Sharkawi for reporting. +- Fix a problem with MPI_Neighbor_alltoall(v,w) for cartesian communicators + with cyclic boundary conditions. Thanks to Ralph Rabenseifner and + Tony Skjellum for reporting. +- Fix an issue using Open MPIO on 32 bit systems. Thanks to + Orion Poplawski for reporting. +- Fix an issue with NetCDF test deadlocking when using the vulcan + Open MPIO component. Thanks to Orion Poplawski for reporting. +- Fix an issue with the mpi_yield_when_idle parameter being ignored + when set in the Open MPI MCA parameter configuration file. + Thanks to @iassiour for reporting. +- Address an issue with Open MPIO when writing/reading more than 2GB + in an operation. Thanks to Richard Warren for reporting. + +4.0.2 -- September, 2019 +------------------------ +- Update embedded PMIx to 3.1.4 +- Enhance Open MPI to detect when processes are running in + different name spaces on the same node, in which case the + vader CMA single copy mechanism is disabled. Thanks + to Adrian Reber for reporting and providing a fix. +- Fix an issue with ORTE job tree launch mechanism. Thanks + to @lanyangyang for reporting. +- Fix an issue with env processing when running as root. + Thanks to Simon Byrne for reporting and providing a fix. +- Fix Fortran MPI_FILE_GET_POSITION return code bug. + Thanks to Wei-Keng Liao for reporting. +- Fix user defined datatypes/ops leak in nonblocking base collective + component. Thanks to Andrey Maslennikov for verifying fix. +- Fixed shared memory not working with spawned processes. + Thanks to @rodarima for reporting. +- Fix data corruption of overlapping datatypes on sends. + Thanks to DKRZ for reporting. +- Fix segfault in oob_tcp component on close with active listeners. + Thanks to Orivej Desh for reporting and providing a fix. +- Fix divide by zero segfault in ompio. + Thanks to @haraldkl for reporting and providing a fix. +- Fix finalize of flux compnents. + Thanks to Stephen Herbein and Jim Garlick for providing a fix. +- Fix osc_rdma_acc_single_intrinsic regression. + Thanks to Joseph Schuchart for reporting and providing a fix. +- Fix hostnames with large integers. + Thanks to @perrynzhou for reporting and providing a fix. +- Fix Deadlock in MPI_Fetch_and_op when using UCX + Thanks to Joseph Schuchart for reporting. +- Fix the SLURM plm for mpirun-based launching. + Thanks to Jordon Hayes for reporting and providing a fix. +- Prevent grep failure in rpmbuild from aborting. + Thanks to Daniel Letai for reporting. +- Fix btl/vader finalize sequence. + Thanks to Daniel Vollmer for reporting. +- Fix pml/ob1 local handle sent during PUT control message. + Thanks to @EmmanuelBRELLE for reporting and providing a fix. +- Fix Memory leak with persistent MPI sends and the ob1 "get" protocol. + Thanks to @s-kuberski for reporting. +- v4.0.x: mpi: mark MPI_COMBINER_{HVECTOR,HINDEXED,STRUCT}_INTEGER + removed unless configured with --enable-mpi1-compatibility +- Fix make-authors.pl when run in a git submodule. + Thanks to Michael Heinz for reporting and providing a fix. +- Fix deadlock with mpi_assert_allow_overtaking in MPI_Issend. + Thanks to Joseph Schuchart and George Bosilca for reporting. +- Add compilation flag to allow unwinding through files that are + present in the stack when attaching with MPIR. + Thanks to James A Clark for reporting and providing a fix. + +Known issues: + +- There is a known issue with the OFI libfabric and PSM2 MTLs when trying to send + very long (> 4 GBytes) messages. In this release, these MTLs will catch + this case and abort the transfer. A future release will provide a + better solution to this issue. + +4.0.1 -- March, 2019 +-------------------- + +- Update embedded PMIx to 3.1.2. +- Fix an issue with Vader (shared-memory) transport on OS-X. Thanks + to Daniel Vollmer for reporting. +- Fix a problem with the usNIC BTL Makefile. Thanks to George Marselis + for reporting. +- Fix an issue when using --enable-visibility configure option + and older versions of hwloc. Thanks to Ben Menadue for reporting + and providing a fix. +- Fix an issue with MPI_WIN_CREATE_DYNAMIC and MPI_GET from self. + Thanks to Bart Janssens for reporting. +- Fix an issue of excessive compiler warning messages from mpi.h + when using newer C++ compilers. Thanks to @Shadow-fax for + reporting. +- Fix a problem when building Open MPI using clang 5.0. +- Fix a problem with MPI_WIN_CREATE when using UCX. Thanks + to Adam Simpson for reporting. +- Fix a memory leak encountered for certain MPI datatype + destructor operations. Thanks to Axel Huebl for reporting. +- Fix several problems with MPI RMA accumulate operations. + Thanks to Jeff Hammond for reporting. +- Fix possible race condition in closing some file descriptors + during job launch using mpirun. Thanks to Jason Williams + for reporting and providing a fix. +- Fix a problem in OMPIO for large individual write operations. + Thanks to Axel Huebl for reporting. +- Fix a problem with parsing of map-by ppr options to mpirun. + Thanks to David Rich for reporting. +- Fix a problem observed when using the mpool hugepage component. Thanks + to Hunter Easterday for reporting and fixing. +- Fix valgrind warning generated when invoking certain MPI Fortran + data type creation functions. Thanks to @rtoijala for reporting. +- Fix a problem when trying to build with a PMIX 3.1 or newer + release. Thanks to Alastair McKinstry for reporting. +- Fix a problem encountered with building MPI F08 module files. + Thanks to Igor Andriyash and Axel Huebl for reporting. +- Fix two memory leaks encountered for certain MPI-RMA usage patterns. + Thanks to Joseph Schuchart for reporting and fixing. +- Fix a problem with the ORTE rmaps_base_oversubscribe MCA paramater. + Thanks to @iassiour for reporting. +- Fix a problem with UCX PML default error handler for MPI communicators. + Thanks to Marcin Krotkiewski for reporting. +- Fix various issues with OMPIO uncovered by the testmpio test suite. + +4.0.0 -- September, 2018 +------------------------ + +- OSHMEM updated to the OpenSHMEM 1.4 API. +- Do not build OpenSHMEM layer when there are no SPMLs available. + Currently, this means the OpenSHMEM layer will only build if + a MXM or UCX library is found. +- A UCX BTL was added for enhanced MPI RMA support using UCX +- With this release, OpenIB BTL now only supports iWarp and RoCE by default. +- Updated internal HWLOC to 2.0.2 +- Updated internal PMIx to 3.0.2 +- Change the priority for selecting external verses internal HWLOC + and PMIx packages to build. Starting with this release, configure + by default selects available external HWLOC and PMIx packages over + the internal ones. +- Updated internal ROMIO to 3.2.1. +- Removed support for the MXM MTL. +- Removed support for SCIF. +- Improved CUDA support when using UCX. +- Enable use of CUDA allocated buffers for OMPIO. +- Improved support for two phase MPI I/O operations when using OMPIO. +- Added support for Software-based Performance Counters, see + https://github.com/davideberius/ompi/wiki/How-to-Use-Software-Based-Performance-Counters-(SPCs)-in-Open-MPI +- Change MTL OFI from opting-IN on "psm,psm2,gni" to opting-OUT on + "shm,sockets,tcp,udp,rstream" +- Various improvements to MPI RMA performance when using RDMA + capable interconnects. +- Update memkind component to use the memkind 1.6 public API. +- Fix a problem with javadoc builds using OpenJDK 11. Thanks to + Siegmar Gross for reporting. +- Fix a memory leak using UCX. Thanks to Charles Taylor for reporting. +- Fix hangs in MPI_FINALIZE when using UCX. +- Fix a problem with building Open MPI using an external PMIx 2.1.2 + library. Thanks to Marcin Krotkiewski for reporting. +- Fix race conditions in Vader (shared memory) transport. +- Fix problems with use of newer map-by mpirun options. Thanks to + Tony Reina for reporting. +- Fix rank-by algorithms to properly rank by object and span +- Allow for running as root of two environment variables are set. + Requested by Axel Huebl. +- Fix a problem with building the Java bindings when using Java 10. + Thanks to Bryce Glover for reporting. +- Fix a problem with ORTE not reporting error messages if an application + terminated normally but exited with non-zero error code. Thanks to + Emre Brookes for reporting. + +3.1.6 -- March, 2020 +-------------------- + +- Fix one-sided shared memory window configuration bug. +- Fix support for PGI'18 compiler. +- Fix issue with zero-length blockLength in MPI_TYPE_INDEXED. +- Fix run-time linker issues with OMPIO on newer Linux distros. +- Fix PMIX dstore locking compilation issue. Thanks to Marco Atzeri + for reporting the issue. +- Allow the user to override modulefile_path in the Open MPI SRPM, + even if install_in_opt is set to 1. +- Properly detect ConnectX-6 HCAs in the openib BTL. +- Fix segfault in the MTL/OFI initialization for large jobs. +- Fix issue to guarantee to properly release MPI one-sided lock when + using UCX transports to avoid a deadlock. +- Fix potential deadlock when processing outstanding transfers with + uGNI transports. +- Fix various portals4 control flow bugs. +- Fix communications ordering for alltoall and Cartesian neighborhood + collectives. +- Fix an infinite recursion crash in the memory patcher on systems + with glibc v2.26 or later (e.g., Ubuntu 18.04) when using certain + OS-bypass interconnects. + +3.1.5 -- November, 2019 +----------------------- + +- Fix OMPIO issue limiting file reads/writes to 2GB. Thanks to + Richard Warren for reporting the issue. +- At run time, automatically disable Linux cross-memory attach (CMA) + for vader BTL (shared memory) copies when running in user namespaces + (i.e., containers). Many thanks to Adrian Reber for raising the + issue and providing the fix. +- Sending very large MPI messages using the ofi MTL will fail with + some of the underlying Libfabric transports (e.g., PSM2 with + messages >=4GB, verbs with messages >=2GB). Prior version of Open + MPI failed silently; this version of Open MPI invokes the + appropriate MPI error handler upon failure. See + https://github.com/open-mpi/ompi/issues/7058 for more details. + Thanks to Emmanuel Thomé for raising the issue. +- Fix case where 0-extent datatypes might be eliminated during + optimization. Thanks to Github user @tjahns for raising the issue. +- Ensure that the MPIR_Breakpoint symbol is not optimized out on + problematic platforms. +- Fix MPI one-sided 32 bit atomic support. +- Fix OMPIO offset calculations with SEEK_END and SEEK_CUR in + MPI_FILE_GET_POSITION. Thanks to Wei-keng Liao for raising the + issue. +- Add "naive" regx component that will never fail, no matter how + esoteric the hostnames are. +- Fix corner case for datatype extent computations. Thanks to David + Dickenson for raising the issue. +- Allow individual jobs to set their map/rank/bind policies when + running LSF. Thanks to Nick R. Papior for assistance in solving the + issue. +- Fix MPI buffered sends with the "cm" PML. +- Properly propagate errors to avoid deadlocks in MPI one-sided operations. +- Update to PMIx v2.2.3. +- Fix data corruption in non-contiguous MPI accumulates over UCX. +- Fix ssh-based tree-based spawning at scale. Many thanks to Github + user @zrss for the report and diagnosis. +- Fix the Open MPI RPM spec file to not abort when grep fails. Thanks + to Daniel Letai for bringing this to our attention. +- Handle new SLURM CLI options (SLURM 19 deprecated some options that + Open MPI was using). Thanks to Jordan Hayes for the report and the + initial fix. +- OMPI: fix division by zero with an empty file view. +- Also handle shmat()/shmdt() memory patching with OS-bypass networks. +- Add support for unwinding info to all files that are present in the + stack starting from MPI_Init, which is helpful with parallel + debuggers. Thanks to James Clark for the report and initial fix. +- Fixed inadvertant use of bitwise operators in the MPI C++ bindings + header files. Thanks to Bert Wesarg for the report and the fix. + +3.1.4 -- April, 2019 +-------------------- + +- Fix compile error when configured with --enable-mpi-java and + --with-devel-headers. Thanks to @g-raffy for reporting the issue + (** also appeared: v3.0.4). +- Only use hugepages with appropriate permissions. Thanks to Hunter + Easterday for the fix. +- Fix possible floating point rounding and division issues in OMPIO + which led to crashes and/or data corruption with very large data. + Thanks to Axel Huebl and René Widera for identifing the issue, + supplying and testing the fix (** also appeared: v3.0.4). +- Use static_cast<> in mpi.h where appropriate. Thanks to @shadow-fx + for identifying the issue (** also appeared: v3.0.4). +- Fix RMA accumulate of non-predefined datatypes with predefined + operators. Thanks to Jeff Hammond for raising the issue (** also + appeared: v3.0.4). +- Fix race condition when closing open file descriptors when launching + MPI processes. Thanks to Jason Williams for identifying the issue and + supplying the fix (** also appeared: v3.0.4). +- Fix support for external PMIx v3.1.x. +- Fix Valgrind warnings for some MPI_TYPE_CREATE_* functions. Thanks + to Risto Toijala for identifying the issue and supplying the fix (** + also appeared: v3.0.4). +- Fix MPI_TYPE_CREATE_F90_{REAL,COMPLEX} for r=38 and r=308 (** also + appeared: v3.0.4). +- Fix assembly issues with old versions of gcc (<6.0.0) that affected + the stability of shared memory communications (e.g., with the vader + BTL) (** also appeared: v3.0.4). +- Fix MPI_Allreduce crashes with some cases in the coll/spacc module. +- Fix the OFI MTL handling of MPI_ANY_SOURCE (** also appeared: + v3.0.4). +- Fix noisy errors in the openib BTL with regards to + ibv_exp_query_device(). Thanks to Angel Beltre and others who + reported the issue (** also appeared: v3.0.4). +- Fix zero-size MPI one-sided windows with UCX. + +3.1.3 -- October, 2018 +---------------------- + +- Fix race condition in MPI_THREAD_MULTIPLE support of non-blocking + send/receive path. +- Fix error handling SIGCHLD forwarding. +- Add support for CHARACTER and LOGICAL Fortran datatypes for MPI_SIZEOF. +- Fix compile error when using OpenJDK 11 to compile the Java bindings. +- Fix crash when using a hostfile with a 'user@host' line. +- Numerous Fortran '08 interface fixes. +- TCP BTL error message fixes. +- OFI MTL now will use any provider other than shm, sockets, tcp, udp, or + rstream, rather than only supporting gni, psm, and psm2. +- Disable async receive of CUDA buffers by default, fixing a hang + on large transfers. +- Support the BCM57XXX and BCM58XXX Broadcomm adapters. +- Fix minmax datatype support in ROMIO. +- Bug fixes in vader shared memory transport. +- Support very large buffers with MPI_TYPE_VECTOR. +- Fix hang when launching with mpirun on Cray systems. + 3.1.2 -- August, 2018 ------------------------ @@ -186,6 +532,134 @@ Master (not on release branches yet) - Remove support for XL compilers older than v13.1. - Remove support for atomic operations using MacOS atomics library. +3.0.6 -- March, 2020 +-------------------- + +- Fix one-sided shared memory window configuration bug. +- Fix support for PGI'18 compiler. +- Fix run-time linker issues with OMPIO on newer Linux distros. +- Allow the user to override modulefile_path in the Open MPI SRPM, + even if install_in_opt is set to 1. +- Properly detect ConnectX-6 HCAs in the openib BTL. +- Fix segfault in the MTL/OFI initialization for large jobs. +- Fix various portals4 control flow bugs. +- Fix communications ordering for alltoall and Cartesian neighborhood + collectives. +- Fix an infinite recursion crash in the memory patcher on systems + with glibc v2.26 or later (e.g., Ubuntu 18.04) when using certain + OS-bypass interconnects. + +3.0.5 -- November, 2019 +----------------------- + +- Fix OMPIO issue limiting file reads/writes to 2GB. Thanks to + Richard Warren for reporting the issue. +- At run time, automatically disable Linux cross-memory attach (CMA) + for vader BTL (shared memory) copies when running in user namespaces + (i.e., containers). Many thanks to Adrian Reber for raising the + issue and providing the fix. +- Sending very large MPI messages using the ofi MTL will fail with + some of the underlying Libfabric transports (e.g., PSM2 with + messages >=4GB, verbs with messages >=2GB). Prior version of Open + MPI failed silently; this version of Open MPI invokes the + appropriate MPI error handler upon failure. See + https://github.com/open-mpi/ompi/issues/7058 for more details. + Thanks to Emmanuel Thomé for raising the issue. +- Fix case where 0-extent datatypes might be eliminated during + optimization. Thanks to Github user @tjahns for raising the issue. +- Ensure that the MPIR_Breakpoint symbol is not optimized out on + problematic platforms. +- Fix OMPIO offset calculations with SEEK_END and SEEK_CUR in + MPI_FILE_GET_POSITION. Thanks to Wei-keng Liao for raising the + issue. +- Fix corner case for datatype extent computations. Thanks to David + Dickenson for raising the issue. +- Fix MPI buffered sends with the "cm" PML. +- Update to PMIx v2.2.3. +- Fix ssh-based tree-based spawning at scale. Many thanks to Github + user @zrss for the report and diagnosis. +- Fix the Open MPI RPM spec file to not abort when grep fails. Thanks + to Daniel Letai for bringing this to our attention. +- Handle new SLURM CLI options (SLURM 19 deprecated some options that + Open MPI was using). Thanks to Jordan Hayes for the report and the + initial fix. +- OMPI: fix division by zero with an empty file view. +- Also handle shmat()/shmdt() memory patching with OS-bypass networks. +- Add support for unwinding info to all files that are present in the + stack starting from MPI_Init, which is helpful with parallel + debuggers. Thanks to James Clark for the report and initial fix. +- Fixed inadvertant use of bitwise operators in the MPI C++ bindings + header files. Thanks to Bert Wesarg for the report and the fix. +- Added configure option --disable-wrappers-runpath (alongside the + already-existing --disable-wrappers-rpath option) to prevent Open + MPI's configure script from automatically adding runpath CLI options + to the wrapper compilers. + +3.0.4 -- April, 2019 +-------------------- + +- Fix compile error when configured with --enable-mpi-java and + --with-devel-headers. Thanks to @g-raffy for reporting the issue. +- Fix possible floating point rounding and division issues in OMPIO + which led to crashes and/or data corruption with very large data. + Thanks to Axel Huebl and René Widera for identifing the issue, + supplying and testing the fix (** also appeared: v3.0.4). +- Use static_cast<> in mpi.h where appropriate. Thanks to @shadow-fx + for identifying the issue. +- Fix datatype issue with RMA accumulate. Thanks to Jeff Hammond for + raising the issue. +- Fix RMA accumulate of non-predefined datatypes with predefined + operators. Thanks to Jeff Hammond for raising the issue. +- Fix race condition when closing open file descriptors when launching + MPI processes. Thanks to Jason Williams for identifying the issue and + supplying the fix. +- Fix Valgrind warnings for some MPI_TYPE_CREATE_* functions. Thanks + to Risto Toijala for identifying the issue and supplying the fix. +- Fix MPI_TYPE_CREATE_F90_{REAL,COMPLEX} for r=38 and r=308. +- Fix assembly issues with old versions of gcc (<6.0.0) that affected + the stability of shared memory communications (e.g., with the vader + BTL). +- Fix the OFI MTL handling of MPI_ANY_SOURCE. +- Fix noisy errors in the openib BTL with regards to + ibv_exp_query_device(). Thanks to Angel Beltre and others who + reported the issue. + +3.0.3 -- October, 2018 +---------------------- + +- Fix race condition in MPI_THREAD_MULTIPLE support of non-blocking + send/receive path. +- Fix error handling SIGCHLD forwarding. +- Add support for CHARACTER and LOGICAL Fortran datatypes for MPI_SIZEOF. +- Fix compile error when using OpenJDK 11 to compile the Java bindings. +- Fix crash when using a hostfile with a 'user@host' line. +- Numerous Fortran '08 interface fixes. +- TCP BTL error message fixes. +- OFI MTL now will use any provider other than shm, sockets, tcp, udp, or + rstream, rather than only supporting gni, psm, and psm2. +- Disable async receive of CUDA buffers by default, fixing a hang + on large transfers. +- Support the BCM57XXX and BCM58XXX Broadcomm adapters. +- Fix minmax datatype support in ROMIO. +- Bug fixes in vader shared memory transport. +- Support very large buffers with MPI_TYPE_VECTOR. +- Fix hang when launching with mpirun on Cray systems. +- Bug fixes in OFI MTL. +- Assorted Portals 4.0 bug fixes. +- Fix for possible data corruption in MPI_BSEND. +- Move shared memory file for vader btl into /dev/shm on Linux. +- Fix for MPI_ISCATTER/MPI_ISCATTERV Fortran interfaces with MPI_IN_PLACE. +- Upgrade PMIx to v2.1.4. +- Fix for Power9 built-in atomics. +- Numerous One-sided bug fixes. +- Fix for race condition in uGNI BTL. +- Improve handling of large number of interfaces with TCP BTL. +- Numerous UCX bug fixes. +- Add support for QLogic and Broadcom Cumulus RoCE HCAs to Open IB BTL. +- Add patcher support for aarch64. +- Fix hang on Power and ARM when Open MPI was built with low compiler + optimization settings. + 3.0.2 -- June, 2018 -------------------