1
1
Ralph Castain 335f8c5100 Update to PMIx 3.1.2
Update the OPAL glue configure code to correctly link the opal/pmix3
component to the hwloc used by OMPI instead of defaulting to the
system-level hwloc. Required a corresponding update to the PMIx hwloc
configure code so we treat hwloc the same way we handle libevent in
embedded scenarios. Roll to PMIx v3.1.2 for plugging of memory leaks and
addition of faster PMIx_Get response

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-01-25 15:58:53 +09:00

481 строка
19 KiB
Plaintext

Copyright (c) 2015-2019 Intel, Inc. All rights reserved.
Copyright (c) 2017 IBM Corporation. All rights reserved.
$COPYRIGHT$
Additional copyrights may follow
$HEADER$
===========================================================================
This file contains the main features as well as overviews of specific
bug fixes (and other actions) for each version of PMIx since
version 1.0.
As more fully described in the "Software Version Number" section in
the README file, PMIx typically maintains two separate version
series simultaneously - the current release and one that is locked
to only bug fixes. Since these series are semi-independent of each
other, a single NEWS-worthy item might apply to different series. For
example, a bug might be fixed in the master, and then moved to
multiple release branches.
3.1.2 -- 24 Jan 2019
----------------------
- Fix a bug in macro identifying system events
- Restore some non-standard macros to the pmix_extend.h
header - these are considered "deprecated" and will be
removed from public-facing headers in future releases
3.1.1 -- 18 Jan 2019
----------------------
- Fix a bug in registration of default event handlers
that somehow slipped thru testing
3.1.0 -- 17 Jan 2019
----------------------
**** THIS RELEASE MARKS THE STARTING POINT FOR FULL COMPLIANCE
**** WITH THE PMIX v3 STANDARD. ALL API BEHAVIORS AND ATTRIBUTE
**** DEFINITIONS MEET THE v3 STANDARD SPECIFICATIONS.
- Add a new, faster dstore GDS component 'ds21'
- Performance optimizations for the dstore GDS components.
- Plug miscellaneous memory leaks
- Silence an unnecessary warning message when checking connection
to a non-supporting server
- Ensure lost-connection events get delivered to default event
handlers
- Correctly handle cache refresh for queries
- Protect against race conditions between host and internal library
when dealing with async requests
- Cleanup tool operations and add support for connections to
remote servers. Initial support for debugger direct/indirect
launch verified with PRRTE. Cleanup setting of tmpdir options.
Drop rendezvous files when acting as a launcher
- Automatically store the server URI for easy access by client
- Provide MCA parameter to control TCP connect retry/timeout
- Update event notification system to properly evict oldest events
when more space is needed
- Fix a number of error paths
- Update IOF cache code to properly drop oldest message. Provide
MCA parameter for setting cache size.
- Handle setsockopt(SO_RCVTIMEO) not being supported
- Ensure that epilogs get run even when connections unexpectedly
terminate. Properly split epilog strings to process multiple
paths
- Pass the tool's command line to the server so it can be returned
in queries
- Add support for C11 atomics
- Support collection and forwarding of fabric-specific envars
- Improve handling of hwloc configure option
- Fix PMIx_server_generate_regex to preserve node ordering
- Fix a bug when registering default event handlers
3.0.2 -- 18 Sept 2018
----------------------
- Ensure we cleanup any active sensors when a peer departs. Allow the
heartbeat monitor to "reset" if a process stops beating and subsequently
returns
- Fix a few bugs in the event notification system and provide some
missing implementation (support for specifying target procs to
receive the event).
- Add PMIX_PROC_TERMINATED constant
- Properly deal with EOPNOTSUPP from getsockopt() on ARM
3.0.1 -- 23 Aug 2018
----------------------
**** DEPRECATION WARNING: The pmix_info_array_t struct was
**** initially marked for deprecation in the v2.x series.
**** We failed to provide clear warning at that time. This
**** therefore serves as warning of intended removal of
**** pmix_info_array_t in the future v4 release series.
- Fixed memory corruption bug in event notification
system due to uninitialized variable
- Add numeric version field to pmix_version.h
- Transfer all cached data to client dstore upon first connect
- Implement missing job control and sensor APIs
3.0.0 -- 6 July 2018
------------------------------------
**** NOTE: This release implements the complete PMIX v3.0 Standard
**** and therefore includes a number of new APIs and features. These
**** can be tracked by their RFC's on the community website:
**** https://pmix.org/pmix-standard.
- Added blocking forms of several existing APIs:
- PMIx_Log
- PMIx_Allocation_request
- PMIx_Job_control
- PMIx_Process_monitor
- Added support for getting/validating security credentials
- PMIx_Get_credential, PMIx_Validate_credential
- Extended support for debuggers/tools
- Added IO forwarding support allowing tools to request
forwarding of output from specific application procs,
and to forward their input to specified target procs
- Extended tool attributes to support synchronization
during startup of applications. This includes the
ability to modify an application's environment
(including support for LD_PRELOAD) and define an
alternate fork/exec agent
- Added ability for a tool to switch server connections
so it can first connect to a system-level server to
launch a starter program, and then reconnect to that
starter for debugging purposes
- Extended network support to collect network inventory by
either rolling it up from individual nodes or by direct
query of fabric managers. Added an API by which the
host can inject any rolled up inventory into the local
PMIx server. Applications and/or the host RM can access
the inventory via the PMIx_Query function.
- Added the ability for applications and/or tools to register
files and directories for cleanup upon their termination
- Added support for inter-library coordination within a process
- Extended PMIx_Log support by adding plugin support for new
channels, including local/remote syslog and email. Added
attributes to query available channels and to tag and
format output.
- Fix several memory and file descriptor leaks
2.2.2 -- 24 Jan 2019
----------------------
- Fix a bug in macro identifying system events
2.2.1 -- 18 Jan 2019
----------------------
- Fix a bug in registration of default event handlers
that somehow slipped thru testing
2.2.0 -- 17 Jan 2019
----------------------
**** THIS RELEASE MARKS THE STARTING POINT FOR FULL COMPLIANCE
**** WITH THE PMIX v2.2 STANDARD. ALL API BEHAVIORS AND ATTRIBUTE
**** DEFINITIONS MEET THE v2.2 STANDARD SPECIFICATIONS.
- Add a new, faster dstore GDS component 'ds21'
- Performance optimizations for the dstore GDS components.
- Plug miscellaneous memory leaks
- Silence an unnecessary warning message when checking connection
to a non-supporting server
- Ensure lost-connection events get delivered to default event
handlers
- Correctly handle cache refresh for queries
- Protect against race conditions between host and internal library
when dealing with async requests
- Cleanup tool operations and add support for connections to
remote servers.
- Automatically store the server URI for easy access by client
- Provide MCA parameter to control TCP connect retry/timeout
- Update event notification system to properly evict oldest events
when more space is needed
- Fix a number of error paths
- Handle setsockopt(SO_RCVTIMEO) not being supported
- Pass the tool's command line to the server so it can be returned
in queries
- Add support for C11 atomics
- Fix a bug when registering default event handlers
2.1.4 -- 18 Sep 2018
----------------------
- Updated configury to silence warnings on older compilers
- Implement job control and sensor APIs
- Update sensor support
- Fix a few bugs in the event notification system and provide some
missing implementation (support for specifying target procs to
receive the event).
- Add PMIX_PROC_TERMINATED constant
- Properly deal with EOPNOTSUPP from getsockopt() on ARM
2.1.3 -- 23 Aug 2018
----------------------
- Fixed memory corruption bug in event notification
system due to uninitialized variable
- Add numeric version definition
- Transfer all cached data to client dstore upon first connect
2.1.2 -- 6 July 2018
----------------------
- Added PMIX_VERSION_RELEASE string to pmix_version.h
- Added PMIX_SPAWNED and PMIX_PARENT_ID keys to all procs
started via PMIx_Spawn
- Fixed faulty compares in PMI/PMI2 tests
- Fixed bug in direct modex for data on remote node
- Correctly transfer all cached job info to the client's
shared memory region upon first connection
- Fix potential deadlock in PMIx_server_init in an error case
- Fix uninitialized variable
- Fix several memory and file descriptor leaks
2.1.1 -- 23 Feb 2018
----------------------
- Fix direct modex when receiving new nspace
- Resolve direct modex of job-level info
- Fix a bug in attribute configuration checks
- Fix a couple of bugs in unpacking of direct modex job-level data
- Correcly handle application setup data during "instant on" launch
- add a PMIX_BYTE_OBJECT_LOAD convenience macro
- Fix two early "free" bugs
- Add an example PMI-1 client program
2.1.0 -- 1 Feb 2018
----------------------
**** NOTE: This release contains the first implementation of cross-version
**** support. Servers using v2.1.0 are capable of supporting clients using
**** PMIx versions v1.2 and above. Clients using v2.1.0 are able to interact
**** with servers based on v1.2 and above.
- Added cross-version communication support
- Enable reporting of contact URI to stdout, stderr, or file (PR #538)
- Enable support for remote tool connections (PR #540, #542)
- Cleanup libevent configure logi to support default install paths (PR #541)
- Debounce "unreachable" notifications for tools when they disconnect (PR #544)
- Enable the regex generator to support node names that include multiple
sets of numbers
2.0.3 -- 1 Feb 2018
----------------------
- Fix event notification so all sides of multi-library get notified
of other library's existence
- Update syslog protection to support Mac High Sierra OS
- Remove usock component - unable to support v1.x clients due
to datatype differences
- Cleanup security handshake
- Cleanup separation of PMI-1/2 libraries and PMIx symbols
- Protect against overly-large messages
- Update data buffer APIs to support cross-version operations
- Protect receive callbacks from NULL and/or empty buffers as this
can occur when the peer on a connection disappears.
- Fix tool connection search so it properly descends into the directory
tree while searching for the server's contact file.
- Fix store_local so it doesn't reject a new nspace as that can happen
when working with tools
- Ensure we always complete PMIx_Finalize - don't return if something
goes wrong in the middle of the procedure
- Fix several tool connection issues
2.0.2 -- 19 Oct 2017
----------------------
- Update RPM spec file (rpmbuild -ta, and --rebuild fixes) (PR #523)
- Support singletons in PMI-1/PMI-2 (PR #537)
- Provide missing implementation support for arrays of pmix_value_t's (PR #531)
- Remove unsupported assembly code for MIPS and ARM processors
prior to v6 (PR #547)
- Fix path separator for PMIx configuration files (PR #547)
- Add configure option to enable/disable the default value for the
show-load-errors MCA param (PR #547)
2.0.1 -- 24 Aug. 2017
----------------------
- Protect PMIX_INFO_FREE macro from NULL data arrays
- Added attributes to support HWLOC shared memory regions
- Fixed several syntax errors in configure code
- Fixed several visibility errors
- Correctly return status from PMIx_Fence operation
- Restore tool connection support and implement search
operations to discover rendezvous files
2.0.0 -- 22 Jun 2017
----------------------
**** NOTE: This release implements the complete PMIX v2.0 Standard
**** and therefore includes a number of new APIs and features. These
**** can be tracked by their RFC's in the RFC repository at:
**** https://github.com/pmix/RFCs. A formal standards document will
**** be included in a later v2.x release. Some of the changes are
**** identified below.
- Added the Modular Component Architecture (MCA) plugin manager and
converted a number of operations to plugins, thereby allowing easy
customization and extension (including proprietary offerings)
- Added support for TCP sockets instead of Unix domain sockets for
client-server communications
- Added support for on-the-fly Allocation requests, including requests
for additional resources, extension of time for currently allocated
resources, and return of identified allocated resources to the scheduler
(RFC 0005 - https://github.com/pmix/RFCs/blob/master/RFC0005.md)
- Tightened rules on the processing of PMIx_Get requests, including
reservation of the "pmix" prefix for attribute keys and specifying
behaviors associated with the PMIX_RANK_WILDCARD value
(RFC 0009 - https://github.com/pmix/RFCs/blob/master/RFC0009.md)
- Extended support for tool interactions with a PMIx server aimed at
meeting the needs of debuggers and other tools. Includes support
for rendezvousing with a system-level PMIx server for interacting
with the system management stack (SMS) outside of an allocated
session, and adds two new APIs:
- PMIx_Query: request general information such as the process
table for a specified job, and available SMS capabilities
- PMIx_Log: log messages (e.g., application progress) to a
system-hosted persistent store
(RFC 0010 - https://github.com/pmix/RFCs/blob/master/RFC0010.md)
- Added support for fabric/network interactions associated with
"instant on" application startup
(RFC 0012 - https://github.com/pmix/RFCs/blob/master/RFC0012.md)
- Added an attribute to support getting the time remaining in an
allocation via the PMIx_Query interface
(RFC 0013 - https://github.com/pmix/RFCs/blob/master/RFC0013.md)
- Added interfaces to support job control and monitoring requests,
including heartbeat and file monitors to detect stalled applications.
Job control interface supports standard signal-related operations
(pause, kill, resume, etc.) as well as checkpoint/restart requests.
The interface can also be used by an application to indicate it is
willing to be pre-empted, with the host RM providing an event
notification when the preemption is desired.
(RFC 0015 - https://github.com/pmix/RFCs/blob/master/RFC0015.md)
- Extended the event notification system to support notifications
across threads in the same process, and the ability to direct
ordering of notifications when registering event handlers.
(RFC 0018 - https://github.com/pmix/RFCs/blob/master/RFC0018.md)
- Expose the buffer manipulation functions via a new set of APIs
to support heterogeneous data transfers within the host RM
environment
(RFC 0020 - https://github.com/pmix/RFCs/blob/master/RFC0020.md)
- Fix a number of race condition issues that arose at scale
- Enable PMIx servers to generate notifications to the host RM
and to themselves
1.2.5 -- TBD
----------------------
- Fix cross-version issue when v1.2 client interacts with v2.1 server (PR #564)
- Update client connection for cross-version support (PR #591)
- Fix write memory barrier ASM for PowerPC (PR #606)
- Add protection from overly-large messages
1.2.4 -- 13 Oct. 2017
----------------------
- Silence some unnecessary warning messages (PR #487)
- Coverity fix - TOCTOU (PR #465)
- automake 1.13 configure fix (PR #486)
- Update RPM spec file (rpmbuild -ta, and --rebuild fixes) (PR #523)
- Support singletons in PMI-1/PMI-2 (PR #537)
1.2.3 -- 24 Aug. 2017
----------------------
- Resolve visibility issues for public APIs (PR #451)
- Atomics update - remove custom ASM atomics (PR #458)
- Fix job-fence test (PR #423)
- Replace stale PMIX_DECLSPEC with PMIX_EXPORT (PR #448)
- Memory barrier fixes for thread shifting (PR #387)
- Fix race condition in dmodex (PR #346)
- Allow disable backward compatability for PMI-1/2 (PR #350)
- Fix segv in PMIx_server_deregister_nspace (PR #343)
- Fix possible hang in PMIx_Abort (PR #339)
1.2.2 -- 21 March 2017
----------------------
- Compiler fix for Sun/Oracle CC (PR #322)
- Fix missing include (PR #326)
- Improve error checking around posix_fallocate (PR #329)
- Fix possible memory corruption (PR #331)
1.2.1 -- 21 Feb. 2017
----------------------
- dstore: Fix data corruption bug in key overwrite cases
- dstore: Performance and scalability fixes
- sm: Use posix_fallocate() before mmap
- pmi1/pmi2: Restore support
- dstore: Fix extension slot size allocation (Issue #280)
1.2.0 -- 14 Dec. 2016
----------------------
- Add shared memory data storage (dstore) option. Default: enabled
Configure option: --disable-dstore
- PMIx_Commit performance improvements
- Disable errhandler support
- Keep job info in the shared memory dstore
- PMIx_Get performance and memory improvements
1.1.5
-----
- Add pmix_version.h to support direct detection of PMIx library version
- Fix support for Solaris 10 by using abstract version of strnlen
- Fix native security module for Solaris by using getpeerucred in
that environment
- Ensure man pages don't get installed in embedded builds
- Pass temporary directory locations in info keys instead of
the environment
1.1.4
-----
- Properly increment the reference count for PMIx_Init
- Fix examples so all run properly
- Fix/complete PMI2 backward compatibility support to handle
keys that are not associated with a specific rank
- Do a better job of hiding non-API symbols
- Correct handling of semi-colon terminations on macros.
Thanks to Ashley Pittman for the patch
- Add more man pages
- Improve error checking and messages for connection
attempts from client to server
- If the tmpdir name is too long, provide an appropriate
help message to the user (particularly relevant on
Mac OSX). Thanks to Rainer Keller for the patch.
- Fix some C++ compatibility issues
- Fix/complete PMI-1 backward compatibility support
- Do not install internal headers unless specifically
requested to do so
- Add support for multiple calls to Put/Commit
1.1.3
-----
- Update the symbol hiding file to cover all symbols
- Fix examples and test directory Makefile.am's so
the Makefiles are automatically built and the
code compiled, but not installed
- Do not install the pmix library in embedded use-cases
1.1.2
-----
- Provide a check for hwloc support - if not found, then
don't pass any topology info down to the client as it
won't know how to unpack it anyway.
- Fix a few places where thread safety wasn't provided
- Fix several issues identified by Paul Hargrove:
* PMIx_Init(NULL) is supported
* Incomplete PMIx_constants man page had some lingering cruft
* Missing prototype for pmix_value_load
- Fix race condition in PMIx_Get/PMIx_Get_nb
- Fix double-free error in pmix_server_commit.
- Fix PMIX_LOAD_BUFFER to be safe.
1.1.1
-----
- Fix an issue where the example and test programs
were incorrectly being installed. Thanks to Orion
Poplawski for reporting it
1.1.0
-----
- major update of APIs to reflect comments received from 1.0.0
non-production release
- fixed thread-safety issues
- fixed a range of pack/unpack issues
- added unit tests for all APIs
1.0.0
------
Initial public release of draft APIs for comment - not production
intended