openmpi

Автор	SHA1	Сообщение	Дата
Brian Barrett	bffcc3bca0	util: move graph solver from usnic to util Cisco wrote a bipartite graph solver to properly solve interface pair selection for usNIC. Using the reachable framework, the TCP BTL (and possibly the runtime network code) can use the graph solver to make more optimal pair selection. Jeff was happy to have the code more broadly used, but didn't have time to do the move, hence this commit. There are a couple of minor changes to the code compared to the usNIC version. Obviously, the functions have been renamed to match naming convention for their new home. Since it's easier to write unit tests for util/ code, the unit tests have been made first class tests run at "make check" time. This last bit required moving some of the definitions into a new header, bipartite_graph_internal.h, so that they could be included in both the library code and the test code. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-09-15 15:08:47 -07:00
Thananon Patinyasakdikul	68658e4bab	btl/usnic: assign the number of send credit correctly. usnic endpoints was always created with default send credit value of 8. This commit assign the correct number from the hardware instead. Signed-off-by: Thananon Patinyasakdikul <apatinya@cisco.com>	2017-08-08 17:01:16 -07:00
Jeff Squyres	6f5e377fe0	btl/usnic: update for libfabric v1.4 With libfabric v1.4, the usnic provider changed the values of its fabric and domain name strings (compared to libfabric <v1.4). Update the Open MPI usNIC BTL to handle both pre-v1.4 and v1.4 fabric/domain names. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-08-25 03:53:17 -07:00
Jeff Squyres	df5043bc3f	usnic: adjust for libfabric API return value change	2015-04-27 10:18:49 -07:00
Jeff Squyres	f7b4b23383	usnic: ensure to NULL-terminate the string/not overflow This was CID 1269921.	2015-02-12 13:41:30 -08:00
Jeff Squyres	8febd41a39	usnic: fix minor memory leak This was CID 1269859.	2015-02-12 13:41:30 -08:00
Jeff Squyres	bfa54d5d7b	usnic: update to match new libfabric	2015-02-03 13:46:06 -08:00
Jeff Squyres	984982790a	usnic: convert from verbs to libfabric (yay!) This commit represents the conversion of the usnic BTL from verbs to libfabric. For the moment, libfabric is embedded in Open MPI (currently in the usnic BTL). This is because the libfabric API is still changing, and also has not yet been released. Ultimately, this embedded copy of libfabric will likely disappear and the usnic BTL will rely on an external installation of libfabric. New configure options: * --with-libfabric: will cause configure to fail if libfabric support cannot be built * --without-libfabric: will prevent libfabric support from being built * --with-libfabric=DIR: use an external libfabric installation * --with-libfabric-libdir=LIBDIR: when paired with --with-libfabric=DIR, use LIBDIR for the libfabric installation library dir The --with-libnl3[-libdir] arguments are now gone.	2014-12-08 11:37:37 -08:00
Ralph Castain	780c93ee57	Per the PR and discussion on today's telecon, extend the process name definition as a two-field struct of uint32_t's down to the OPAL layer. This resolves issues created by prior commits that impacted both heterogeneous and SPARC support. This also simplifies the OMPI code base by removing the need for frequent memcpy's when transitioning between the OMPI/ORTE layers and OPAL. We recognize that this means other users of OPAL will need to "wrap" the opal_process_name_t if they desire to abstract it in some fashion. This is regrettable, and we are looking at possible alternatives that might mitigate that requirement. Meantime, however, we have to put the needs of the OMPI community first, and are taking this step to restore hetero and SPARC support.	2014-11-11 17:00:42 -08:00
Jeff Squyres	c22e1ae33b	configury: new OPAL_SET_LIB_PREFIX/ORTE_SET_LIB_PREFIX macros These two macros set the prefix for the OPAL and ORTE libraries, respectively. Specifically, the OPAL library will be named libPREFIXopen-pal.la and the ORTE library will be named libPREFIXopen-rte.la. These macros must be called, even if the prefix argument is empty. The intent is that Open MPI will call these macros with an empty prefix, but other projects (such as ORCM) will call these macros with a non-empty prefix. For example, ORCM libraries can be named liborcm-open-pal.la and liborcm-open-rte.la. This scheme is necessary to allow running Open MPI applications under systems that use their own versions of ORTE and OPAL. For example, when running MPI applications under ORTE, if the ORTE and OPAL libraries between OMPI and ORCM are not identical (which, because they are released at different times, are likely to be different), we need to ensure that the OMPI applications link against their ORTE and OPAL libraries, but the ORCM executables link against their ORTE and OPAL libraries.	2014-10-22 10:32:19 -07:00
Ralph Castain	fd6a044b7f	Cleanup some cruft resulting from the move of the btl's to opal. We had created the ability to delay modex operations, which included a need to delay retrieving hostname info for remote procs. This allowed us to not retrieve the modex info until first message unless required - the hostname is generally only required for debug and error messages. Properly setup the opal_process_info structure early in the initialization procedure. Define the local hostname right at the beginning of opal_init so all parts of opal can use it. Overlay that during orte_init as the user may choose to remove fqdn and strip prefixes during that time. Setup the job_session_dir and other such info immediately when it becomes available during orte_init.	2014-10-03 16:02:57 -06:00
Ralph Castain	aec5cd08bd	Per the PMIx RFC: WHAT: Merge the PMIx branch into the devel repo, creating a new OPAL “lmix” framework to abstract PMI support for all RTEs. Replace the ORTE daemon-level collectives with a new PMIx server and update the ORTE grpcomm framework to support server-to-server collectives WHY: We’ve had problems dealing with variations in PMI implementations, and need to extend the existing PMI definitions to meet exascale requirements. WHEN: Mon, Aug 25 WHERE: https://github.com/rhc54/ompi-svn-mirror.git Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding. All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level. Accordingly, we have: * created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations. * Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported. * Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint * removed the prior OMPI/OPAL modex code * added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform. * retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand This commit was SVN r32570.	2014-08-21 18:56:47 +00:00
Jeff Squyres	34897cee9f	usnic: unify teardown between trunk and v1.8 branches Make the del_procs, module finalize, and endpoint destructors be the same between trunk and v1.8, with one exception: the very beginning of v1.8 module_finalize calls del_procs for each proc to simulate/pretend the trunk/v1.9 PML behavior of calling del_procs before module_finalize. This commit was SVN r32437.	2014-08-05 22:31:55 +00:00
Jeff Squyres	ff4717b727	usnic: cagent now checks that incoming pings are expected Previously, the connectivity agent was pretty dumb: it took whatever pings it got and ACKed them. Then we added an agent check to ensured that the ping actually came from the source interface that it said it came from. Now we add another check such that when a ping is received on interface X that corresponds to usnic module Y, we ensure that the source interface of the ping is on the all_endpoints list for module Y (i.e., module Y expects to be able to talk to that peer interface). This detects cases where peers have come to different conclusions about which interfaces should be used to communicate (which is bad!). This usually reflects a network misconfiguration. Fixes CSCuq05389. This commit was SVN r32383.	2014-07-31 22:30:20 +00:00
Jeff Squyres	f1fb4970a5	usnic: remove all trailing whitespace Style cleanup only; no code changes. This commit was SVN r32366.	2014-07-30 20:56:15 +00:00
George Bosilca	3d5578d6df	No vestiges of the RTE or OMPI left. This commit was SVN r32327.	2014-07-26 22:19:49 +00:00
George Bosilca	f217661ee0	Use opal_process_info whenever possible. Some other minor cleanups. This commit was SVN r32325.	2014-07-26 21:48:23 +00:00
Ralph Castain	552c9ca5a0	George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic. This commit was SVN r32317.	2014-07-26 00:47:28 +00:00

18 Коммитов