Ralph Castain
f94f3cda21
Fix the handling of cpusets so we get the correct cpuset for each local peer. Add the ability to indicate that a modex request is "optional" so we don't call the server if we don't find the value. Take advantage of that to allow the MPI layer to decide that the lack of locality info indicates non-local
2015-09-10 10:25:30 -07:00
Jeff Squyres
f7d90abf42
usnic: update for new add_procs() downcall behavior
2015-09-10 08:55:55 -06:00
Jeff Squyres
2f2d5ff855
btl.h: update comment for new add_procs behavior
2015-09-10 08:55:55 -06:00
Nathan Hjelm
2041aac4e4
btl/openib: add support for dynamic add_procs
...
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-10 08:55:55 -06:00
Nathan Hjelm
40067f7ec4
btl/tcp: add support for dynamic add_procs
...
This commit makes two changes to the tcp btl:
- If a tcp proc does not exist when handling a new connection create
a new proc and use it. The current implementation uses the
opal_proc_by_name() function to get the opal_proc_t then calls
add_procs on all btl modules. It may be sufficient to just call
add_procs until an endpoint is created so this may change somewhat.
- In add_procs add a check for an existing endpoint before creating
one.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-10 08:55:55 -06:00
Nathan Hjelm
536aba1172
btl/portals4: add send flag to btl_flags
2015-09-10 08:55:55 -06:00
Nathan Hjelm
408da16d50
ompi/proc: add proc hash table for ompi_proc_t objects
...
This commit adds an opal hash table to keep track of mapping between
process identifiers and ompi_proc_t's. This hash table is used by the
ompi_proc_by_name() function to lookup (in O(1) time) a given
process. This can be used by a BTL or other component to get a
ompi_proc_t when handling an incoming message from an as yet unknown
peer.
Additionally, this commit adds a new MCA variable to control the new
add_procs behavior: mpi_add_procs_cutoff. If the number of ranks in
the process falls below the threshold a ompi_proc_t is created for
every process. If the number of ranks is above the threshold then a
ompi_proc_t is only created for the local rank. The code needed to
generate additional ompi_proc_t's for a communicator is not yet
complete.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-10 08:55:54 -06:00
Nathan Hjelm
6f8f2325ed
btl: btls are now required to set the send flag if supported
...
This commit updates each non-compliant btl to send the
MCA_BTL_FLAGS_SEND flag in the btl_flags field if send is
supported. This fixes a problem identified after the latest bml/r2
update which excplicitly checks for the send flag.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-10 08:55:54 -06:00
Ralph Castain
4c47c498ac
Sync to latest PMIx master
...
Allow the blocking send and recv to keep trying
2015-09-09 11:48:47 -07:00
Matias Cabral
f360eebfeb
Merge pull request #855 from matcabral/btl_openib_mtu
...
Fix for openib btl mca command line parameter btl_openib_mtu being ignored
2015-09-09 11:22:00 -07:00
Gilles Gouaillardet
7f0ed74d24
pmix1xx: fix CPPFLAGS when DSO are not built
2015-09-09 14:20:12 +09:00
rhc54
f6b6b9a9ca
Merge pull request #877 from rhc54/topic/s1s2
...
Cleanup s1 and s2 components
2015-09-08 19:20:59 -07:00
Ralph Castain
1cdb86b8c7
Cleanup s1 and s2 components, and ensure that mpirun and orteds only use non-direct-launch pmix components.
2015-09-08 18:37:09 -07:00
Gilles Gouaillardet
6e6a3e965c
pml: do not cast way the const modifier when this is not necessary
...
update the pml framework and mpi c bindings
2015-09-09 09:18:57 +09:00
rhc54
3a446c9797
Merge pull request #876 from rhc54/topic/hnp
...
Fix segfault upon job error
2015-09-08 15:10:51 -07:00
rhc54
47f437608d
Merge pull request #875 from rhc54/topic/dynamics
...
Stop a segfault in the test by correctly passing all the argv during spawn
2015-09-08 14:35:42 -07:00
Ralph Castain
459f169e06
Fix segfault upon job error
...
Silence some unnecessary error-logs
2015-09-08 14:03:06 -07:00
Ralph Castain
ae7156cabb
Stop a segfault in the test by correctly passing all the argv during spawn
2015-09-08 13:42:46 -07:00
Rolf vandeVaart
188c30a01a
Merge pull request #867 from rolfv/pr/openib-hwloc-verbosity
...
Add some verbosity to help debug hwloc issues
2015-09-08 14:43:35 -04:00
rhc54
8053357fcc
Merge pull request #873 from rhc54/topic/static
...
Add the libs required for PMIx to support static builds (and trim all excess whitespace)
2015-09-08 11:28:47 -07:00
Rolf vandeVaart
2e64a69fa9
Add some verbosity to help debug hwloc issues
2015-09-08 10:50:22 -07:00
Ralph Castain
291afe502f
Add the libs required for PMIx to support static builds
...
Remove unneeded CPPFLAGS
2015-09-08 10:21:06 -07:00
Jeff Squyres
bc9e5652ff
whitespace: purge whitespace at end of lines
...
Generated by running "./contrib/whitespace-purge.sh".
2015-09-08 09:47:17 -07:00
Ralph Castain
e6add86e4f
Deal with connect/accept between two jobs from different mpirun's. Somewhat optimize connect/accept by using MPI bcast to distribute the participants instead of another PMIx lookup. Cleanup some Coverity issues.
2015-09-07 09:19:24 -07:00
Ralph Castain
37c3ed68e7
Cleanup connect/disconnect and bring comm_spawn back online!
2015-09-06 10:27:39 -07:00
Jeff Squyres
f782a7640e
usnic: minor re-order of Makefile.am sources
...
Put the hwloc.c file alphabetically in the list.
2015-09-05 05:02:00 -07:00
rhc54
665b30376a
Merge pull request #868 from rhc54/topic/hwloc
...
Remove OPAL_HAVE_HWLOC qualifier and error out if --without-hwloc is given
2015-09-04 17:58:07 -07:00
Ralph Castain
2ecbbc84e7
Hide a symbol that is only used in one file and is not properly prefixed
2015-09-04 17:08:24 -07:00
Ralph Castain
d97bc29102
Remove OPAL_HAVE_HWLOC qualifier and error out if --without-hwloc is given
2015-09-04 16:54:40 -07:00
rhc54
d45ccda813
Merge pull request #866 from rhc54/topic/updatepmix
...
Update PMIx support
2015-09-04 11:09:36 -07:00
Ralph Castain
f6948c2bb4
Sync with PMIx master 43e45c3. Get multi-node publish/lookup/unpublish working
2015-09-04 10:07:17 -07:00
Rolf vandeVaart
ebfd00b66e
While debugging user problems, these extra verbosity statements would be helpful
2015-09-03 17:15:39 -04:00
Howard Pritchard
0557beee22
Merge pull request #864 from hppritcha/topic/pmix_cray_more_funcs
...
pmix/cray: more stubs plus a get_version method
2015-09-03 14:52:46 -06:00
Howard Pritchard
6e7345c790
pmix/cray: more stubs plus a get_version method
...
Add more stubs to reduce likelihood of future
mysterious segfaults if some of the newer pmix
funcs start to get used within ompi.
Add a get_version to return the version of the
Cray PMI library being used, since the Cray PMI
library actually has a function to get that info.
Be more accurate about which functions have a hope
of being implemented using Cray PMI and those which
never will.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-09-03 12:51:50 -07:00
Ralph Castain
a772b46c15
Bring the MPI_Publish and friends online
2015-09-02 12:04:07 -07:00
matcabral
1f9218a0bc
Fix for openib btl mca command line parameter btl_openib_mtu being
...
ignored.
2015-09-02 02:22:30 -07:00
Ralph Castain
95dbd70f44
Sync to PMIx 1.1, sha- 51479b0
2015-09-01 14:09:25 -07:00
Rolf vandeVaart
30b1a6e003
Merge pull request #836 from rolfv/pr/fix-cuda-war
...
Add config code to check for need of workaround. Add runtime way to turn oiff just in case.
2015-09-01 15:05:29 -04:00
Nathan Hjelm
f926796e57
Merge pull request #828 from hjelmn/openib_thread_fix
...
openib thread fixes
2015-09-01 09:12:50 -06:00
rhc54
d8cb3fe705
Merge pull request #852 from rhc54/topic/pmix
...
Sync to PMIx tarball - includes:
2015-09-01 06:54:34 -07:00
Gilles Gouaillardet
6dfa996760
configury: fix a typo in opal/mca/pmix/pmix1xx/configure.m4
2015-09-01 14:59:07 +09:00
Ralph Castain
c1bbd7bc78
Sync to PMIx tarball - includes:
...
* update to configury to silence ident messages (thanks Gilles!)
* fix for warnings Jeff saw when get didn't find the requested data
* fix for Mac OSX operations
2015-08-31 21:51:02 -07:00
rhc54
2d3c6af8ad
Merge pull request #851 from rhc54/topic/copyfix
...
Only copy the value across if the "get" operation succeeded
2015-08-31 19:51:13 -07:00
Ralph Castain
ef69958e01
Only copy the value across if the "get" operation succeeded
2015-08-31 17:11:26 -07:00
Jeff Squyres
8558458bb9
usnic: adjust for new PMIX argument type
2015-08-31 14:55:58 -07:00
Rolf vandeVaart
54ab0d1a51
Add config code to check for need of workaround. Add runtime way to turn it off just in case
2015-08-31 17:18:47 -04:00
Nathan Hjelm
3c34f6f25c
Merge pull request #517 from hjelmn/class_fix
...
opal/class: enable use of opal classes after opal_class_finalize
2015-08-31 12:13:58 -07:00
Nathan Hjelm
faf06edb5b
Merge pull request #824 from hjelmn/opal_mutex_mod
...
opal/mutex: remove unnecessary ()s from OPAL_SCOPED_LOCK macro
2015-08-31 12:08:25 -07:00
rhc54
6e78e2c89b
Merge pull request #846 from rhc54/topic/pmix
...
Sync to PMIx tarball
2015-08-31 08:53:07 -07:00
Nathan Hjelm
2aab6ad90f
Merge pull request #827 from hjelmn/recursive_locks
...
Add support for recursive locks (revisited)
2015-08-31 07:52:23 -07:00