Ralph Castain
3ad5a40ba8
Sync to PMIx master
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-03 10:56:30 -07:00
Ralph Castain
57c14cbfed
Sync to PMIx master to pickup a little bug fix
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-27 07:54:16 -07:00
Ralph Castain
d5db4ee965
Update to track PMIx master (v2.1.0)
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-25 10:24:13 -07:00
Ralph Castain
5fed7330e7
Update the configure logic to separate the emitting of a libpmix library from with-devel-headers. Instead, we create a new --enable-install-libpmix expressly for that
...
purpose. Continue to link the new library back to libopen-pal to resolve the renamed symbols.
Update opal configure logic to set disable_dlopen when disable_mca_dso is given. Fix typos in disable_dlopen when setting variables (incorrect inclusion of quotes)
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-22 16:02:57 -07:00
Ralph Castain
3493c43468
Sync to PMIx master
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-22 10:48:00 -07:00
Ralph Castain
fe9b584c05
Fully support OMPI spawn options. Fix a bug in the round-robin mappers where we weren't adding nodes to the job map node array, and so resources were not released
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 285d8cfef74ffc899e9c51e1d9c597b7fb2ceb89)
2017-09-21 10:29:27 -07:00
Ralph Castain
e575c4d6f9
Fix tool connection logic so we properly search for default session server, perform specified number of retries, etc.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 7c755e01004f8b86c71f1729662979ea45ab1adb)
2017-09-19 13:35:46 -07:00
Ralph Castain
3b3ce243bb
Merge pull request #4214 from karasevb/pmix1_hang_fix
...
pmix: fixed immediate request for PMIx v1.2
2017-09-19 06:51:25 -07:00
Ralph Castain
5708872112
Implement support for "local" range when publishing data
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 2d54f7e0dd3a47260b0b2634aae3361316005933)
2017-09-18 19:34:08 -07:00
Boris Karasev
2929f52ffc
pmix1: fixed immediate request
...
This fixes a hang of immediate PMIx request. PMIx v1.2 does not support
the info key `PMIX_IMMEDIATE` that leads to hanging. For that request
the fix uses the key `PMIX_OPTIONAL` for not go to the server.
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-09-18 09:17:44 +03:00
Ralph Castain
3c914a7a97
Complete the fix of the ORTE DVM. We will now use "prun" instead of "orterun -hnp foo" to execute jobs. This provides the feature of automatic discovery of the orte-dvm so you don't need to manually enter URI's or contact file locations. All IO is forwarded to prun.
...
Still in the "needs to be done" category:
* mapping/ranking/binding options aren't correctly supported
* if the DVM encounters some errors (e.g., not enough resources for the job), the resulting error is globally set and impacts any subsequent job submission
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-16 13:13:07 -07:00
Ralph Castain
7c7d8a69a0
Backport changes from PMIx reference server
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-14 11:48:56 -07:00
Ralph Castain
691237801b
Update to track PMIx master
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-13 10:21:44 -07:00
Ralph Castain
bbd83fd4c0
Add a new launcher "prun" for starting applications against the ORTE DVM.
...
Unlike "orterun", "prun" is a PMIx-only program that discovers the DVM connection instead of requiring that we explicitly provide it. Only build "prun" if PMIx v2.x is available.
This gets the DVM working again, but still is showing problems for multiple executions. I'll detail those in a separate issue. Thus, the DVM should still be considered "broken".
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-12 21:40:41 -07:00
Ralph Castain
88eac797fb
Silence coverity warnings
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-12 09:14:36 -07:00
Ralph Castain
3477079804
Repair the ORTE DVM
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-11 17:38:21 -07:00
Ralph Castain
cbc114e923
Update to track PMIx master
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-06 13:15:24 -07:00
Ralph Castain
2c723f4338
Roll to track PMIx master
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-01 12:30:34 -07:00
Gilles Gouaillardet
c9cca771cc
pmix/ext2x: automatically generate ext2x component from pmix2x sources
...
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-30 09:41:31 +09:00
Gilles Gouaillardet
fd08b923d5
pmix: do not invoke PMIX_INFO_CREATE() with a zero size
...
Thanks Lisandro Dalcin for the report
Fixes open-mpi/ompi#3854
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-28 11:25:58 +09:00
Josh Hursey
ad87aa2674
Merge pull request #4121 from jjhursey/explore/dlopen-local
...
mca: Dynamic components link against project lib
2017-08-25 13:15:51 -05:00
Joshua Hursey
e1d079544b
mca: Dynamic components link against project lib
...
* Resolves #3705
* Components should link against the project level library to better
support `dlopen` with `RTLD_LOCAL`.
* Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am`
with the appropriate project level library:
```
MCA components in ompi/
$(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la
MCA components in orte/
$(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la
MCA components in opal/
$(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la
MCA components in oshmem/
$(top_builddir)/oshmem/liboshmem.la"
```
Note: The changes in this commit were automated by the script in
the commit that proceeds it with the `libadd_mca_comp_update.py`
script. Some components were not included in this change because
they are statically built only.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-24 11:56:16 -04:00
Ralph Castain
68029b27e4
Fix the orte-dvm operations so that orterun can connect and execute an application. There is a lingering problem, though. The first invocation of orterun succeeds every time. However, subsequent invocations have a high probability of hanging in the OOB connection handshake.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-23 17:31:08 -07:00
Ralph Castain
0561d64748
Continue tracking PMIx v2.1.0
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-23 09:38:27 -07:00
Ralph Castain
d80b0c7990
If the HWLOC shared memory system is unable to connect, then fallback to providing the topology via XML. Do not automatically provide the XML to every process as that defeats the purpose of the shared memory system. Instead, use PMIx_Query_info_nb to get the info from the server when required.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-22 18:12:26 -07:00
Ralph Castain
e3213386ec
Fix the internal PMIx installation - matching changes have been upstreamed
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-22 13:49:07 -07:00
Ralph Castain
a1b15c5666
Roll in update to PMIx master. Transfer updates from pmix2x component to ext2x
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-22 13:06:47 -07:00
Ralph Castain
d515f48885
The local PMIx server is notifying its clients of all events, but for some reason I don't recall, the broadcast notification was marked for delivery only to non-default event handlers. This creates a discrepancy between the two behaviors, so don't restrict the broadcast notifications.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-18 17:26:11 -07:00
Ralph Castain
088b6cdeee
Silence coverity warnings
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-17 09:49:35 -07:00
Ralph Castain
c4d5dbfcdc
Change test per recommendation of @jsquyres
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-16 11:19:15 -07:00
Ralph Castain
eb69df02ae
Update to PMIx v2.1.0rc1
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 19:59:15 -07:00
Ralph Castain
65fb6070d9
Update tool support by adding MCA params to direct orted's to drop
...
session and/or system-level tool rendezous files. Ensure PMIx is
enabled for tools
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 17:49:47 -07:00
Ralph Castain
033a0eb373
Fix the --disable-dlopen --with-devel-headers case by not having libpmix link back to libopen-pal as the latter won't exist in time during this build case
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 10:51:35 -07:00
Ralph Castain
4290247d64
Update to latest PMIx v2.1.0a
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-10 18:48:07 -07:00
Ralph Castain
53c9270af7
Silence coverity warnings
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-08 06:10:14 -07:00
Ralph Castain
9921237f99
Merge pull request #4012 from rhc54/topic/p3
...
Cover the use-cases for OPAL_PREFIX and PMIX_INSTALL_PREFIX options
2017-08-07 11:42:53 -07:00
Ralph Castain
d593e5a4ce
When we specify --with-devel-headers, we also emit a copy of libpmix. However, that library was built against the OPAL libevent component, which means all the libevent functions are prefixed with OPAL names. So ensure that the emitted libpmix is linked back against libopen-pal so those symbols will be resolved.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-07 09:36:16 -07:00
Ralph Castain
a239b4c3c3
Per discussion on the PMIx side, do a better job of detecting mismatches between location directives for OPAL and PMIx. Provide a more helpful error message and error out if we find a mismatch. If any OPAL values are set and the PMIx equivalent is not, then transfer it.
...
Do not clear PMIX_INSTALL_PREFIX from the daemon's launch environment
Fixes #3980
Closes #4007
Refs #3985
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-04 19:36:00 -07:00
Ralph Castain
f128b4c546
Fix incorrect usage of '==' in test comparisons
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-03 21:21:26 -07:00
Artem Polyakov
500c8be888
pmix: fix PMIx envar name for the installation prefix.
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-08-02 08:03:36 +03:00
Ralph Castain
f39ce67982
Merge pull request #3951 from rhc54/topic/hwloc2
...
Update to hwloc 2.0.0a
2017-08-01 15:18:31 -06:00
Ralph Castain
8f34fa4a56
Move the detection of OPAL_PREFIX and subsequent posting of PMIX_PREFIX to the internal integration code for PMIx so we only do this when running with the embeddied PMIx
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-01 08:24:27 -06:00
Boris Karasev
e20b581529
pmix: fixed immediate request
...
This commit fixes a hang when using external PMIx v1 module
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-07-28 15:53:48 +06:00
Ralph Castain
7a83fdb9bb
Update to hwloc 2.0.0a with shmem support.
...
Update to support passing of HWLOC shmem topology to client procs
Update use of distance API per @bgoglin
Have the openib component lookup its object in the distance matrix
Bring usnic up-to-date
Restore binding for hwloc2
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 20:26:22 -07:00
Ralph Castain
0042c758f1
Update the tools support so it allows tools to access PMIx
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 17:10:08 -07:00
Ralph Castain
058e802b11
Add missing export directives
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 07:19:08 -07:00
George Bosilca
1ea8fab095
Make external symbols visible.
...
All symbols that need to be accessed from a MCA component must be marked
explicitly as visible using PMIX_EXPORT. This patch allows current trunk
to almost work on OsX. More on the devel mailing list.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-07-25 01:14:22 -04:00
Ralph Castain
af85e48dd7
Silence Coverity warning, silence pmix_error_log of success
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-21 15:33:16 -07:00
Ralph Castain
492f98f8a5
Update to latest PMIx v2.1.0a
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-21 12:58:09 -07:00
Ralph Castain
f7e8780a42
Remove fortran support from platform file
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-20 21:02:30 -07:00