1
1

616 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
7c7d8a69a0 Backport changes from PMIx reference server
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-14 11:48:56 -07:00
Ralph Castain
691237801b Update to track PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-13 10:21:44 -07:00
Ralph Castain
bbd83fd4c0 Add a new launcher "prun" for starting applications against the ORTE DVM.
Unlike "orterun", "prun" is a PMIx-only program that discovers the DVM connection instead of requiring that we explicitly provide it. Only build "prun" if PMIx v2.x is available.

This gets the DVM working again, but still is showing problems for multiple executions. I'll detail those in a separate issue. Thus, the DVM should still be considered "broken".

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-12 21:40:41 -07:00
Ralph Castain
88eac797fb Silence coverity warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-12 09:14:36 -07:00
Ralph Castain
3477079804 Repair the ORTE DVM
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-11 17:38:21 -07:00
Ralph Castain
cbc114e923 Update to track PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-06 13:15:24 -07:00
Ralph Castain
2c723f4338 Roll to track PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-01 12:30:34 -07:00
Gilles Gouaillardet
c9cca771cc pmix/ext2x: automatically generate ext2x component from pmix2x sources
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-30 09:41:31 +09:00
Gilles Gouaillardet
fd08b923d5 pmix: do not invoke PMIX_INFO_CREATE() with a zero size
Thanks Lisandro Dalcin for the report

Fixes open-mpi/ompi#3854

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-28 11:25:58 +09:00
Josh Hursey
ad87aa2674 Merge pull request #4121 from jjhursey/explore/dlopen-local
mca: Dynamic components link against project lib
2017-08-25 13:15:51 -05:00
Joshua Hursey
e1d079544b mca: Dynamic components link against project lib
* Resolves #3705
 * Components should link against the project level library to better
   support `dlopen` with `RTLD_LOCAL`.
 * Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am`
   with the appropriate project level library:
```
MCA components in ompi/
       $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la
MCA components in orte/
       $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la
MCA components in opal/
       $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la
MCA components in oshmem/
       $(top_builddir)/oshmem/liboshmem.la"
```

Note: The changes in this commit were automated by the script in
the commit that proceeds it with the `libadd_mca_comp_update.py`
script. Some components were not included in this change because
they are statically built only.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-24 11:56:16 -04:00
Ralph Castain
68029b27e4 Fix the orte-dvm operations so that orterun can connect and execute an application. There is a lingering problem, though. The first invocation of orterun succeeds every time. However, subsequent invocations have a high probability of hanging in the OOB connection handshake.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-23 17:31:08 -07:00
Ralph Castain
0561d64748 Continue tracking PMIx v2.1.0
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-23 09:38:27 -07:00
Ralph Castain
d80b0c7990 If the HWLOC shared memory system is unable to connect, then fallback to providing the topology via XML. Do not automatically provide the XML to every process as that defeats the purpose of the shared memory system. Instead, use PMIx_Query_info_nb to get the info from the server when required.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-22 18:12:26 -07:00
Ralph Castain
e3213386ec Fix the internal PMIx installation - matching changes have been upstreamed
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-22 13:49:07 -07:00
Ralph Castain
a1b15c5666 Roll in update to PMIx master. Transfer updates from pmix2x component to ext2x
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-22 13:06:47 -07:00
Ralph Castain
d515f48885 The local PMIx server is notifying its clients of all events, but for some reason I don't recall, the broadcast notification was marked for delivery only to non-default event handlers. This creates a discrepancy between the two behaviors, so don't restrict the broadcast notifications.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-18 17:26:11 -07:00
Ralph Castain
088b6cdeee Silence coverity warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-17 09:49:35 -07:00
Ralph Castain
c4d5dbfcdc Change test per recommendation of @jsquyres
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-16 11:19:15 -07:00
Ralph Castain
eb69df02ae Update to PMIx v2.1.0rc1
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 19:59:15 -07:00
Ralph Castain
65fb6070d9 Update tool support by adding MCA params to direct orted's to drop
session and/or system-level tool rendezous files. Ensure PMIx is
enabled for tools

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 17:49:47 -07:00
Ralph Castain
033a0eb373 Fix the --disable-dlopen --with-devel-headers case by not having libpmix link back to libopen-pal as the latter won't exist in time during this build case
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 10:51:35 -07:00
Ralph Castain
4290247d64 Update to latest PMIx v2.1.0a
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-10 18:48:07 -07:00
Ralph Castain
53c9270af7 Silence coverity warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-08 06:10:14 -07:00
Ralph Castain
9921237f99 Merge pull request #4012 from rhc54/topic/p3
Cover the use-cases for OPAL_PREFIX and PMIX_INSTALL_PREFIX options
2017-08-07 11:42:53 -07:00
Ralph Castain
d593e5a4ce When we specify --with-devel-headers, we also emit a copy of libpmix. However, that library was built against the OPAL libevent component, which means all the libevent functions are prefixed with OPAL names. So ensure that the emitted libpmix is linked back against libopen-pal so those symbols will be resolved.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-07 09:36:16 -07:00
Ralph Castain
a239b4c3c3 Per discussion on the PMIx side, do a better job of detecting mismatches between location directives for OPAL and PMIx. Provide a more helpful error message and error out if we find a mismatch. If any OPAL values are set and the PMIx equivalent is not, then transfer it.
Do not clear PMIX_INSTALL_PREFIX from the daemon's launch environment

Fixes #3980
Closes #4007
Refs #3985

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-04 19:36:00 -07:00
Ralph Castain
f128b4c546 Fix incorrect usage of '==' in test comparisons
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-03 21:21:26 -07:00
Artem Polyakov
500c8be888 pmix: fix PMIx envar name for the installation prefix.
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-08-02 08:03:36 +03:00
Ralph Castain
f39ce67982 Merge pull request #3951 from rhc54/topic/hwloc2
Update to hwloc 2.0.0a
2017-08-01 15:18:31 -06:00
Ralph Castain
8f34fa4a56 Move the detection of OPAL_PREFIX and subsequent posting of PMIX_PREFIX to the internal integration code for PMIx so we only do this when running with the embeddied PMIx
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-01 08:24:27 -06:00
Boris Karasev
e20b581529 pmix: fixed immediate request
This commit fixes a hang when using external PMIx v1 module

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-07-28 15:53:48 +06:00
Ralph Castain
7a83fdb9bb Update to hwloc 2.0.0a with shmem support.
Update to support passing of HWLOC shmem topology to client procs
Update use of distance API per @bgoglin
Have the openib component lookup its object in the distance matrix
Bring usnic up-to-date
Restore binding for hwloc2

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 20:26:22 -07:00
Ralph Castain
0042c758f1 Update the tools support so it allows tools to access PMIx
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 17:10:08 -07:00
Ralph Castain
058e802b11 Add missing export directives
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 07:19:08 -07:00
George Bosilca
1ea8fab095
Make external symbols visible.
All symbols that need to be accessed from a MCA component must be marked
explicitly as visible using PMIX_EXPORT. This patch allows current trunk
to almost work on OsX. More on the devel mailing list.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-07-25 01:14:22 -04:00
Ralph Castain
af85e48dd7 Silence Coverity warning, silence pmix_error_log of success
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-21 15:33:16 -07:00
Ralph Castain
492f98f8a5 Update to latest PMIx v2.1.0a
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-21 12:58:09 -07:00
Ralph Castain
f7e8780a42 Remove fortran support from platform file
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-20 21:02:30 -07:00
Ralph Castain
b225366012 Bring the ofi/rml component online by completing the wireup protocol for the daemons. Cleanup the current confusion over how connection info gets created and
passed to make it all flow thru the opal/pmix "put/get" operations. Update the PMIx code to latest master to pickup some required behaviors.

Remove the no-longer-required get_contact_info and set_contact_info from the RML layer.

Add an MCA param to allow the ofi/rml component to route messages if desired. This is mainly for experimentation at this point as we aren't sure if routing wi
ll be beneficial at large scales. Leave it "off" by default.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-20 21:01:57 -07:00
Ralph Castain
8c30958879 Update to PMIx v2.1.0alpha
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-20 11:12:06 -07:00
Ralph Castain
fca68b070b Merge pull request #3934 from rhc54/topic/singleton
Fix the isolated pmix component. Cleanup the ess/singleton component …
2017-07-19 16:02:37 -05:00
Ralph Castain
543c16b28d Fix the isolated pmix component. Cleanup the ess/singleton component - we shouldn't be automatically discovering the local topology as that is now done on-demand.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-19 12:14:29 -07:00
Howard Pritchard
2fa0c4c6ec pmix/s1: fix problems with ref counting in s1
s1 pmix component wasn't doing proper ref counting

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-07-18 15:59:28 -06:00
Gilles Gouaillardet
9124afbeae pmix: do not invoke PMIX_INFO_CREATE() with a zero size
Thanks Lisandro Dalcin for the report

Fixes open-mpi/ompi#3854

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-14 15:00:05 +09:00
Artem Polyakov
4d3e22e815 Merge pull request #3870 from hppritcha/topic/repair_s2_launch
pmix/s2: fix srun native launch for pmi2
2017-07-13 12:45:22 -05:00
Howard Pritchard
eeb91bc82b pmix/s2: fix srun native launch for pmi2
recent changes that broke native launch on cray
using srun or aprun was also broke native launch
using pmi2.

This commit fixes this problem.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-07-12 17:45:52 -06:00
Jeff Squyres
ccf17808b6 Merge pull request #3258 from markalle/pr/symbol_name_pollution
symbol name pollution
2017-07-12 16:19:25 -05:00
Howard Pritchard
550e8c4afe Merge pull request #3842 from hppritcha/topic/fix_cray_pmix_problem
pmix/cray: add a bit of debug output
2017-07-11 08:29:56 -06:00
Howard Pritchard
26a8142c97 pmix/cray: add a bit of debug output
add a bit of debug output to help with pmix finalize issues

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-07-11 05:45:49 -05:00