after performing the final OBJ_RELEASE on the request,
reset the user level variable to MPI_REQUEST_NULL.
Otherwise the c_2_f translation step in the fortran
interface fails.
Fixes issue #4807
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
This commit fixes a flaw in the eager limit check in pml/ob1. The
check was incorrectly checking if RDMA-only BTLs (BTLs without the
send flag) has a valid eager limit. This commit fixes the check by
adding an additional check for the send flag on the BTL module.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Since we now support the dynamic addition of hosts to the orte_node_pool, there is no longer any reason to require advanced specification of all possible nodes. Instead, use a precedence method to initially allocate only those hosts that were specified in the cmd line:
* rankfile, if given, as that will specify the nodes
* -host, aggregated across all app_contexts
* -hostfile, aggregated across all app_contexts
* default hostfile
* assign local node
Fix slots_inuse accounting so that the nodes are correctly reset upon error termination - e.g., when oversubscribed without permission.
Ensure we accurately track the user's specified desires for oversubscribe and no-use-local when dynamically spawning jobs.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit c9b3e68ce596a68a2ed2fbf73f211b3334b0a6a8)
The osc monitoring component needed to include other OSC components
header in order to be able tu access communicator through the
component specific ompi_osc_*_module_t structures. This commit remove
the dependency, and resolve the issue #4523.
Extend the common monitoring API.
* Now it's possible to translate from local rank to world rank from
both the communicator and the group.
* Remove useless hashtable as we directly use the w_group contained
in window structure.
Add automatic generation at config time.
The templates are expanded at configure time. It creates a new header
file that generates all the variables/functions needed. Adding this
during the autogen automagicaly generates for each of the available
modules the proper functions.
Only keep a generated argv-style array.
Following Jeff's advice, the configure.m4 file generate a simple array
of module variables to be iterated over to find the proper module.
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>
Fixed the desync of job-nodelists between mpirun and orted
daemons. The issue was observed when using RSH launching because user
can provide arbitrary order of nodes regarding HNP placement.
The mpirun process propagate the daemon's nodelist order to nodes.
The problem was that HNP itself is assembling the nodelist based on
user provided order. As the result ranks assignment was calculated
differently on orted and mpirun.
Consider following example:
* User launches mpirun on node cn2.
* Hostlist is cn1,cn2,cn3,cn4; ppn=1
* mpirun is passing hostlist cn[2:2,1,3-4]@0(4) to orteds
So as result mpirun will assing rank 0 on cn1 while orted will assign
rank 0 on cn2 (because orted sees cn2 as the first element in the node
list)
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
This header file was meant to be autogenerated, and for
some reasons, was never removed from the repository.
Update .gitignore as well
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
If component selection fails, then module->bases might be unallocated
when ompi_osc_sm_free() in invoked, so test it before trying to free()
module->bases[0].
Thanks Martin Binder for the report.
Refs open-mpi/ompi#4770
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Since open-mpi/ompi@47fd2313ab
the backing file is now in /dev/shm by default. As a consequence,
the backing file name has to include the jobid so more than one job
can run at a time.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
When too much data is available on stdin, it might not be
forwarded immediatly to the task (write() might fail with -EAGAIN),
so when stdin is terminated, there might be some remaining data
to be pushed to the task. In this case, delay the release of the sink
so no data is discarded.
Refs open-mpi/ompi#4744
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
this option was only used by the iof/mr_hnp (aka Map/Reduce)
component that is no more part of master nor v3 branches.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>