852af8b834
I recently found a case where ompi_mpi_abort() segv's: {{{ $ mpirun --mca btl non_existent_btl_name ... }}} In this case, the BML init fails because we have no paths to any peers. It calls ompi_mpi_abort(), but this is before ompi_comm_self has been setup. ompi_mpi_abort() assumes that if the comm parameter is != NULL, it can be used. But since we aborted so early in MPI_INIT, that's a false assumption. (note that this isn't happening on v1.8 because the check for INIT/FINALIZE in ompi_mpi_abort() is a little different. Hence: this is a trunk issue -- at least for now) When fixing this problem, I noticed a few other problems in ompi_mpi_abort(): * the group access was incorrect (it didn't use accessor functions) * it wasn't clear that ORTE's ompi_rte_abort_peers() returns NOT_IMPLEMENTED and falls through down to ompi_rte_abort() * the check for my proc in the communicator was a little more complicated than necessary * the logic for checking for aborts early in MPI_INIT wasn't right * some comments were stale * the hostname output in error messages would be NULL if MPI_FINALIZE had been invoked * it was possible to abort, but still exit with a 0 status This commit fixes all of the above problems, and makes the logic a little more straightforward. Thanks to Ralph Castain and George Bosilca for the assists with this patch. This commit was SVN r32125. |
||
---|---|---|
.. | ||
help-mpi-runtime.txt | ||
Makefile.am | ||
mpiruntime.h | ||
ompi_cr.c | ||
ompi_cr.h | ||
ompi_info_support.c | ||
ompi_info_support.h | ||
ompi_module_exchange.c | ||
ompi_module_exchange.h | ||
ompi_mpi_abort.c | ||
ompi_mpi_finalize.c | ||
ompi_mpi_init.c | ||
ompi_mpi_params.c | ||
ompi_mpi_preconnect.c | ||
params.h |