1
1

odls_default: do not opal_output() while creating a process!

It is verbotten to use opal_output() after the fork() but before the
exec()!  It results in all manner of undefined behavior.  For example,
on some OS X systems, if you run a trivial "hello world" MPI program
with a high level of ODLS verbosity:

```sh
$ mpirun -np 3 --mca odls_base_verbose 100 ./hello_c
```

You will see a bunch of output from the mpirun ODLS base, but then it
*may* hang in odls_default_module.c:do_child() -- after the fork() but
before the exec() -- while trying to opal_output() some debugging
statements.

The solution is to remove these extraneous opal_output() statements.
Indeed, the ODLS base is already outputting the same information that
these opal_output() statements are trying to emit, anyway.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Этот коммит содержится в:
Jeff Squyres 2016-05-24 20:38:06 -04:00
родитель 461ca1203b
Коммит dd9a819a1c

Просмотреть файл

@ -11,7 +11,7 @@
* All rights reserved.
* Copyright (c) 2007-2010 Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2007 Evergrid, Inc. All rights reserved.
* Copyright (c) 2008-2015 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2008-2016 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2010 IBM Corporation. All rights reserved.
* Copyright (c) 2011-2013 Los Alamos National Security, LLC. All rights
* reserved.
@ -479,17 +479,6 @@ static int do_child(orte_app_context_t* context,
/* Exec the new executable */
if (10 < opal_output_get_verbosity(orte_odls_base_framework.framework_output)) {
int jout;
opal_output(0, "%s STARTING %s", ORTE_NAME_PRINT(ORTE_PROC_MY_NAME), context->app);
for (jout=0; NULL != context->argv[jout]; jout++) {
opal_output(0, "%s\tARGV[%d]: %s", ORTE_NAME_PRINT(ORTE_PROC_MY_NAME), jout, context->argv[jout]);
}
for (jout=0; NULL != environ_copy[jout]; jout++) {
opal_output(0, "%s\tENVIRON[%d]: %s", ORTE_NAME_PRINT(ORTE_PROC_MY_NAME), jout, environ_copy[jout]);
}
}
execve(context->app, context->argv, environ_copy);
send_error_show_help(write_fd, 1,
"help-orte-odls-default.txt", "execve error",