1
1

26497 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
a7b8190fdc Per f2f meeting: if async modex is given, default to no MPI init barrier, letting the user override that if desired.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-25 10:13:53 -08:00
Jeff Squyres
230bbc597d plm base: make sure to assign "node" early enough
Make sure to assign "node" before using it in ORTE_FLAG_SET.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-25 08:02:59 -08:00
Ralph Castain
e7323fdd93 Merge pull request from rhc54/topic/oob4
Cleanup some code so it is clear that it is executing in an event. En…
2017-01-25 07:48:31 -08:00
Ralph Castain
184ccc8e91 Cleanup some code so it is clear that it is executing in an event. Ensure that peer event base is properly set on incoming connections
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-25 06:55:11 -08:00
Edgar Gabriel
4e06b96701 Merge pull request from edgargabriel/pr/sharedfp-append-fix
Pr/sharedfp append fix
2017-01-25 08:01:04 -06:00
Gilles Gouaillardet
142b95df87 pmix/ext2x: plug misc memory leaks regarding opal_pmix2x_event_chain_t handling
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-25 16:17:10 +09:00
Gilles Gouaillardet
7a3d39f079 pmix/ext2x: plug a memory leak in _reg_nspace()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-25 16:17:01 +09:00
Gilles Gouaillardet
ef10d3fd7b orte: add missing include file
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-25 16:15:20 +09:00
Ralph Castain
186059cc00 Merge pull request from rhc54/topic/host
Revamp -host and -cpu-list options per f2f meeting
2017-01-24 18:37:24 -08:00
Ralph Castain
ef86707fbe Deprecate the --slot-list paramaeter in favor of --cpu-list. Remove the --cpu-set param (mark it as deprecated) and use --cpu-list instead as it was confusing having the two params. The --cpu-list param defines the cpus to be used by procs of this job, and the binding policy will be overlayed on top of it.
Note: since the discovered cpus are filtered against this list, #slots will be set to the #cpus in the list if no slot values are given in a -host or -hostname specification.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-24 13:33:22 -08:00
Ralph Castain
0bfdc0057a Extend the -host:N syntax to accept "*" or "auto" to indicate "auto-detect the #cpus and set #slots to that value"
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-24 10:21:01 -08:00
Ralph Castain
d3907dec98 Make master continue the -host behavior of prior releases: use of -host <foo> specifies a single slot. Requests to run more than one process will require either specifying slots using the "-host foo:N" syntax, or adding --oversubscribe to the cmd line.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-24 10:11:56 -08:00
Edgar Gabriel
cbb3cb9745 fs/ufs: avoid using the exclusive flag with shared file pointer
when a file is opened a second time for shared file pointer operations,
avoid setting the create and exclusive flag.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-01-24 12:11:29 -06:00
Edgar Gabriel
f5289a1803 common/ompio: store correctly the SHAREDFP_IS_SET flag
it looks like disabling the lazy_open flag for sharedfp components
revealead a bug that lead to a crash in file_close in some tests. Make
sure the SHAREDFP_IS_SET flag is correctly set (and not overwritten again),
and we use that to avoid a double-free of the communicator.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-01-24 12:09:56 -06:00
Josh Hursey
c6595c2289 Merge pull request from jjhursey/topic/libevent-conf2
libevent2022: Fix broken configure AC_LANG_PROGRAM
2017-01-24 08:31:46 -06:00
Ralph Castain
4e9364b9a4 Merge pull request from rhc54/topic/regs
Next step in reducing launch time
2017-01-24 03:19:57 -08:00
Gilles Gouaillardet
682f5116aa Merge pull request from ggouaillardet/topic/misc_fixes_and_plugs
fix misc bugs and plug misc memory leaks
2017-01-24 14:41:45 +09:00
Ralph Castain
86ab751c5e Next step in reducing launch time: begin reducing the size of the launch message itself. Start by expressing the daemon map as a set of three regular expression strings. On an 8k cluster, this reduces the nidmap contribution from over 200kBytes to 21 bytes in size.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-23 19:54:47 -08:00
Joshua Hursey
72ac812039 libevent2022: Fix broken configure AC_LANG_PROGRAM
* Similar to commit 029964a748
   This removes an extra `int main` during configure.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-23 21:47:59 -06:00
Josh Hursey
b9b96f13ca Merge pull request from jjhursey/topic/libevent-conf
libevent2022: Fix broken configure AC_LANG_PROGRAM
2017-01-23 21:39:05 -06:00
Gilles Gouaillardet
d54e54538a orted/orted_submit: plug a memory leak
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:30 +09:00
Gilles Gouaillardet
189da7fdab pmix2x: plug a memory leak in _event_hdlr()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:30 +09:00
Gilles Gouaillardet
acbc32d3b2 pmix2x: plug a memory leak in opal_lkupcbfunc()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:29 +09:00
Gilles Gouaillardet
b5b21043c4 pmix2x: plug a memory leak in _reg_nspace()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:29 +09:00
Gilles Gouaillardet
0f47310a75 pmix2x/pmix2x_client: plug misc memory leaks
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:29 +09:00
Gilles Gouaillardet
f4dc7e4134 orted/orted_submit: plug misc memory leaks
- always invoke init_globals() before opal_cmd_line_parse(orte_cmd_line, ...)
- plug more leaks in init_globals()
- remove unused env_val and personalities fields from orte_cmd_options_t

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:29 +09:00
Gilles Gouaillardet
d5aa310884 mpiext/affinity: initialize all output variables of OMPI_Affinity_str()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:29 +09:00
Gilles Gouaillardet
501eb8dc7e ompio: plug misc memory leaks
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:19 +09:00
Gilles Gouaillardet
1a6c17ec7d opal/util: plug a memory leak
by using opal_setenv() instead of putenv()

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:12:47 +09:00
Gilles Gouaillardet
d0629f18c2 coll/libnbc: optimize size one communicators
simply "return" with ompi_request_empty if the communicator size is 1

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:12:47 +09:00
Gilles Gouaillardet
9d6e0482a6 orte/data_server: plug a memory leak in orte_data_server()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:12:47 +09:00
Gilles Gouaillardet
0bdc594b2e rml/base: plug a memory leak in orte_rml_API_recv_cancel()
simply return when the orte event thread has gone

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:12:47 +09:00
Gilles Gouaillardet
6f2ca5809b man: fix a typo in MPI_Win_get_name()
Thanks Nicolas Joly for the report

Fixes 

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:08:13 +09:00
Jeff Squyres
e7588f0509 Merge pull request from edgargabriel/pr/sharedfp-append-fix
common/ompio: update comment based on the previous commit.
2017-01-23 14:06:13 -08:00
Joshua Hursey
029964a748 libevent2022: Fix broken configure AC_LANG_PROGRAM
* The AC_LANG_PROGRAM macro adds the `main()` so it is erroneous
   to add it to the test program.
 * This was detected with the XL compilers which will fail to
   build the program in this situation. The GNU compiler does not
   error out or warn, but successfully compiles the program.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-23 13:44:12 -06:00
Edgar Gabriel
4dc09de3b8 common/ompio: update comment based on the previsou commit.
No source code changed.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-01-23 13:38:05 -06:00
Ralph Castain
f3920828ed Merge pull request from rhc54/topic/pmixup
Update to latest PMIx master
2017-01-23 11:01:19 -08:00
Edgar Gabriel
2215f29849 Merge pull request from edgargabriel/pr/sharedfp-append-fix
Pr/sharedfp append fix
2017-01-23 10:38:27 -06:00
Ralph Castain
8c960bae8d Update to latest PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-23 07:07:40 -08:00
Edgar Gabriel
3eae0eecd0 io/ompio: change default for sharedfp_lazy_open parameter
Revert the logic of io_ompio_sharedfp_lazy_open. The user now has to explicitely
disable shared fp in order for the structures not to be allocated.
Otherwise, resetting the shared fp e.g. in case the file was opened
in append mode will not work correctly, the code could deadlock.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-01-23 08:59:22 -06:00
Edgar Gabriel
d3a8d38cc6 common/ompio: correctly position shared fp in append mode
Fixes a bug reported on the mailing list. ompio did only reposition the individual
file pointer when the file was opened in append mode. Set the shared file
pointer also to point to the end of the file, similarly to the individual
file pointer.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-01-23 08:59:05 -06:00
Ralph Castain
a61f7bdb26 Merge pull request from rhc54/topic/conn
Ensure we properly set the "shutting down" flag so connection drops by downstream peers are properly handled.
2017-01-23 06:40:28 -08:00
Ralph Castain
e7b12913b4 Ensure we properly set the "shutting down" flag so connection drops by downstream peers are properly handled.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-23 04:00:24 -08:00
Ralph Castain
0b4648b3a7 Merge pull request from hjelmn/oob_param
oob/base: fix num_threads registration type
2017-01-22 14:09:06 -08:00
Nathan Hjelm
954a4b7be3 oob/base: fix num_threads registration type
This commit fixes a bug in the registration of the num_threads MCA
variable. The variable is of type int and was being registered as
a boolean.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-01-22 14:02:34 -07:00
Ralph Castain
c549f82cdc Merge pull request from rhc54/topic/threads
Ensure that oob/base level data is always accessed in the oob/base event thread. Make debruijn the default routed component
2017-01-22 11:21:34 -08:00
Ralph Castain
ac4fcd3f97 Ensure that oob/base level data is always accessed in the oob/base event thread. Make debruijn the default routed component
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-22 10:33:32 -08:00
Ralph Castain
adbcefebf8 Merge pull request from rhc54/topic/spawn
Fix comm_spawn and orte-dvm by resetting all used "node mapped" flags after building the child list
2017-01-22 08:07:08 -08:00
Ralph Castain
6560617c04 Fix comm_spawn and orte-dvm by resetting all used "node mapped" flags after building the child list
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-22 05:55:53 -08:00
Ralph Castain
59eafebf66 Merge pull request from rhc54/topic/fix
Add missing flag set to ensure nodes do not get double-added to job map.
2017-01-21 20:54:37 -08:00