openmpi

Автор	SHA1	Сообщение	Дата
Mark Allen	bdd92a7a64	-cpu-set as a constraint rather than as a binding The first category of issue I'm addressing is that recent code changes seem to only consider -cpu-set as a binding option. Eg a command like this % mpirun -np 2 --report-bindings --use-hwthread-cpus \ --bind-to cpulist:ordered --map-by hwthread --cpu-set 6,7 hostname which just round robins over the --cpu-set list. Example output which seems fine to me: > MCW rank 0: [..../..B./..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....] > MCW rank 1: [..../...B/..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....] It should also be possible though to pass a --cpu-set to most other map/bind options and have it be a constraint on that binding. Eg % mpirun -np 2 --report-bindings \ --bind-to hwthread --map-by hwthread --cpu-set 6,7 hostname % mpirun -np 2 --report-bindings \ --bind-to hwthread --map-by ppr:2:node,pe=2 --cpu-set 6,7,12,13 hostname The first command above errors that > Conflicting directives for mapping policy are causing the policy > to be redefined: > New policy: RANK_FILE > Prior policy: BYHWTHREAD The error check in orte_rmaps_rank_file_open() is likely too aggressive. The intent seems to be that any option like "--map-by whatever" will check to see if a rankfile is in use, and report that mapping via rmaps and using an explicit rankfile is a conflict. But the check has been expanded to not just check NULL != orte_rankfile but also errors out if (NULL != opal_hwloc_base_cpu_list && !OPAL_BIND_ORDERED_REQUESTED(opal_hwloc_binding_policy)) which seems to be only recognizing -cpu-set as a binding option and ignoring -cpu-set as a constraint on other binding policies. For now I've changed the NULL != opal_hwloc_base_cpu_list to OPAL_BIND_TO_CPUSET == OPAL_GET_BINDING_POLICY(opal_hwloc_binding_policy) so it hopefully only errors out if -cpu-set is being used as a binding policy. Whether I did that right or not it's enough to get to the next stage of testing the example commands I have above. Another place similar logic is used is hwloc_base_frame.c where it has /* did the user provide a slot list? */ if (NULL != opal_hwloc_base_cpu_list) { OPAL_SET_BINDING_POLICY(opal_hwloc_binding_policy, OPAL_BIND_TO_CPUSET); } where it used to (long ago) only do that if !OPAL_BINDING_POLICY_IS_SET(opal_hwloc_binding_policy) I think the new code is making it impossible to use --cpu-set as anything other than a binding policy. That brings us past the error detection and into the real functionality, some of which has been stripped out, probably in moving to hwloc-2: % mpirun -np 2 --report-bindings \ --bind-to hwthread --map-by hwthread --cpu-set 6,7 hostname > MCW rank 0: [B.../..../..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....] > MCW rank 1: [.B../..../..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....] The rank_by() function in rmaps_base_ranking.c makes an array out of objects returned from opal_hwloc_base_get_obj_by_type(,,,i,) which uses df_search(). That function changed quite a bit from hwloc-1 to 2 but it used to include a check for available = opal_hwloc_base_get_available_cpus(topo, start) which is where the bitmask from --cpu-set goes. And it used to skip objs that had hwloc_bitmap_iszero(available). So I restored that behavior in ds_search() by adding a "constrained_cpuset" to replace start->cpuset that it was otherwise processing. With that change in place the first command works: % mpirun -np 2 --report-bindings \ --bind-to hwthread --map-by hwthread --cpu-set 6,7 hostname > MCW rank 0: [..../..B./..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....] > MCW rank 1: [..../...B/..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....] The other command uses a different path though that still ignored the available mask: % mpirun -np 2 --report-bindings \ --bind-to hwthread --map-by ppr:2:node:pe=2 --cpu-set 6,7,12,13 hostname > MCW rank 0: [BB../..../..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....] > MCW rank 1: [..BB/..../..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....] In bind_generic() the code used to call opal_hwloc_base_find_min_bound_target_under_obj() which used opal_hwloc_base_get_ncpus(), and that's where it would intersect objects with the available cpuset and skip over ones that were't available. To match the old behavior I added a few lines in bind_generic() to skip over objects that don't intersect the available mask. After that we get % mpirun -np 2 --report-bindings \ --bind-to hwthread --map-by ppr:2:node:pe=2 --cpu-set 6,7,12,13 hostname > MCW rank 0: [..../..BB/..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....] > MCW rank 1: [..../..../..../BB../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....] I think the above changes are improvements, but I don't feel like they're comprehensive. I only traced through enough code to fix the two specific bugs I was dealing with. Signed-off-by: Mark Allen <markalle@us.ibm.com>	2019-04-12 15:33:56 -04:00
valentin petrov	cd5fa9706e	Merge pull request #6574 from vspetrov/master Fixes the O(N^2) loop in the mca_scoll_mpi_comm_query	2019-04-11 17:25:55 +03:00
bosilca	8cf7a7e87d	Merge pull request #6538 from bosilca/topic/issue6522 Prevent a segfault when accessing a rank outside a communicator.	2019-04-09 18:08:49 -04:00
Jeff Squyres	9bb8fd509b	Merge pull request #6566 from davideberius/name_fix Component: Changes	2019-04-09 09:03:05 -05:00
Gilles Gouaillardet	fd686d7e1f	Merge pull request #6580 from ggouaillardet/topic/pmix_refresh pmix/pmix4x: refresh to the latest PMIx	2019-04-09 15:26:45 +09:00
Gilles Gouaillardet	9ce8d7b568	pmix/pmix4x: refresh to the latest PMIx refrest pmi4x to pmix/pmix@2531c0c3d1 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-04-09 14:03:00 +09:00
Valentin Petrov	2fa93332ce	Fixes the O(N^2) loop in the mca_scoll_mpi_comm_query The new proc group is created from the "world_group" based on the ranks mapping which can be directly taken from proc_name->vpid. Signed-off-by: Valentin Petrov <valentinp@mellanox.com>	2019-04-08 17:49:00 +03:00
KAWASHIMA Takahiro	163bbd4f04	Merge pull request #6567 from kawashima-fj/pr/sys-timer-cleanup opal/sys: Native timer cleanup	2019-04-08 09:22:43 +09:00
Yossi Itigin	0a446b0a3f	Merge pull request #6563 from benmenadue/fix-shmem-context master: add missing #include to oshmem/shmem/c/shmem_context.c	2019-04-04 16:22:27 +03:00
KAWASHIMA Takahiro	77286a41aa	opal/sys: Introduce OPAL_HAVE_SYS_TIMER_GET_FREQ macro ... to avoid using an architecture name macro in `opal/mca/timer/linux/timer_linux_component.c`. The function name `opal_sys_timer_freq` is also changed for consistency with `opal_sys_timer_get_cycles`. Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2019-04-04 11:48:02 +09:00
KAWASHIMA Takahiro	f2c108bd8e	opal/sys: Correct OPAL_HAVE_SYS_TIMER_GET_CYCLES value ... in the case of `OPAL_GCC_INLINE_ASSEMBLY == 0` In this case, `OPAL_HAVE_SYS_TIMER_GET_CYCLES` should be 0 because the `opal_sys_timer_get_cycles` function is not defined. The history: 1. Before `8d4175ad89`, `OPAL_HAVE_SYS_TIMER_GET_CYCLES` was 0. 2. In `8d4175ad89`, `adf92d6237`, `adf92d6237`, and `c62ce1593a`, `OPAL_HAVE_SYS_TIMER_GET_CYCLES` was changed to 1 by introducing `opal/asm/base/.asm`. 3. In `ebce88b7ad`, `opal/asm/base/.asm` were removed. Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2019-04-04 11:46:37 +09:00
David Eberius	461d8bc77b	Fixed a potential name collision. Signed-off-by: David Eberius <deberius@vols.utk.edu>	2019-04-03 16:43:48 -04:00
Ben Menadue	063596b828	Add missing #include to oshmem/shmem/c/shmem_context.c. Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>	2019-04-03 15:58:13 +11:00
markalle	98fdeeeb41	Merge pull request #6448 from markalle/macro_writing_input_arg in-place conversion macro writes into INPUT argument	2019-04-02 11:33:18 -05:00
Brelle Emmanuel	e630046a4b	pml/ob1: fixed local handle sent during PUT control message In case of using a btl_put in ob1, the handle of the locally registered memory is sent with a PUT control message. In the current master code the sent handle is necessary the handle in the frag but if the handle has been successfully registered in the request, the frag structure does not have any valid handle and all fragments use the request one. I suggest to check if the handle in the fragment is valid and if not to send the handle from the request. Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>	2019-04-01 18:45:05 +02:00
Brelle Emmanuel	9c689f2225	pml/ob1: fixed exit from get_frag_fail when falling back on btl_put In the case the btl_get fails Ob1 tries to fallback on btl_put first but the return code was ignored. So the code fell back on both btl_put and btl_send. Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>	2019-04-01 18:17:10 +02:00
Mark Allen	0a7f1e3cc5	in-place conversion macro writes into INPUT argument In fint_2_int.h there are some conversion macros for logicals. It has one path for OMPI_SIZEOF_FORTRAN_LOGICAL != SIZEOF_INT where a new array would be allocated and the conversions then might expand to c_array[i] = (array[i] == 0 ? 0 : 1) and another path for OMPI_SIZEOF_FORTRAN_LOGICAL == SIZEOF_INT where it does things "in place", so the same conversion there would just be array[i] = (array[i] == 0 ? 0 : 1) The problem is some of the logical arrays being converted are INPUT arguments. And it's possible for some compilers to even put the argument in read-only memory so the above "in place" conversion SEGV's. A testcase I have used call MPI_CART_SUB(oldcomm, (/.true.,.false./), newcomm, ierr) and gfortran put the second arg in read-only mem. In cart_sub_f.c you can trace the ompi_fortran_logical_t *remain_dims arg. remain_dims[] is for input only, but the file uses OMPI_LOGICAL_ARRAY_NAME_DECL(remain_dims); OMPI_ARRAY_LOGICAL_2_INT(remain_dims, ndims); PMPI_Cart_sub(..., OMPI_LOGICAL_ARRAY_NAME_CONVERT(remain_dims), ...); OMPI_ARRAY_INT_2_LOGICAL(remain_dims, ndims); to convert it to c-ints make a C call then restore it to Fortran logicals before returning. It's not always wrong to convert purely in-place, eg cart_get_f.c has a periods[] that's exclusively for OUTPUT and it would be fine with the macros as they were. But I still say the macros are invalid because they don't distinguish whether they're being used on INPUT or OUTPUT args and thus they can't be used in a way that's legal for both cases. It might be possible to fix the macros by adding more of them so that cart_create_f.c and cart_get_f.c would use different macros that give more context. But my fix here is just to turn off the first block and make all paths run as if OMPI_SIZEOF_FORTRAN_LOGICAL != SIZEOF_INT. The main macros that get enlarged by this change are define OMPI_ARRAY_LOGICAL_2_INT_ALLOC : mallocs now define OMPI_ARRAY_LOGICAL_2_INT : also mallocs now But these are only used in 4 places, three of which are the purpose of this checkin, to avoid the former in-place expansion of an INPUT arg: cart_create_f.c cart_map_f.c cart_sub_f.c and one of which is an OUPUT arg that was fine and that gets unnecessarily expanded into a separate array by this checkin. cart_get_f.c So I think an unnecessary malloc in cart_get_f.c is the only downside to this change, where the logicals array argument could have been used and converted in place. Signed-off-by: Mark Allen <markalle@us.ibm.com> Update provided by Gilles Gouaillardet to keep the in-place option if OMPI_FORTRAN_VALUE_TRUE == 1 where no conversion is needed. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-04-01 10:38:05 -04:00
Mark Allen	eb888118e8	shmat/shmdt additions for patcher This is mostly based off recent UCX additions to their patcher: https://github.com/openucx/ucx/pull/2703 They added triggers for * mmap when (flags & MAP_FIXED) && (addr != NULL) * shmat when (shmflg & SHM_REMAP) && (shmaddr != NULL) Beyond that I noticed they already had a trigger for * madvise when (advice == MADV_FREE) that we didn't so I added that. And the other main thing is we didn't really have shmat/shmdt active for some systems because we only had a path for syscall(SYS_shmdt, ) but we needed to also have a path for syscall(SYS_ipc, IPCOP_shmdt, ) and same for shmat. Signed-off-by: Mark Allen <markalle@us.ibm.com>	2019-03-29 14:38:46 -04:00
KAWASHIMA Takahiro	76516bc70c	Merge pull request #6542 from kawashima-fj/pr/man-typo man: Fix typo of MPI_TYPE_GET_NAME	2019-03-29 13:06:46 +09:00
KAWASHIMA Takahiro	63a1968459	man: Fix typo of MPI_TYPE_GET_NAME Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2019-03-29 13:01:52 +09:00
bosilca	b54fdf5dd9	Merge pull request #6541 from bwbarrett/bugfix/enotconn btl/tcp: Skip printing error message in racy cleanup path	2019-03-28 22:42:52 -04:00
Brian Barrett	d5360711fa	btl/tcp: Skip printing error message in racy cleanup path Avoid printing an error message about ENOTCONN return codes from getpeername() when handling an incoming connection request. At this point in the receive state machine, the remote process has been verified to be a valid OMPI instance. In all-to-all startup at 4k rank scale, we're seeing this error message when the remote side drops the connection because it realizes it's the "loser" in the connection race. We were already doing all the right things, other than printing a scary error message. So skip the error message and call it good. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2019-03-28 23:12:35 +00:00
Jeff Squyres	05c5e2034b	Merge pull request #6527 from James-A-Clark/master Add compilation flag to allow unwinding through files that are present in the stack when attaching with MPIR	2019-03-28 18:16:02 -04:00
George Bosilca	6ea0c4eab9	Prevent a segfault when accessing a rank outside a communicator. This is not fixing any issue, it is simply preventing a sefault if the communicator creation has not happened as expected. Thus, this code path should never really be hit in a correct MPI application with a valid communicator creation support. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-03-28 12:03:29 -04:00
Jeff Squyres	3c1b33c93a	Merge pull request #6140 from bertwesarg/fix-cpp-condition Fix use of bitwise operation in CPP condition	2019-03-28 10:06:20 -04:00
Nathan Hjelm	34d0790558	Merge pull request #6526 from ggouaillardet/topic/vader_fini btl/vader: fix finalize sequence	2019-03-27 12:12:00 -06:00
James Clark	20f5840cbb	Add a compilation flag that adds unwind info to all files that are present in the stack starting from MPI_Init. This is so when a debugger attaches using MPIR, it can step out of this stack back into main. This cannot be done with certain aggressive optimisations and missing debug information. Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Jeff Squyres <jsquyres@cisco.com> Co-authored-by: Jeff Squyres <jsquyres@cisco.com>	2019-03-27 14:32:15 +00:00
Gilles Gouaillardet	77060cad07	btl/vader: fix finalize sequence free the component mpool in mca_btl_vader_component_close() and after freeing soem objects that depend on it such as mca_btl_vader_component.vader_frags_user Thanks Christoph Niethammer for reporting this. Refs. open-mpi/ompi#6524 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-03-27 11:57:40 +09:00
Ralph Castain	4e5cacc8db	Merge pull request #6523 from rhc54/topic/nid Sync nidmap to PRRTE to fix hetero topo problem	2019-03-26 09:22:58 -07:00
Ralph Castain	8174286530	Sync nidmap to PRRTE to fix hetero topo problem Signed-off-by: Ralph Castain <rhc@pmix.org>	2019-03-26 08:24:09 -07:00
Ralph Castain	dfbc14430d	Merge pull request #6440 from ggouaillardet/topic/yield_when_idle schizo/ompi: correctly handle the yield_when_idle option	2019-03-25 12:17:34 -07:00
Geoff Paulsen	44b3aa244b	Merge pull request #6510 from sam6258/int4_cswap_fix shmem/fortran: Fix invalid datatype size in call to atomic cswap	2019-03-25 11:49:00 -05:00
Gilles Gouaillardet	97b7fab872	Merge pull request #6516 from ggouaillardet/topic/pmix_refresh pmix/pmix4x: refresh to the latest PMIx	2019-03-25 14:48:45 +09:00
Gilles Gouaillardet	e844f76725	pmix/pmix4x: refresh to the latest PMIx refrest pmi4x to pmix/pmix@20cc9c041e Fixes open-mpi/ompi#6513 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-03-25 13:33:18 +09:00
Artem Polyakov	bfff5783f9	Merge pull request #6371 from artpol84/osc/select_dbg osc/base: Add debug output stating a selected component	2019-03-22 22:24:04 -07:00
Joshua Ladd	9ab6ecba65	Merge pull request #6492 from janjust/oshmem-multiple-contexts-master Oshmem multiple contexts	2019-03-22 17:34:46 -04:00
Xin Zhao	9c3d00b144	ompi/oshmem/spml/ucx: use lockfree array to optimize spml_ucx_progress/delete oshmem_barrier in shmem_ctx_destroy ompi/oshmem/spml/ucx: optimize spml ucx progress Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>	2019-03-21 23:01:45 +02:00
Xin Zhao	e0414006b0	ompi/oshmem/spml/ucx:delete oob path of getting rkeys in spml ucx Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>	2019-03-21 23:01:45 +02:00
Xin Zhao	e1c1ab0202	ompi/oshmem/spml/ucx: defer clean up shmem_ctx to shmem_finalize Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>	2019-03-21 23:01:37 +02:00
Scott Miller	6b294e0641	shmem/fortran: Fix invalid datatype size in call to atomic cswap Signed-off-by: Scott Miller <scott.miller1@ibm.com>	2019-03-20 21:57:08 -04:00
Josh Hursey	53cd31ed7e	Merge pull request #6504 from jjhursey/rm-hash-pmix4 Do not force 'hash' gds on direct modex in pmix4x	2019-03-19 20:35:12 -05:00
Ralph Castain	4e0905cda7	Merge pull request #6505 from rhc54/topic/pmxup Sync to latest PMIx master and silence hwloc warnings	2019-03-19 12:53:15 -07:00
Ralph Castain	0f26d8c76b	Silence warnings Signed-off-by: Ralph Castain <rhc@pmix.org>	2019-03-19 10:27:39 -07:00
Ralph Castain	c4be211741	Sync to latest PMIx master Signed-off-by: Ralph Castain <rhc@pmix.org>	2019-03-19 10:27:12 -07:00
Joshua Hursey	1314cf2640	Do not force 'hash' gds on direct modex in pmix4x * Forcing the 'hash' gds component should not be necessary any more. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2019-03-19 11:53:26 -05:00
Josh Hursey	836c80c442	Merge pull request #6498 from jjhursey/rm-hash-pmix3 Do not force 'hash' gds on direct modex	2019-03-19 10:45:11 -05:00
Nathan Hjelm	bf5fb5b589	Merge pull request #6500 from nysal/spinlock_fix opal/atomics: Add acquire semantics back for spinlocks	2019-03-19 07:54:37 -06:00
Jeff Squyres	5111dbd480	Merge pull request #6493 from rhc54/topic/order Ensure that nodes are always used in order provided	2019-03-19 09:40:21 -04:00
Nysal Jan K.A	00f27a80fc	opal/atomics: Add acquire semantics back for spinlocks This was introduced in commit `9d0b3fe9` Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>	2019-03-19 16:27:03 +05:30
Joshua Hursey	c2581d0e33	Do not force 'hash' gds on direct modex * Forcing the 'hash' gds component should not be necessary any more. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2019-03-18 21:52:32 -05:00

1 2 3 4 5 ...

29893 Коммитов