Nathan Hjelm
dfbe584c92
ompi/group: fix typos in add_procs changes
...
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-17 09:21:32 -06:00
rhc54
6efe91a24b
Merge pull request #904 from ggouaillardet/topic/cpuset
...
hwloc: do not count not allowed cores in df_search_cores
2015-09-17 03:55:50 -07:00
Gilles Gouaillardet
975b6fd51b
hwloc: do not count not allowed cores in df_search_cores
2015-09-17 13:10:34 +09:00
Nathan Hjelm
131681acc6
Merge pull request #901 from hjelmn/comm_fix
...
ompi/comm: fix comm_[i]dup on intracommunicators
2015-09-16 12:43:19 -06:00
Nathan Hjelm
c84c05bab7
ompi/comm: fix comm_[i]dup on intracommunicators
...
The behavior of ompi_comm_set was changed to get the remote size from
the remote group. This broke how ompi_comm_[i]dup were using
ompi_comm_set. In order to adapt to the new behavior these functions
now pass NULL for the remote group if the communicator is not an
inter-communicator.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-16 10:31:18 -06:00
rhc54
55d40910ee
Merge pull request #899 from rhc54/topic/cov
...
Silence some warnings and address Coverity issues
2015-09-16 09:23:32 -07:00
Ralph Castain
1b7930ad52
Silence some warnings and address Coverity issues
2015-09-16 07:58:22 -07:00
Ralph Castain
8b88ea9b13
Fix singletons by removing stale code
2015-09-16 00:58:05 -07:00
George Bosilca
02624bd0b6
Fix all treematch issues idenfied by Coverity.
2015-09-15 23:49:11 -04:00
George Bosilca
6ab5f68fc3
indentation.
2015-09-15 22:46:13 -04:00
rhc54
5597416fe0
Merge pull request #897 from rhc54/topic/oob
...
Remove the last involvement of the OOB system from the MPI layer
2015-09-15 14:40:21 -07:00
Jeff Squyres
7cb546a221
core: yow; this should absolutely not be in the repo!
2015-09-15 16:15:04 -04:00
Ralph Castain
c1bbbb5e2f
Remove the last involvement of the OOB system from the MPI layer, remove the no-longer-needed usock/oob component, and have procs no longer open the RML, OOB, ROUTED, and GRPCOMM frameworks as PMIx now provides all required app-mpirun cmds
2015-09-15 13:08:35 -07:00
Rolf vandeVaart
555f14a479
Merge pull request #893 from rolfv/pr/more-verbose-fix
...
Cleanup handle verbose messages
2015-09-15 15:45:52 -04:00
rhc54
3b4e982f86
Merge pull request #896 from hjelmn/comm_set_fix
...
ompi/comm: fix bug in ompi_comm_set
2015-09-15 12:25:55 -07:00
Nathan Hjelm
9c45c63143
ompi/dpm: fix typo in dynamic communicator detection
...
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-15 12:42:58 -06:00
Nathan Hjelm
6379178046
ompi/comm: fix bug in ompi_comm_set
...
This commit updates the behavior of ompi_comm_set to explicitly take
either local/remote group(s) OR local/remote array(s). If array(s) are
in use the sizes will be taken from the appropriate group(s).
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-15 11:37:44 -06:00
Nathan Hjelm
ca4be77ff1
Merge pull request #894 from hjelmn/osh_memheap_fix
...
oshmem/memheap: correct usage of opal_dss functions
2015-09-15 08:05:39 -06:00
George Bosilca
0e7e14449f
Typo in the modex_recv.
2015-09-14 18:00:02 -04:00
Nathan Hjelm
69b9bc2269
oshmem/memheap: correct usage of opal_dss functions
...
Any buffer given to opal_dss.load becomes the responsibility of the
opal_buffer_t object. It will be freed automatically if either the
opal_buffer_t is released or opal_dss.load is called again on the
buffer. opal_dss.unload will not prevent this unless no unpacking
takes place between the .load and .unload calls.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-14 13:54:56 -06:00
Rolf vandeVaart
34fe2188cd
Cleanup handle verbose messages
2015-09-14 11:01:25 -04:00
Mike Dubman
6f82ce3fc8
Merge pull request #879 from igor-ivanov/pr/disable-oshmem-issue
...
Prevent oshmem related files inside install folder in case --disable-oshmem
2015-09-14 12:12:06 +03:00
Gilles Gouaillardet
d5af5d106c
btl/sm: mca_btl_sm_sendi: do not set *descriptor when descriptor is NULL
2015-09-14 14:04:40 +09:00
Nathan Hjelm
f29b65aa14
ompi/proc: fix typos CID 1323840
...
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-09-11 21:02:30 -06:00
rhc54
33f5e4c766
Merge pull request #892 from rhc54/topic/pmix
...
Fix the no-disconnect test by resolving a segfault on free - opal_dss…
2015-09-11 16:01:42 -07:00
Ralph Castain
fbcf819d2e
Remove unnecessary include
2015-09-11 15:53:00 -07:00
Nathan Hjelm
f798c909d1
Merge pull request #883 from hjelmn/comm_split_update
...
ompi/comm: improve comm_split_type scalability
2015-09-11 16:35:34 -06:00
Rolf vandeVaart
d78b954fd4
Merge pull request #891 from rolfv/pr/minor-cuda-verbosity-fixes
...
Fix cuda verbosity messages
2015-09-11 16:33:22 -04:00
Ralph Castain
22d7c0081a
Fix the no-disconnect test by resolving a segfault on free - opal_dss.unload will return the remaining unpacked portion of a buffer. As such, it cannot return the pointer to that info as it might be partway inside of a malloc'd region. So copy the data out of the buffer.
2015-09-11 13:01:35 -07:00
Ralph Castain
b60b03d613
It is okay not to get the hostname - we don't require that it be provided
2015-09-11 13:01:20 -07:00
Nathan Hjelm
c45789a222
ompi/comm: improve comm_split_type scalability
...
This commit includes two changes. First, the locality code has been
factored out to improve readability and maintainability. Second,
instead of looking up each proc using ompi_group_peer_lookup the code
now uses ompi_group_peer_lookup_existing. The code falls back on modex
if a proc doesn't exist. This will prevent MPI_Comm_split_type from
allocating ompi_proc_t's for every process in the job.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-11 13:53:48 -06:00
Rolf vandeVaart
90dd1d264b
Fix cuda verbosity messages
2015-09-11 15:44:36 -04:00
Nathan Hjelm
1868b5937c
Merge pull request #889 from hjelmn/sentinel_update
...
Use the low instead of the high bit to indicate a proc is a sentinel
2015-09-11 12:30:27 -06:00
rhc54
c31093ff19
Merge pull request #890 from rhc54/topic/fixpmi
...
Revert "Revert "Fix the handling of cpusets so we get the correct cpu…
2015-09-11 09:25:24 -07:00
Nathan Hjelm
898a0a038c
bml/r2: fix coverity CID 1323765
...
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-09-11 09:39:10 -06:00
Nathan Hjelm
64c8f124fc
Use the low instead of the high bit to indicate a proc is a sentinel
...
The assumption that the high bit is not in use in pointers on any of our
supported platforms was incorrect. A better assumption is that all
ompi_proc_t pointers will be at least 2-byte aligned. This allows us
to use the low bit. To do this we drop the highest bit of the
opal_process_name_t jobid (hope this is ok) and use the low bit to
indicate the proc is really a sentinel.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-09-11 09:32:02 -06:00
Ralph Castain
dc5796b8a1
Revert "Revert "Fix the handling of cpusets so we get the correct cpuset for each local peer. Add the ability to indicate that a modex request is "optional" so we don't call the server if we don't find the value. Take advantage of that to allow the MPI layer to decide that the lack of locality info indicates non-local""
...
Fix the locality computation by correctly computing the vpid of the local peer
This reverts commit open-mpi/ompi@6a8fad49e5 .
2015-09-11 08:29:51 -07:00
Ralph Castain
6a8fad49e5
Revert "Fix the handling of cpusets so we get the correct cpuset for each local peer. Add the ability to indicate that a modex request is "optional" so we don't call the server if we don't find the value. Take advantage of that to allow the MPI layer to decide that the lack of locality info indicates non-local"
...
This reverts commit f94f3cda21
.
2015-09-11 02:01:25 -07:00
rhc54
d4017d5ed4
Merge pull request #888 from rhc54/topic/pmix
...
Sync to PMIx master
2015-09-11 01:10:13 -07:00
Gilles Gouaillardet
a1627feaf7
coll/ml, bcol: fix prototypes (e.g. use the const modifier)
2015-09-11 13:20:44 +09:00
Ralph Castain
e0a52354d4
Sync to PMIx master at open-mpi/pmix@89680d6663
...
Includes changes to support BigEndian machines
2015-09-10 20:47:40 -07:00
Gilles Gouaillardet
8f2d3aeb65
oshmem: do not include pml/ob1 headers
...
this is an abstraction violation and that can cause linker failure
2015-09-11 09:34:10 +09:00
Gilles Gouaillardet
638a59adf3
fix compilation in heterogeneous mode
...
use OPAL_PMIX_GLOBAL instead of PMIX_GLOBAL
2015-09-11 09:23:21 +09:00
rhc54
a4a20a39df
Merge pull request #887 from rhc54/topic/s1
...
Fix the s1 component so direct launch is supported for SLURM
2015-09-10 17:04:08 -07:00
Ralph Castain
a2a15cea8a
Fix the s1 component so direct launch is supported for SLURM
2015-09-10 16:07:37 -07:00
rhc54
3430f154fc
Merge pull request #885 from hppritcha/topic/pmix_not_pmix1xx_u16_prob
...
pmix/~pmix1xx: use u32 for OPAL_PMIX_LOCAL_SIZE
2015-09-10 15:38:54 -07:00
Nathan Hjelm
ad3a2ef6cc
silence warnings introduced by add_procs merge
...
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-10 16:33:52 -06:00
Nathan Hjelm
2a269b52ee
Merge pull request #886 from hjelmn/hwloc_numa_socket
...
opal/hwloc: fix topology detection when socket is above numa
2015-09-10 15:57:20 -06:00
Nathan Hjelm
899bf548a2
opal/hwloc: fix topology detection when socket is above numa
...
The OPAL_PROC_ON_* definitions have been changed from values to
flags. This should not cause any problems as these values were already
used as flags throughout the code base. Note, there will be a
difference between localities produced by the new code and the
old. For example, if a machine does not have a level-3 but two cores
share a level-1 or level-2 cache cache the level-3 bit will not be set
in the locality and OPAL_PROC_ON_LOCAL_L3CACHE will return 0. Before
this change it would have returned 1.
In addition the OPAL_PROC_ON_LOCAL_* macros have been simplified.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-10 14:17:45 -06:00
Jeff Squyres
0fd073b69e
Merge pull request #882 from yburette/master
...
Update AUTHORS list.
2015-09-10 15:14:42 -04:00