1
1

28326 Коммитов

Автор SHA1 Сообщение Дата
Howard Pritchard
30eed9f035 btl/openib: addition conditional around an assert
A user trying to build Open MPI with explicit use
of CFLAGS on the make command line hit problems.

This fixes one of the problems.

https://www.mail-archive.com/users@lists.open-mpi.org//msg32241.html

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2018-05-04 14:17:07 -06:00
Edgar Gabriel
4092138ad9
Merge pull request #4987 from raafatfeki/master
fcoll/dynamic_gen2: use hindexed constructor on the sender side
2018-03-29 08:03:32 -05:00
raafatfeki
100677721d fcoll/dynamic_gen2: use hindexed constructor on the sender side
instead of using a temporary buffer and copy data into the temp buffer before sending, use a derived datatype to describe the data that needs to be sent during a cycle in the collective I/O operation.

Signed-off-by: raafatfeki <fekiraafat@gmail.com>
2018-03-28 14:37:30 -05:00
Nathan Hjelm
e79debc320 osc/rdma: fix overflow in offset calculation
This commit fixes a bug is osc/rdma that can occur if the total size
of the shared memory segment gets larger than 4 GiB. The bug was
caused by a typo. The type of my_base_offset should have been size_t
not int.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-27 09:33:44 -06:00
Nathan Hjelm
f7faacca4e osc/rdma: fix 32-bit builds
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-27 09:16:04 -06:00
Jeff Squyres
06af6f1c4c
Merge pull request #4962 from jsquyres/pr/cid-fixes
A bunch of CID fixes
2018-03-26 22:30:31 -04:00
Ralph Castain
f92acd735b
Merge pull request #4965 from rhc54/topic/rank
Fix breakage in ranking system and silence OSC/RDMA warnings
2018-03-26 19:10:36 -05:00
Ralph Castain
d644f7ee26 Correctly fix the ranking policy
Shorten the loops as much as possible - if someone wants to further optimize, they are welcome to do so.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-26 16:06:46 -07:00
Jeff Squyres
5360035995 topo/treematch: fix CID 1416327
Ensure to free things in the right order so that we don't access
memory after it is freed.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:26:17 -07:00
Jeff Squyres
08ceb66a19 osc/pt2pt: fix (effectively false positive) CID 1402113
This will almost certainly never happen, but be defensive and
guarantee that we never return an uninitialized variable.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:26:17 -07:00
Jeff Squyres
8c419294a8 btl/tcp: fix CID 710596
sizeof(addrs[0].addr_inet)==16 (so that it can handle IPv6 addresses),
but the memory that we are copying from (my_ss->sin_addr) is only 4
bytes long.  Don't copy beyond the end of that source buffer.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:22 -07:00
Jeff Squyres
9de750a280 io/ompio: fix CID 1269889
Free some memory upon error conditions.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Jeff Squyres
dca66b9775 comm_join: fix CID 1323170
Enusre that the port name is always NULL-terminated.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Jeff Squyres
6319292170 fcoll/static: fix CID 1413066
local_iov_array is unconditionally allocated, so unconditionally
de-allocate it, too.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Jeff Squyres
2968ffa296 fcoll/static: remove useless/dead code
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Jeff Squyres
3003be14f3 btl/sm: fix CID 1415105
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Jeff Squyres
a17f4afdc7 btl/tcp: fix CID 1416634
Fix resource leak in the TCP BTL.  Also add a little defensive programming.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Jeff Squyres
8e925b4f17 fbtl/posix: fix CID 1419954
Ensure to initialized ret_code.  This problem will likely never occur
in practice, but we might as well be defensive about it.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Jeff Squyres
124208198c osc/rdma: fix CID 1424327
Fix minor memory leak.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Nathan Hjelm
1c75aa82fc use-mpi-f08: fix rma function signatures
The various RMA functions need to have the asynchronous property on
all buffers. This property was missing and some buffers were
incorrectly marked as intent(in). This commit fixes the function
signatures.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-26 15:11:07 -06:00
Nathan Hjelm
7f761d8434 opal_free_list: use lifo atomic functions in opal_free_list_wait_mt
This commit fixes a multi-threading bug when using the thread-safe
free list functions. opal_free_list_wait_mt() was using the
conditional version of opal_lifo_pop() and not the thread-safe call.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-26 10:16:42 -06:00
Ralph Castain
19e85a3298
Merge pull request #4966 from rhc54/topic/platform
Update default MCA params in platform file
2018-03-25 20:24:41 -05:00
Ralph Castain
538fd18fad Update default MCA params in platform file
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-25 17:14:01 -07:00
Ralph Castain
fd704d8708 Add NEWS item
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-25 16:20:38 -07:00
Ralph Castain
3a93b535ec Silence the flood of OSC/RDMA warnings
Fixes #4950

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-25 16:12:41 -07:00
Ralph Castain
322f6c5056 Fix a breakage in the ranking system
While it may be faster to reverse the order of the assignment loops, it also results in the wrong answer

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-25 15:55:56 -07:00
Ralph Castain
c1c0c02f06
Merge pull request #4964 from rhc54/ompi/reset
Reset OMPI master to PMIx master
2018-03-25 12:40:56 -05:00
Ralph Castain
8454fc8a65 Allow oversubscription on managed allocations
Fixes https://github.com/pmix/pmix-reference-server/issues/42

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-25 08:37:51 -07:00
Ralph Castain
e443adc7a1 Reset OMPI master to PMIx master
Track PMIx master instead of the reference server - fixes problem of external PMIx master builds.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-25 08:36:46 -07:00
Artem Polyakov
77ff99e9ee
Merge pull request #4933 from karasevb/timings_update
timings: added new timing points
2018-03-25 00:10:49 -07:00
Jeff Squyres
65525434e3
Merge pull request #4961 from jsquyres/pr/cid-1430413-fix
util/fd: fix CID 1430413
2018-03-24 08:45:08 -05:00
Jeff Squyres
06ec93a61a util/fd: fix CID 1430413
Take multiple defensive steps to fix CID 1430413 and ensure that ret
is always initialized upon return.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-24 04:25:26 -07:00
Jeff Squyres
871e5c76bc
Merge pull request #4960 from jsquyres/pr/warnings-fixes
Coverity fix + compiler warning fixes
2018-03-23 14:47:56 -05:00
Jeff Squyres
c3adcb05eb Miscellaneous compiler warnings fixes
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-23 11:45:30 -07:00
Artem Polyakov
714c8c7381
Merge pull request #4957 from open-mpi/host_filtering
plm/base: fixed the hosts filtering
2018-03-23 10:27:04 -07:00
Jeff Squyres
f66ac43fbc opal/util: fix CID 1430381
Fix minor resource leak.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-23 08:48:11 -07:00
Nathan Hjelm
5f7ff5307e fcoll/two_phase: do not use removed function (MPI_Address)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-23 08:43:24 -06:00
Boris Karasev
6afc7099a0 plm/base: fixed the hosts filtering
Reseting the `ORTE_NODE_FLAG_MAPPED` flag after hosts filtering, this
flag is used subsequently and can be affect to the node mapping logic

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-03-23 09:41:16 +03:00
Jeff Squyres
1e56023ea4
Merge pull request #4951 from jsquyres/pr/contribution-consolidation
CONTRIBUTING: Consolidate the 2 files
2018-03-22 10:53:08 -05:00
Jeff Squyres
8ada4e48a5 CONTRIBUTING: Consolidate the 2 files
We accidentally had 2 CONTRIBUTING.md files.  Consolidate the content
of both of them.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-22 08:56:35 -05:00
Jeff Squyres
023a4a82d3
Merge pull request #4942 from jsquyres/pr/tcp-btl-help-message-updates
TCP help message updates
2018-03-22 08:53:04 -05:00
Jeff Squyres
0f8077ace6 oob/tcp: add show_help message about version mismatch
Be more explicit about version mismatch between ORTE processes.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-21 20:18:28 -07:00
Jeff Squyres
a15d8233c9
Merge pull request #3434 from dsharma283/pr-3431
ompi/opal: add support for HDR link speeds
2018-03-21 21:57:20 -05:00
Jeff Squyres
40afd525f8 btl/tcp: make error messages more specific
Convert some verbose messages to opal_show_help() messages.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-21 19:34:03 -07:00
Jeff Squyres
e0d86b1c72 opal/util/fd: add opal_fd_get_peer_name(()
Returns a string name (either a resolved name or IPv4/IPv6 name in a
string if unresolvable.  The caller is responsible for freeing the
string.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-21 19:34:03 -07:00
Devesh Sharma
90e9b22196 ompi/opal: add support for HDR link speeds
This patch enables to use adapters with HDR speeds.
issue id 3431

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
2018-03-21 19:15:41 -07:00
Edgar Gabriel
c23dff24bc
Merge pull request #4940 from edgargabriel/topic/ompi-cleanup-march-2018
Topic/ompio cleanup march 2018
2018-03-21 13:47:41 -05:00
Edgar Gabriel
36747cca67 io/ompio: disable the fcoll timing by default
somehow the flag indicating to gather performance data
on collective io operations has changed to 1 accidentally.
Should be 0 ( false) by default.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-21 11:34:35 -05:00
Edgar Gabriel
aae8c6c6ad remove addproc sharedfp component
never got to move this sharedfp component into anything
usable. Can easily be restored if necessary.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-21 11:27:01 -05:00
Edgar Gabriel
e703ac2da8 remove plfs components
plfs components are at this point not utilized by anybody as far as I know.
Easy to bring back if we want to.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-21 11:27:01 -05:00