1
1
Граф коммитов

23599 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
f6920aa916 osc/rdma: check for usable btls during query
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-22 17:08:28 -06:00
Nathan Hjelm
903762e194 osc/sm: fix pscw synchronization
The osc/sm component was using a simple counter to determine if all
expected posts had arrived to start a PSCW access epoch. This is
incorrect as a post may arrive from a peer that isn't part of the
current start group. There are many ways this could have been fixed.
This commit adds an n^2 bitmap. When a process posts it sets a bit in
the bitmap associated with the access rank to indicate the post is
complete. The access rank checks for and clears the bits associated
with all the processes in the start group.

The bitmap requires comm_size ^ 2 bits of space. This should be
managable as most nodes have relatively small numbers of processes. If
this changes another algorigthm can be implemented.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-22 16:00:27 -06:00
Nathan Hjelm
5553dba0c4 Merge pull request #919 from hjelmn/accumulate_ops
ompi/win: save value of accumulate_ops info key on window
2015-09-22 10:50:50 -06:00
Nathan Hjelm
036395dc0f osc/pt2pt: fix typos
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-22 10:30:01 -06:00
Nathan Hjelm
974061c38f osc: fixed issues identified by coverity
Fix CID 1324733: Null pointer dereferences  (FORWARD_NULL)
Fix CID 1324734: Null pointer dereferences  (FORWARD_NULL)
Fix CID 1324735: Null pointer dereferences  (FORWARD_NULL)
Fix CID 1324736: Null pointer dereferences  (FORWARD_NULL)
Fix CID 1324737: Null pointer dereferences  (FORWARD_NULL)
Fix CID 1324751: Memory - illegal accesses  (USE_AFTER_FREE)
Fix CID 1324750: (USE_AFTER_FREE)
Fix CID 1324749: Memory - corruptions  (USE_AFTER_FREE)
Fix CID 1324748: Memory - illegal accesses  (USE_AFTER_FREE)
Fix CID 1324747: (USE_AFTER_FREE)
Fix CID 1324746: Memory - corruptions  (USE_AFTER_FREE)

Add missing return on an error path.

Fix CID 1324745: Code maintainability issues  (UNUSED_VALUE)

Ignore return code from barrier. It was not being used anyway.

Fix CID 1324738: Null pointer dereferences  (FORWARD_NULL)
Fix CID 1324741: Null pointer dereferences  (REVERSE_INULL)

module->selected_btl can not be NULL in osc/rdma during normal
operation. Removed the unnecessary NULL check.

Fix CID 1324752: Memory - illegal accesses  (USE_AFTER_FREE)

Move ompi_osc_pt2pt_module_lock_remove to before the lock is freed.

Fix CID 1324744: Uninitialized variables  (UNINIT)
Fix CID 1324743: Uninitialized variables  (UNINIT)

This array is not used unitialized but there is no reason not to use
calloc here to silence the warning.

The following CID is a false positive: 1324742. I will mark it such in
coverity.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-22 09:23:39 -06:00
rhc54
1197455999 Merge pull request #924 from rhc54/topic/odls
Eliminate malloc by utilizing /proc/self/fd - optimization
2015-09-22 07:35:30 -07:00
bosilca
733328aa4d Merge pull request #916 from rolfv/pr/fix-coll-cuda-const-warnings
Fix warnings due to missing const
2015-09-22 16:34:40 +02:00
Ralph Castain
f28448702a Eliminate malloc by utilizing /proc/self/fd - optimization 2015-09-22 07:24:54 -07:00
igor-ivanov
a9fc53cf20 Merge pull request #923 from igor-ivanov/pr/mpisync
ompi/tools: Add O(logN) algorithm for data collection
2015-09-22 16:11:25 +03:00
rhc54
2b45969d16 Merge pull request #921 from rhc54/topic/pmix
Sync to PMIx master
2015-09-22 05:52:47 -07:00
Igor Ivanov
53be890c03 ompi/tools: Add O(logN) algorithm for data collection
Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2015-09-22 15:21:37 +03:00
Nathan Hjelm
6b2307c88a Merge pull request #918 from hjelmn/missing_man_pages
ompi: add missing man pages
2015-09-22 00:27:23 -06:00
Ralph Castain
4c654ffd94 Sync to PMIx master 2015-09-21 21:27:06 -07:00
rhc54
4899e7fe25 Merge pull request #920 from rhc54/topic/dvm
Fix orte-submit so it allows application procs to select the correct …
2015-09-21 21:13:30 -07:00
Ralph Castain
f872e99315 Fix orte-submit so it allows application procs to select the correct ess component. Protect orte_data_server from multiple calls to finalize. 2015-09-21 20:31:57 -07:00
Nathan Hjelm
6751409c32 ompi/win: save value of accumulate_ops info key on window
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-21 16:37:29 -06:00
Nathan Hjelm
d6724f2828 ompi: add missing man pages
This commit adds man pages for the MPI_Win_allocate and MPI_Win_allocated_shared
MPI-3 functions. The man page for MPI_Win_create has also been updated to
indicate support for the same_size and same_disp_unit info keys

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-21 16:21:28 -06:00
Howard Pritchard
ef6cf50687 Merge pull request #917 from hppritcha/topic/alps_warning_swat
oob/alps: swat compiler warning
2015-09-21 16:17:30 -06:00
Howard Pritchard
8d7e759b85 oob/alps: swat compiler warning
swat some alps related compiler warnings when using --enable-picky

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-09-21 14:24:26 -07:00
Rolf vandeVaart
2c51faa58d Fix warnings due to missing const 2015-09-21 14:18:44 -04:00
Mike Dubman
23c41a0320 Merge pull request #908 from igor-ivanov/pr/oshmem-check
Recovering oshmem functionality
2015-09-21 19:50:24 +03:00
Nathan Hjelm
60c2b0df48 Merge pull request #903 from hjelmn/new_osc_rdma
osc/rdma: add true RDMA one-sided component
2015-09-21 10:29:11 -06:00
Nathan Hjelm
88100ad670 Merge pull request #902 from hjelmn/new_osc
osc/pt2pt: reduce memory footprint of windows
2015-09-21 10:28:41 -06:00
rhc54
ee17d73bd3 Merge pull request #915 from rhc54/topic/fixfd
As Jeff proposed, change the check to looking for the filename's firs…
2015-09-21 09:18:13 -07:00
Igor Ivanov
7de0537a1d oshmem: Add help message for fatal issues in scoll:mpi and scoll:fca
Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2015-09-21 18:50:20 +03:00
Igor Ivanov
ec7cd13a81 oshmem: Fix compilation warnings 2015-09-21 18:50:20 +03:00
Igor Ivanov
69c82df781 oshmem/proc: Sanity check for oshmem_proc_t size
Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2015-09-21 18:50:12 +03:00
Ralph Castain
92ae386a34 As Jeff proposed, change the check to looking for the filename's first character to be a digit 2015-09-21 08:22:58 -07:00
Rolf vandeVaart
e33d44a29a Merge pull request #898 from rolfv/pr/add-empty-cache-feature
Add ability for user to empty the CUDA IPC registration cache when it is full
2015-09-21 08:50:04 -04:00
Igor Ivanov
9f12098ab8 oshmem: Remove needless code
Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2015-09-21 10:44:24 +03:00
Igor Ivanov
ca8c3eebea oshmem: Abort application in casesingle scoll:mpi is selected
scoll:mpi does not have barrier and should be selected with
any other scoll component.

Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2015-09-21 10:42:54 +03:00
Jeff Squyres
73b399ab78 Merge pull request #913 from rhc54/topic/config
Do not use "==" in configure "test" calls
2015-09-21 09:09:58 +02:00
Ralph Castain
0b3f4c55f8 Do not use "==" in configure "test" calls
Thanks to Kevin Buckley for pointing it out
2015-09-20 21:34:27 -07:00
rhc54
13def2a69b Merge pull request #911 from rhc54/topic/cleanup
Cleanup the odls "close file descriptor" commit to conform to OMPI co…
2015-09-20 07:01:39 -07:00
Howard Pritchard
1367a442b6 Merge pull request #910 from hppritcha/topic/odls_alps_use_907_stuff
odls/alps: do smarter close of fds in child
2015-09-20 07:37:55 -06:00
Ralph Castain
c167acc5a7 Cleanup the odls "close file descriptor" commit to conform to OMPI coding standards and remove memory leaks 2015-09-19 20:46:36 -07:00
rhc54
984418dd83 Merge pull request #907 from plesn/close-used-fds
odls: close only used file descriptors at fork/exec
2015-09-19 20:26:38 -07:00
Howard Pritchard
a31cc21bea odls/alps: do smarter close of fds in child
Use a modified variant of #907.  Thanks to plesn
for noticing this.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-09-19 14:17:05 -07:00
Jeff Squyres
7bf364ff04 usnic: create blocking libevent for progress thread emulation
In the v1.10 opal_progress_thread emulation, ensure to create a
blocking libevent timer far in the future.  Without that,
opal_event_loop() will return immediately (and therefore the progress
thread spins hard, stealing CPU cycles).
2015-09-18 11:53:03 -07:00
Jeff Squyres
7cb6c2fcc2 usnic: put the HWLOC #if's back to preserve compat with v1.10
We try to keep the source code the same between master and v1.10.  So
put the #if's back for OPAL_HAVE_HWLOC (and just hard-code it to 1 on
master) so that this code is also compilable in v1.10.
2015-09-18 11:53:03 -07:00
Piotr Lesnicki
1dd5487fae odls: close only used file descriptors at fork/exec 2015-09-18 16:44:57 +02:00
Igor Ivanov
fb5d934e2f oshmem/proc: Refactor oshmem_proc to meet new add_proc changes
ompi has new mpi_add_procs_cutoff argument that can control
creation of ompi_proc_t but We should be confident that all
ompi_proc_t object exists during oshmem_group_all creation.
Probably it could be done in more flexible way later.

Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2015-09-18 17:40:21 +03:00
Edgar Gabriel
01fcfb08fe do not set the contigous flag in two_phase_file_read_all. This optimization
needs some more debugging for the two_phase component, and is disabled
for two_phase_file_write_all as well.
2015-09-18 09:30:50 -05:00
Edgar Gabriel
3734a38370 this file should have been part of the previous commit. for removeing io_ompio_nbc.[ch] 2015-09-18 09:28:25 -05:00
Edgar Gabriel
cf46a6bd4d remove the io_ompio_nbc.[ch] files, they are not used anymore at this point in time. 2015-09-18 09:26:25 -05:00
Gilles Gouaillardet
a611274704 pml: fix commit open-mpi/ompi@6e6a3e965c
do not use the const modifier for allocator nor recv buffers
2015-09-18 09:54:18 +09:00
Rolf vandeVaart
7da614c75e Add ability for user to empty the CUDA IPC registration cache when it is full 2015-09-17 16:42:16 -04:00
Jeff Squyres
567c9e3a5b mtl_ofi_component.c: add missing argv.h header 2015-09-17 10:05:05 -07:00
Igor Ivanov
f437f4012e Revert "scoll/mpi: work around bug in oshmem/proc design"
This workaround is needless after oshmem/proc refactoring

This reverts commit 202c6a38e4.
2015-09-17 19:01:24 +03:00
Igor Ivanov
4b8d9b8eff oshmem/proc: Refactor proc component
Most functionality of oshmem_proc duplicates ompi_proc. In addition
to that, Current logic does not allow to do oshmem initialization
w/o ompi startup.
So this refactoring allows to  avoid code duplication, decrease used
memory and make oshmem support easier.
Now oshmem_proc is transparent ompi_proc structure, that can be
extended by oshmem specific data.

Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2015-09-17 18:49:00 +03:00