1
1
Граф коммитов

8154 Коммитов

Автор SHA1 Сообщение Дата
Gilles Gouaillardet
c05b271c68 man: fix a trivial typo in MPI_Neighbor_allgather.3in 2015-05-15 16:02:01 +09:00
Gilles Gouaillardet
1488e82efd osc/pt2pt: enable heterogeneous support 2015-05-14 16:42:48 +09:00
Todd Kordenbrock
c42e277385 mtl-portals4: thread multiple updates
When activating short receive blocks on the overflow list, remove
the PTL_ME_EVENT_LINK_DISABLE flag so the event gets generated.
Without PTL_EVENT_LINK, the block status can't reach the activated
state.

Replace #ifdef with #if for Open MPI configure booleans, because
Open MPI configure booleans are always defined and the value must
be checked.
2015-05-13 17:06:18 -05:00
Yohann Burette
27f1884cf8 mtl/ofi: Reworked header files. Added compat to ease maintenance. 2015-05-12 15:47:50 -07:00
rhc54
b59fa14004 Merge pull request #583 from rhc54/topic/mallocwarnings
Silence malloc(0) warnings reported by Lisandro
2015-05-12 13:37:38 -07:00
Ralph Castain
9a70765f27 Silence malloc(0) warnings reported by Lisandro 2015-05-12 12:38:58 -07:00
Nathan Hjelm
427aebbaca Fix cuda support MCA variables
This commit fixes some issues with the cuda support parameters. There
were a couple of duplicate registrations and an incorrect synonym (one
variable was made a synonym of mpi_preconnect_mpi).

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-12 09:52:51 -06:00
Ryan Grant
bbeaf41a52 Merge pull request #580 from tkordenbrock/topic/mtl.add.status.to.short.recv.blocks
mtl-portals4: add status to short recv blocks to coordinate out of or…
2015-05-11 13:44:45 -06:00
Ryan Grant
265682bdb9 Merge pull request #581 from tkordenbrock/topic/remove.overlapping.multiMD.code
portals4: use a single Memory Descriptor to cover all of memory
2015-05-11 13:20:32 -06:00
George Bosilca
78f5f0f8a9 Show the name of the collective that failed to get initialized. 2015-05-11 15:10:37 -04:00
Todd Kordenbrock
9df163f116 portals4: use a single Memory Descriptor to cover all of memory
In days past, some implementations of Portals4 could not cover all
of memory with a single Memory Descriptor so multiple large
overlapping Memory Descriptors were created.  Because none of the
current implementations have this limitation (and no future
implementations should either), this commit removes the overlapping
Memory Descriptors code.
2015-05-11 11:49:41 -05:00
Todd Kordenbrock
074583060d mtl-portals4: add status to short recv blocks to coordinate out of order events
If OMPI is initialized as thread multiple, then it is possible for
Portals events to be processed out of order by different threads.
Out of order events could lead to reactivation of the block
(PTL_EVENT_AUTO_FREE) before the block is removed from the active
list (PTL_EVENT_AUTO_UNLINK).  This commit adds a status field to
ompi_mtl_portals4_recv_short_block_t that coordinates these events.
2015-05-11 11:48:25 -05:00
Gilles Gouaillardet
650289bc33 romio314: update one more romio->romio314 name
Also missed this in open-mpi/ompi@db257cdbc0.
2015-05-08 18:26:33 +09:00
Ralph Castain
6e95bcd583 Fix typo in oob_tcp.c when IPV6 enabled. Cleanup a few other warnings, including a type in coll_sm that prevented that component from registering its MCA params! 2015-05-07 21:05:08 -07:00
Gilles Gouaillardet
f1258c3b6c ompi/errhandler: make most ompi_err_* variables static
Thanks @hjelmn for pointing this !
2015-05-08 10:11:59 +09:00
Nathan Hjelm
f0e650fef5 Rename internal error code variables in errcode-internal.c
The renamed variables used the same identifiers as variables in
errcode.c. To avoid confusion rename the variables to end in _intern.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-08 10:11:59 +09:00
Gilles Gouaillardet
9d56b85b55 initialize common symbols from ompi 2015-05-08 10:11:58 +09:00
Gilles Gouaillardet
dd572a0838 Fix --with-fortran=... logic 2015-05-08 09:23:55 +09:00
Gilles Gouaillardet
ab148e4e0c romio314: update one more romio->romio314 name
Also missed this in open-mpi/ompi@db257cdbc0.
2015-05-08 09:12:22 +09:00
Jeff Squyres
b3d89cf7b0 romio314: update one more romio->romio314 name
Missed this in db257cdbc0.
2015-05-07 09:40:45 -07:00
George Bosilca
3af8dfd3e2 Fix a overwrite of the args buffer identified by Lisandro Dalcin. 2015-05-07 09:50:39 -04:00
Jeff Squyres
691b4ec1e5 romio314: whitespace cleanup
No code changes
2015-05-05 06:23:59 -07:00
Jeff Squyres
db257cdbc0 romio314: adhere to the prefix rule
Rename all files and symbols from "io_romio" to "io_romio314".  This
fixes --disable-dlopen builds (because they were missing
the mca_io_romio314_component symbol).
2015-05-05 06:23:59 -07:00
Devendar Bureddy
88eb1fa936 HCOLL: refactoring hcoll_init 2015-05-04 22:03:36 +03:00
Jeff Squyres
8127c24f30 romio314/Makefile.am: whitespace cleanup
No code changes.
2015-05-04 07:20:11 -07:00
Jeff Squyres
332bca7183 romio314/Makefile.am: name the component library properly 2015-05-04 07:20:11 -07:00
Howard Pritchard
74089f5e3a Merge pull request #558 from ggouaillardet/refresh/romio314
Refresh/romio314
2015-05-01 14:23:43 -06:00
Gilles Gouaillardet
ef4b6203a4 Correctly unpack datatype.
Alignment requirements were relaxed in open-mpi/ompi@33a3ace874
so correctly handle this when unpacking a datatype.
2015-05-01 17:08:15 +09:00
Gilles Gouaillardet
34128c1cad mca_topo_base_dist_graph_neighbors: do not fail if legitimate parameters are provided.
Per the MPI 3.0 standard (chapter 7, page 310) :
"If maxindegree or maxoutdegree is smaller than the numbers returned by
MPI_DIST_GRAPH_NEIGHBOR_COUNT, then only the first part of the full list is returned."
2015-05-01 13:31:05 +09:00
Gilles Gouaillardet
96e3cbe8fc Remove an incorrect assert.
Alignment requirements were relaxed in open-mpi/ompi@33a3ace874
and made a previous alignment check incorrect.
2015-05-01 12:53:12 +09:00
George Bosilca
33a3ace874 Minimize the alignments. We only do it when we need to pack
data that must be aligned (aka the displacement). All other
cases do not require special alignments, and are treated
normally.
Fix the comment regarding the alignment requirements.
2015-04-30 22:06:50 -04:00
George Bosilca
015d3f56cf Fix the INDEXED_BLOCK issue identified by IBM. 2015-04-30 14:43:19 -04:00
Gilles Gouaillardet
6b3126e69e ROMIO 3.1.4 refresh: add refresh notes 2015-04-30 19:02:20 +09:00
Gilles Gouaillardet
e1b6ab4f1d ROMIO 3.1.4 refresh: remove old romio 2015-04-30 19:01:23 +09:00
Gilles Gouaillardet
85e77079b4 ROMIO 3.1.4 refresh: use romio from mpich 3.1.4 2015-04-30 19:00:50 +09:00
Gilles Gouaillardet
92f6c7c1e2 ROMIO 3.1.4 refresh: apply post romio-3.1.4 patches 2015-04-30 18:56:53 +09:00
Gilles Gouaillardet
6400bc75ab ROMIO 3.1.4 refresh: patch romio for Open MPI 2015-04-30 18:53:55 +09:00
Gilles Gouaillardet
eacd434a02 ROMIO 3.1.4 refresh: import romio from mpich 3.1.4 tarball 2015-04-30 18:53:03 +09:00
Gilles Gouaillardet
e2e91142d5 ROMIO 3.1.4 refresh: prepare new romio directory 2015-04-30 18:52:22 +09:00
Gilles Gouaillardet
1ee58af8f5 update ROMIO .gitignore 2015-04-30 18:49:53 +09:00
Gilles Gouaillardet
697a866b6e ddt: correctly align next datatype description
This bug can be evidenced by the test/datatype/ddt_pack
test case on sparc architecture.
2015-04-30 15:04:54 +09:00
Ryan Grant
6ab91a6781 Merge pull request #561 from tkordenbrock/topic/mtl.fix.datatype.overflow
mtl-portals4: fix datatype overflow in ompi_mtl_portals4_long_isend()
2015-04-28 15:15:59 -06:00
Ryan Grant
cc3da91700 Merge pull request #562 from tkordenbrock/topic/mtl.expand.source.bits.to.24
mtl-portals4: expand the source field of the match bits to 24 bits
2015-04-28 14:31:06 -06:00
Rolf vandeVaart
91a8ec52ca Fix possible unintialized warnings 2015-04-28 16:25:35 -04:00
Todd Kordenbrock
8a4616f724 mtl-portals4: fix datatype overflow in ompi_mtl_portals4_long_isend()
The length parameter of ompi_mtl_portals4_long_isend() was declared
as "int", which may not be big enough depending on the platform and
compiler options used.  This commit changes the type to size_t to
prevent overflow.
2015-04-28 14:40:25 -05:00
Todd Kordenbrock
3e437f6184 mtl-portals4: expand the source field of the match bits to 24 bits
The source field was 16 bits which is not sufficient for many
current and future machines.  This commit expands the source field
to 24 bits and reduces the tag field from 32 bits to 24 bits.
2015-04-28 14:25:30 -05:00
Gilles Gouaillardet
18b75bd40d io/base: check the MCA version matches 2015-04-28 17:48:23 +09:00
Ralph Castain
3d46850c4d Per patch from Marco Atzeri, have the fortran wrapper links go directly to opal_wrapper to avoid breaks in the chain in some environments. 2015-04-25 17:09:06 -07:00
Yohann Burette
1be185ed87 mtl/ofi: Remove use of MR. 2015-04-24 15:55:21 -07:00
Nathan Hjelm
2716b8b1da osc/pt2pt: correct flush expected counts
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-04-24 13:34:21 -06:00
Nathan Hjelm
f1d09e55ec osc/pt2pt: silence warnings
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-23 15:35:47 -06:00
Nathan Hjelm
29b435a5a4 osc/pt2pt: fix bugs that caused incorrect fragment counting
This commit fixes a bug identified by MTT that occurred when mixing
passive and active target synchronization. The bugs fixed in this
commit are:

 - Do not update incoming fragment counts for any type of unbuffered
   control message. These messages are out-of-band and should not be
   considered towards the signal counts.

 - Complete a change from using received counts to expected counts for
   lock, unlock, and flush acks. Part of the change made it into
   master before the rest was ready. This was preventing wakeups in
   some cases.

 - Turn the passive_target_access_epoch module member into a
   counter. As long as at least one peer is locked we are in a
   passive-target epoch and not an active target one. This fix will
   ensure that fragment flags are set appropriately.

fixes #538

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-23 13:22:24 -06:00
Jeff Squyres
63c7520273 use-mpi-f08/Makefile.am: also link in libmpi_mpifh.la
Per mail from Macro Atzeri, we also need to link in libmpi_mpifh.la,
lest we exhaust relative offset addressing (e.g., in 32 bit builds).

See http://www.open-mpi.org/community/lists/devel/2015/04/17304.php.
2015-04-22 14:22:36 -07:00
Ryan Grant
1436417488 Merge pull request #547 from tkordenbrock/topic/mtl.add.logical.mode
mtl-portals4: add the option to use the Portals4 logical to physical mapping
2015-04-22 15:06:13 -06:00
Nathan Hjelm
7c95ecf859 mca/base: provide functions to determine if a framework is registered/open
This commit also fixes a problem with the lazy opening of topo
components. The topo framework incorrectly: 1) checked if the topo
framework was open by checking the length of the components list, and
2) called the framework open directly instead of using
mca_base_framework_open.

fixes #544

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-21 13:54:25 -06:00
Todd Kordenbrock
8e56002ec7 mtl-portals4: add missing return to portals4_init_interface() 2015-04-21 11:30:33 -05:00
Todd Kordenbrock
34c50fa963 mtl-portals4: move MD cleanup closer to failure
PtlMDRelease() was called if read_msg() returned a failure code.
This commit moves the PtlMDRelease() inside read_msg() so that it
doesn't get called in cases where the failure happens before or at
the PtlMDBind().
2015-04-21 11:30:33 -05:00
Todd Kordenbrock
422be76770 mtl-portals4: add a debug message for thread multiple mode 2015-04-21 11:30:33 -05:00
Todd Kordenbrock
35e5ffd001 mtl-portals4: add the option to use the Portals4 logical to physical table
This commit adds an MCA variable to select Portals4 logical
addressing, populates the logical-to-physical mapping table and
initializes the NI in this mode.
2015-04-21 11:30:33 -05:00
Yohann Burette
19607d2ce7 mtl/ofi: Remove memset() from progress path. 2015-04-20 14:12:39 -07:00
Yohann Burette
d2eda04801 mtl/ofi: Use fi_tinject for small messages. 2015-04-20 14:12:39 -07:00
Nathan Hjelm
033894b493 Merge pull request #541 from hjelmn/c99_components
C99 component initialization
2015-04-20 10:45:39 -06:00
Yohann Burette
ba1bc00df1 mtl/ofi: Remove FI_CANCEL. 2015-04-20 09:40:37 -07:00
Devendar Bureddy
19f5a3eff4 HCOLL: skip hcoll if enable_mpi_threads is true
reasons:
    1) default OCOMS is not configured with --enable-ocoms-multi-threads
    2) locking overheads
2015-04-20 19:39:49 +03:00
Devendar Bureddy
dd8e9fa176 HCOLL: enable by defaut 2015-04-20 19:39:30 +03:00
Nathan Hjelm
d251fa1525 pml/ob1: fix heterogenous build
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-20 09:27:00 -06:00
Howard Pritchard
3339274136 Merge pull request #542 from hppritcha/topic/coverity_714118
fcoll/two_phase: coverity fix
2015-04-20 05:42:12 -06:00
Howard Pritchard
de215addc6 fcoll/two_phase: coverity fix
fix CID 714118

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-04-18 14:34:48 -06:00
Nathan Hjelm
df75d0382f ompi: use C99 subobject naming for component initialization
This commit helps future-proof ompi components by initializing each
component member by name.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-18 10:29:58 -06:00
Yohann Burette
9392bb5ede mtl/ofi: Implement Probe/Mprobe/Mrecv using FI_PEEK/FI_CLAIM. 2015-04-17 16:42:13 -07:00
Nadezhda Kogteva
116169c38a opal timing: added ability to choose the timer type 2015-04-17 11:15:55 +03:00
Mangala Jyothi Bhaskar
c4de46e284 Fix number of aggregators used in two phase fcoll 2015-04-16 10:39:10 -05:00
Nathan Hjelm
3436f2917d Merge pull request #449 from hjelmn/mca_base_update
mca/base update
2015-04-16 08:41:48 -06:00
Nathan Hjelm
d5b52d3141 ompi/communicator: make comm_request internal variables static 2015-04-15 10:05:21 -06:00
Ralph Castain
a4b1225892 Don't register the PSM errhandler until it is certain that the PSM component can be used.
This doesn't matter on the master, but it does matter on the 1.8 branch as the MTL select logic is different over there.
2015-04-14 07:54:53 -07:00
Nathan Hjelm
113c890ccf Merge pull request #520 from hjelmn/valgrind_cleanness
fix memory leaks and valgrind errors
2015-04-13 10:09:34 -06:00
Jeff Squyres
49f52a5356 osc_sm_passive_target.c: update the check for lock types
Based on some on-list and IM discussion with @hjelmn about
open-mpi/ompi@40b7643119, change the testing to a switch/case.  If we
fall into the default case, assert() error (because it's an OMPI
developer programming error).
2015-04-13 12:02:15 -04:00
Jeff Squyres
40b7643119 osc_sm_passive_target.c: ensure ret is always defined
Fixes a compiler warning
2015-04-13 11:31:43 -04:00
Nathan Hjelm
a7b0c00ab6 fix memory leaks and valgrind errors
This commit fixes several vagrind errors. Included:

 - installdirs did not correctly reinitialize all pointers to NULL
   at close. This causes valgrind errors on a subsequent call to
   opal_init_tool.

 - several opal strings were leaked by opal_deregister_params which
   was setting them to NULL instead of letting them be freed by the
   MCA variable system.

 - move opal_net_init to AFTER the variable system is initialized and
   opal's MCA variables have been registered. opal_net_init uses a
   variable registered by opal_register_params!

 - do not leak ompi_mpi_main_thread when it is allocated by
   MPI_T_init_thread.

 - do not overwrite ompi_mpi_main_thread if it is already set (by
   MPI_T_init_thread).

 - mca_base_var: read_files was overwritting mca_base_var_file_list
   even if it was non-NULL.

 - mca_base_var: set all file global variables to initial states on
   finalize.

 - btl/vader: decrement enumerator reference count to ensure that it
   is freed.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-11 09:28:35 -06:00
Ralph Castain
3e44d3c9e3 Enable singletons to run without any active OOB module until they attempt to comm_spawn 2015-04-10 14:06:42 -07:00
Nathan Hjelm
eb56117405 Merge pull request #513 from hjelmn/mca_bug_fixes
opal: fix multiple bugs in MCA and opal
2015-04-08 10:29:44 -06:00
Nathan Hjelm
9cd955badf opal: fix multiple bugs in MCA and opal
This commit fixes the following bugs:

 - opal_output_finalize did not properly set internal state. This
   caused problems when calling the sequence opal_output_init (),
   opal_output_finalize (), opal_output_init ().

 - opal_info support called mca_base_open () but never called the
   matching mca_base_close (). mca_base_open () and mca_base_close ()
   have been updated to use a open count instead of an open flag to
   allow mca_base_open to be called through multiple paths (as may be
   the case when MPI_T is in use).

 - orte_info support did not register opal variables. This can cause
   orte-info to not return opal variables.

 - opal_info, orte_info, and ompi_info support have been updated to
   use a register count.

 - When opening the dl framework the reference count was added to
   ensure the framework stuck around. The framework being closed
   prematurely was a bug in the MCA base that has since been
   corrected. The increment (and associated decrement) have been
   removed.

 - dl/dlopen did not set the value of
   mca_dl_dlopen_component.filename_suffixes_mca_storage on each call
   to register. Instead the value was set in the component
   structure. This caused the value to be lost when re-loading the
   component. Fixed by setting the default value in register.

 - Reset shmem framework state on close to avoid returning a stale
   component after reloading opal/shmem.

 - MCA base parameters were not properly deregistered when the MCA
   base was closed.

This commit may fix #374.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-07 19:13:20 -06:00
Howard Pritchard
5ee18f4f00 Merge pull request #514 from hppritcha/topic/mpi_win_lock_all_man
man pages: fix problem with MPI_Win_lock_all
2015-04-07 17:17:30 -06:00
Howard Pritchard
291c775e74 man pages: fix problem with MPI_Win_lock_all
thanks to Thomas Jahns for pointing this out -

http://www.open-mpi.org/community/lists/users/2015/04/26633.php

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-04-07 16:29:00 -06:00
Nathan Hjelm
2409715fc3 Merge pull request #511 from hjelmn/osc_pt2pt_fix
osc/pt2pt: fix synchronization bugs
2015-04-07 09:14:00 -06:00
Howard Pritchard
fc3a0f60c5 Merge pull request #512 from hppritcha/topic/java_better_dlopen_error
ompi/java: better error message if dlopen fails
2015-04-06 14:08:10 -06:00
Howard Pritchard
18039b34b4 ompi/java: better error message if dlopen fails
The error message emitted by ompi/java when dlopen
fails is misleading and not very informative.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-04-06 13:35:09 -06:00
Nathan Hjelm
80ed805a16 osc/pt2pt: fix synchronization bugs
The fragment flush code tries to send the active fragment before
sending any queued fragments. This could cause osc messages to arrive
out-of-order at the target (bad). Ensure ordering by alway sending
the active fragment after sending queued fragments.

This commit also fixes a bug when a synchronization message (unlock,
flush, complete) can not be packed at the end of an existing active
fragment. In this case the source process will end up sending 1 more
fragment than claimed in the synchronization message. To fix the issue
a check has been added that fixes the fragment count if this situation
is detected.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-06 08:39:19 -06:00
Jithin Jose
9c937d44ae Inline MTL-OFI
Signed-off-by: Jithin Jose <jithin.jose@intel.com>

Conflicts:
	ompi/mca/mtl/ofi/mtl_ofi_recv.c
2015-04-03 15:19:30 -07:00
Jithin Jose
50304dfe05 Inline mtl-datatype pack/unpack
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-04-03 15:19:21 -07:00
Jithin Jose
c09582a3ff - CM blocking send/recv optimizations
This patch tries to do as little as possible in the PML CM blocking
    send/receive routines.  Basically, avoid creating and filling in an
    entire request object.  An OMPI-level request is still needed, but we
    can create that on the stack instead of going to a free list.

Signed-off-by: Andrew Friedley <andrew.friedley@intel.com>
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-04-03 15:19:08 -07:00
Howard Pritchard
05324e32ff fcoll/static: coverity fixes
Fix CIDs 72138, 72139, 72143

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-04-02 14:51:44 -06:00
Devendar Bureddy
6ddc7ac35c HCOLL: Fix assertion
hcoll context may not be destroyed if it is cached.
2015-04-01 20:33:28 +03:00
Nathan Hjelm
b68d66bb9b MCA: Add the project/project version to the MCA base component
This commit adds support for project_framework_component_* parameter
matching. This is the first step in allowing the same framework name
in multiple projects. This change also bumps the MCA component version
to 2.1.0.

All master frameworks have been updated to use the new component
versioning macro. An mca.h has been added to each project to add a
project specific versioning macro of the form
PROJECT_MCA_VERSION_2_1_0.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-03-27 10:59:04 -06:00
Nathan Hjelm
e91084e20b Merge pull request #492 from hjelmn/subarray_fix
ompi/datatype: fix subarray datatype
2015-03-27 10:50:46 -06:00
Devendar Bureddy
71c28cea65 HCOLL: hcoll dte fixes
- hcoll currently do not support datatype with gaps around it (i.e dtsize !=
dtextent)
    - check for user defined Ops.
2015-03-25 16:04:11 +02:00
Nathan Hjelm
88072f9b8b ompi/datatype: fix subarray datatype
The subarray datatype was not packing/unpacking correctly. This was
leading to wrong results whenever the lb of the subarray datatype was
non-zero.

I tracked the issue to the use of ompi_datatype_create_resized. This
function simply duplicates the old datatype and sets the lb and
extent. This is unfortunately insufficent for the pack/unpack
functions which use the loop end first element offset NOT the lb. This
offset is 0 in the resized datatype. Once ompi_datatype_create_resized
has been fixed this commit should be reverted.

Fixes #380.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-03-24 14:33:10 -06:00
Nathan Hjelm
f588ad7af0 Merge pull request #485 from hjelmn/info_enums
ompi/info: add support for getting info key value based on variable enumerator
2015-03-23 11:43:22 -06:00
Nathan Hjelm
9299cd5cd7 ompi/info: add support for getting info key value based on variable
enumerator

This commit adds a function that will return an integer value for an
info key based on the value returned by a variable enumerator. This
feature should greatly simplify code using the info keys (osc for
example).

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-03-23 11:20:37 -06:00
Howard Pritchard
edf9e8ba8f mtl/psm: coverity fixes
Fix CIDS 1270176 - 1270179

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-03-18 11:02:01 -06:00
Nathan Hjelm
ccba8ce856 Merge pull request #457 from hjelmn/mpit_fixes
mca/base: fix bugs in framework deregistration/re-registration
2015-03-18 08:37:49 -06:00
Ralph Castain
0cfb4f29aa Silence compiler warning 2015-03-16 09:59:21 -07:00
Todd Kordenbrock
515d9e8cc9 mtl-portals4: fix compiler warnings 2015-03-12 20:34:04 -05:00
adrianreber
714d9aa67e Merge pull request #348 from adrianreber/topic/orte_cr_continue_like_restart
Topic/orte cr continue like restart
2015-03-12 14:54:02 +01:00
Alina Sklarevich
28586caecf MTL_MXM/PML_YALLA: fix coverity issues. 2015-03-12 11:49:22 +02:00
Howard Pritchard
da85d5fc0a Merge pull request #467 from hppritcha/topic/minor_fcoll_static_coverity_fix
fcoll/static: minor fix for coverity
2015-03-11 10:28:05 -06:00
Nathan Hjelm
ce6caab2a7 Merge pull request #463 from hjelmn/cuda_async
btl/openib: cuda: fix CUDA-aware support with async copy
2015-03-11 09:52:48 -06:00
Howard Pritchard
66fee3bd18 fcoll/static: minor fix for coverity
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-03-11 09:11:49 -06:00
Adrian Reber
c08e234af7 FT: fix compilation using --with-ft (5/5)
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.

With the changes introduced in the previous patches in this series
some goto constructs for cleanup are no longer necessary and removed.
2015-03-11 14:23:33 +01:00
Adrian Reber
9b84fe45d3 FT: fix compilation using --with-ft (3/5)
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.

Follow-up of 552c9ca5a0. This patch
implements the necessary changes in mentioned commit in the FT code.
2015-03-11 14:23:33 +01:00
Adrian Reber
1c5a8df724 FT: fix compilation using --with-ft (2/5)
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.

The FT code used barrier mechanisms which have been removed
with aec5cd08bd. This patch replaces
all those different barriers with opal_pmix.fence(NULL, 0);
I am not sure this is completely correct but at least a starting
point for a review.
2015-03-11 14:23:33 +01:00
Adrian Reber
f45dd069bd FT: fix compilation using --with-ft (1/5)
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.

This first patch moves orte_cr_continue_like_restart from ORTE
to opal_cr_continue_like_restart in OPAL. This only leaves three
calls from OPAL to ORTE in the FT code. As it is not yet 100%
clear how to handle these calls the code orte_sstore.set_attr()
has been #ifdef'd out for now.
2015-03-11 14:23:33 +01:00
Alina Sklarevich
f9a9b936a1 PML_YALLA: fix compilation warnings. 2015-03-11 10:58:54 +02:00
Nathan Hjelm
3d32dbd793 btl/openib: cuda: fix CUDA-aware support with async copy
This commit should resolve an issue seen with CUDA-aware support. The
problem came in with BTL 3.0. Before 3.0 the size of the copy was
stored in the incoming segment's des_remote_count field. This field
does not exist in BTL 3.0 so I stored the value in the
des_segment_count field. This caused problems with the cuda support
code. To fix the issue the endpoint pointer is now stored in the in
fragment's endpoint pointer which free's up the segment's des_cbdata
pointer for storing the transfer size.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-03-10 14:38:12 -06:00
Nathan Hjelm
d929137768 osc/pt2pt: need to unlock self before waiting for unlock acks
This commit fixes a bug in osc/pt2pt which causes MPI_Win_unlock_all
to hang. The problem was caused by code refactoring that moved the
unlock of the local process to after the loop that waits for unlock
acks. This will cause the code to loop forever waiting on the self
ack.

Fixes #444
2015-03-10 14:10:37 -06:00
Yohann Burette
d48a8ab8f0 mtl/ofi: Use fi_allocinfo(). 2015-03-10 12:50:55 -07:00
Jeff Squyres
2e8ee003b0 ofi: endpoint type hint moved to a sub-struct, BUFFERED went away
Update to match	new libfabric API/structure change.
2015-03-10 09:55:45 -07:00
Howard Pritchard
b73d566d57 Merge pull request #454 from hppritcha/topic/coverity_fixes
fcoll/dynamic: coverity fixes
2015-03-10 07:59:56 -06:00
Mike Dubman
6f91a007e1 Merge pull request #458 from yosefe/topic/pml-yalla-fix-segv
keep mxm context alive as long as pml_yalla component is open.
2015-03-10 13:38:14 +02:00
yosefe
976144dca7 keep mxm context alive as long as pml_yalla component is open.
pml_yalla_del_comm may be called after yalla module is finalized, which
leads to invalid memory access if mxm context is already destroyed in
this point.
2015-03-10 11:52:44 +02:00
Nathan Hjelm
005c6022e2 mca/base: fix bugs in framework deregistration/re-registration
There were a number of bugs in the framework/variable code that
affected deregistration:

 - Frameworks could be erroneously closed if seperately registered and
   opened then subsequently closed. This was a bug in the original
   design which only reference counted opens but not
   registrations. This would cause undefined behavior if
   MPI_T_finalize actually calls ompi_info_close_components as
   intended. Now both registrations and opens are reference counted
   and frameworks/components are not torn down until the matching
   number of close calls have been made.

 - group_find_by_name did not pass the invalidok flags down
   to mca_base_var_group_get_internal correctly.

 - Group deregistration caused the group to be completely reset. This
   does not match the behavior required by MPI_T as it could reduce
   the number of variables/subgroups in a group.

This commit also updates MPI_T_finalize to call
ompi_info_close_components as originally intended.

Closes #374

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-03-09 16:52:53 -06:00
Howard Pritchard
fba88360a8 fcoll/dynamic: more coverity fixes
Okay coverity seems to get one stuck in a loop where
by fixing one set of resource allocation problems, it
starts finding more.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-03-09 15:01:05 -07:00
Howard Pritchard
2d61a652c8 fcoll/dynamic: coverity fixes
okay, hopefully really fix CIDS 72325-72328, and 72330-72332.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-03-09 13:53:52 -07:00
Nathan Hjelm
0d80bfb391 Merge pull request #443 from hjelmn/mpit_31
Add new error code introduced in MPI-3.1.
2015-03-09 13:03:15 -06:00
Jeff Squyres
a026456bef (orte|ompi|oshmem)*info tools: convert to opal_dl interface
Noe that this commit removes option:lt_dladvise from the various
"info" tools output.  This technically breaks our CLI "ABI" because
we're not deprecating it / replacing it with an alias to some other
"into" tool output.

Although the dl/libltdl component contains an "have_lt_dladvise" MCA
var that contains the same information, the "option:lt_dladvise"
output from the various "info" tools is *not* an MCA var, and
therefore we can't alias it.  So it just has to die.
2015-03-09 08:18:13 -07:00
Jeff Squyres
c683500a29 debuggers: convert to opal_dl interface 2015-03-09 08:16:55 -07:00
Gilles Gouaillardet
6de973daae coll/sm: remove unused value
as reported by Coverity with CID 1269962
2015-03-09 17:31:32 +09:00
Gilles Gouaillardet
1896d4fba7 bcol/basesmuma: fix misc memory leak
as reported by Coverity with CID 715762
2015-03-09 17:22:25 +09:00
Gilles Gouaillardet
341bdd1fc3 ompi/group: refactor ompi_group_incl
and fixes CID 70478
2015-03-09 17:07:11 +09:00
Gilles Gouaillardet
9107bf5077 ompi/topo: fix misc errors
as reported by Coverity with CIDs 1041232, 1041234, 1041235
1269789 and 1269996
2015-03-09 15:22:22 +09:00
Gilles Gouaillardet
a9044945fe ompi/proc: correctly handle cutoff modex case
as reported by Coverity with CID 1196664
2015-03-09 14:34:28 +09:00
Gilles Gouaillardet
59f298a534 fs/base: securily use readlink
as reported by Coverity with CIDs 1287031 and 1287032
2015-03-09 11:20:51 +09:00
Howard Pritchard
209f002200 fcoll/static: fix an errant free
Got to excited about coverity and ended up generating
a new coverity error.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-03-06 13:12:53 -07:00
Howard Pritchard
4f4b99bbac fcoll/dynamic,static: coverity fixes
Fix some theoretical memory leaks reported by coverity.

Fixes CIDS 72332, 72328, 72332, 72138, 72139, 72140, 72364, 72365-72370
           72372-72374, 741354, 72143, 72375-83, 1027140, 1269903

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-03-06 11:05:23 -07:00
Gilles Gouaillardet
35c64af4b1 dpm: fix misc issues
as reported by Coverity with CIDs 71126 and 1269659
2015-03-06 16:20:24 +09:00
Gilles Gouaillardet
757b40e56a coll/tuned: remove dead code
as reported by Coverity with CID 1271638
that looks like a multiple paste error ...
2015-03-06 15:02:56 +09:00
Gilles Gouaillardet
f03d7dce17 ompio: fix deallocation sequence
as reported by Coverity with CID 1287034
2015-03-06 14:59:59 +09:00
George Bosilca
f758790d7a Allow TOPO modules to register their parameters when we do lazy
initialization.
2015-03-05 13:11:06 -05:00
George Bosilca
420ae98dfe Remove all unnecessary whitespaces and make sure we close the module
correctly.
2015-03-05 13:00:13 -05:00
Gilles Gouaillardet
d6ae0a5e05 sharedfp/sm: fix misc memory leaks
as reported by Coverity with CIDs 1196785, 1196787 and 1269896
2015-03-05 16:33:32 +09:00
Gilles Gouaillardet
5b2122381b ompio: fix misc memory leaks
as reported by Coverity with CIDs 72127, 72145, 72146, 72177, 72179,
72186, 731276, 731278, 1269888, 1269890
2015-03-05 16:22:19 +09:00
Gilles Gouaillardet
ceeb0844b6 dpm: fix misc memory leaks
as reported by Coverity with CIDs 1196737 and 1269850
2015-03-05 14:20:09 +09:00
Gilles Gouaillardet
e75b1e6435 fs/base: fix misc memory leak
as reported by Coverity with CID 72202
2015-03-05 14:20:08 +09:00
Gilles Gouaillardet
9f13425980 fbtl/posix: fix misc memory leaks
as reported by Coverity with CIDs 72125, 72126, 1269899 and 1269900
2015-03-05 14:20:08 +09:00
Gilles Gouaillardet
838cd51644 pubsub: fix misc memory leak
as reported by Coverity with CID 710627
2015-03-05 14:20:08 +09:00
Gilles Gouaillardet
d0dded1e05 topo/base: fix misc memory leaks
as reported by Coverity with CIDs 1269901 ans 1269902
2015-03-05 14:20:08 +09:00
Gilles Gouaillardet
d1b2f043ff fix misc memory leaks
as already reported by Coverity with CIDs
71818, 71819, 72250, 715767, 1196749 and 1274002
2015-03-05 13:58:05 +09:00
Nathan Hjelm
7d84991781 Give some headroom for adding new MPI error codes without breaking ABI 2015-03-04 10:46:41 -07:00
Nathan Hjelm
1537a50987 Add new error code introduced in MPI-3.1. 2015-03-03 17:49:42 -07:00
Howard Pritchard
53fd425a6a romio: patches from Rob Latham for issue #255
Patches supplied by Rob Latham which fix issue #255.

See
http://git.mpich.org/mpich.git/commit/4e80e1d2b9
http://git.mpich.org/mpich.git/commit/5a10283bf7fd

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2015-03-02 15:33:49 -08:00