1
1
Граф коммитов

8121 Коммитов

Автор SHA1 Сообщение Дата
Jithin Jose
50089977ac Inline PML-CM
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:41 -07:00
Jithin Jose
56869bff38 Avoid datatype pack/unpack for contiguous data on homogenous systems.
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:41 -07:00
Jeff Squyres
fb12572438 OFI: make v1.10 and v2.0 fit in version checking scheme
v1.10 is now in the same compatibility level as v1.7/v1.8 (there
is/will be no v1.9 series).  v2.0 now takes over for what used to be
called v1.9.
2015-05-26 18:58:28 -07:00
Nathan Hjelm
d21bd24126 ompi/crcp: fix logic issue after component selection
CID 70630 Dereference before null check

Cleaned up useless goto statements and deleted NULL check. If
mca_base_select returns success than best_module and best_component
will always be non-NULL.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-26 11:48:40 -06:00
Gilles Gouaillardet
899fb89392 MPI_Sendrecv_replace : use the right process convertor 2015-05-26 16:59:36 +09:00
Gilles Gouaillardet
0a2f60994c ompi/ddt: fix #ifdef vs #if HAVE_xxx 2015-05-26 16:59:36 +09:00
Gilles Gouaillardet
e980958ad4 pml/ob1: silence a warning 2015-05-26 15:05:44 +09:00
Ralph Castain
cfd2cc49fd Get the Java bindings to compile again - add missing header 2015-05-23 11:22:24 -07:00
Nathan Hjelm
68614a211b Merge pull request #596 from hjelmn/errorcode_fixes
Handle ompi error codes in java code and remove non-standard MPI error code from mpi.h.
2015-05-23 07:29:44 -06:00
Nathan Hjelm
163a1b4505 Remove non-standard MPI_ERR_SYSRESOURCE error code
Replaced internal usage with OMPI_ERR_OUT_OF_RESOURCE.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-05-22 19:59:37 -06:00
Nathan Hjelm
9da29c3621 java: remove debug code
Talked to @ggouaillardet about this code. It was not intended to be committed to
master. Removing to fix coverity issue.

CID 1270134 Unchecked return value

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-22 08:36:14 -06:00
Gilles Gouaillardet
85c45e2275 pml/ob1: fix mca_pml_ob1_recv_request_put_frag(...) in heterogeneous mode 2015-05-22 15:48:45 +09:00
Nathan Hjelm
c540a9e59a Merge pull request #582 from hjelmn/mca_var_file_fix
mca/base: fix source file name bug for synonyms
2015-05-21 11:34:12 -06:00
Nathan Hjelm
403b3b20d7 Handle ompi error codes in java code
This commit also adds protection against negative error codes in ompi
error code functions. There is one outstanding issue. There is a
negative MPI error code defined in mpi.h. This will need to be fixed
separetely.

This commit fixes coverity IDs 1271533 and 1270156.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-21 10:39:09 -06:00
Nathan Hjelm
757c021951 Fix coverity ID 1270164
The sargs array and its elements were malloced but not freed. Note
that strings passed to NewStringUTF are copied into Java's heap and it
is the callers responsibility to free the original string.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-21 10:15:10 -06:00
Nathan Hjelm
ce48eabd84 pml/ob1: use c99 flexible array members instead of size 1 arrays
This commit updates several ob1 structures to take advantage of C99's
flexible array member.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-20 10:31:35 -06:00
Gilles Gouaillardet
b6c67e051d io/ompio: fix misc memory leaks
as reported by Coverity with CIDs 72147-72149,72187,72188,731274,731275,741356,
1269889,1269893,1271535 and 1269872
2015-05-20 17:19:39 +09:00
Gilles Gouaillardet
c05b271c68 man: fix a trivial typo in MPI_Neighbor_allgather.3in 2015-05-15 16:02:01 +09:00
Gilles Gouaillardet
1488e82efd osc/pt2pt: enable heterogeneous support 2015-05-14 16:42:48 +09:00
Todd Kordenbrock
c42e277385 mtl-portals4: thread multiple updates
When activating short receive blocks on the overflow list, remove
the PTL_ME_EVENT_LINK_DISABLE flag so the event gets generated.
Without PTL_EVENT_LINK, the block status can't reach the activated
state.

Replace #ifdef with #if for Open MPI configure booleans, because
Open MPI configure booleans are always defined and the value must
be checked.
2015-05-13 17:06:18 -05:00
Yohann Burette
27f1884cf8 mtl/ofi: Reworked header files. Added compat to ease maintenance. 2015-05-12 15:47:50 -07:00
rhc54
b59fa14004 Merge pull request #583 from rhc54/topic/mallocwarnings
Silence malloc(0) warnings reported by Lisandro
2015-05-12 13:37:38 -07:00
Ralph Castain
9a70765f27 Silence malloc(0) warnings reported by Lisandro 2015-05-12 12:38:58 -07:00
Nathan Hjelm
427aebbaca Fix cuda support MCA variables
This commit fixes some issues with the cuda support parameters. There
were a couple of duplicate registrations and an incorrect synonym (one
variable was made a synonym of mpi_preconnect_mpi).

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-12 09:52:51 -06:00
Ryan Grant
bbeaf41a52 Merge pull request #580 from tkordenbrock/topic/mtl.add.status.to.short.recv.blocks
mtl-portals4: add status to short recv blocks to coordinate out of or…
2015-05-11 13:44:45 -06:00
Ryan Grant
265682bdb9 Merge pull request #581 from tkordenbrock/topic/remove.overlapping.multiMD.code
portals4: use a single Memory Descriptor to cover all of memory
2015-05-11 13:20:32 -06:00
George Bosilca
78f5f0f8a9 Show the name of the collective that failed to get initialized. 2015-05-11 15:10:37 -04:00
Todd Kordenbrock
9df163f116 portals4: use a single Memory Descriptor to cover all of memory
In days past, some implementations of Portals4 could not cover all
of memory with a single Memory Descriptor so multiple large
overlapping Memory Descriptors were created.  Because none of the
current implementations have this limitation (and no future
implementations should either), this commit removes the overlapping
Memory Descriptors code.
2015-05-11 11:49:41 -05:00
Todd Kordenbrock
074583060d mtl-portals4: add status to short recv blocks to coordinate out of order events
If OMPI is initialized as thread multiple, then it is possible for
Portals events to be processed out of order by different threads.
Out of order events could lead to reactivation of the block
(PTL_EVENT_AUTO_FREE) before the block is removed from the active
list (PTL_EVENT_AUTO_UNLINK).  This commit adds a status field to
ompi_mtl_portals4_recv_short_block_t that coordinates these events.
2015-05-11 11:48:25 -05:00
Gilles Gouaillardet
650289bc33 romio314: update one more romio->romio314 name
Also missed this in open-mpi/ompi@db257cdbc0.
2015-05-08 18:26:33 +09:00
Ralph Castain
6e95bcd583 Fix typo in oob_tcp.c when IPV6 enabled. Cleanup a few other warnings, including a type in coll_sm that prevented that component from registering its MCA params! 2015-05-07 21:05:08 -07:00
Gilles Gouaillardet
f1258c3b6c ompi/errhandler: make most ompi_err_* variables static
Thanks @hjelmn for pointing this !
2015-05-08 10:11:59 +09:00
Nathan Hjelm
f0e650fef5 Rename internal error code variables in errcode-internal.c
The renamed variables used the same identifiers as variables in
errcode.c. To avoid confusion rename the variables to end in _intern.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-08 10:11:59 +09:00
Gilles Gouaillardet
9d56b85b55 initialize common symbols from ompi 2015-05-08 10:11:58 +09:00
Gilles Gouaillardet
dd572a0838 Fix --with-fortran=... logic 2015-05-08 09:23:55 +09:00
Gilles Gouaillardet
ab148e4e0c romio314: update one more romio->romio314 name
Also missed this in open-mpi/ompi@db257cdbc0.
2015-05-08 09:12:22 +09:00
Jeff Squyres
b3d89cf7b0 romio314: update one more romio->romio314 name
Missed this in db257cdbc0.
2015-05-07 09:40:45 -07:00
George Bosilca
3af8dfd3e2 Fix a overwrite of the args buffer identified by Lisandro Dalcin. 2015-05-07 09:50:39 -04:00
Jeff Squyres
691b4ec1e5 romio314: whitespace cleanup
No code changes
2015-05-05 06:23:59 -07:00
Jeff Squyres
db257cdbc0 romio314: adhere to the prefix rule
Rename all files and symbols from "io_romio" to "io_romio314".  This
fixes --disable-dlopen builds (because they were missing
the mca_io_romio314_component symbol).
2015-05-05 06:23:59 -07:00
Devendar Bureddy
88eb1fa936 HCOLL: refactoring hcoll_init 2015-05-04 22:03:36 +03:00
Jeff Squyres
8127c24f30 romio314/Makefile.am: whitespace cleanup
No code changes.
2015-05-04 07:20:11 -07:00
Jeff Squyres
332bca7183 romio314/Makefile.am: name the component library properly 2015-05-04 07:20:11 -07:00
Howard Pritchard
74089f5e3a Merge pull request #558 from ggouaillardet/refresh/romio314
Refresh/romio314
2015-05-01 14:23:43 -06:00
Gilles Gouaillardet
ef4b6203a4 Correctly unpack datatype.
Alignment requirements were relaxed in open-mpi/ompi@33a3ace874
so correctly handle this when unpacking a datatype.
2015-05-01 17:08:15 +09:00
Gilles Gouaillardet
34128c1cad mca_topo_base_dist_graph_neighbors: do not fail if legitimate parameters are provided.
Per the MPI 3.0 standard (chapter 7, page 310) :
"If maxindegree or maxoutdegree is smaller than the numbers returned by
MPI_DIST_GRAPH_NEIGHBOR_COUNT, then only the first part of the full list is returned."
2015-05-01 13:31:05 +09:00
Gilles Gouaillardet
96e3cbe8fc Remove an incorrect assert.
Alignment requirements were relaxed in open-mpi/ompi@33a3ace874
and made a previous alignment check incorrect.
2015-05-01 12:53:12 +09:00
George Bosilca
33a3ace874 Minimize the alignments. We only do it when we need to pack
data that must be aligned (aka the displacement). All other
cases do not require special alignments, and are treated
normally.
Fix the comment regarding the alignment requirements.
2015-04-30 22:06:50 -04:00
George Bosilca
015d3f56cf Fix the INDEXED_BLOCK issue identified by IBM. 2015-04-30 14:43:19 -04:00
Gilles Gouaillardet
6b3126e69e ROMIO 3.1.4 refresh: add refresh notes 2015-04-30 19:02:20 +09:00
Gilles Gouaillardet
e1b6ab4f1d ROMIO 3.1.4 refresh: remove old romio 2015-04-30 19:01:23 +09:00
Gilles Gouaillardet
85e77079b4 ROMIO 3.1.4 refresh: use romio from mpich 3.1.4 2015-04-30 19:00:50 +09:00
Gilles Gouaillardet
92f6c7c1e2 ROMIO 3.1.4 refresh: apply post romio-3.1.4 patches 2015-04-30 18:56:53 +09:00
Gilles Gouaillardet
6400bc75ab ROMIO 3.1.4 refresh: patch romio for Open MPI 2015-04-30 18:53:55 +09:00
Gilles Gouaillardet
eacd434a02 ROMIO 3.1.4 refresh: import romio from mpich 3.1.4 tarball 2015-04-30 18:53:03 +09:00
Gilles Gouaillardet
e2e91142d5 ROMIO 3.1.4 refresh: prepare new romio directory 2015-04-30 18:52:22 +09:00
Gilles Gouaillardet
1ee58af8f5 update ROMIO .gitignore 2015-04-30 18:49:53 +09:00
Gilles Gouaillardet
697a866b6e ddt: correctly align next datatype description
This bug can be evidenced by the test/datatype/ddt_pack
test case on sparc architecture.
2015-04-30 15:04:54 +09:00
Ryan Grant
6ab91a6781 Merge pull request #561 from tkordenbrock/topic/mtl.fix.datatype.overflow
mtl-portals4: fix datatype overflow in ompi_mtl_portals4_long_isend()
2015-04-28 15:15:59 -06:00
Ryan Grant
cc3da91700 Merge pull request #562 from tkordenbrock/topic/mtl.expand.source.bits.to.24
mtl-portals4: expand the source field of the match bits to 24 bits
2015-04-28 14:31:06 -06:00
Rolf vandeVaart
91a8ec52ca Fix possible unintialized warnings 2015-04-28 16:25:35 -04:00
Todd Kordenbrock
8a4616f724 mtl-portals4: fix datatype overflow in ompi_mtl_portals4_long_isend()
The length parameter of ompi_mtl_portals4_long_isend() was declared
as "int", which may not be big enough depending on the platform and
compiler options used.  This commit changes the type to size_t to
prevent overflow.
2015-04-28 14:40:25 -05:00
Todd Kordenbrock
3e437f6184 mtl-portals4: expand the source field of the match bits to 24 bits
The source field was 16 bits which is not sufficient for many
current and future machines.  This commit expands the source field
to 24 bits and reduces the tag field from 32 bits to 24 bits.
2015-04-28 14:25:30 -05:00
Gilles Gouaillardet
18b75bd40d io/base: check the MCA version matches 2015-04-28 17:48:23 +09:00
Ralph Castain
3d46850c4d Per patch from Marco Atzeri, have the fortran wrapper links go directly to opal_wrapper to avoid breaks in the chain in some environments. 2015-04-25 17:09:06 -07:00
Yohann Burette
1be185ed87 mtl/ofi: Remove use of MR. 2015-04-24 15:55:21 -07:00
Nathan Hjelm
2716b8b1da osc/pt2pt: correct flush expected counts
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-04-24 13:34:21 -06:00
Nathan Hjelm
f1d09e55ec osc/pt2pt: silence warnings
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-23 15:35:47 -06:00
Nathan Hjelm
29b435a5a4 osc/pt2pt: fix bugs that caused incorrect fragment counting
This commit fixes a bug identified by MTT that occurred when mixing
passive and active target synchronization. The bugs fixed in this
commit are:

 - Do not update incoming fragment counts for any type of unbuffered
   control message. These messages are out-of-band and should not be
   considered towards the signal counts.

 - Complete a change from using received counts to expected counts for
   lock, unlock, and flush acks. Part of the change made it into
   master before the rest was ready. This was preventing wakeups in
   some cases.

 - Turn the passive_target_access_epoch module member into a
   counter. As long as at least one peer is locked we are in a
   passive-target epoch and not an active target one. This fix will
   ensure that fragment flags are set appropriately.

fixes #538

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-23 13:22:24 -06:00
Jeff Squyres
63c7520273 use-mpi-f08/Makefile.am: also link in libmpi_mpifh.la
Per mail from Macro Atzeri, we also need to link in libmpi_mpifh.la,
lest we exhaust relative offset addressing (e.g., in 32 bit builds).

See http://www.open-mpi.org/community/lists/devel/2015/04/17304.php.
2015-04-22 14:22:36 -07:00
Ryan Grant
1436417488 Merge pull request #547 from tkordenbrock/topic/mtl.add.logical.mode
mtl-portals4: add the option to use the Portals4 logical to physical mapping
2015-04-22 15:06:13 -06:00
Nathan Hjelm
7c95ecf859 mca/base: provide functions to determine if a framework is registered/open
This commit also fixes a problem with the lazy opening of topo
components. The topo framework incorrectly: 1) checked if the topo
framework was open by checking the length of the components list, and
2) called the framework open directly instead of using
mca_base_framework_open.

fixes #544

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-21 13:54:25 -06:00
Todd Kordenbrock
8e56002ec7 mtl-portals4: add missing return to portals4_init_interface() 2015-04-21 11:30:33 -05:00
Todd Kordenbrock
34c50fa963 mtl-portals4: move MD cleanup closer to failure
PtlMDRelease() was called if read_msg() returned a failure code.
This commit moves the PtlMDRelease() inside read_msg() so that it
doesn't get called in cases where the failure happens before or at
the PtlMDBind().
2015-04-21 11:30:33 -05:00
Todd Kordenbrock
422be76770 mtl-portals4: add a debug message for thread multiple mode 2015-04-21 11:30:33 -05:00
Todd Kordenbrock
35e5ffd001 mtl-portals4: add the option to use the Portals4 logical to physical table
This commit adds an MCA variable to select Portals4 logical
addressing, populates the logical-to-physical mapping table and
initializes the NI in this mode.
2015-04-21 11:30:33 -05:00
Yohann Burette
19607d2ce7 mtl/ofi: Remove memset() from progress path. 2015-04-20 14:12:39 -07:00
Yohann Burette
d2eda04801 mtl/ofi: Use fi_tinject for small messages. 2015-04-20 14:12:39 -07:00
Nathan Hjelm
033894b493 Merge pull request #541 from hjelmn/c99_components
C99 component initialization
2015-04-20 10:45:39 -06:00
Yohann Burette
ba1bc00df1 mtl/ofi: Remove FI_CANCEL. 2015-04-20 09:40:37 -07:00
Devendar Bureddy
19f5a3eff4 HCOLL: skip hcoll if enable_mpi_threads is true
reasons:
    1) default OCOMS is not configured with --enable-ocoms-multi-threads
    2) locking overheads
2015-04-20 19:39:49 +03:00
Devendar Bureddy
dd8e9fa176 HCOLL: enable by defaut 2015-04-20 19:39:30 +03:00
Nathan Hjelm
d251fa1525 pml/ob1: fix heterogenous build
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-20 09:27:00 -06:00
Howard Pritchard
3339274136 Merge pull request #542 from hppritcha/topic/coverity_714118
fcoll/two_phase: coverity fix
2015-04-20 05:42:12 -06:00
Howard Pritchard
de215addc6 fcoll/two_phase: coverity fix
fix CID 714118

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-04-18 14:34:48 -06:00
Nathan Hjelm
df75d0382f ompi: use C99 subobject naming for component initialization
This commit helps future-proof ompi components by initializing each
component member by name.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-18 10:29:58 -06:00
Yohann Burette
9392bb5ede mtl/ofi: Implement Probe/Mprobe/Mrecv using FI_PEEK/FI_CLAIM. 2015-04-17 16:42:13 -07:00
Nadezhda Kogteva
116169c38a opal timing: added ability to choose the timer type 2015-04-17 11:15:55 +03:00
Mangala Jyothi Bhaskar
c4de46e284 Fix number of aggregators used in two phase fcoll 2015-04-16 10:39:10 -05:00
Nathan Hjelm
3436f2917d Merge pull request #449 from hjelmn/mca_base_update
mca/base update
2015-04-16 08:41:48 -06:00
Nathan Hjelm
d5b52d3141 ompi/communicator: make comm_request internal variables static 2015-04-15 10:05:21 -06:00
Ralph Castain
a4b1225892 Don't register the PSM errhandler until it is certain that the PSM component can be used.
This doesn't matter on the master, but it does matter on the 1.8 branch as the MTL select logic is different over there.
2015-04-14 07:54:53 -07:00
Nathan Hjelm
113c890ccf Merge pull request #520 from hjelmn/valgrind_cleanness
fix memory leaks and valgrind errors
2015-04-13 10:09:34 -06:00
Jeff Squyres
49f52a5356 osc_sm_passive_target.c: update the check for lock types
Based on some on-list and IM discussion with @hjelmn about
open-mpi/ompi@40b7643119, change the testing to a switch/case.  If we
fall into the default case, assert() error (because it's an OMPI
developer programming error).
2015-04-13 12:02:15 -04:00
Jeff Squyres
40b7643119 osc_sm_passive_target.c: ensure ret is always defined
Fixes a compiler warning
2015-04-13 11:31:43 -04:00
Nathan Hjelm
a7b0c00ab6 fix memory leaks and valgrind errors
This commit fixes several vagrind errors. Included:

 - installdirs did not correctly reinitialize all pointers to NULL
   at close. This causes valgrind errors on a subsequent call to
   opal_init_tool.

 - several opal strings were leaked by opal_deregister_params which
   was setting them to NULL instead of letting them be freed by the
   MCA variable system.

 - move opal_net_init to AFTER the variable system is initialized and
   opal's MCA variables have been registered. opal_net_init uses a
   variable registered by opal_register_params!

 - do not leak ompi_mpi_main_thread when it is allocated by
   MPI_T_init_thread.

 - do not overwrite ompi_mpi_main_thread if it is already set (by
   MPI_T_init_thread).

 - mca_base_var: read_files was overwritting mca_base_var_file_list
   even if it was non-NULL.

 - mca_base_var: set all file global variables to initial states on
   finalize.

 - btl/vader: decrement enumerator reference count to ensure that it
   is freed.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-11 09:28:35 -06:00
Ralph Castain
3e44d3c9e3 Enable singletons to run without any active OOB module until they attempt to comm_spawn 2015-04-10 14:06:42 -07:00
Nathan Hjelm
eb56117405 Merge pull request #513 from hjelmn/mca_bug_fixes
opal: fix multiple bugs in MCA and opal
2015-04-08 10:29:44 -06:00
Nathan Hjelm
9cd955badf opal: fix multiple bugs in MCA and opal
This commit fixes the following bugs:

 - opal_output_finalize did not properly set internal state. This
   caused problems when calling the sequence opal_output_init (),
   opal_output_finalize (), opal_output_init ().

 - opal_info support called mca_base_open () but never called the
   matching mca_base_close (). mca_base_open () and mca_base_close ()
   have been updated to use a open count instead of an open flag to
   allow mca_base_open to be called through multiple paths (as may be
   the case when MPI_T is in use).

 - orte_info support did not register opal variables. This can cause
   orte-info to not return opal variables.

 - opal_info, orte_info, and ompi_info support have been updated to
   use a register count.

 - When opening the dl framework the reference count was added to
   ensure the framework stuck around. The framework being closed
   prematurely was a bug in the MCA base that has since been
   corrected. The increment (and associated decrement) have been
   removed.

 - dl/dlopen did not set the value of
   mca_dl_dlopen_component.filename_suffixes_mca_storage on each call
   to register. Instead the value was set in the component
   structure. This caused the value to be lost when re-loading the
   component. Fixed by setting the default value in register.

 - Reset shmem framework state on close to avoid returning a stale
   component after reloading opal/shmem.

 - MCA base parameters were not properly deregistered when the MCA
   base was closed.

This commit may fix #374.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-07 19:13:20 -06:00
Howard Pritchard
5ee18f4f00 Merge pull request #514 from hppritcha/topic/mpi_win_lock_all_man
man pages: fix problem with MPI_Win_lock_all
2015-04-07 17:17:30 -06:00