1
1
Граф коммитов

20878 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
916f98a3ee Rename an HWLOC member of a union in the diff.h file to avoid a naming conflict with an external library - it isn't that HWLOC did something wrong, but rather that the name being used is so close to a type name that other folks has a tendency to #define it as well. We could argue with those folks that what they are doing is incorrect, but it is just easier to make a slight change and resolve the problem.
This commit was SVN r32675.
2014-09-07 15:42:05 +00:00
Edgar Gabriel
0f59ce6591 use the fbtl return value as originally intended, namely to retrieve the
number of bytes written and read. Status contains now the actual number of
bytes written for individual operations. For collective operations, this is
unfortunately not possible.

This commit was SVN r32674.
2014-09-07 15:14:57 +00:00
Ralph Castain
6323b226c7 Bring over some updates from the PMIx branch - mostly just minor cleanups. Make the direct grpcomm component no longer be the default. For now, we seem to be having problems with non-blocking fence operations, so make them not be the default under any scenario (e.g., when sm is the only btl in operation).
This commit was SVN r32673.
2014-09-06 19:19:44 +00:00
Ralph Castain
f1a33b6476 Use the accessor function to get the jobid and vpid
This commit was SVN r32672.
2014-09-06 19:18:21 +00:00
Howard Pritchard
fe2ea1f0fb fix handling of OPAL_DSTORE_LOCALITY and ref cnt
This commit was SVN r32671.
2014-09-05 21:36:19 +00:00
Gilles Gouaillardet
f0108f881f oshmem: silence warning
ensure OSHMEM_PROFILING is #define'd even if profiling is not supported

cmr=v1.8.3:reviewer=miked

This commit was SVN r32670.
2014-09-05 08:37:29 +00:00
Ralph Castain
ec51cbab9f We are failing to use the system dirname function because we are not correctly flagging that we found it. Modify opal_search_libs_core to set an "opal_have_foo" flag to indicate that we found the specified function, and then modify the have_dirname check to look for it.
cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32669.
2014-09-04 16:10:38 +00:00
Ralph Castain
41c6058153 Bring over changes to MXM from pmix branch:
MTL MXM: establish endpoint connection on the first communication when direct_modex used

This commit was SVN r32668.
2014-09-03 18:22:11 +00:00
Ralph Castain
a51d1d7a97 find_last_path_separator returns NULL if the filename doesn't contain a path separator in it - i.e., it's just a local file. So protect the loop to avoid a segfault
cmr=v1.8.3:reviewer=rolfv

This commit was SVN r32667.
2014-09-03 18:13:42 +00:00
Ralph Castain
94ffca4901 Correct the cutoff point for full modex operation as it is based on the number of nodes in the system, not the number of procs in the signature.
This commit was SVN r32666.
2014-09-03 17:28:12 +00:00
Ralph Castain
3fed455bbc If something goes wrong in add_procs, let's not segfault during finalize
This commit was SVN r32665.
2014-09-03 17:27:31 +00:00
Ralph Castain
2bfb18e004 Resolve some race conditions when async pmix modex modes are invoked. Since calls to "get" data can come both locally and remotely before data for a given proc has actually been received, we have to track all requests that cannot be immediately fulfilled and provide the data once it has been received.
This commit was SVN r32664.
2014-09-02 20:04:17 +00:00
Ralph Castain
b372cd02d0 Ensure the hwloc headers get installed when --with-devel-headers is given
This commit was SVN r32663.
2014-09-02 19:58:25 +00:00
Ralph Castain
4d186e6402 Properly protect the MCA parameters being registered by the OOB/TCP component when IPv6 is enabled
cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32662.
2014-09-02 14:53:00 +00:00
Ralph Castain
1c4870357b Per patch submitted by J. Randall, add missing library to LSF integration
cmr=v1.8.3:reviewer=rhc

This commit was SVN r32661.
2014-09-02 00:38:07 +00:00
Ralph Castain
f2b26bde4c Resolve a race condition that could cause us to hang during abnormal terminations due to multi-counting num_terminated
This commit was SVN r32660.
2014-09-02 00:32:52 +00:00
Gilles Gouaillardet
edfbeba7bf coll/ml: better error handling
when CHECK_AND_RECYCLE detects an error, a message is displayed
if the error occurs on an intrinsic communicator, then abort
the program (instead of trying to free the communicator)

cmr=v1.8.3:reviewer=hjelmn

This commit was SVN r32659.
2014-09-01 10:00:49 +00:00
Gilles Gouaillardet
c2bcda518f oshmem: shpalloc returns the errcode as described in OpenSHMEM 1.1 api
cmr=v1.8.3:reviewer=jladd

This commit was SVN r32658.
2014-09-01 08:14:13 +00:00
Ralph Castain
aae1bb4f44 Silence warning
This commit was SVN r32657.
2014-08-31 08:10:35 +00:00
Ralph Castain
d13fb37ef9 Add array types to opal_value_t
This commit was SVN r32656.
2014-08-31 08:07:03 +00:00
Ralph Castain
9500939042 Fix abstraction violation
This commit was SVN r32655.
2014-08-31 08:06:35 +00:00
Mike Dubman
4497dada00 build: add "-verok" flag to ignore autotools version check and continue anyway.
This commit was SVN r32654.
2014-08-31 07:27:34 +00:00
MPI Team
f5905c7111 Update git/hg ignore files
This commit was SVN r32653.
2014-08-31 05:00:20 +00:00
Ralph Castain
60eb7124ab Upgrade to hwloc 1.9.1
This commit was SVN r32652.
2014-08-31 03:13:06 +00:00
Ralph Castain
e49ca05f11 Remove unused variable
This commit was SVN r32651.
2014-08-31 03:11:50 +00:00
Ralph Castain
5cdbc00136 Re-enable the usock oob component. Ensure the TCP component promotes messages for other procs to the OOB base so that other components have a chance to send the relay. Seems to be passing MTT, so let's see how it works for others.
This commit was SVN r32650.
2014-08-30 19:33:46 +00:00
Ralph Castain
a2085a5916 Fix the PSM transport key generator to match prior releases
This commit was SVN r32649.
2014-08-30 00:48:25 +00:00
Ralph Castain
9ac75451ff Nathan had requested this before as he needs to know the #procs in the job to optimize the UGNI btl. Add the fetch for that data - the native pmix component already provides it, but ensure the Slurm PMI-1 support does too. If not found, fall back to the non-optimized number
This commit was SVN r32648.
2014-08-29 22:53:35 +00:00
Ralph Castain
cb0739dfd4 Update the regex to resolve a bug
This commit was SVN r32647.
2014-08-29 22:24:20 +00:00
Ralph Castain
f865ef61ab Need local_size returned by the Slurm components
This commit was SVN r32646.
2014-08-29 22:23:27 +00:00
Howard Pritchard
9a2891f2d6 handle PMIX_LOCAL_SIZE attr arg in cray pmix
This commit was SVN r32645.
2014-08-29 21:18:02 +00:00
Ralph Castain
8faabed2cd Add some further initialization and protection for zero-byte messages
This commit was SVN r32644.
2014-08-29 17:24:55 +00:00
Ralph Castain
2b225e3776 Cleanup a race condition regarding marking that waitpid_fired. We should always mark it as fired when we enter the wait_local_proc routine, and also mark it as no longer alive if iof_complete has also been found. If other places in the code also update those flags, there is no harm done.
This commit was SVN r32643.
2014-08-29 17:03:31 +00:00
Gilles Gouaillardet
6916bfc368 btl/openib: fix use of mca_btl_openib_component.default_recv_qps
- do not have mca_btl_openib_component.default_recv_qps point to the stack
- do not reset mca_btl_openib_component.default_recv_qps in btl_openib_component_open

cmr=v1.8.3:reviewer=miked

This commit was SVN r32642.
2014-08-29 04:41:34 +00:00
Gilles Gouaillardet
b8a2e90f2d btl/openib: fix a typo
cmr=v1.8.3:reviewer=miked

This commit was SVN r32639.
2014-08-29 04:23:42 +00:00
Ralph Castain
730e28349e Some minor uninitialized variable cleanups
This commit was SVN r32629.
2014-08-29 02:21:13 +00:00
Jeff Squyres
733316372b usnic: remove suggestion of enabling no-drop in the fabric
Reviewed by Reese Faucette

cmr=v1.8.3:reviewer=ompi-rm1.8

This commit was SVN r32628.
2014-08-28 23:56:56 +00:00
Jeff Squyres
f4238d65a5 fortran: also provide PMPI variants for MPI_Alloc_mem_cptr
r32622 was the first half of the fix -- we need the PMPI variants as well.

Refs trac:4882

This commit was SVN r32627.

The following SVN revision numbers were found above:
  r32622 --> open-mpi/ompi@cf0f734a98

The following Trac tickets were found above:
  Ticket 4882 --> https://svn.open-mpi.org/trac/ompi/ticket/4882
2014-08-28 23:47:38 +00:00
Howard Pritchard
2a12fd833d Fix compile problem from pmix merge
This commit was SVN r32626.
2014-08-28 22:14:12 +00:00
Howard Pritchard
51c73f116b switch check for ugni to use pkg-config
deprecate with-ugni in lanl/cray_xe6 platform file

This commit was SVN r32625.
2014-08-28 22:03:41 +00:00
Ralph Castain
fafdbeec0c Cleanup and enable the new daemon collective modules for more scalable operations. Thanks to Nadezhda Kogteva (Mellanox) for doing them.
This commit was SVN r32624.
2014-08-28 20:35:35 +00:00
Jeff Squyres
008454a4a8 make_dist_tarball: check for a dirty source tree
Have the make_dist_tarball script check to ensure that the source tree
is clean before continuing.  This ensures that we don't accidentally
build a distribution tarball with something that is not committed in
the repo.

There is a --dirtyok option to override this check, and if you access
this script via the "make_tarball" link, --dirtyok is added to the
default set of options.

cmr=v1.8.3:reviewer=rhc

This commit was SVN r32623.
2014-08-28 13:26:33 +00:00
Gilles Gouaillardet
cf0f734a98 Fortran: add mpi_alloc_mem_cptr like bindings when configured with --without-weak-symbols
cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32622.
2014-08-28 09:34:54 +00:00
Ralph Castain
b554cd7d86 Turn off the coll/ml component if --without-hwloc was given
cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32621.
2014-08-27 20:25:39 +00:00
Ralph Castain
731a878ff3 Add a bunch of debug to help track down the problem, and eventually find another place where comparison of signatures was incorrectly performed - use the dss compare operation to be consistent and safe
This commit was SVN r32620.
2014-08-27 19:52:20 +00:00
Ralph Castain
5fb7c7d23b Don't explicitly add the hostname to the data fetch when we already cached a remote blob
This commit was SVN r32619.
2014-08-27 16:18:05 +00:00
Ralph Castain
3c24770bce Protect debug printing on backend nodes
This commit was SVN r32618.
2014-08-27 16:17:28 +00:00
Ralph Castain
b87b69e977 Ensure the nodes get added to the job map on the remote nodes, add some debug to grpcomm daemon array construction
This commit was SVN r32617.
2014-08-27 16:16:46 +00:00
Ralph Castain
842aaf6167 Correctly end mapping oversubscribed nodes round-robin byslot
cmr=v1.8.3:reviewer=rhc

This commit was SVN r32616.
2014-08-27 16:15:18 +00:00
Jeff Squyres
d85527701a Fix MPI_COMM_SPLIT_TYPE with MPI_UNDEFINED
Thanks to Lisandro Dalcin for identifying the problem.

Fixes trac:4876

Submitted by George Boscila, reviewed by Jeff Squyres.

cmr=v1.8.3:reviewer=ompi-rm1.8

This commit was SVN r32615.

The following Trac tickets were found above:
  Ticket 4876 --> https://svn.open-mpi.org/trac/ompi/ticket/4876
2014-08-27 12:17:33 +00:00