Mangala Jyothi Bhaskar
4ff21d6178
Fixed offset data type in communication
...
This commit was SVN r32709.
2014-09-11 14:51:07 +00:00
Mangala Jyothi Bhaskar
6e5f2c8ae8
Fixed offset data type in communication
...
This commit was SVN r32708.
2014-09-11 14:50:30 +00:00
Ralph Castain
9e7e90265f
Temporarily make the direct grpcomm component the default until we can debug the other modules
...
This commit was SVN r32707.
2014-09-11 14:47:54 +00:00
Edgar Gabriel
4ccc0f5ea2
the length of the iov array should be limited to IOV_MAX, which is defined in limits.h
...
This commit was SVN r32706.
2014-09-10 21:59:45 +00:00
Ralph Castain
a90f12ad1d
Save the clang settings for detecting alignment issues - I don't want to have to remember the cmd line jango
...
This commit was SVN r32705.
2014-09-10 18:56:19 +00:00
Ralph Castain
cb2ad98f57
Silence an unused function warning
...
This commit was SVN r32704.
2014-09-10 17:36:34 +00:00
Ralph Castain
4eb6291334
Avoid conflicts when multiple collectives are underway in ORTE by giving each grpcomm component its own RML tag and posting persistent receives. We use the signature anyway to determine which collective the received message is addressing, so there is no need to post non-persistent receives.
...
This commit was SVN r32703.
2014-09-10 17:36:16 +00:00
Ralph Castain
a7c5b77d70
Just because the openib BTL can't reach a process doesn't mean it is a job-ending error. If we have other methods for reaching the process (e.g., sm for a local proc), then that's okay. If there is no method for reaching a proc, then that's an error - but the BML will report that situation.
...
The question of whether or not the openib BTL supports loopback is a separate question. It may be more appropriate to make the modex be PMIX_GLOBAL for cases where openib can support loopback so someone can run without a shared memory component. I'll leave that decision to the IB vendors.
This commit was SVN r32702.
2014-09-10 17:02:16 +00:00
Ralph Castain
ea11e63f59
Per patch from Tetsuya, allow the user to bind-to none when specifying multiple pe's/rank as requested by Reuti. This allows the user to reserve multiple "slots" in the allocation for each process while mapping, but not to bind the process to specific processing elements on the node.
...
Reviewed by rhc, so RM-approved to go across to v1.8.3
cmr=v1.8.3:reviewer=ompi-gk1.8
This commit was SVN r32701.
2014-09-10 15:52:18 +00:00
Edgar Gabriel
cc46b65a5e
the fbtl interfaces should really be an ssize_t not a size_t, since the return
...
value could be negative, which is allowed for ssize_t, but not for size_t.
This commit was SVN r32700.
2014-09-10 15:01:54 +00:00
Edgar Gabriel
599cb7b351
update the pvfs2 fbtl to return the number of bytes generated.
...
This commit was SVN r32699.
2014-09-10 13:32:06 +00:00
Ralph Castain
93948f0c4e
Resolve alignment issues when unpacking buffers
...
cmr=v1.8.3:reviewer=jsquyres
This commit was SVN r32698.
2014-09-10 10:19:16 +00:00
Gilles Gouaillardet
e71452d73a
Revert r32696
...
This commit was SVN r32697.
The following SVN revision numbers were found above:
r32696 --> open-mpi/ompi@e4c3500166
2014-09-10 04:35:47 +00:00
Gilles Gouaillardet
e4c3500166
Fix MPI_Status_set_elements[_x] for non predefined datatypes
...
Fixes trac:4896
cmr=v1.8.3:reviewer=bosilca
This commit was SVN r32696.
The following Trac tickets were found above:
Ticket 4896 --> https://svn.open-mpi.org/trac/ompi/ticket/4896
2014-09-10 02:41:29 +00:00
Ralph Castain
e671620ac7
Per request from Jeff: tune up the help messages for binding options
...
Refs trac:4898
This commit was SVN r32691.
The following Trac tickets were found above:
Ticket 4898 --> https://svn.open-mpi.org/trac/ompi/ticket/4898
2014-09-09 22:39:22 +00:00
Edgar Gabriel
3a5f4f72da
make the zero byte read/write scenarios work without the contiguous flag.
...
This commit was SVN r32690.
2014-09-09 16:26:14 +00:00
Edgar Gabriel
6a607caed8
fix some zero byte allocation scenarios.
...
This commit was SVN r32689.
2014-09-09 16:25:44 +00:00
Gilles Gouaillardet
63209eac5b
orte/util: use ORTE_JOB_FAMILY and ORTE_LOCAL_JOBID macros
...
This commit was SVN r32688.
2014-09-09 05:13:00 +00:00
Ralph Castain
4207b4c4ad
Improve the --bind-to help message to better indicate the default options under various values of np. Remove the warning message if the user doesn't specify a binding policy and we are overloaded
...
cmr=v1.8.3:reviewer=jsquyres
This commit was SVN r32687.
2014-09-08 21:03:51 +00:00
Ralph Castain
4df1aa63f7
Since we've run into the situation where someone puts a script wrapper around a launcher such as srun, we need to always protect MCA cmd line params with quotes. This means we also need to protect the backend from quotes coming into the system as part of a value, or else the parser gets confused.
...
So add a new function for wrapping MCA arguments, and tell the backend parser to ignore/remove leading/trailing quotes.
cmr=v1.8.3:reviewer=jsquyres
This commit was SVN r32686.
2014-09-08 20:38:46 +00:00
Ralph Castain
5649841e26
Provide missing include file - generates errors when used with Intel compilers
...
This commit was SVN r32685.
2014-09-08 19:04:40 +00:00
Edgar Gabriel
ed02927767
- do not set the contiguous memory option in the collective operations. It
...
should not be stored on the file handle anyway, since it is not a property of
the file.
- protect a realloc for zero byte scenarios.
This commit was SVN r32678.
2014-09-07 18:09:43 +00:00
Edgar Gabriel
0d425e2f74
resetting the counter for the iov array has to happen outside of the if statement.
...
This commit was SVN r32677.
2014-09-07 16:30:56 +00:00
Ralph Castain
e32d541c8d
Bring over a slight modification to the opal_init_test routine
...
This commit was SVN r32676.
2014-09-07 15:46:53 +00:00
Ralph Castain
916f98a3ee
Rename an HWLOC member of a union in the diff.h file to avoid a naming conflict with an external library - it isn't that HWLOC did something wrong, but rather that the name being used is so close to a type name that other folks has a tendency to #define it as well. We could argue with those folks that what they are doing is incorrect, but it is just easier to make a slight change and resolve the problem.
...
This commit was SVN r32675.
2014-09-07 15:42:05 +00:00
Edgar Gabriel
0f59ce6591
use the fbtl return value as originally intended, namely to retrieve the
...
number of bytes written and read. Status contains now the actual number of
bytes written for individual operations. For collective operations, this is
unfortunately not possible.
This commit was SVN r32674.
2014-09-07 15:14:57 +00:00
Ralph Castain
6323b226c7
Bring over some updates from the PMIx branch - mostly just minor cleanups. Make the direct grpcomm component no longer be the default. For now, we seem to be having problems with non-blocking fence operations, so make them not be the default under any scenario (e.g., when sm is the only btl in operation).
...
This commit was SVN r32673.
2014-09-06 19:19:44 +00:00
Ralph Castain
f1a33b6476
Use the accessor function to get the jobid and vpid
...
This commit was SVN r32672.
2014-09-06 19:18:21 +00:00
Howard Pritchard
fe2ea1f0fb
fix handling of OPAL_DSTORE_LOCALITY and ref cnt
...
This commit was SVN r32671.
2014-09-05 21:36:19 +00:00
Gilles Gouaillardet
f0108f881f
oshmem: silence warning
...
ensure OSHMEM_PROFILING is #define'd even if profiling is not supported
cmr=v1.8.3:reviewer=miked
This commit was SVN r32670.
2014-09-05 08:37:29 +00:00
Ralph Castain
ec51cbab9f
We are failing to use the system dirname function because we are not correctly flagging that we found it. Modify opal_search_libs_core to set an "opal_have_foo" flag to indicate that we found the specified function, and then modify the have_dirname check to look for it.
...
cmr=v1.8.3:reviewer=jsquyres
This commit was SVN r32669.
2014-09-04 16:10:38 +00:00
Ralph Castain
41c6058153
Bring over changes to MXM from pmix branch:
...
MTL MXM: establish endpoint connection on the first communication when direct_modex used
This commit was SVN r32668.
2014-09-03 18:22:11 +00:00
Ralph Castain
a51d1d7a97
find_last_path_separator returns NULL if the filename doesn't contain a path separator in it - i.e., it's just a local file. So protect the loop to avoid a segfault
...
cmr=v1.8.3:reviewer=rolfv
This commit was SVN r32667.
2014-09-03 18:13:42 +00:00
Ralph Castain
94ffca4901
Correct the cutoff point for full modex operation as it is based on the number of nodes in the system, not the number of procs in the signature.
...
This commit was SVN r32666.
2014-09-03 17:28:12 +00:00
Ralph Castain
3fed455bbc
If something goes wrong in add_procs, let's not segfault during finalize
...
This commit was SVN r32665.
2014-09-03 17:27:31 +00:00
Ralph Castain
2bfb18e004
Resolve some race conditions when async pmix modex modes are invoked. Since calls to "get" data can come both locally and remotely before data for a given proc has actually been received, we have to track all requests that cannot be immediately fulfilled and provide the data once it has been received.
...
This commit was SVN r32664.
2014-09-02 20:04:17 +00:00
Ralph Castain
b372cd02d0
Ensure the hwloc headers get installed when --with-devel-headers is given
...
This commit was SVN r32663.
2014-09-02 19:58:25 +00:00
Ralph Castain
4d186e6402
Properly protect the MCA parameters being registered by the OOB/TCP component when IPv6 is enabled
...
cmr=v1.8.3:reviewer=jsquyres
This commit was SVN r32662.
2014-09-02 14:53:00 +00:00
Ralph Castain
1c4870357b
Per patch submitted by J. Randall, add missing library to LSF integration
...
cmr=v1.8.3:reviewer=rhc
This commit was SVN r32661.
2014-09-02 00:38:07 +00:00
Ralph Castain
f2b26bde4c
Resolve a race condition that could cause us to hang during abnormal terminations due to multi-counting num_terminated
...
This commit was SVN r32660.
2014-09-02 00:32:52 +00:00
Gilles Gouaillardet
edfbeba7bf
coll/ml: better error handling
...
when CHECK_AND_RECYCLE detects an error, a message is displayed
if the error occurs on an intrinsic communicator, then abort
the program (instead of trying to free the communicator)
cmr=v1.8.3:reviewer=hjelmn
This commit was SVN r32659.
2014-09-01 10:00:49 +00:00
Gilles Gouaillardet
c2bcda518f
oshmem: shpalloc returns the errcode as described in OpenSHMEM 1.1 api
...
cmr=v1.8.3:reviewer=jladd
This commit was SVN r32658.
2014-09-01 08:14:13 +00:00
Ralph Castain
aae1bb4f44
Silence warning
...
This commit was SVN r32657.
2014-08-31 08:10:35 +00:00
Ralph Castain
d13fb37ef9
Add array types to opal_value_t
...
This commit was SVN r32656.
2014-08-31 08:07:03 +00:00
Ralph Castain
9500939042
Fix abstraction violation
...
This commit was SVN r32655.
2014-08-31 08:06:35 +00:00
Mike Dubman
4497dada00
build: add "-verok" flag to ignore autotools version check and continue anyway.
...
This commit was SVN r32654.
2014-08-31 07:27:34 +00:00
MPI Team
f5905c7111
Update git/hg ignore files
...
This commit was SVN r32653.
2014-08-31 05:00:20 +00:00
Ralph Castain
60eb7124ab
Upgrade to hwloc 1.9.1
...
This commit was SVN r32652.
2014-08-31 03:13:06 +00:00
Ralph Castain
e49ca05f11
Remove unused variable
...
This commit was SVN r32651.
2014-08-31 03:11:50 +00:00
Ralph Castain
5cdbc00136
Re-enable the usock oob component. Ensure the TCP component promotes messages for other procs to the OOB base so that other components have a chance to send the relay. Seems to be passing MTT, so let's see how it works for others.
...
This commit was SVN r32650.
2014-08-30 19:33:46 +00:00