1
1
Граф коммитов

20902 Коммитов

Автор SHA1 Сообщение Дата
Mangala Jyothi Bhaskar
4ff21d6178 Fixed offset data type in communication
This commit was SVN r32709.
2014-09-11 14:51:07 +00:00
Mangala Jyothi Bhaskar
6e5f2c8ae8 Fixed offset data type in communication
This commit was SVN r32708.
2014-09-11 14:50:30 +00:00
Ralph Castain
9e7e90265f Temporarily make the direct grpcomm component the default until we can debug the other modules
This commit was SVN r32707.
2014-09-11 14:47:54 +00:00
Edgar Gabriel
4ccc0f5ea2 the length of the iov array should be limited to IOV_MAX, which is defined in limits.h
This commit was SVN r32706.
2014-09-10 21:59:45 +00:00
Ralph Castain
a90f12ad1d Save the clang settings for detecting alignment issues - I don't want to have to remember the cmd line jango
This commit was SVN r32705.
2014-09-10 18:56:19 +00:00
Ralph Castain
cb2ad98f57 Silence an unused function warning
This commit was SVN r32704.
2014-09-10 17:36:34 +00:00
Ralph Castain
4eb6291334 Avoid conflicts when multiple collectives are underway in ORTE by giving each grpcomm component its own RML tag and posting persistent receives. We use the signature anyway to determine which collective the received message is addressing, so there is no need to post non-persistent receives.
This commit was SVN r32703.
2014-09-10 17:36:16 +00:00
Ralph Castain
a7c5b77d70 Just because the openib BTL can't reach a process doesn't mean it is a job-ending error. If we have other methods for reaching the process (e.g., sm for a local proc), then that's okay. If there is no method for reaching a proc, then that's an error - but the BML will report that situation.
The question of whether or not the openib BTL supports loopback is a separate question. It may be more appropriate to make the modex be PMIX_GLOBAL for cases where openib can support loopback so someone can run without a shared memory component. I'll leave that decision to the IB vendors.

This commit was SVN r32702.
2014-09-10 17:02:16 +00:00
Ralph Castain
ea11e63f59 Per patch from Tetsuya, allow the user to bind-to none when specifying multiple pe's/rank as requested by Reuti. This allows the user to reserve multiple "slots" in the allocation for each process while mapping, but not to bind the process to specific processing elements on the node.
Reviewed by rhc, so RM-approved to go across to v1.8.3

cmr=v1.8.3:reviewer=ompi-gk1.8

This commit was SVN r32701.
2014-09-10 15:52:18 +00:00
Edgar Gabriel
cc46b65a5e the fbtl interfaces should really be an ssize_t not a size_t, since the return
value could be negative, which is allowed for ssize_t, but not for size_t.

This commit was SVN r32700.
2014-09-10 15:01:54 +00:00
Edgar Gabriel
599cb7b351 update the pvfs2 fbtl to return the number of bytes generated.
This commit was SVN r32699.
2014-09-10 13:32:06 +00:00
Ralph Castain
93948f0c4e Resolve alignment issues when unpacking buffers
cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32698.
2014-09-10 10:19:16 +00:00
Gilles Gouaillardet
e71452d73a Revert r32696
This commit was SVN r32697.

The following SVN revision numbers were found above:
  r32696 --> open-mpi/ompi@e4c3500166
2014-09-10 04:35:47 +00:00
Gilles Gouaillardet
e4c3500166 Fix MPI_Status_set_elements[_x] for non predefined datatypes
Fixes trac:4896

cmr=v1.8.3:reviewer=bosilca

This commit was SVN r32696.

The following Trac tickets were found above:
  Ticket 4896 --> https://svn.open-mpi.org/trac/ompi/ticket/4896
2014-09-10 02:41:29 +00:00
Ralph Castain
e671620ac7 Per request from Jeff: tune up the help messages for binding options
Refs trac:4898

This commit was SVN r32691.

The following Trac tickets were found above:
  Ticket 4898 --> https://svn.open-mpi.org/trac/ompi/ticket/4898
2014-09-09 22:39:22 +00:00
Edgar Gabriel
3a5f4f72da make the zero byte read/write scenarios work without the contiguous flag.
This commit was SVN r32690.
2014-09-09 16:26:14 +00:00
Edgar Gabriel
6a607caed8 fix some zero byte allocation scenarios.
This commit was SVN r32689.
2014-09-09 16:25:44 +00:00
Gilles Gouaillardet
63209eac5b orte/util: use ORTE_JOB_FAMILY and ORTE_LOCAL_JOBID macros
This commit was SVN r32688.
2014-09-09 05:13:00 +00:00
Ralph Castain
4207b4c4ad Improve the --bind-to help message to better indicate the default options under various values of np. Remove the warning message if the user doesn't specify a binding policy and we are overloaded
cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32687.
2014-09-08 21:03:51 +00:00
Ralph Castain
4df1aa63f7 Since we've run into the situation where someone puts a script wrapper around a launcher such as srun, we need to always protect MCA cmd line params with quotes. This means we also need to protect the backend from quotes coming into the system as part of a value, or else the parser gets confused.
So add a new function for wrapping MCA arguments, and tell the backend parser to ignore/remove leading/trailing quotes.

cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32686.
2014-09-08 20:38:46 +00:00
Ralph Castain
5649841e26 Provide missing include file - generates errors when used with Intel compilers
This commit was SVN r32685.
2014-09-08 19:04:40 +00:00
Edgar Gabriel
ed02927767 - do not set the contiguous memory option in the collective operations. It
should not be stored on the file handle anyway, since it is not a property of
the file.
- protect a realloc for zero byte scenarios.

This commit was SVN r32678.
2014-09-07 18:09:43 +00:00
Edgar Gabriel
0d425e2f74 resetting the counter for the iov array has to happen outside of the if statement.
This commit was SVN r32677.
2014-09-07 16:30:56 +00:00
Ralph Castain
e32d541c8d Bring over a slight modification to the opal_init_test routine
This commit was SVN r32676.
2014-09-07 15:46:53 +00:00
Ralph Castain
916f98a3ee Rename an HWLOC member of a union in the diff.h file to avoid a naming conflict with an external library - it isn't that HWLOC did something wrong, but rather that the name being used is so close to a type name that other folks has a tendency to #define it as well. We could argue with those folks that what they are doing is incorrect, but it is just easier to make a slight change and resolve the problem.
This commit was SVN r32675.
2014-09-07 15:42:05 +00:00
Edgar Gabriel
0f59ce6591 use the fbtl return value as originally intended, namely to retrieve the
number of bytes written and read. Status contains now the actual number of
bytes written for individual operations. For collective operations, this is
unfortunately not possible.

This commit was SVN r32674.
2014-09-07 15:14:57 +00:00
Ralph Castain
6323b226c7 Bring over some updates from the PMIx branch - mostly just minor cleanups. Make the direct grpcomm component no longer be the default. For now, we seem to be having problems with non-blocking fence operations, so make them not be the default under any scenario (e.g., when sm is the only btl in operation).
This commit was SVN r32673.
2014-09-06 19:19:44 +00:00
Ralph Castain
f1a33b6476 Use the accessor function to get the jobid and vpid
This commit was SVN r32672.
2014-09-06 19:18:21 +00:00
Howard Pritchard
fe2ea1f0fb fix handling of OPAL_DSTORE_LOCALITY and ref cnt
This commit was SVN r32671.
2014-09-05 21:36:19 +00:00
Gilles Gouaillardet
f0108f881f oshmem: silence warning
ensure OSHMEM_PROFILING is #define'd even if profiling is not supported

cmr=v1.8.3:reviewer=miked

This commit was SVN r32670.
2014-09-05 08:37:29 +00:00
Ralph Castain
ec51cbab9f We are failing to use the system dirname function because we are not correctly flagging that we found it. Modify opal_search_libs_core to set an "opal_have_foo" flag to indicate that we found the specified function, and then modify the have_dirname check to look for it.
cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32669.
2014-09-04 16:10:38 +00:00
Ralph Castain
41c6058153 Bring over changes to MXM from pmix branch:
MTL MXM: establish endpoint connection on the first communication when direct_modex used

This commit was SVN r32668.
2014-09-03 18:22:11 +00:00
Ralph Castain
a51d1d7a97 find_last_path_separator returns NULL if the filename doesn't contain a path separator in it - i.e., it's just a local file. So protect the loop to avoid a segfault
cmr=v1.8.3:reviewer=rolfv

This commit was SVN r32667.
2014-09-03 18:13:42 +00:00
Ralph Castain
94ffca4901 Correct the cutoff point for full modex operation as it is based on the number of nodes in the system, not the number of procs in the signature.
This commit was SVN r32666.
2014-09-03 17:28:12 +00:00
Ralph Castain
3fed455bbc If something goes wrong in add_procs, let's not segfault during finalize
This commit was SVN r32665.
2014-09-03 17:27:31 +00:00
Ralph Castain
2bfb18e004 Resolve some race conditions when async pmix modex modes are invoked. Since calls to "get" data can come both locally and remotely before data for a given proc has actually been received, we have to track all requests that cannot be immediately fulfilled and provide the data once it has been received.
This commit was SVN r32664.
2014-09-02 20:04:17 +00:00
Ralph Castain
b372cd02d0 Ensure the hwloc headers get installed when --with-devel-headers is given
This commit was SVN r32663.
2014-09-02 19:58:25 +00:00
Ralph Castain
4d186e6402 Properly protect the MCA parameters being registered by the OOB/TCP component when IPv6 is enabled
cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32662.
2014-09-02 14:53:00 +00:00
Ralph Castain
1c4870357b Per patch submitted by J. Randall, add missing library to LSF integration
cmr=v1.8.3:reviewer=rhc

This commit was SVN r32661.
2014-09-02 00:38:07 +00:00
Ralph Castain
f2b26bde4c Resolve a race condition that could cause us to hang during abnormal terminations due to multi-counting num_terminated
This commit was SVN r32660.
2014-09-02 00:32:52 +00:00
Gilles Gouaillardet
edfbeba7bf coll/ml: better error handling
when CHECK_AND_RECYCLE detects an error, a message is displayed
if the error occurs on an intrinsic communicator, then abort
the program (instead of trying to free the communicator)

cmr=v1.8.3:reviewer=hjelmn

This commit was SVN r32659.
2014-09-01 10:00:49 +00:00
Gilles Gouaillardet
c2bcda518f oshmem: shpalloc returns the errcode as described in OpenSHMEM 1.1 api
cmr=v1.8.3:reviewer=jladd

This commit was SVN r32658.
2014-09-01 08:14:13 +00:00
Ralph Castain
aae1bb4f44 Silence warning
This commit was SVN r32657.
2014-08-31 08:10:35 +00:00
Ralph Castain
d13fb37ef9 Add array types to opal_value_t
This commit was SVN r32656.
2014-08-31 08:07:03 +00:00
Ralph Castain
9500939042 Fix abstraction violation
This commit was SVN r32655.
2014-08-31 08:06:35 +00:00
Mike Dubman
4497dada00 build: add "-verok" flag to ignore autotools version check and continue anyway.
This commit was SVN r32654.
2014-08-31 07:27:34 +00:00
MPI Team
f5905c7111 Update git/hg ignore files
This commit was SVN r32653.
2014-08-31 05:00:20 +00:00
Ralph Castain
60eb7124ab Upgrade to hwloc 1.9.1
This commit was SVN r32652.
2014-08-31 03:13:06 +00:00
Ralph Castain
e49ca05f11 Remove unused variable
This commit was SVN r32651.
2014-08-31 03:11:50 +00:00
Ralph Castain
5cdbc00136 Re-enable the usock oob component. Ensure the TCP component promotes messages for other procs to the OOB base so that other components have a chance to send the relay. Seems to be passing MTT, so let's see how it works for others.
This commit was SVN r32650.
2014-08-30 19:33:46 +00:00