1
1
Граф коммитов

1789 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
cf99f0c905 usnic: just add comments/explanations -- no code changes 2015-02-13 11:46:38 -07:00
Jeff Squyres
af61065b87 usnic: minor update of member field names 2015-02-13 11:46:38 -07:00
Jeff Squyres
8311428602 btl.h: whitespace cleanup
No code changes
2015-02-13 11:46:38 -07:00
Jeff Squyres
7971fd57f0 btl.h: add more description for reg/dereg functions 2015-02-13 11:46:38 -07:00
Nathan Hjelm
a3b739d117 btl/ugni: use pthread_join to wait on progress thread completion
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:38 -07:00
Nathan Hjelm
953efc3eb2 btl/openib: fix compilation issues with XRC
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:38 -07:00
Nathan Hjelm
a9763e123d add btl comment
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:38 -07:00
Nathan Hjelm
1e518504e4 btl/smcuda: update for BTL 3.0 interface 2015-02-13 11:46:37 -07:00
Nathan Hjelm
aba0675fe7 btl/vader: update for BTL 3.0 interface
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:37 -07:00
Nathan Hjelm
f8ac3fb1e8 btl/ugni: add support for atomic operations
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:37 -07:00
Nathan Hjelm
655604f509 btl/ugni: update for BTL 3.0 interface
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:37 -07:00
Nathan Hjelm
4972d97b8b btl/template: update for BTL 3.0 interface
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:37 -07:00
Nathan Hjelm
f241b6e0a7 btl/tcp: update for BTL 3.0 interface
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:36 -07:00
Nathan Hjelm
25176cad27 btl/sm: update for BTL 3.0 interface
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:36 -07:00
Nathan Hjelm
19abc19ad9 btl/self: update for BTL 3.0 interface
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:36 -07:00
Nathan Hjelm
f96d48a2e1 btl/scif: update for BTL 3.0 interface
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:36 -07:00
Nathan Hjelm
cf91156105 btl/openib: add atomic operation support
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:36 -07:00
Nathan Hjelm
74f1af4548 btl/openib: update for BTL 3.0 interface
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:36 -07:00
Nathan Hjelm
fc7397949c btl: require that btls handle descriptor = NULL in the btl_sendi function
The send inline optimization uses the btl_sendi function to achieve lower
latency and higher message rates. Before this commit BTLs were allowed to
assume the descriptor was non-NULL and were expected to return a valid
descriptor if the send could not be completed using btl_sendi. This
behavior was fine until the usage of btl_sendi was changed in ob1. This
commit allows the caller to specify NULL for the descriptor. The affected
btls have been updated to handle this case.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:36 -07:00
Nathan Hjelm
593f97ae92 btl: add support for 64-bit atomic operations
This commit adds an interface for btl's to export support for 64-bit atomic
operations on integers. BTL's that can support atomic operations should
implement these functions and set the appropriate btl_flags and btl_atomic_flags.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:36 -07:00
Nathan Hjelm
f8e15ca83d Update the interface to provide a cleaner interface for RDMA operations.
The old BTL interface provided support for RDMA through the use of
the btl_prepare_src and btl_prepare_dst functions. These functions were
expected to prepare as much of the user buffer as possible for the RDMA
operation and return a descriptor. The descriptor contained segment
information on the prepared region. The btl user could then pass the
RDMA segment information to a remote peer. Once the peer received that
information it then packed it into a similar descriptor on the other
side that could then be passed into a single btl_put or btl_get
operation.

Changes:

 - Added functions to register and deregister memory regions with the
   btl. If no registration is needed a btl should set these function
   pointers to NULL. These function take over for btl_prepare_src/dst
   and btl_free for RDMA operations. The caller should specify the
   maximum permissions needed on the memory.

 - Changed the function signatures for both btl_put and btl_get. In
   place of a prepared descriptor the caller should provide the source
   and destination addresses and registration handles as well as a
   new callback function. The callback will be provided with the local
   address and registration handle, callback context, callback data, and
   status. See mca_btl_base_rdma_completion_fn_t in btl.h.

 - Added a new btl constraint: MCA_BTL_REG_HANDLE_MAX_SIZE. This
   value specifies the maximum size of any btl's registration handle.

 - Removed the btl_prepare_dst function. This reflects the fact that
   RDMA operations no longer depend on "prepared" descriptors.

 - Removed the btl_seg_size member. There is no need to btl's to
   subclass the mca_btl_base_segment_t class anymore.

 - Expose the btl's put/get limitations with new struct members:
   btl_put_limit, btl_put_alignment, btl_get_limit, btl_get_alignment.

 - Remove the mca_mpool_base_registration_t argument from the btl_prepare_src
   function. The argument was intended to support RDMA operations and is no
   longer necessary.

 - Remove des_remote/des_remote_count from the mca_btl_base_descriptor_t
   structure. This structure member was originally used to specify the remote
   segment for RDMA operations. Since the new btl interface no longer uses
   desriptors for RDMA this member no longer has a purpose. In addition
   to removing these members the local segment structure fields have been
   renamed to from des_local/des_local_count to des_segments/des_segment_count.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:36 -07:00
Howard Pritchard
6a275f4489 Merge pull request #395 from hppritcha/topic/pmix_cray_kvs
pmix/cray: remove workaround for OBJ_RELEASE
2015-02-13 11:25:50 -07:00
Howard Pritchard
bd9d185951 pmix/cray: remove workaround for OBJ_RELEASE
Per feedback from rhc, manually set the base_ptr member
of the opal_buffer_t variable to NULL prior to calling
OBJ_RELEASE.  A similar feature of opal_dss.load also
exists so likewise reset the base_ptr to NULL prior to
invoking it.

Hopefully the opal_buffer_t struct does not change
frequently.

Minor cleanups to reduce output when pmix_base_verbose
mca paramater is set.
2015-02-13 07:47:26 -08:00
Jeff Squyres
f7b4b23383 usnic: ensure to NULL-terminate the string/not overflow
This was CID 1269921.
2015-02-12 13:41:30 -08:00
Jeff Squyres
8febd41a39 usnic: fix minor memory leak
This was CID 1269859.
2015-02-12 13:41:30 -08:00
Jeff Squyres
4c074da1c2 usnic: fix minor memory leak
This was CID 1269853.
2015-02-12 13:41:30 -08:00
Jeff Squyres
a7ce2d406c usnic: don't bother comparing unsigned values for <0
This was CID 1269812.
2015-02-12 13:41:30 -08:00
Jeff Squyres
caacc6ad91 usnic: properly differentiate data pool vs. malloc
usnic_fls() can actually return 0, leading us to incorrectly free() a
buffer instead of OMPI_FREE_LIST_RETURN_MT'ing it.

So add an explicit bool in the struct that tracks whether the buffer
came from malloc or a freelist.

This was CID 1269660.
2015-02-12 13:41:30 -08:00
Jeff Squyres
3b39535ebb usnic: ensure that the string is NULL-terminated
This was CID 1269666.
2015-02-12 13:41:30 -08:00
Jeff Squyres
41c6e26a38 usnic: ensure the copied string is NULL-terminated
This was CID 1269667
2015-02-12 13:41:30 -08:00
Jeff Squyres
81585c0a7c usnic: strengthen the check-if-accept()-failed test
This was Coverity CID 1269801.
2015-02-12 13:41:30 -08:00
Jeff Squyres
117e6feaa1 shmem sysv: ensure we don't shmdt(NULL)
This was CID 71999.
2015-02-12 13:41:30 -08:00
Jeff Squyres
6d3a84514f mca_base_cmd_line.c: fix minor memory leak
This was CID 1269874.
2015-02-12 13:41:29 -08:00
Jeff Squyres
f8e334357d mca_base_pvar.c: protect removal from list
Only remove it from the list if it is actually on the list.

This was CID 1269758.
2015-02-12 13:41:29 -08:00
Nathan Hjelm
f1dc29b145 btl/vader: fix modex size when xpmem is in use
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-12 14:06:24 -07:00
Nathan Hjelm
49ba150972 mca/base: fix path string parsing
CID 993709
2015-02-12 13:03:46 -07:00
Jeff Squyres
00c878957c mca_base_var.c: add debug check for another programming error
Coverity alerted us to the fact that there are places where
the synonym_for param is hard-coded to -1 when calling
register_variable().  It would be a coding error if synonym_for==-1
and (flags & MCA_BASE_VAR_FLAG_SYNONYM)>0, so let's add that to the
debug-only check at the top of the function.

This was CID 993717.
2015-02-12 10:24:02 -08:00
Jeff Squyres
332943f1c3 pstat linux: ensure to close the file
This was CID 71983.
2015-02-12 10:24:02 -08:00
Jeff Squyres
6a64fe85a1 pstat linux: ensure read() returns >=0
This was CID 71182.
2015-02-12 10:24:02 -08:00
Jeff Squyres
8be0e0b0ca usnic: don't close fp upon error
Let the caller close fp.  Properly check for errors when calling
subroutines.

This was Coverity CID 1269995.
2015-02-12 10:24:01 -08:00
Howard Pritchard
0cf2b478e0 Merge pull request #391 from hppritcha/topic/cray_pmi_kvs
pmix/cray: initial kvs removal work
2015-02-11 19:55:34 -07:00
Howard Pritchard
9955834ff1 pmix/cray: initial kvs removal work
Remove use of the Cray PMI KVS - which is designed for a lighweight
MPI that exchanges only a minimimal amount of connection info
(about 128 bytes per rank) - within cray/pmix.  Use Cray PMI
collective extensions instead.

This is the first of several steps to accelerate launch of
Open MPI on Cray systems using either native aprun or nativized
slurm.
2015-02-11 15:14:55 -08:00
Rolf vandeVaart
08dceda2c0 Fix logic for handling priority and eager RDMA. There was some refactoring that was done
in this code and it ended up changing the logic that is used to set up eager RDMA.
Rather than setting up eager RDMA with a high priority message, it did it the other
way around.  For some reason, CUDA-aware support did not like this.  So, basically,
restore the logic to the way it was prior to the refactoring.  The refactoring did not
intend to change this.  Lightly reviewed by hjelmn.
2015-02-11 16:38:36 -05:00
Jeff Squyres
4f1996df5d various: remove $(LTDLINCL) from Makefile.am's that didn't need it 2015-02-11 12:25:20 -08:00
Ralph Castain
3de8c5c7c6 Cleanup the munge support - the credential cannot be reused for multiple connections 2015-02-10 20:34:35 -08:00
George Bosilca
e173f9b0c0 Somehow we lost one of the most critical parameter
allowing the PML to decide how to order the different
interconnects. Bring it back !
2015-02-10 20:32:05 -05:00
Ralph Castain
3ae3b96c17 Fix master compilation - a buried header dependency must have been removed. 2015-02-10 07:22:10 -08:00
Mike Dubman
6816e3421f Merge pull request #377 from regrant/ib_wr_fix
fix problem with get_pathrecord posting too many recv requests
2015-02-10 08:47:23 +02:00
Ralph Castain
bef830efef Fix debug output 2015-02-09 20:49:04 -08:00
Ralph Castain
07134f5b17 Add munge security 2015-02-09 20:49:03 -08:00