openmpi

Автор	SHA1	Сообщение	Дата
Nathan Hjelm	655604f509	btl/ugni: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:37 -07:00
Nathan Hjelm	4972d97b8b	btl/template: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:37 -07:00
Nathan Hjelm	f241b6e0a7	btl/tcp: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	25176cad27	btl/sm: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	19abc19ad9	btl/self: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	f96d48a2e1	btl/scif: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	cf91156105	btl/openib: add atomic operation support Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	74f1af4548	btl/openib: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	fc7397949c	btl: require that btls handle descriptor = NULL in the btl_sendi function The send inline optimization uses the btl_sendi function to achieve lower latency and higher message rates. Before this commit BTLs were allowed to assume the descriptor was non-NULL and were expected to return a valid descriptor if the send could not be completed using btl_sendi. This behavior was fine until the usage of btl_sendi was changed in ob1. This commit allows the caller to specify NULL for the descriptor. The affected btls have been updated to handle this case. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	593f97ae92	btl: add support for 64-bit atomic operations This commit adds an interface for btl's to export support for 64-bit atomic operations on integers. BTL's that can support atomic operations should implement these functions and set the appropriate btl_flags and btl_atomic_flags. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	f8e15ca83d	Update the interface to provide a cleaner interface for RDMA operations. The old BTL interface provided support for RDMA through the use of the btl_prepare_src and btl_prepare_dst functions. These functions were expected to prepare as much of the user buffer as possible for the RDMA operation and return a descriptor. The descriptor contained segment information on the prepared region. The btl user could then pass the RDMA segment information to a remote peer. Once the peer received that information it then packed it into a similar descriptor on the other side that could then be passed into a single btl_put or btl_get operation. Changes: - Added functions to register and deregister memory regions with the btl. If no registration is needed a btl should set these function pointers to NULL. These function take over for btl_prepare_src/dst and btl_free for RDMA operations. The caller should specify the maximum permissions needed on the memory. - Changed the function signatures for both btl_put and btl_get. In place of a prepared descriptor the caller should provide the source and destination addresses and registration handles as well as a new callback function. The callback will be provided with the local address and registration handle, callback context, callback data, and status. See mca_btl_base_rdma_completion_fn_t in btl.h. - Added a new btl constraint: MCA_BTL_REG_HANDLE_MAX_SIZE. This value specifies the maximum size of any btl's registration handle. - Removed the btl_prepare_dst function. This reflects the fact that RDMA operations no longer depend on "prepared" descriptors. - Removed the btl_seg_size member. There is no need to btl's to subclass the mca_btl_base_segment_t class anymore. - Expose the btl's put/get limitations with new struct members: btl_put_limit, btl_put_alignment, btl_get_limit, btl_get_alignment. - Remove the mca_mpool_base_registration_t argument from the btl_prepare_src function. The argument was intended to support RDMA operations and is no longer necessary. - Remove des_remote/des_remote_count from the mca_btl_base_descriptor_t structure. This structure member was originally used to specify the remote segment for RDMA operations. Since the new btl interface no longer uses desriptors for RDMA this member no longer has a purpose. In addition to removing these members the local segment structure fields have been renamed to from des_local/des_local_count to des_segments/des_segment_count. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Howard Pritchard	6a275f4489	Merge pull request #395 from hppritcha/topic/pmix_cray_kvs pmix/cray: remove workaround for OBJ_RELEASE	2015-02-13 11:25:50 -07:00
Howard Pritchard	bd9d185951	pmix/cray: remove workaround for OBJ_RELEASE Per feedback from rhc, manually set the base_ptr member of the opal_buffer_t variable to NULL prior to calling OBJ_RELEASE. A similar feature of opal_dss.load also exists so likewise reset the base_ptr to NULL prior to invoking it. Hopefully the opal_buffer_t struct does not change frequently. Minor cleanups to reduce output when pmix_base_verbose mca paramater is set.	2015-02-13 07:47:26 -08:00
Jeff Squyres	f7b4b23383	usnic: ensure to NULL-terminate the string/not overflow This was CID 1269921.	2015-02-12 13:41:30 -08:00
Jeff Squyres	8febd41a39	usnic: fix minor memory leak This was CID 1269859.	2015-02-12 13:41:30 -08:00
Jeff Squyres	4c074da1c2	usnic: fix minor memory leak This was CID 1269853.	2015-02-12 13:41:30 -08:00
Jeff Squyres	a7ce2d406c	usnic: don't bother comparing unsigned values for <0 This was CID 1269812.	2015-02-12 13:41:30 -08:00
Jeff Squyres	caacc6ad91	usnic: properly differentiate data pool vs. malloc usnic_fls() can actually return 0, leading us to incorrectly free() a buffer instead of OMPI_FREE_LIST_RETURN_MT'ing it. So add an explicit bool in the struct that tracks whether the buffer came from malloc or a freelist. This was CID 1269660.	2015-02-12 13:41:30 -08:00
Jeff Squyres	3b39535ebb	usnic: ensure that the string is NULL-terminated This was CID 1269666.	2015-02-12 13:41:30 -08:00
Jeff Squyres	41c6e26a38	usnic: ensure the copied string is NULL-terminated This was CID 1269667	2015-02-12 13:41:30 -08:00
Jeff Squyres	81585c0a7c	usnic: strengthen the check-if-accept()-failed test This was Coverity CID 1269801.	2015-02-12 13:41:30 -08:00
Jeff Squyres	117e6feaa1	shmem sysv: ensure we don't shmdt(NULL) This was CID 71999.	2015-02-12 13:41:30 -08:00
Jeff Squyres	6d3a84514f	mca_base_cmd_line.c: fix minor memory leak This was CID 1269874.	2015-02-12 13:41:29 -08:00
Jeff Squyres	f8e334357d	mca_base_pvar.c: protect removal from list Only remove it from the list if it is actually on the list. This was CID 1269758.	2015-02-12 13:41:29 -08:00
Nathan Hjelm	f1dc29b145	btl/vader: fix modex size when xpmem is in use Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-12 14:06:24 -07:00
Nathan Hjelm	49ba150972	mca/base: fix path string parsing CID 993709	2015-02-12 13:03:46 -07:00
Jeff Squyres	00c878957c	mca_base_var.c: add debug check for another programming error Coverity alerted us to the fact that there are places where the synonym_for param is hard-coded to -1 when calling register_variable(). It would be a coding error if synonym_for==-1 and (flags & MCA_BASE_VAR_FLAG_SYNONYM)>0, so let's add that to the debug-only check at the top of the function. This was CID 993717.	2015-02-12 10:24:02 -08:00
Jeff Squyres	332943f1c3	pstat linux: ensure to close the file This was CID 71983.	2015-02-12 10:24:02 -08:00
Jeff Squyres	6a64fe85a1	pstat linux: ensure read() returns >=0 This was CID 71182.	2015-02-12 10:24:02 -08:00
Jeff Squyres	8be0e0b0ca	usnic: don't close fp upon error Let the caller close fp. Properly check for errors when calling subroutines. This was Coverity CID 1269995.	2015-02-12 10:24:01 -08:00
Howard Pritchard	0cf2b478e0	Merge pull request #391 from hppritcha/topic/cray_pmi_kvs pmix/cray: initial kvs removal work	2015-02-11 19:55:34 -07:00
Howard Pritchard	9955834ff1	pmix/cray: initial kvs removal work Remove use of the Cray PMI KVS - which is designed for a lighweight MPI that exchanges only a minimimal amount of connection info (about 128 bytes per rank) - within cray/pmix. Use Cray PMI collective extensions instead. This is the first of several steps to accelerate launch of Open MPI on Cray systems using either native aprun or nativized slurm.	2015-02-11 15:14:55 -08:00
Rolf vandeVaart	08dceda2c0	Fix logic for handling priority and eager RDMA. There was some refactoring that was done in this code and it ended up changing the logic that is used to set up eager RDMA. Rather than setting up eager RDMA with a high priority message, it did it the other way around. For some reason, CUDA-aware support did not like this. So, basically, restore the logic to the way it was prior to the refactoring. The refactoring did not intend to change this. Lightly reviewed by hjelmn.	2015-02-11 16:38:36 -05:00
Jeff Squyres	4f1996df5d	various: remove $(LTDLINCL) from Makefile.am's that didn't need it	2015-02-11 12:25:20 -08:00
Ralph Castain	3de8c5c7c6	Cleanup the munge support - the credential cannot be reused for multiple connections	2015-02-10 20:34:35 -08:00
George Bosilca	e173f9b0c0	Somehow we lost one of the most critical parameter allowing the PML to decide how to order the different interconnects. Bring it back !	2015-02-10 20:32:05 -05:00
Ralph Castain	3ae3b96c17	Fix master compilation - a buried header dependency must have been removed.	2015-02-10 07:22:10 -08:00
Mike Dubman	6816e3421f	Merge pull request #377 from regrant/ib_wr_fix fix problem with get_pathrecord posting too many recv requests	2015-02-10 08:47:23 +02:00
Ralph Castain	bef830efef	Fix debug output	2015-02-09 20:49:04 -08:00
Ralph Castain	07134f5b17	Add munge security	2015-02-09 20:49:03 -08:00
Ralph Castain	a3275aa867	Once again, fix the blasted singleton comm_spawn	2015-02-05 17:34:25 -08:00
Jeff Squyres	0dbbffb753	pmix_base_frame: use the "= { 0 }" initializer Per open-mpi/ompi#381, convert the specific intialization of opal_pmix to use the generic "= { 0 }" initializer. This form can be used to initialize any type when the intent is just to zero out / assign some value.	2015-02-05 17:51:06 -05:00
Ralph Castain	4d882796b6	Silence warnings	2015-02-05 11:41:00 -08:00
Howard Pritchard	e508a4078e	Merge pull request #376 from regrant/ib_error_fix fixes OpenIB connect error reporting for ibv_* calls that return an errn...	2015-02-04 10:22:03 -07:00
Jeff Squyres	621af3aa07	pmix_base: fix global opal_pmix symbol for static linking on OS X OS X has weirdness when static linking. If a symbol is not initialized, it is put into the common block section, and Weird Things happen (linking when trying to using that global symbol will fail). If you initialize the variable, it goes into a different section (and linking to it will work). This link (that might go stale someday) has some information about OS X linker scope and treatment of symbol definitions: https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachOTopics/1-Articles/executing_files.html#//apple_ref/doc/uid/TP40001829-98432-TPXREF120 Fixes #375.	2015-02-04 12:12:31 -05:00
Ryan Grant	de93497789	fix problem with get_pathrecord posting too many recv requests	2015-02-04 09:53:58 -07:00
Ryan Grant	5d5e9bc1f8	fixes OpenIB connect error reporting for ibv_* calls that return an errno	2015-02-04 09:09:14 -07:00
Jeff Squyres	a3728f09af	libfabric: add another missing file to the Makefile.am	2015-02-04 04:02:27 -08:00
Jeff Squyres	66a680879e	libfabric: fix header file name in Makefile.am	2015-02-03 19:41:25 -08:00
Jeff Squyres	cb7cc171f9	usnic: update README.txt notes Update notes about copying the usnic BTL between master and the v1.8 branch.	2015-02-03 15:54:36 -08:00

1 2 3 4 5 ...

1779 Коммитов