openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	937bbbac34	libfabric: update to 8528d35551a78b5241e615c0e6ac5a711f96a03c Update to latest from libfabric Github master ofiwg/libfabric@8528d35551	2015-02-20 12:37:27 -08:00
Jeff Squyres	15be948d79	wrappers: _EXTRA_INCLUDES does not exist any more There were a few places where _EXTRA_INCLUDES (and derivates) were still being used. This commit removes all of them.	2015-02-20 08:43:25 -08:00
Nathan Hjelm	cc750b00a6	btl: export local registration thresholds Some BTLs do not require local registration for some rdma transactions. For example: inline put on openib, fma put on ugni. This commit adds code to expose the local registration thresholds to BTL users. Optimized code can take advantage of this information to improve rdma performance.	2015-02-19 16:13:37 -07:00
Jeff Squyres	6098b84294	libfabric: pass the appropriate LDFLAGS to libfabric components When compiling against an external libfabric, ensure to also pass the appropriate -L flags so that the compiler/linker can find it.	2015-02-19 05:35:38 -08:00
Jeff Squyres	2d636147e3	reachable netlink: fix the component symbol name	2015-02-19 04:25:15 -08:00
Ralph Castain	008755ab17	Remove stale file reference	2015-02-18 18:36:08 -08:00
Nathan Hjelm	0e09b9298a	mca/base: add framework flag indicating a framework does not have dso components This flag is needed for a special case framework: dl. The framework is needed before any dl components can be used.	2015-02-18 14:03:51 -07:00
rhc54	ae16a168ec	Merge pull request #401 from rhc54/reachable Add reachable framework for determining TCP connections	2015-02-18 08:22:48 -08:00
Jeff Squyres	b66fc3aed9	opal_check_visibility.m4: remove extraneous sym link The sym link to this m4 is not necessary down in the component.	2015-02-18 03:40:25 -08:00
Jeff Squyres	f040ef09ff	libfabric: properly define HAVE_ALIAS_ATTRIBUTE @ggouaillardet identified that HAVE_ALIAS_ATTRIBUTE was not properly being defined in the embedded libfabric. This is because the embedded configury missed the test for it (i.e., the real configure.ac for libfabric always defines HAVE_ALIAS_ATTRIBUTE to 0 or 1 -- we didn't emulate that properly here in libfabric's configure.m4). Also, fix some grammar and properly escape another AC_MSG_CHECKING message in libfabric's configure.m4.	2015-02-18 03:26:34 -08:00
Gilles Gouaillardet	28714b60cb	btl/sm: fix misc errors as reported by Coverity as CIDs 711636 and 1269847	2015-02-18 17:05:19 +09:00
Ralph Castain	9ef523c152	Add reachable framework for determining TCP connections	2015-02-17 21:47:09 -08:00
Nathan Hjelm	298f238096	opal_lifo: add missing memory barrier to 64/32-bit atomic lifo implementation Need to ensure the head write is complete before updating the item's next pointer. References #371	2015-02-17 12:23:13 -07:00
Jeff Squyres	9cb047c1ee	libfabric: don't install the osd.h headers When configured --with-devel-headers, there's now 2 "osd.h" header files in libfabric (in different dirs). Automake's "install" target didn't like this, and errored out. Since embedding libfabric is a temporary measure, just avoid the problem by not installing any libfabric headers.	2015-02-17 07:10:12 -08:00
Gilles Gouaillardet	55948f2a6d	hwloc: fix misc memory leak as reported by Coverity with CID 1270441 (previous commit open-mpi/ompi@c25185f3a9 did not fully fix that one)	2015-02-17 14:06:15 +09:00
Jeff Squyres	9d7171e8f1	convert: remove unnecessary/unused opal_size2int() function The comments in the file even said "This file will hopefully not last long in the tree...".	2015-02-16 07:17:33 -08:00
Gilles Gouaillardet	16c8af6725	opal/dss: correctly handle incorrect parameter in opal_value_unload this fixes previous commit open-mpi/ompi@f9b3fb442e Thanks Ralph for the review !	2015-02-16 14:53:35 +09:00
Gilles Gouaillardet	f9b3fb442e	opal/dss: correctly handle data==NULL in opal_value_unload	2015-02-16 14:40:42 +09:00
Gilles Gouaillardet	da7ffb6448	btl/vader: fix memory leak as reported by Coverity with CID 1269904	2015-02-16 13:51:05 +09:00
Gilles Gouaillardet	c25185f3a9	opal/hwloc: fix misc memory leaks as reported by Coverity with CIDS 710631-710638, 1196705, 1196716, 1196717, 1196752, 1196753	2015-02-16 12:23:37 +09:00
Gilles Gouaillardet	8dd77c692e	opal/hwloc: fix misc bugs as reported by Coverity with CIDs 72224, 703566, 1196821, 1196842, 1196657 and 1196658	2015-02-16 11:59:48 +09:00
Gilles Gouaillardet	0ce59f2d29	pmix: fix misc memory leaks as reported by Coverity as CID 1269843, 1269854, 1269856, 1269857 and 1269858	2015-02-16 11:19:43 +09:00
Gilles Gouaillardet	ccbdf64de4	opal/util: fix memory leak in opal_util_init_sys_limits as reported by Coverity with CID 996174 previous commit (open-mpi/ompi@ca3a275823) dit not fix this CID	2015-02-16 11:05:35 +09:00
George Bosilca	a7a4d6335e	Various cleanups.	2015-02-15 11:39:09 -05:00
George Bosilca	a4aa74d4b9	Fix the SM BTL.	2015-02-15 11:38:45 -05:00
George Bosilca	84994c7438	This comment seems to contradict with the compilers opportunities to optimize the unused data out.	2015-02-15 11:37:22 -05:00
Jeff Squyres	2ca14acaf0	libfabric: add missing files into Makefile.am	2015-02-14 05:01:29 -08:00
Jeff Squyres	955d8b7525	usnic: adapt for new libfabric API	2015-02-13 14:44:23 -08:00
Jeff Squyres	3abebe7251	libfabric: update to ofiwg/libfabric@06fdfbef98	2015-02-13 14:44:06 -08:00
Nathan Hjelm	1162093d34	btl/scif: fix debug build Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:39 -07:00
Jeff Squyres	67ee1e6cf8	usnic: restore compatibilty between master and v1.8 Add the functions that changed between BTL 2.0 and 3.0 into compat.h and compat.c: * module.btl_prepare_src: the signature and body of this method changed between 2.0 and 3.0. However, the functions that this method calls did not need to change, so they are copied over wholesale (with the exception that they no longer accept the unused `registration` parameter). * module.btl_prepare_dst: this method does not exist in BTL 3.0. * module.btl_put: the signature and body of this method changed between 2.0 and 3.0.	2015-02-13 11:46:38 -07:00
Jeff Squyres	ad841d7ba3	usnic: update to BTL 3.0	2015-02-13 11:46:38 -07:00
Jeff Squyres	0a5fd8e36a	usnic: update README for new BTL 3.0 scheme details	2015-02-13 11:46:38 -07:00
Jeff Squyres	cf99f0c905	usnic: just add comments/explanations -- no code changes	2015-02-13 11:46:38 -07:00
Jeff Squyres	af61065b87	usnic: minor update of member field names	2015-02-13 11:46:38 -07:00
Jeff Squyres	8311428602	btl.h: whitespace cleanup No code changes	2015-02-13 11:46:38 -07:00
Jeff Squyres	7971fd57f0	btl.h: add more description for reg/dereg functions	2015-02-13 11:46:38 -07:00
Nathan Hjelm	a3b739d117	btl/ugni: use pthread_join to wait on progress thread completion Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:38 -07:00
Nathan Hjelm	953efc3eb2	btl/openib: fix compilation issues with XRC Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:38 -07:00
Nathan Hjelm	a9763e123d	add btl comment Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:38 -07:00
Nathan Hjelm	44fb8369ff	opal/convertor: add a function to get the pointer for an offset (instead of the current offset) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:37 -07:00
Nathan Hjelm	1e518504e4	btl/smcuda: update for BTL 3.0 interface	2015-02-13 11:46:37 -07:00
Nathan Hjelm	aba0675fe7	btl/vader: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:37 -07:00
Nathan Hjelm	f8ac3fb1e8	btl/ugni: add support for atomic operations Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:37 -07:00
Nathan Hjelm	655604f509	btl/ugni: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:37 -07:00
Nathan Hjelm	4972d97b8b	btl/template: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:37 -07:00
Nathan Hjelm	f241b6e0a7	btl/tcp: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	25176cad27	btl/sm: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	19abc19ad9	btl/self: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	f96d48a2e1	btl/scif: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	cf91156105	btl/openib: add atomic operation support Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	74f1af4548	btl/openib: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	fc7397949c	btl: require that btls handle descriptor = NULL in the btl_sendi function The send inline optimization uses the btl_sendi function to achieve lower latency and higher message rates. Before this commit BTLs were allowed to assume the descriptor was non-NULL and were expected to return a valid descriptor if the send could not be completed using btl_sendi. This behavior was fine until the usage of btl_sendi was changed in ob1. This commit allows the caller to specify NULL for the descriptor. The affected btls have been updated to handle this case. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	593f97ae92	btl: add support for 64-bit atomic operations This commit adds an interface for btl's to export support for 64-bit atomic operations on integers. BTL's that can support atomic operations should implement these functions and set the appropriate btl_flags and btl_atomic_flags. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Nathan Hjelm	f8e15ca83d	Update the interface to provide a cleaner interface for RDMA operations. The old BTL interface provided support for RDMA through the use of the btl_prepare_src and btl_prepare_dst functions. These functions were expected to prepare as much of the user buffer as possible for the RDMA operation and return a descriptor. The descriptor contained segment information on the prepared region. The btl user could then pass the RDMA segment information to a remote peer. Once the peer received that information it then packed it into a similar descriptor on the other side that could then be passed into a single btl_put or btl_get operation. Changes: - Added functions to register and deregister memory regions with the btl. If no registration is needed a btl should set these function pointers to NULL. These function take over for btl_prepare_src/dst and btl_free for RDMA operations. The caller should specify the maximum permissions needed on the memory. - Changed the function signatures for both btl_put and btl_get. In place of a prepared descriptor the caller should provide the source and destination addresses and registration handles as well as a new callback function. The callback will be provided with the local address and registration handle, callback context, callback data, and status. See mca_btl_base_rdma_completion_fn_t in btl.h. - Added a new btl constraint: MCA_BTL_REG_HANDLE_MAX_SIZE. This value specifies the maximum size of any btl's registration handle. - Removed the btl_prepare_dst function. This reflects the fact that RDMA operations no longer depend on "prepared" descriptors. - Removed the btl_seg_size member. There is no need to btl's to subclass the mca_btl_base_segment_t class anymore. - Expose the btl's put/get limitations with new struct members: btl_put_limit, btl_put_alignment, btl_get_limit, btl_get_alignment. - Remove the mca_mpool_base_registration_t argument from the btl_prepare_src function. The argument was intended to support RDMA operations and is no longer necessary. - Remove des_remote/des_remote_count from the mca_btl_base_descriptor_t structure. This structure member was originally used to specify the remote segment for RDMA operations. Since the new btl interface no longer uses desriptors for RDMA this member no longer has a purpose. In addition to removing these members the local segment structure fields have been renamed to from des_local/des_local_count to des_segments/des_segment_count. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:36 -07:00
Howard Pritchard	6a275f4489	Merge pull request #395 from hppritcha/topic/pmix_cray_kvs pmix/cray: remove workaround for OBJ_RELEASE	2015-02-13 11:25:50 -07:00
Howard Pritchard	bd9d185951	pmix/cray: remove workaround for OBJ_RELEASE Per feedback from rhc, manually set the base_ptr member of the opal_buffer_t variable to NULL prior to calling OBJ_RELEASE. A similar feature of opal_dss.load also exists so likewise reset the base_ptr to NULL prior to invoking it. Hopefully the opal_buffer_t struct does not change frequently. Minor cleanups to reduce output when pmix_base_verbose mca paramater is set.	2015-02-13 07:47:26 -08:00
Gilles Gouaillardet	ca3a275823	opal/util: fix misc memory leaks reported by Coverity fixes CID 996174, 996920, 1196735, 1196769 and 1196770	2015-02-13 14:28:59 +09:00
Jeff Squyres	f7b4b23383	usnic: ensure to NULL-terminate the string/not overflow This was CID 1269921.	2015-02-12 13:41:30 -08:00
Jeff Squyres	8febd41a39	usnic: fix minor memory leak This was CID 1269859.	2015-02-12 13:41:30 -08:00
Jeff Squyres	4c074da1c2	usnic: fix minor memory leak This was CID 1269853.	2015-02-12 13:41:30 -08:00
Jeff Squyres	a7ce2d406c	usnic: don't bother comparing unsigned values for <0 This was CID 1269812.	2015-02-12 13:41:30 -08:00
Jeff Squyres	caacc6ad91	usnic: properly differentiate data pool vs. malloc usnic_fls() can actually return 0, leading us to incorrectly free() a buffer instead of OMPI_FREE_LIST_RETURN_MT'ing it. So add an explicit bool in the struct that tracks whether the buffer came from malloc or a freelist. This was CID 1269660.	2015-02-12 13:41:30 -08:00
Jeff Squyres	3b39535ebb	usnic: ensure that the string is NULL-terminated This was CID 1269666.	2015-02-12 13:41:30 -08:00
Jeff Squyres	41c6e26a38	usnic: ensure the copied string is NULL-terminated This was CID 1269667	2015-02-12 13:41:30 -08:00
Jeff Squyres	81585c0a7c	usnic: strengthen the check-if-accept()-failed test This was Coverity CID 1269801.	2015-02-12 13:41:30 -08:00
Jeff Squyres	117e6feaa1	shmem sysv: ensure we don't shmdt(NULL) This was CID 71999.	2015-02-12 13:41:30 -08:00
Jeff Squyres	6d3a84514f	mca_base_cmd_line.c: fix minor memory leak This was CID 1269874.	2015-02-12 13:41:29 -08:00
Jeff Squyres	a1037cd70a	if.c: fix minor memory leak This was CID 1269846.	2015-02-12 13:41:29 -08:00
Jeff Squyres	4a85f759ec	opal_info_support.c: prevent a NULL pointer If NULL is passed in, then assume the caller meant "". This was CID 993714.	2015-02-12 13:41:29 -08:00
Jeff Squyres	29794af0e9	cmd_line.c: use strncat() instead of strcat() Be safe about appending to the end of strings. This was CID 71932 (and probably also others).	2015-02-12 13:41:29 -08:00
Jeff Squyres	f8e334357d	mca_base_pvar.c: protect removal from list Only remove it from the list if it is actually on the list. This was CID 1269758.	2015-02-12 13:41:29 -08:00
Jeff Squyres	e188c75edc	opal_environ.c: ensure "value" is a valid string for the setenv() case This was CID 1269764.	2015-02-12 13:41:29 -08:00
Nathan Hjelm	f1dc29b145	btl/vader: fix modex size when xpmem is in use Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-12 14:06:24 -07:00
Nathan Hjelm	49ba150972	mca/base: fix path string parsing CID 993709	2015-02-12 13:03:46 -07:00
Jeff Squyres	00c878957c	mca_base_var.c: add debug check for another programming error Coverity alerted us to the fact that there are places where the synonym_for param is hard-coded to -1 when calling register_variable(). It would be a coding error if synonym_for==-1 and (flags & MCA_BASE_VAR_FLAG_SYNONYM)>0, so let's add that to the debug-only check at the top of the function. This was CID 993717.	2015-02-12 10:24:02 -08:00
Jeff Squyres	167d72ec68	net.c: ensure to free the args in the error case This was CID 710643.	2015-02-12 10:24:02 -08:00
Jeff Squyres	332943f1c3	pstat linux: ensure to close the file This was CID 71983.	2015-02-12 10:24:02 -08:00
Jeff Squyres	6a64fe85a1	pstat linux: ensure read() returns >=0 This was CID 71182.	2015-02-12 10:24:02 -08:00
Jeff Squyres	8be0e0b0ca	usnic: don't close fp upon error Let the caller close fp. Properly check for errors when calling subroutines. This was Coverity CID 1269995.	2015-02-12 10:24:01 -08:00
Nathan Hjelm	1c8f8c6694	opal_fifo: add a couple of memory barriers to the cswap2 implementation	2015-02-12 11:01:40 -07:00
Howard Pritchard	0cf2b478e0	Merge pull request #391 from hppritcha/topic/cray_pmi_kvs pmix/cray: initial kvs removal work	2015-02-11 19:55:34 -07:00
Howard Pritchard	9955834ff1	pmix/cray: initial kvs removal work Remove use of the Cray PMI KVS - which is designed for a lighweight MPI that exchanges only a minimimal amount of connection info (about 128 bytes per rank) - within cray/pmix. Use Cray PMI collective extensions instead. This is the first of several steps to accelerate launch of Open MPI on Cray systems using either native aprun or nativized slurm.	2015-02-11 15:14:55 -08:00
Rolf vandeVaart	08dceda2c0	Fix logic for handling priority and eager RDMA. There was some refactoring that was done in this code and it ended up changing the logic that is used to set up eager RDMA. Rather than setting up eager RDMA with a high priority message, it did it the other way around. For some reason, CUDA-aware support did not like this. So, basically, restore the logic to the way it was prior to the refactoring. The refactoring did not intend to change this. Lightly reviewed by hjelmn.	2015-02-11 16:38:36 -05:00
Jeff Squyres	08285c6361	lt_interface: properly check OPAL_HAVE_LTDL_ADVISE	2015-02-11 12:25:20 -08:00
Jeff Squyres	4f1996df5d	various: remove $(LTDLINCL) from Makefile.am's that didn't need it	2015-02-11 12:25:20 -08:00
Ralph Castain	3de8c5c7c6	Cleanup the munge support - the credential cannot be reused for multiple connections	2015-02-10 20:34:35 -08:00
George Bosilca	e173f9b0c0	Somehow we lost one of the most critical parameter allowing the PML to decide how to order the different interconnects. Bring it back !	2015-02-10 20:32:05 -05:00
George Bosilca	7f4c5fa96f	Add the displacement of the element to the safeguard check.	2015-02-10 20:13:36 -05:00
Ralph Castain	3ae3b96c17	Fix master compilation - a buried header dependency must have been removed.	2015-02-10 07:22:10 -08:00
Mike Dubman	6816e3421f	Merge pull request #377 from regrant/ib_wr_fix fix problem with get_pathrecord posting too many recv requests	2015-02-10 08:47:23 +02:00
Ralph Castain	bef830efef	Fix debug output	2015-02-09 20:49:04 -08:00
Ralph Castain	07134f5b17	Add munge security	2015-02-09 20:49:03 -08:00
Ralph Castain	a3275aa867	Once again, fix the blasted singleton comm_spawn	2015-02-05 17:34:25 -08:00
Jeff Squyres	0dbbffb753	pmix_base_frame: use the "= { 0 }" initializer Per open-mpi/ompi#381, convert the specific intialization of opal_pmix to use the generic "= { 0 }" initializer. This form can be used to initialize any type when the intent is just to zero out / assign some value.	2015-02-05 17:51:06 -05:00
Ralph Castain	f28238af59	Fix a race condition seen by Absoft during finalize. Stop the orte progress thread without cleaning it up, thus allowing the frameworks to still cancel their posted recv's. Then cleanup the memory footprint afterwards.	2015-02-05 11:41:37 -08:00
Ralph Castain	4d882796b6	Silence warnings	2015-02-05 11:41:00 -08:00
Howard Pritchard	e508a4078e	Merge pull request #376 from regrant/ib_error_fix fixes OpenIB connect error reporting for ibv_* calls that return an errn...	2015-02-04 10:22:03 -07:00
Jeff Squyres	621af3aa07	pmix_base: fix global opal_pmix symbol for static linking on OS X OS X has weirdness when static linking. If a symbol is not initialized, it is put into the common block section, and Weird Things happen (linking when trying to using that global symbol will fail). If you initialize the variable, it goes into a different section (and linking to it will work). This link (that might go stale someday) has some information about OS X linker scope and treatment of symbol definitions: https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachOTopics/1-Articles/executing_files.html#//apple_ref/doc/uid/TP40001829-98432-TPXREF120 Fixes #375.	2015-02-04 12:12:31 -05:00
Ryan Grant	de93497789	fix problem with get_pathrecord posting too many recv requests	2015-02-04 09:53:58 -07:00
Ryan Grant	5d5e9bc1f8	fixes OpenIB connect error reporting for ibv_* calls that return an errno	2015-02-04 09:09:14 -07:00
Jeff Squyres	a3728f09af	libfabric: add another missing file to the Makefile.am	2015-02-04 04:02:27 -08:00
Jeff Squyres	66a680879e	libfabric: fix header file name in Makefile.am	2015-02-03 19:41:25 -08:00
Jeff Squyres	cb7cc171f9	usnic: update README.txt notes Update notes about copying the usnic BTL between master and the v1.8 branch.	2015-02-03 15:54:36 -08:00
Jeff Squyres	edf7232e00	usnic: enable building with an external libfabric	2015-02-03 13:46:06 -08:00
Jeff Squyres	bfa54d5d7b	usnic: update to match new libfabric	2015-02-03 13:46:06 -08:00
Jeff Squyres	d2490d2fd8	libfabric: update Makefile.am to match new libfabric drop	2015-02-03 13:46:05 -08:00
Jeff Squyres	3dc0abfbc4	libfabric: update to (just past) 1.0rc1 Updated to Github ofiwg/libfabric@6b005d0d19.	2015-02-03 13:46:05 -08:00
Ralph Castain	d3267c200f	Add missing OMPI-changes to libevent 2.0.22	2015-02-02 20:57:40 -08:00
Jeff Squyres	965ccab6cc	libfabric: remove a few warnings Embedding libfabric is a temporary measure; I'm removing some warning notifications so that the output isn't so cluttered (we're getting the real warnings fixed upstream, but the OMPI community doesn't really care/need to see the warnings in the meantime).	2015-01-29 17:38:02 -08:00
Todd Kordenbrock	37e6096fe7	Copyright update.	2015-01-29 11:08:13 -06:00
Todd Kordenbrock	ca30e129e8	Add the option to use the Portals4 logical to physical table. This commit adds an MCA variable to select Portals4 logical addressing, populates the logical-to-physical mapping table and initializes the NI in this mode.	2015-01-29 11:08:13 -06:00
George Bosilca	b9a63cbe7a	One less warning.	2015-01-27 13:25:55 -05:00
Ralph Castain	294ebc907a	Fix singleton operations so they can work inside a slurm environment	2015-01-27 09:29:42 -06:00
Ralph Castain	ba25e8a0ce	Fix singletons	2015-01-27 09:29:42 -06:00
Ralph Castain	028b00154d	Complete implementation of the schizo framework to support OMPI component	2015-01-27 09:29:42 -06:00
Jeff Squyres	436223959d	usnic: update to match new libfabric APIs	2015-01-24 05:49:36 -08:00
Jeff Squyres	7d5755f62b	libfabric: update to ofiwg/libfabric@b3f7af4c67 Pull down a new embedded copy of libfabric from https://github.com/ofiwg/libfabric.	2015-01-24 05:48:48 -08:00
Howard Pritchard	4de512af66	Merge pull request #358 from hppritcha/topic/ugni_spawn_issue btl/ugni: use PMIX_GLOBAL for modex_send in ugni	2015-01-22 12:55:46 -06:00
Howard Pritchard	056daa05bf	btl/ugni: use PMIX_GLOBAL for modex_send in ugni Using PMIX_REMOTE is not the right thing for ugni BTL when its possible that spawned ranks end up on the same node as some of the spawnee ranks.	2015-01-22 06:53:45 -08:00
Bert Wesarg	0d0a754c42	Remove VampirTrace.	2015-01-22 08:08:07 +01:00
Gilles Gouaillardet	9f80aa2d28	btl/openib: regression fix when rdmacm or udcm are disabled This fixes a regression introduced in open-mpi/ompi@661c35ca67 Thanks to Mark Santcroos for reporting this issue	2015-01-20 11:31:50 +09:00
George Bosilca	da83b084f5	Shifting the datatype around should alter it's true LB and UB.	2015-01-19 02:28:17 -05:00
George Bosilca	3ae89dc686	Clarify some of the comments.	2015-01-19 02:26:59 -05:00
Rolf vandeVaart	66f6026214	Improve error message to help user figure out what to do	2015-01-16 13:55:27 -05:00
Jeff Squyres	65a279019e	usnic: fix typo in memchecker usage	2015-01-16 09:42:19 -08:00
Jeff Squyres	3969fe3a94	libfabric: ensure wrapper libs are loaded for static builds For static builds, we need to also set <framework>_<component>_WRAPPER_EXTRA_LIBS so that the wrappers know what other libraries to add to link executables.	2015-01-16 09:29:52 -08:00
Gilles Gouaillardet	661c35ca67	cleanup dead code caused by the removal of the --with-threads configure option	2015-01-16 19:13:59 +09:00
Gilles Gouaillardet	ac16970d21	opal_tree: use a safer syntax intel compiler incorrectly inline this function, so use a safer syntax to get correct generated code.	2015-01-16 18:45:55 +09:00
Gilles Gouaillardet	5687ce8a07	Revert "opal/lifo: fix type declaration when cmpset_128 is available" This reverts commit `1ba36175be`.	2015-01-16 15:18:07 +09:00
Gilles Gouaillardet	1ba36175be	opal/lifo: fix type declaration when cmpset_128 is available	2015-01-16 15:12:29 +09:00
Gilles Gouaillardet	b23126497c	Merge branch 'master' of https://github.com/open-mpi/ompi	2015-01-16 10:55:35 +09:00
Nathan Hjelm	006074c48d	Merge pull request #332 from hjelmn/openib_updates Openib updates	2015-01-15 15:05:18 -06:00
Jeff Squyres	d13c14ec82	CSCus22527: fix off-by-one error in checking the number of VFs Ensure to count this process when checking for how many VFs we need on the local server. (cherry picked from commit 386c01934e98cb8dcb48ff648ecdfb0c8677baa9)	2015-01-15 11:44:29 -08:00
Jeff Squyres	4685767b2d	libfabric: update usnic configury Use new common m4 macro for choosing between libnl3 and libnl.	2015-01-15 07:12:39 -08:00
Jeff Squyres	400b02e566	libfabric: update to github:ofiwg/libfabric HEAD Specifically: bbf0f3ea8e92c92a7cee56473ecdbbbb34cceb7d (15 Jan 2015)	2015-01-15 07:11:54 -08:00
Gilles Gouaillardet	bf6adedd70	atomic/ia32: silence warnings	2015-01-15 18:53:58 +09:00
Aurélien Bouteiller	f49981bb2a	Disable coalescing until pull request #332 gets in.	2015-01-14 14:12:47 -05:00
Nathan Hjelm	cf4975501d	rcache/vma: fix parent class of mca_rcache_vma_t There was a mismatch between the structure for mca_rcache_vma_t and the OBJ_CLASS_INSTANCE. One was opal_list_item_t and the other was ompi_free_list_item_t. The super class in the structure looks like it is the correct one. Changed the superclass in OBJ_CLASS_INSTANCE to match.	2015-01-14 10:21:24 -07:00
Jeff Squyres	e4e5e7dbc0	usnic: ensure to clean up nicely in case of low resources If there are not enough resources (e.g., low VFs), we can end up calling finalize_one_channel() on the same channel multiple times. So ensure to NULL out fields that we have freed already so that we do not try to free them a second time. Fixes CSCus26648.	2015-01-13 14:37:31 -08:00
Jeff Squyres	8807ae2497	usnic libfabric: also set the us_netmask_be field. From libfabric upstream commit ofiwg/libfabric@3976745. Part of the fix for CSCus22495.	2015-01-13 12:04:57 -08:00
Jeff Squyres	d00cede718	usnic: fix if_include/exclude of CIDR-specified networks Fix the ordering so that we obtain the usnic netmask information before we do the filtering based on CIDR-specified networks. Also requires upstream Github libfabric commit 3976745. Fixes CSCus22495.	2015-01-13 12:04:51 -08:00
Jeff Squyres	a220b92cf8	usnic: fix function name in opal_output	2015-01-13 12:04:07 -08:00
Gilles Gouaillardet	955f3c2730	configury: check existence of the atomic_init function in libfabric intel compilers implements atomic_init in c++ only, so disable c11 atomic in libfabric for now	2015-01-13 16:39:41 +09:00
Gilles Gouaillardet	cbe0d26b2d	configury: do test the __STDC_NO_ATOMICS__ macro for libfabric	2015-01-13 16:06:37 +09:00
Jeff Squyres	5ed688a074	usnic: enusre that we only get "usnic"-named providers Also, a minor update to a verbose message.	2015-01-12 13:21:22 -08:00
Jeff Squyres	881b1dcf19	usnic: document libfabric abstractions Handy tips to remember the libfabric abstractions and what they correspond to in usnic/VIC terms.	2015-01-09 15:21:51 -08:00
Gilles Gouaillardet	194d9f84d3	btl/usnic: move call to check_reg_mem_basics() avoid annoying memlock related messages when there is no usnic device.	2015-01-09 11:37:45 +09:00
George Bosilca	1344097d35	Turn OFF the TCP dump mechanism.	2015-01-08 18:50:49 -05:00
George Bosilca	8ddd3b3b09	Cleanup the TCP dump mechanism.	2015-01-08 18:50:05 -05:00
Nathan Hjelm	c65f026fee	btl/vader: fix typo in xpmem setup	2015-01-08 12:52:38 -07:00
Nathan Hjelm	9f6faadd91	opal_fifo: add missing memory barrier in pop Thanks to Adrian Reber for reporting this. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-01-08 09:14:56 -07:00
Gilles Gouaillardet	4c29d8e247	btl/openib: silence warning (unused code)	2015-01-08 17:18:07 +09:00
Gilles Gouaillardet	8ab605d9c5	btl/tcp: fix overflow in mca_btl_tcp_endpoint_dump()	2015-01-08 15:40:16 +09:00
Nathan Hjelm	7d206ae769	btl/ugni: fix a couple of bugs Two fixes: - Do not try to return a mailbox to the free list if one wasn't allocated. - Do not try to tear down IRQ CQs if they were not created.	2015-01-07 13:48:17 -07:00
Dave Goodell	49069bc661	usnic: fix fi_av_insert (ARP resolution) bugs We had several problems in the old code: 1. We were specifying an arbitrary timeout (100 ms) and then abandoning all remaining pending AV insert operations. We would then free the endpoint buffer that we gave to fi_av_insert(), usually causing libfabric's progress thread to write to a freed buffer. 2. We were claiming in a show_help message that the timeout was controllable via an MCA parameter. This commit removes that parameter, since there's no good method for us to specify a timeout like this to libfabric right now. 3. We also weren't waiting for the correct number of fi_av_insert() operations to complete. We were waiting for nprocs, which is accidentally fine for 2 procs on separate hosts, but not for most other proc counts. Reviewed-by: Jeff Squyres <jsquyres@cisco.com>	2015-01-07 08:25:17 -08:00
Gilles Gouaillardet	06e071454e	btl/openib: cleanup duplicate code	2015-01-07 14:07:30 +09:00
Gilles Gouaillardet	135ecce0eb	btl/openib: rename OPAL_HAVE_XRCD macro into OPAL_HAVE_CONNECTX_XRC_DOMAINS	2015-01-07 13:27:25 +09:00
George Bosilca	bf62bed65f	Typo in the poll/epoll ops declaration.	2015-01-06 21:21:25 -05:00
Ralph Castain	a7c5ff2ace	Update to libevent 2.0.22-stable	2015-01-06 16:37:25 -08:00
Nathan Hjelm	6733d89cf9	btl/vader: fix return code check when opening ptrace_scope file	2015-01-06 15:17:56 -07:00
Nathan Hjelm	cde79bfa60	btl/openib: misc cleanup (tabs, etc) and put credit code into a common place (was duplicated in the send and sendi paths)	2015-01-06 11:39:23 -07:00
Nathan Hjelm	9bae131589	btl/openib: fix message coalescing There was a bug in the openib btl handling this valid sequence of calls: desc = btl_alloc (); btl_free (desc); When triggered the bug would cause either fragment loss or undefined behavior (SEGV, etc). The problem occured because btl_alloc contained the logic to modify the pending fragment (length, etc) and these changes were not corrected if the fragment was freed instead of sent. To fix this issue I 1) moved some of the coalescing logic to the btl_send function, and 2) retry the coalesced fragment on btl_free if it was never sent. This appears to completely address the issue.	2015-01-06 11:39:16 -07:00
Nathan Hjelm	9aaac11648	btl/openib: fix recieve queue source detection	2015-01-06 11:39:11 -07:00
Howard Pritchard	7df648f1cf	btl/openib: fix problems from commit `b3617e73` For systems with OFED's lacking XRC support, commit `b3617e73` broke the build of the openib btl. This commit addresses the issues introduced by this commit.	2015-01-06 11:31:12 -07:00
Ralph Castain	4c38c31ccf	Actually copy buffer contents when dss.copy of a buffer is requested	2015-01-06 09:09:06 -08:00
Gilles Gouaillardet	b3617e736e	btl/openib: add XRC support with OFED 3.12+ based on an original patch contributed by Bull.	2015-01-06 15:30:52 +09:00
Howard Pritchard	c857cc926c	Merge pull request #327 from hppritcha/topic/async_progress Topic/async progress	2015-01-05 16:20:44 -07:00
Dave Goodell	8afd8487f8	opal_stdint.h: fix "#pragma GCC" warnings This was more complicated than I would like, but it's just an unfortunate GCC/clang difference. I don't have access to all the C compilers out there, so this may still have problems with other compilers that implement some form of `#pragma GCC diagnostic` support but don't actually behave the same as some versions of GCC. fixes #323	2015-01-05 14:44:46 -08:00
Gilles Gouaillardet	9e9261e90a	pmix: correctly set locality flags in proc_flags do not use opal_process_info.cpuset which is not set at that time.	2014-12-26 15:37:08 +09:00
Howard Pritchard	0a6f841d5f	xpmem/config: simple xpmem search on Cray's Use the pkg-config related m4 functions to find out where Cray's xpmem.h and libxpmem are located on a system. With this commit, there is no longer any need to have to explicitly indicate an xpmem install location on the configure line, at least for Cray systems running CLE 4.X and 5.X.	2014-12-24 14:40:06 -07:00
Howard Pritchard	065c756860	btl/ugni: improve error handling Improve error handling when pthread functions return errors. Remove stale debug code.	2014-12-24 11:50:24 -07:00
Howard Pritchard	f8e354ce00	btl/ugni: add a request_progress_thread mca param Replace temporary environment variables with a MCA parameter for the ugni btl. A user wishing to use the ugni btl async. progress thread needs to set the request_progress_thread param to true. For example, using env. variable format: export OMPI_MCA_btl_ugni_request_progress_thread=1	2014-12-24 11:50:24 -07:00
Howard Pritchard	8b250cc15b	btl/ugni: more debug cleanup	2014-12-24 11:50:24 -07:00
Howard Pritchard	f0c519517b	btl/ugni: switch to using opal_progress Switch to invoking opal_progress from the async progress thread, rather than calling ugni btl specific progress.	2014-12-24 11:50:24 -07:00
Howard Pritchard	47747c1b27	btl/ugni: remove some debug output	2014-12-24 11:50:24 -07:00
Howard Pritchard	2d14c2a204	btl/ugni: switch to using tx cq irqs for rdma Verified via testing with unit tests, etc. that in fact BTE TX descriptors using CQs configured to generate IRQs were in fact working correctly on Cray XC. Disable send message back to self and just use IRQs generated by completion of TX descriptors posted to BTE.	2014-12-24 11:50:24 -07:00
Howard Pritchard	acd07d98da	btl/ugni: turn off chatty debug in irq cq setup	2014-12-24 11:50:24 -07:00
Howard Pritchard	0dec2f4af7	btl/ugni: mark btl frags for irqs as btl owned Make sure frags allocated to generate irqs to wake the progress thread, etc. set the MCA_BTL_DES_FLAGS_BTL_OWNERSHIP flag.	2014-12-24 11:50:23 -07:00
Howard Pritchard	d188f0bc6f	btl/ugni: honor enable_mpi_threads Honor enable_mpi_threads setting to enable the ugni btl async progress thread. If the app doesn't request thread-multiple the thread will not be created.	2014-12-24 11:50:23 -07:00
Howard Pritchard	43cdcb745f	btl/ugni: add missing mutex lock	2014-12-24 11:50:23 -07:00
Howard Pritchard	83bcbd1cf9	btl/ugni: compilation fixes Fix compilation problems in ugni btl associated with async progress additions.	2014-12-24 11:50:23 -07:00
Howard Pritchard	13ab8a9e5a	btl/ugni: use MCA_BTL_DES_FLAGS_SIGNAL Use MCA_BTL_DES_FLAGS_SIGNAL frag flag to indicate whether or not an interrupt needs to be delivered along with a control message going through smsg.	2014-12-24 11:50:23 -07:00
Howard Pritchard	3fc7b389ff	initial async progress changes for gni	2014-12-24 11:50:23 -07:00
Devendar Bureddy	ccafc62c07	OMPI: btl openib: fix max registarable memory caluclation - by default allow to register maximum possible (i.e 2 * total_memory) memory. This beheviour can be turned off using mca parameter "btl_openib_allow_max_memory_registration" - In fallback case, use device specific parameters to calulate memory limit.	2014-12-23 23:35:54 +02:00
Howard Pritchard	ffbf9738a3	btl/vader: disable SGI UV xpmem for now This commit allows master to build again on SGI UV systems. Fixes #322	2014-12-23 12:04:25 -07:00
Gilles Gouaillardet	f6da257477	configury: test external hwloc version is 1.8 or greater hwloc_topology_dup is only available from hwloc 1.8	2014-12-22 13:42:38 +09:00
Jeff Squyres	40dd4c5b76	configury: manually remove some stamp-h? files Due to what might be a bug in Automake, we need to remove stamp-h? files manually. See http://debbugs.gnu.org/cgi/bugreport.cgi?bug=19418.	2014-12-20 08:32:57 -08:00
Jeff Squyres	d5b3e5802e	libfabric configury: add more tests Properly test for some dependent libraries; don't just assume elsewhere in Open MPI's configury will find those libraries. Also consolidate some CPPFLAGS and clarify some comments.	2014-12-20 08:32:47 -08:00
Jeff Squyres	012e008649	libfabric configury: make AC_CONFIG_FILES be unconditional Also add the generated config.h file to .gitignore.	2014-12-20 08:32:47 -08:00
Jeff Squyres	45ef0352d7	libfabric: do a proper check for intrinsic atomics	2014-12-20 08:32:46 -08:00
Jeff Squyres	ff1364cbe4	Revert "libfabric: add missing header file" That wasn't a missing header file; in fact, it should have been .gitignored! This reverts commit `35bf5fc60c`.	2014-12-19 17:39:30 -08:00
Jeff Squyres	35bf5fc60c	libfabric: add missing header file	2014-12-19 17:33:11 -08:00
Jeff Squyres	e0f660cb9e	libfabric: fix clang compile error in usnic provider From ofiwg/libfabric@0078c93ae4	2014-12-19 15:45:16 -08:00
Jeff Squyres	75797c4f30	libfabric: update embedded libfabric configury To support the newly-copied libfabric downloaded from github ofiwg/libfabric@8da3957de3.	2014-12-19 14:45:30 -08:00
Jeff Squyres	e2362988a9	libfabric: update to ofiwg/libfabric@8da3957de3 Pull down a new embedded copy of libfabric from https://github.com/ofiwg/libfabric.	2014-12-19 14:45:21 -08:00
Howard Pritchard	91b0d03bf2	pmix/cray: remove dead code	2014-12-19 13:08:23 -08:00
Ralph Castain	123fdd603f	If we are using hwthread cpus, then default to binding there, letting the user override to whatever they want	2014-12-19 08:04:28 -08:00
Rolf vandeVaart	26482db736	Bump up max send size. Gives much better performance for GPU transfers while only decreasing host transfers by a small amount.	2014-12-18 13:22:58 -08:00
Jeff Squyres	de31b08a24	Merge pull request #319 from miked-mellanox/topic/opal_path_nfs_autofs skip check for autofs if fstype is autofs jenkins: check	2014-12-18 15:47:16 -05:00

... 2 3 4 5 6 ...

3323 Коммитов