openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	4e592ac434	Fix the tarball by providing the correct list of headers in the Makefile.am	2015-01-07 18:37:26 -08:00
Nathan Hjelm	7d206ae769	btl/ugni: fix a couple of bugs Two fixes: - Do not try to return a mailbox to the free list if one wasn't allocated. - Do not try to tear down IRQ CQs if they were not created.	2015-01-07 13:48:17 -07:00
mjbhaskar	2d33b0a745	A fix for memory corruption seen on 32 bit machines	2015-01-07 14:41:44 -06:00
mjbhaskar	27dfcaaab2	Merge branch 'master' of https://github.com/open-mpi/ompi	2015-01-07 14:39:23 -06:00
mjbhaskar	74f8ba2acb	A fix for memory corruption problem	2015-01-07 14:34:38 -06:00
Howard Pritchard	f34dd5f5fd	plm/alps: update copyright	2015-01-07 12:33:38 -07:00
Howard Pritchard	c454d11b01	plm/alps: fix orted abort hang problem Turns out the alps plm component wasn't changing the state of the job upon terminating the orted's in the case of an abnormal termination. This caused mpirun to hang with a zommbie'd aprun process if an orted on a node in the job was killed via signal.	2015-01-07 12:31:41 -07:00
Nathan Hjelm	81dc3a5db9	Merge pull request #335 from hjelmn/osc_updates Osc updates	2015-01-07 11:16:55 -06:00
Dave Goodell	49069bc661	usnic: fix fi_av_insert (ARP resolution) bugs We had several problems in the old code: 1. We were specifying an arbitrary timeout (100 ms) and then abandoning all remaining pending AV insert operations. We would then free the endpoint buffer that we gave to fi_av_insert(), usually causing libfabric's progress thread to write to a freed buffer. 2. We were claiming in a show_help message that the timeout was controllable via an MCA parameter. This commit removes that parameter, since there's no good method for us to specify a timeout like this to libfabric right now. 3. We also weren't waiting for the correct number of fi_av_insert() operations to complete. We were waiting for nprocs, which is accidentally fine for 2 procs on separate hosts, but not for most other proc counts. Reviewed-by: Jeff Squyres <jsquyres@cisco.com>	2015-01-07 08:25:17 -08:00
Gilles Gouaillardet	06e071454e	btl/openib: cleanup duplicate code	2015-01-07 14:07:30 +09:00
Gilles Gouaillardet	135ecce0eb	btl/openib: rename OPAL_HAVE_XRCD macro into OPAL_HAVE_CONNECTX_XRC_DOMAINS	2015-01-07 13:27:25 +09:00
Ralph Castain	e0927895db	Grrr...how many files did they forget?	2015-01-06 19:40:18 -08:00
Ralph Castain	84c41429e9	Add missing file	2015-01-06 18:41:11 -08:00
George Bosilca	bf62bed65f	Typo in the poll/epoll ops declaration.	2015-01-06 21:21:25 -05:00
Ralph Castain	a7c5ff2ace	Update to libevent 2.0.22-stable	2015-01-06 16:37:25 -08:00
Howard Pritchard	061a587384	Merge pull request #336 from hppritcha/topic/odls_signal_fix odls/base: fix an edge case with signals	2015-01-06 16:11:22 -07:00
Howard Pritchard	f0f98f13b6	odls/base: fix an edge case with signals In the course of doing some testing with how orted's handle signaled child processes, found out that very often doing a kill -9 on a process on a node just results in the job hanging. The problem was that the orted odls/errmgr was not properly handling the exit_code being returned from waitpid. Now mark the proc state as ORTE_PROC_STATE_ABORTED_BY_SIG if the exit_code from waitpid indicates the process exited owing to a signal.	2015-01-06 15:42:38 -07:00
Nathan Hjelm	6733d89cf9	btl/vader: fix return code check when opening ptrace_scope file	2015-01-06 15:17:56 -07:00
Nathan Hjelm	e68ed2876c	osc/pt2pt: threading fixes and code cleanup	2015-01-06 13:39:16 -07:00
Nathan Hjelm	3d79806805	add more internal RMA error codes	2015-01-06 13:39:04 -07:00
Nathan Hjelm	9eba7b9d35	Rename the OSC "rdma" component to pt2p to better reflect that it does not actually use btl rdma	2015-01-06 13:38:55 -07:00
Nathan Hjelm	cde79bfa60	btl/openib: misc cleanup (tabs, etc) and put credit code into a common place (was duplicated in the send and sendi paths)	2015-01-06 11:39:23 -07:00
Nathan Hjelm	9bae131589	btl/openib: fix message coalescing There was a bug in the openib btl handling this valid sequence of calls: desc = btl_alloc (); btl_free (desc); When triggered the bug would cause either fragment loss or undefined behavior (SEGV, etc). The problem occured because btl_alloc contained the logic to modify the pending fragment (length, etc) and these changes were not corrected if the fragment was freed instead of sent. To fix this issue I 1) moved some of the coalescing logic to the btl_send function, and 2) retry the coalesced fragment on btl_free if it was never sent. This appears to completely address the issue.	2015-01-06 11:39:16 -07:00
Nathan Hjelm	9aaac11648	btl/openib: fix recieve queue source detection	2015-01-06 11:39:11 -07:00
Howard Pritchard	7df648f1cf	btl/openib: fix problems from commit b3617e73 For systems with OFED's lacking XRC support, commit b3617e73 broke the build of the openib btl. This commit addresses the issues introduced by this commit.	2015-01-06 11:31:12 -07:00
Jeff Squyres	cab1379dfb	Fortran: only emit real16 and complex32 if supported This is the master version of @ggouaillardet's patch from open-mpi/ompi-release#148 (there was a minor conflict to fix and several fuzzings of line numbers).	2015-01-06 09:47:26 -08:00
Howard Pritchard	ec632001b1	Merge pull request #329 from ggouaillardet/topic/romio_refresh refresh ROMIO based on v3.2a2-84-gef1cf14	2015-01-06 10:27:20 -07:00
Ralph Castain	4c38c31ccf	Actually copy buffer contents when dss.copy of a buffer is requested	2015-01-06 09:09:06 -08:00
Jeff Squyres	e77838973d	Merge pull request #313 from ggouaillardet/topic/OFED_3_12 btl/openib: add XRC support with OFED 3.12+	2015-01-06 11:33:19 -05:00
Jeff Squyres	3d5a1bfb7b	Merge pull request #334 from yburette/topic/ofimtlbugfixes Topic/ofimtlbugfixes	2015-01-06 11:30:34 -05:00
Gilles Gouaillardet	0914de9eae	refresh ROMIO based on v3.2a2-84-gef1cf14	2015-01-06 19:43:58 +09:00
Gilles Gouaillardet	b3617e736e	btl/openib: add XRC support with OFED 3.12+ based on an original patch contributed by Bull.	2015-01-06 15:30:52 +09:00
Yohann Burette	f01dd429df	Reset pointer to NULL to prevent double-freeing.	2015-01-05 17:01:37 -08:00
Yohann Burette	1e24da90fe	Fix fi_av_insert return code test.	2015-01-05 17:01:37 -08:00
Yohann Burette	5944c294ad	Add return code testing for fi_mr_reg.	2015-01-05 17:01:37 -08:00
Howard Pritchard	c857cc926c	Merge pull request #327 from hppritcha/topic/async_progress Topic/async progress	2015-01-05 16:20:44 -07:00
Howard Pritchard	f009c8425e	Merge pull request #325 from hppritcha/topic/issue_324 opal/configury: allow param usage multiple times	2015-01-05 16:19:14 -07:00
Howard Pritchard	a179d6a1d7	opal/configury: add url ref to OPAL_FLAGS_UNIQ Add a reference to the git issue related to additions to OPAL_FLAGS_UNIQ to handle multiple instances of --param in the CFLAGS env. variable.	2015-01-05 16:01:18 -07:00
Dave Goodell	8afd8487f8	opal_stdint.h: fix "#pragma GCC" warnings This was more complicated than I would like, but it's just an unfortunate GCC/clang difference. I don't have access to all the C compilers out there, so this may still have problems with other compilers that implement some form of `#pragma GCC diagnostic` support but don't actually behave the same as some versions of GCC. fixes #323	2015-01-05 14:44:46 -08:00
Jeff Squyres	ce2008aa88	man pages: update non-blocking send descriptions As noted by Alexander Pozdneev, non-blocking sends are now able to access buffers in pending non-blocking send operations; the buffers just can't be modified.	2015-01-05 15:44:27 -05:00
Mike Dubman	0e4ce91f5f	Merge pull request #331 from miked-mellanox/topic/fix_mkey_recursion_master fix infinite recursion during mkey exchange at scale	2015-01-04 20:23:50 +02:00
Mike Dubman	54a072caaa	OSHMEM: fix infinite recursion and stack size violation send reply before posting the receive request again to limit the recursion size to number of receive requests. send can call opal_progress which calls this function again. If recv req is started stack size will be proportional to number of job ranks.	2015-01-04 16:31:19 +02:00
Devendar Bureddy	e732152304	HCOLL: Fix hcoll supported datatype checks corretcly	2015-01-02 21:18:12 +02:00
Gilles Gouaillardet	e8d084e6b9	fix ABI fix Fix an undeleted line in open-mpi/ompi@24df0ed039 Thanks to Nick Papior Andersen for pointing this.	2014-12-28 18:07:51 +09:00
Gilles Gouaillardet	9e9261e90a	pmix: correctly set locality flags in proc_flags do not use opal_process_info.cpuset which is not set at that time.	2014-12-26 15:37:08 +09:00
Gilles Gouaillardet	24df0ed039	MPI_Comm_split_type: fix ABI compatibility ABI compatibility was previously broken in open-mpi/ompi@3deda3dc82	2014-12-25 19:43:58 +09:00
Howard Pritchard	a98441cb12	Merge pull request #328 from hppritcha/topic/xpmem_configury xpmem/config: simple xpmem search on Cray's	2014-12-24 15:04:10 -07:00
Howard Pritchard	0a6f841d5f	xpmem/config: simple xpmem search on Cray's Use the pkg-config related m4 functions to find out where Cray's xpmem.h and libxpmem are located on a system. With this commit, there is no longer any need to have to explicitly indicate an xpmem install location on the configure line, at least for Cray systems running CLE 4.X and 5.X.	2014-12-24 14:40:06 -07:00
Howard Pritchard	065c756860	btl/ugni: improve error handling Improve error handling when pthread functions return errors. Remove stale debug code.	2014-12-24 11:50:24 -07:00
Howard Pritchard	f8e354ce00	btl/ugni: add a request_progress_thread mca param Replace temporary environment variables with a MCA parameter for the ugni btl. A user wishing to use the ugni btl async. progress thread needs to set the request_progress_thread param to true. For example, using env. variable format: export OMPI_MCA_btl_ugni_request_progress_thread=1	2014-12-24 11:50:24 -07:00

... 2 3 4 5 6 ...

21814 Коммитов