openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	26705efad0	opal_check_alps: fix configure output There was a path where OPAL_CHECK_ALPS would exit its testing but still leave `opal_check_cray_alps_happy` blank. Fix that by setting it to "no". Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-10-18 11:30:00 -07:00
Edgar Gabriel	dce203ffc6	Merge pull request #7057 from edgargabriel/topic/romio321-status-set-elements-fix MPIR_Status_set_bytes: fix for large counts	2019-10-18 08:16:36 -05:00
Nathan Hjelm	b1ef5a40fa	Merge pull request #7016 from hjelmn/fix_btl_uct_from_yet_another_unannounced_api_break_in_the_openucx_uct_layer btl/uct: add support for OpenUCX v1.8 API changes	2019-10-17 06:27:18 -07:00
Jeff Squyres	b6c4d5c118	Merge pull request #7060 from jsquyres/pr/usnic-mca-updates BTL usnic MCA updates	2019-10-15 10:48:10 -04:00
Jeff Squyres	e1e6d8b85e	Merge pull request #7076 from ftab/pr/my-superlative-fix README: Remove info for plugins that aren't used anymore	2019-10-10 14:52:36 -04:00
Jeff Squyres	65fd12feff	Merge pull request #7081 from msbrowning/pr/fixed-README Removed text block from line 883 of README.	2019-10-10 14:52:23 -04:00
Mark Browning	77b3ff9d38	Remove the stale cr MPI extension Also removed text block from line 883 of README. Signed-off-by: Mark Browning <marksbrowning3@gmail.com>	2019-10-10 13:24:30 -04:00
Jeff Squyres	f7ee4463b3	Merge pull request #7079 from CalebProvost/hacktoberfest Edit README	2019-10-10 13:18:54 -04:00
Jeff Squyres	896ce76b64	Merge pull request #7082 from kizill/master Fix ipv6 improper address copy bug	2019-10-10 12:01:44 -04:00
Jeff Squyres	8f3583d3bd	Merge pull request #7073 from Joe-Downs/pr/fix-README README: edit "dist_graph topologies" to "communicator topologies"	2019-10-10 11:55:43 -04:00
Jeff Squyres	d736253079	Merge pull request #7074 from classicsman/pr/fix-README Deleted paragraph	2019-10-10 11:50:38 -04:00
Jeff Squyres	836a0766ae	Merge pull request #7072 from bfitzgit23/pr/fix-README README-fixed-bfitzgit23	2019-10-10 11:39:06 -04:00
CalebProvost	634054fb37	README: minor grammar fixes Signed-off-by: CalebProvost <DHX664@gmail.com>	2019-10-10 11:23:55 -04:00
Rick Gleitz	0c923c5428	README: deleted stale paragraph about fca component Signed-off-by: Rick Gleitz <rgleitz@jefflibrary.org>	2019-10-10 11:07:53 -04:00
Jeff Squyres	f77c3327d8	Merge pull request #7070 from Cfoster01/pr/fix-README updated readme to remove double space on line 297	2019-10-10 11:06:52 -04:00
Jeff Squyres	69eca3c599	Merge pull request #7069 from summonholmes/master Fix a typo: slopen -> dlopen	2019-10-10 11:06:12 -04:00
Jeff Squyres	f10e582f71	Merge pull request #7078 from santa65/pr/readme-fix fix address	2019-10-10 10:32:27 -04:00
bfitzgit23	38da109217	README: Removed stale sentance about --enable-mpi-thread-multiple Signed-off-by: bfitzgit23 <dfitz@me.com>	2019-10-10 10:22:09 -04:00
shanekimble	72b6292b69	Fix a typo: slopen -> dlopen Signed-off-by: shanekimble <skimble@edjanalytics.com>	2019-10-10 10:04:09 -04:00
santa magar	3bbf870fde	README: fix Knem URL Signed-off-by: santa magar <santa65thapa@yahoo.com>	2019-10-10 09:34:33 -04:00
Stanislav Kirillov	0e0763e006	fix ipv6 btl connection bug Signed-off-by: Stanislav Kirillov <staskirillof@yandex.ru>	2019-10-10 11:20:37 +00:00
Dennis Field	e72b93bf60	README: Remove info for plugins that aren't used anymore Signed-off-by: Dennis Field <fury@xibase.com>	2019-10-09 20:38:50 -04:00
Joe Downs	dd6a4f3950	README: edit to "communicator topologies" Signed-off-by: Joe Downs <joedowns502@gmail.com>	2019-10-09 20:22:26 -04:00
Foster	252e98c474	updated readme to remove double space on line 297 Signed-off-by: Foster <CCF6703@yum.com>	2019-10-09 20:17:46 -04:00
Geoff Paulsen	4e1e6f8972	Merge pull request #6993 from awlauria/fix_warnings_master Fix miscellaneous compiler warnings.	2019-10-09 09:17:02 -05:00
Gilles Gouaillardet	096da7b3b5	Merge pull request #7061 from ggouaillardet/topic/ucx_zero_size_ddt pml/ucx: correctly handle zero size datatypes	2019-10-09 17:28:18 +09:00
Gilles Gouaillardet	33361aa124	pml/ucx: correctly handle zero size datatypes zero-size derived datatypes are now flagged as OPAL_DATATYPE_FLAG_CONTIGUOUS so update mca_pml_ucx_init_datatype() to correctly handle them. Since 'size' is a 'size_t', the assertion can simply be removed. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-10-09 16:54:00 +09:00
Gilles Gouaillardet	8906f8cdc6	Merge pull request #7062 from ggouaillardet/topic/travis_distcheck travis: fix make distcheck	2019-10-09 13:30:14 +09:00
Gilles Gouaillardet	d37f35244f	travis: fix make distcheck bad side effect occurs when CPPFLAGS is set in the environment, so set it (and LDFLAGS too) on the configure command line. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-10-09 12:57:54 +09:00
Jeff Squyres	3080033a8c	btl/usnic: set retrans_timeout back down to 5ms Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-10-08 11:17:54 -07:00
Jeff Squyres	132e4cab3b	btl/usnic: set ack_iteration_delay default to 4 It was previously accidentally set to 0. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-10-08 11:17:30 -07:00
Edgar Gabriel	8a3abbf803	MPIR_Status_set_bytes: fix for large count sizes Change the ncounts argument to MPI_Count and use MPI_Status_set_elements_x for enabling read/write operations beyond the 2GB limit. Thanks to Richard Warren from the HDF5 group for reporting the issue and providing the suggested fix for romio. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-10-08 10:47:02 -05:00
Edgar Gabriel	ea7e1ea859	Merge pull request #7046 from edgargabriel/pr/hdf5-2gb-bug comomn_ompio_file_read/write: fix 2GB limiting issue	2019-10-05 10:39:09 -05:00
Edgar Gabriel	a130f569df	comomn_ompio_file_read/write: fix 2GB limiting issue individual read/write operations exceeding 2GB fail in ompio due to improper conversions from size_t to int in two different locations. This commit fixes an issue reported by Richard Warren from the HDF5 group. Fixes Issue #397 Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-10-05 09:50:02 -05:00
Jeff Squyres	a49ae7f034	Merge pull request #7038 from jsquyres/pr/usnic-fixes-and-optimizations btl/usnic fixes and optimizations	2019-10-04 19:38:50 -04:00
Jeff Squyres	fe7f772f21	btl/usnic: properly size freelist items Move the prefix area from the head to the body in relevant size computations. This fixes a problem in high traffic situations where usNIC may have sent from unregistered memory. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-10-04 14:40:56 -07:00
Jeff Squyres	27e3040dfe	btl/usnic: cap the number of resends per progress iteration New MCA param: btl_usnic_max_resends_per_iteration. This is the max number of resends we'll do in a single pass through usNIC component progress. This prevents progress from getting stuck in an endless loop of retransmissions (i.e., if more retransmissions are triggered during the sending of retransmissions). Specifically: we need to leave the resend loop to allow receives to happen (which may ACK messages we have sent previously, and therefore cause pending resends to be moot). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-10-04 13:05:51 -07:00
Jeff Squyres	3cc95d86b2	btl/usnic: increase default retrans_timeout Significantly increase the default retrans timeout. If the retrans timeout is too soon, we can end up in a retransmission storm where the logic will continually re-transmit the same frames during a single run through the usNIC progress function (because the timer for a single frame expires before we have run through re-transmitting all the frames pending re-transmission). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-10-04 13:05:51 -07:00
Jeff Squyres	968b1a51b5	btl/usnic: clarifications and fixes regarding ACKs New MCA parameter: btl_usnic_ack_iteration_delay. Set this to the number of times through the usNIC component progress function before sending a standalone ACK (vs. piggy-backing the ACK on any other send going to the target peer). Use "ticks" language to clarify that we're really counting the number of times through the usNIC component DATA_CHANNEL completion check (to check for incoming messages) -- it has no relation to wall clock time whatsoever. Also slightly change the channel-checking scheme in usNIC component progress: only check the PRIORITY channel once (vs. checking it once, not finding anything, and then falling through the progress_2() where we check PRIORITY again and then check the DATA channel). As before, if our "progress" libevent fires, increment the tick counter enough to guarantee that all endpoints that need an ACK will get triggered to send standalone ACKs the next time through progress, if necessary. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-10-04 13:05:51 -07:00
Jeff Squyres	ce2910a28a	btl/usnic: s/get_nsec/get_nticks/g Rename "get_nsec()" to "get_ticks()" to more accurately reflect that this function has no correlation to wall clock time at all. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-10-04 13:05:51 -07:00
Jeff Squyres	f3429d7a44	btl/usnic: pack a wire data struct Might as well save a few bytes when sending this struct across the network via the __opal_attribute_packed__ attribute. That being said, also re-order the elements in this struct so that there's no holes to begin with. Do this so that the compiler/runtime won't effect (slow) unaligned reads/writes because of the __opal_attribute_packed__ attribute. The "packed" attribute is really more about defensive programming (e.g., if we make a mistake and have a hole, "packed" will remove it for us). *** Do not bring this commit back to existing/already-released release branches: it will cause incompatibility, since it effectively changes the usNIC BTL wire protocol. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-10-04 13:05:51 -07:00
Josh Hursey	b774b47428	Merge pull request #7032 from jjhursey/fix-sigkill-wait Fix the sigkill timeout sleep to prevent SIGCHLD from preventing completion	2019-10-02 14:48:27 -05:00
Joshua Hursey	0e8a97c598	Fix the sigkill timeout sleep to prevent SIGCHLD from preventing completion. * The user can set `-mca odls_base_sigkill_timeout 30` to have ORTE wait 30 seconds before sending SIGTERM then another 30 seconds before sending SIGKILL to remaining processes. This usually happens on an abnormal termination. Sometimes the user wants to delay the cleanup to give the system time to write out corefile or run other diagnostics. * The problem is that child processes may be completing while ORTE is in this loop. The SIGCHLD will interrupt the `sleep` system call. Without the loop the sleep could effectively be ignored in this case. - Sleep returns the amount of time remaining to sleep. If it was interrupted by a signal then it is a positive number less than or equal to the parameter passed to it. If it slept the whole time then it returns 0. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2019-10-02 14:41:34 -04:00
Austen Lauria	0d4004cc3c	Fix miscellaneous compiler warnings. Signed-off-by: Austen Lauria <awlauria@us.ibm.com>	2019-10-01 16:27:25 -04:00
Jeff Squyres	7ddfa6950b	Merge pull request #7028 from jsquyres/pr/fix-ofi-mtl mtl/ofi: replace OMPI_UNLIKELY with OPAL version	2019-10-01 13:40:21 -04:00
Howard Pritchard	d6d73b7724	mtl/ofi: replace OMPI_UNLIKELY with OPAL version one off patch for v4.0.x. for some reason commit on master didn't have this problem. Signed-off-by: Howard Pritchard <howardp@lanl.gov> (cherry picked from commit 5f3dbdb5c8a94a4f426ecca1a3a91c83035f956c) Note that this commit is actually a cherry-pick from the v4.0.x branch. This is the opposite direction than what we nornmally do: we usually commit to master first and then cherry-pick to the release branches (vs. the other way around). As is probably evident from the original commit message above, through a comedy of errors, this commit was actually applied to the v4.0.x branch first and then cherry-picked back to master (i.e., the problem did exist in the original master commit 3aca4af548a3d781b6b52f89f4d6c7e66d379609, but it was not recongized at the time). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-10-01 09:52:27 -07:00
Gilles Gouaillardet	280856928a	Merge pull request #7026 from ggouaillardet/topic/openpmix_refresh pmix/pmix4x: refresh to the latest open PMIx master	2019-10-01 18:15:53 +09:00
Gilles Gouaillardet	1c4a3598d0	pmix/pmix4x: refresh to the latest open PMIx master refresh to openpmix/openpmix@ea3b29b1a4 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-10-01 14:27:22 +09:00
Ralph Castain	f4371f7f94	Merge pull request #7022 from rhc54/topic/oob2 Remove stale references to orte_oob_base.ev_base	2019-09-29 19:53:12 -07:00
Ralph Castain	7444b32494	Remove stale references to orte_oob_base.ev_base The oob is restricted to the main event base Signed-off-by: Ralph Castain <rhc@pmix.org>	2019-09-29 18:52:03 -07:00

1 2 3 4 5 ...

30136 Коммитов