openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	4774eb8b5a	Update NEWS with 1.10.5 items Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-19 16:48:45 -08:00
Xin Zhao	2d77912c19	Revert "PML/SPML/UCX: add UCX MT support to PML and SPML." This reverts commit `0ecf3c951c`. Signed-off-by: Xin Zhao <xinz@mellanox.com>	2016-12-19 18:57:48 +02:00
rhc54	0acdcebab2	Merge pull request #2601 from rhc54/topic/dbgr Transfer across final fixes from debugger attach work	2016-12-19 03:56:42 -08:00
Ralph Castain	256b5adac5	Transfer across final fixes from debugger attach work Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-19 00:34:27 -08:00
rhc54	c1b8538216	Merge pull request #2600 from rhc54/topic/dbg Transfer debugger support changes	2016-12-17 20:13:40 -08:00
Ralph Castain	c6f6f40529	Transfer debugger support changes Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-17 18:14:46 -08:00
rhc54	54c4925f3f	Merge pull request #2598 from rhc54/topic/debugger Transfer back changes from debugger attach work	2016-12-17 13:09:38 -08:00
Nathan Hjelm	16a2f09cd5	Merge pull request #2596 from hjelmn/x86_rtdtsc opal/timer: add code to check if rtdtsc is core invariant	2016-12-17 11:14:49 -07:00
Ralph Castain	269753f5c1	Transfer back changes from debugger attach work Silence warning Remove debug Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-17 10:00:52 -08:00
Joshua Gerrard	3332a7d630	Fixed memory leak and some -Werror=unused-result warnings Signed-off-by: Joshua Gerrard <joshuagerrard+ompi-commit@protonmail.com>	2016-12-17 17:43:14 +00:00
Jeff Squyres	bd1828c54d	Merge pull request #2451 from martinkontsek/master master: Add arguments to rpmbuild script and update README.	2016-12-17 12:28:59 -05:00
rhc54	49328eb1ca	Merge pull request #2597 from rhc54/topic/flux Add a flux component for LLNL	2016-12-17 07:40:14 -08:00
Ralph Castain	215d6290e0	Add a flux component for LLNL Fine tuning of flux component Fix a few minor issues with the initial cut: * Job id could be obtained from the PMI kvsname like SLURM, but simpler to getenv (FLUX_JOB_ID) * Flux pmi-1 doesn't define PMI_BOOL, PMI_TRUE, PMI_FALSE * Flux pmi-1 maps the deprecated PMI_Get_kvs_domain_id() to PMI_KVS_Get_my_name() internally, so just call that instead. * Drop residual slurm references. Add wrappers for PMI functions so that if HAVE_FLUX_PMI_LIBRARY is not defined, the component can dlopen libpmi.so at location specified by the FLUX_PMI_LIBRARY_PATH env variable, which adds flexibility. If HAVE_FLUX_PMI_LIBRARY is defined, link with libpmi.so at build time in the usual way. Update configury for flux component Update m4 so the configure options work as follows: --with-flux-pmi Build Flux PMI support (default: yes) --with-flux-pmi-library Link Flux PMI support with PMI library at build time. Otherwise the library is opened at runtime at location specified by FLUX_PMI_LIBRARY_PATH environment variable. Use this option to enable Flux support when building statically or without dlopen support (default: no) If the latter option is provided, the library/header is located at build time using the pkg-config module 'flux-pmi'. Otherwise there is no library/header dependency. Handle the case where ompi is configured with --disable-dlopen or --enable-statkc. In those cases, don't build the component unless --with-flux-pmi-library is provided. It is fatal if the user explicitly requests --with-flux-pmi but it cannot be built (e.g. due to --disable-dlopen). Add a schizo/flux component Update schizo/flux component Eliminate slurm-specific usage cases. Since the module is only loaded if FLUX_JOB_ID is set, there are only two cases to handle: 1) App was launched indirectly through mpirun. This is not yet supported with Flux, but hook remains in case this mode is supported in the future. 2) App was launched directly by Flux, with Flux providing CPU binding, if any. Fix up white space in pmix/flux component Drop non-blocking fence from pmix:flux component The flux PMI-1 library is not thread safe, therefore register a regular blocking fence callback instead of the thread-shifting fencenb(). pmix/flux component avoids extra PMI_KVS_Gets Keys stored into the base cache under the wildcard rank are not intended to be part of the global key namespace. These keys therefore should not trigger a PMI_KVS_Get() if they are not found in the cache. Minor pmix/flux component cleanup pmix/flux: drop code for fetching unused pmix_id pmix/flux: err_exit must return error Problem: in flux_init(), although 'ret' (variable holding err_exit return code) is initialized to OPAL_ERROR, the variable is reused as a temporary result code, so if there are some successes followed by a failure that doesn't set 'ret', flux_init() could return success with PMI not initialized. Ensure that a "goto err_exit" returns OPAL_ERROR if 'ret' is not set to some other error code. pmix/flux: don't mix OPAL_ and PMI_ return codes Problem: flux_init() can return both PMI_ and OPAL_ return codes. Although OPAL_SUCCESS and PMI_SUCCESS are both defined as 0, other codes are not compatible. Ensure that flux_init() consistently uses 'rc' for PMI_ return codes and 'ret' for OPAL_ return codes. pmix/flux: factor out repeated code for cache put Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-16 18:26:38 -08:00
Nathan Hjelm	a718743a5c	opal/timer: add code to check if rtdtsc is core invariant Newer x86 processors have a core invariant tsc. On these systems it is safe to use the rtdtsc instruction as a monotonic timer. This commit adds a new function to the opal timer code to check if the timer backend is monotonic. On x86 it checks the appropriate bit and on other architectures it parrots back the OPAL_TIMER_MONOTONIC value. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-12-16 15:11:50 -07:00
Josh Hursey	ced245d093	Merge pull request #2590 from jjhursey/topic/osc-pt2pt-1-thread-fixes Topic/osc pt2pt 1 thread fixes	2016-12-16 12:26:59 -06:00
Mark Allen	eec1d5bf2e	osc/pt2pt: Fix hang with Put and Win_lock_all * When using `MPI_Put` with `MPI_Win_lock_all` a hang is possible since the `put` is waiting on `eager_send_active` to become `true` but that variable might not be reset in the case of `MPI_Win_lock_all` depending on other incoming events (e.g., `post` or ACKs of lock requests. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2016-12-16 11:52:53 -05:00
Mark Allen	0d1336b4a8	osc/pt2pt: Fix Lock/Unlock and Get wrong answer * When using `MPI_Lock`/`MPI_Unlock` with `MPI_Get` and non-contiguous datatypes is is possible that the unlock finishes too early before the data is actually present in the recv buffer. * We need to wait for the irecv to complete before unlocking the target. This commit waits for the outgoing fragment counts to become equal before unlocking. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2016-12-16 11:52:51 -05:00
Mark Allen	1ebf9fd3a4	osc/pt2pt: Fix PSCW after Fence wrong answer. * If the user uses PSCW synchronization after a Fence then the previous epoch is not reset which can cause the PSCW to transfer data before it is ready leading to wrong answers. * This commit resets the `eager_send_active` in the start call. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2016-12-16 11:52:49 -05:00
Joshua Ladd	d8c1a3de3a	Merge pull request #2589 from xinzhao3/topic/ucx-mt-support PML/SPML/UCX: add UCX MT support to PML and SPML.	2016-12-16 08:53:50 -05:00
Xin Zhao	0ecf3c951c	PML/SPML/UCX: add UCX MT support to PML and SPML. Signed-off-by: Xin Zhao <xinz@mellanox.com>	2016-12-15 23:59:15 +02:00
Martin Kontsek	30d076a2f7	Add arguments to rpmbuild script and update README, implement pull request suggestions. Signed-off-by: Martin Kontsek <mkontsek@cisco.com>	2016-12-15 11:18:41 -08:00
rhc54	00b87ea829	Merge pull request #2584 from rhc54/topic/warnings Reduce the flood of warnings due to uninitialized variables, mismatch…	2016-12-15 10:09:01 -08:00
rhc54	e84f738c11	Merge pull request #2587 from rhc54/topic/oversub Ensure that we don't bind-by-default in an oversubscribed condition	2016-12-15 09:59:48 -08:00
Ralph Castain	2af677b1cf	Ensure that we don't bind-by-default in an oversubscribed condition Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-15 07:58:52 -08:00
Ralph Castain	585540bcee	Reduce the flood of warnings due to uninitialized variables, mismatched types, and unused things to a more bearable trickle Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-14 16:33:50 -08:00
Gilles Gouaillardet	a019095b84	pmix2x/class: correctly handle concurrent class initialization (back-ported from upstream commit pmix/master@ceedbd67fd) Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-12-15 09:07:24 +09:00
rhc54	15b6eaf2d4	Merge pull request #2562 from rhc54/topic/pmix2 Update the PMIx2 support to include the latest shared memory optimizations	2016-12-14 15:18:33 -08:00
Ralph Castain	884fb7fcf2	Update the PMIx2 support to include the latest shared memory optimizations Update ORTE support for dynamic PMIx operations e.g., PMIx_Spawn Update to track master Ensure that --disable-pmix-dstore actually disables the dstore. Sync to a few debugger updates Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-14 15:00:10 -08:00
rhc54	db32d1d600	Merge pull request #2577 from rhc54/topic/exitcode Ensure jobs that fail always return a non-zero exit code.	2016-12-14 10:35:07 -08:00
Jeff Squyres	3cb3220094	AUTHORS: update via make-authors.pl script Update .mailmap to catch some inconsistencies. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-12-14 10:28:48 -08:00
Jeff Squyres	a28ae984ee	make-authors: we no longer require organizations Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-12-14 10:20:56 -08:00
Ralph Castain	9f69b0183f	Ensure jobs that fail always return a non-zero exit code. Thanks to Ashley Pittman for the report. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-14 09:41:06 -08:00
rhc54	82110dcb53	Merge pull request #2575 from rhc54/topic/tmpdir Use the server tmpdir instead of the system tmpdir for tool contact files	2016-12-14 09:40:13 -08:00
Ralph Castain	1961a1c22a	Use the server tmpdir instead of the system tmpdir for tool contact files Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-14 08:42:09 -08:00
Alex Mikheev	67d66c2326	oshmem: sshmem: make mmap allocator a default instead of verbs By default use mmap() to allocate memory for the symmetric heap. It is safer and more portable choice than sysv and verbs. Signed-off-by: Alex Mikheev <alexm@mellanox.com>	2016-12-14 13:31:16 +02:00
Nathan Hjelm	8155124adc	Merge pull request #2558 from hjelmn/datatype_fix ompi/datatype: fix bug in darray that causes MPI/IO failures	2016-12-13 14:02:15 -07:00
Yossi	fa6e263821	Merge pull request #2537 from alinask/topic/pml-spml-ucx-api PML/SPML/UCX: Adapt to the API changes in the UCX lib.	2016-12-13 20:01:47 +02:00
Nathan Hjelm	eb439228b1	ompi/datatype: fix bug in darray that causes MPI/IO failures This commit fixes errors in the lb and extent of darray datatypes. For these datatypes the lb should be the start offset of the rank's data in the array and the extent should be the size of the entire datatype. In master the lb was always 0 and the extent was always to small. This commit updates the call to opal_datatype_resize to set the correct lb and fixes the extent calculation. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-12-13 09:25:16 -07:00
Jeff Squyres	f9e8a55a0e	Merge pull request #2543 from ggouaillardet/topic/dll_bit_reproducible ompi/debuggers: make the binary bit reproducible	2016-12-09 06:35:47 -05:00
KAWASHIMA Takahiro	ae056d957c	Merge pull request #2545 from kawashima-fj/pr/inactive-persistent-request ompi/request: Fix a persistent request creation bug	2016-12-09 08:42:31 +09:00
Jeff Squyres	1187212f5d	scaling.pl: minor change to perl quoting Makes emacs syntax hilighting work better. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-12-08 09:25:08 -08:00
Ralph Castain	d5a428b646	Scaling test should only launch one proc/node Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-08 09:24:22 -08:00
KAWASHIMA Takahiro	6510800c16	ompi/request: Fix a persistent request creation bug According to the MPI-3.1 p.52 and p.53 (cited below), a request created by `MPI_*_INIT` but not yet started by `MPI_START` or `MPI_STARTALL` is inactive therefore `MPI_WAIT` or its friends must return immediately if such a request is passed. The current implementation hangs in `MPI_WAIT` and its friends in such case because a persistent request is initialized as `req_complete = REQUEST_PENDING`. This commit fixes the initialization. Also, this commit fixes internal requests used in `MPI_PROBE` and `MPI_IPROBE` which was marked wrongly as persistent. MPI-3.1 p.52: We shall use the following terminology: A null handle is a handle with value MPI_REQUEST_NULL. A persistent request and the handle to it are inactive if the request is not associated with any ongoing communication (see Section 3.9). A handle is active if it is neither null nor inactive. An empty status is a status which is set to return tag = MPI_ANY_TAG, source = MPI_ANY_SOURCE, error = MPI_SUCCESS, and is also internally configured so that calls to MPI_GET_COUNT, MPI_GET_ELEMENTS, and MPI_GET_ELEMENTS_X return count = 0 and MPI_TEST_CANCELLED returns false. We set a status variable to empty when the value returned by it is not significant. Status is set in this way so as to prevent errors due to accesses of stale information. MPI-3.1 p.53: One is allowed to call MPI_WAIT with a null or inactive request argument. In this case the operation returns immediately with empty status. Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2016-12-08 21:42:05 +09:00
Alina Sklarevich	e9d2d029c6	PML/SPML/UCX: Adapt to the API changes in the UCX lib. Signed-off-by: Alina Sklarevich <alinas@mellanox.com>	2016-12-08 11:33:29 +02:00
Gilles Gouaillardet	804a784fce	Merge pull request #2544 from ggouaillardet/topic/mca_spml_yoda_get spml/yoda: fix support for BTLs that do not register memory in mca_sp…	2016-12-08 17:26:07 +09:00
Gilles Gouaillardet	062ed9c919	spml/yoda: fix support for BTLs that do not register memory in mca_spml_yoda_get() Refs open-mpi/ompi#2499 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-12-08 15:56:25 +09:00
Gilles Gouaillardet	4d8f606420	ompi/debuggers: make the binary bit reproducible instead of compilation date __DATE__, use a MPI_Get_library_version() like string Thanks Alastair McKinstry for the report Fixes open-mpi/ompi#2518 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-12-08 13:46:43 +09:00
rhc54	341ab683de	Merge pull request #2532 from rhc54/topic/pmixptl Update to latest PMIx master + PTL branch	2016-12-07 17:28:22 -08:00
rhc54	25a3e27b07	Merge pull request #2542 from rhc54/topic/ashley Correctly cleanup the local children and node map info on remote orte…	2016-12-07 15:53:08 -08:00
Ralph Castain	e1aa7939ef	Correctly cleanup the local children and node map info on remote orteds upon job completion. Ensure that register_nspace only includes procs from that job in the proc map Thanks to Ashley Pittman for the report Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-07 13:53:00 -08:00

1 2 3 4 5 ...

26368 Коммитов