1
1
Граф коммитов

25898 Коммитов

Автор SHA1 Сообщение Дата
Artem Polyakov
55ac3b0be3 orte/schizo: fix binding detection in slurm component
in SLURM 16.05 the SLURM_CPU_BIND_TYPE is equal to "mask_cpu:"
instead of "mask_cpu". Account for that.
2016-08-26 09:55:52 +03:00
Gilles Gouaillardet
e4bf915e75 pmix3x: remove auto-generated file
remove opal/mca/pmix/pmix3x/pmix/src/include/pmix_config.h.in
.gitignore is correct, so it seems this file was added before .gitignore was updated
2016-08-26 15:00:18 +09:00
rhc54
c0fff60e59 Merge pull request #2017 from rhc54/topic/pmixconfig
Update configury to support multiple PMIx versions
2016-08-25 21:36:34 -05:00
Ralph Castain
af67f16422 Update configury to support multiple PMIx versions, rename pmix2x component to pmix3x for support of PMIx master
Update support for external v1.1.x and v2.x libraries. Minor corrections to the v3.x component
2016-08-25 18:19:05 -07:00
Gilles Gouaillardet
277c319389 opal/util: fix (again and again) incorrect type casting in opal_path_df
and silence CID 1371767

this fixes previous commits :
 - open-mpi/ompi@2eec8970ff
 - open-mpi/ompi@a439afce5b
2016-08-26 09:42:45 +09:00
Nathan Hjelm
89c2f4974c Merge pull request #2016 from hjelmn/wait_sync
opal/wait_sync: add #if protection on header
2016-08-25 15:13:09 -07:00
Nathan Hjelm
f3d4eaeaf7 Merge pull request #2013 from hjelmn/osc_rdma_fix
osc/rdma: fix bug in dynamic memory window tracking code
2016-08-25 13:42:27 -07:00
Nathan Hjelm
de32c779e2 opal/wait_sync: add #if protection on header
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-25 14:31:52 -06:00
rhc54
19b0f4db9f Merge pull request #1995 from rhc54/topic/pe-per-rank
Change the behavior of cpus-per-rank.
2016-08-25 14:38:12 -05:00
Edgar Gabriel
1ba03d38ec io/ompio: protect remaining functions in multi-threaded scenarios
protect the remaining functions where necessary by a mutex lock
to avoid problems in multi-threaded executions. Some functions
do not require that in my opinion, and I provided an explanation
in those cases.
2016-08-25 13:45:51 -05:00
Nathan Hjelm
e53de7ecbe osc/rdma: fix bug in dynamic memory window tracking code
This commit fixes an ordering bug in the code that keeps track of all
attached memory windows. The code is intended to keep the memory
regions sorted but was often inserting at the wrong index. Thanks to
Christoph Niethammer for reporting the issue. The reproducer will be
added to nightly MTT testing.

Fixes open-mpi/ompi#2012

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-25 12:08:46 -06:00
Nathan Hjelm
7af138f83b osc/pt2pt: fix possible race in peer locking
It is possible for another thread to process a lock ack before the
peer is set as locked. In this case either setting the locked or the
eager active flag might clobber the other thread. To address this the
flags have been made volatile and are set atomically. Since there is
no a opal_atomic_or or opal_atomic_and function just use cmpset for
now.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-25 09:28:25 -06:00
Nathan Hjelm
c082068953 Merge pull request #2006 from hjelmn/osc_pt2pt_fix
osc/pt2pt: fix several bugs
2016-08-25 09:19:29 -06:00
rhc54
17a210f7f0 Merge pull request #2008 from rhc54/topic/binding
Correct the binding algorithm to decouple it from oversubscribe.
2016-08-25 09:25:33 -05:00
Edgar Gabriel
1cee83cc1b use the common/ interfaces in file_preallocate instead of the io_ompio_ interfaces.
Necessar for avoiding potential deadlock situations in multi-threaded scenarios.
2016-08-25 08:55:12 -05:00
Jeff Squyres
0d19cc4a13 README: fix a bunch of typos
Thanks to Paul Hargrove for pointing them out.  Really.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-25 09:15:27 -04:00
Jeff Squyres
f56b16f079 usnic: remove unused variable
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-25 03:53:18 -07:00
Jeff Squyres
9717bcb7e6 btl/usnic: remove stale comment
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-25 03:53:18 -07:00
Jeff Squyres
6f5e377fe0 btl/usnic: update for libfabric v1.4
With libfabric v1.4, the usnic provider changed the values of its
fabric and domain name strings (compared to libfabric <v1.4).  Update
the Open MPI usNIC BTL to handle both pre-v1.4 and v1.4 fabric/domain
names.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-25 03:53:17 -07:00
rhc54
b563c9e303 Merge pull request #2003 from rhc54/topic/sync
Set the default value of both barrier counters to zero, thus ensuring the coll/sync component is off by default
2016-08-24 23:18:58 -05:00
Ralph Castain
440eae90ec Correct the binding algorithm to decouple it from oversubscribe.
Oversubscribe stipulates that we allow more procs on the node than assigned slots - it has nothing to do with the number of available pe's. Let overload directives handle the pe situation.
2016-08-24 21:17:22 -07:00
George Bosilca
3adff9d323 Fixes #1793.
Reshape the tearing down process (connection close) to prevent race
conditions between the main thread and the progress thread.

Minor cleanups.
2016-08-24 22:45:19 -04:00
Nathan Hjelm
70f8a6e792 osc/pt2pt: fix several bugs
This commit fixes some bugs uncovered during thread testing of
2.0.1rc1. With these fixes the component is running cleanly with
threads.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-24 14:35:45 -06:00
Nathan Hjelm
6de64ddbc1 Merge pull request #2005 from hjelmn/ugni_fix
btl/ugni: actually make the endpoint lock recursive
2016-08-24 11:05:27 -06:00
Nathan Hjelm
83062db7cb btl/ugni: actually make the endpoint lock recursive
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-24 10:36:08 -06:00
Ralph Castain
bcf5ac3971 Set the default value of both barrier counters to zero, thus ensuring the coll/sync component is off by default 2016-08-24 07:51:32 -07:00
Gilles Gouaillardet
2eec8970ff opal/util: fix (again) incorrect type casting in opal_path_df
this fixes previous commit open-mpi/ompi@a439afce5b
2016-08-24 12:50:15 +09:00
rhc54
b12e43fc03 Merge pull request #2001 from ggouaillardet/topic/pmix2x_sec_native
fix sec/native module under Solaris and other misc issues
2016-08-23 22:47:05 -05:00
Gilles Gouaillardet
02847d9e7b pmix2x: dstore: add missing <fcntl.h> include file in pmix_esh.c
(back-ported from upstream pmix/master@5c66ffe0f0)
2016-08-24 11:18:46 +09:00
Gilles Gouaillardet
c11e8163f8 pmix2x: sec/native: fix the pmix_native module under solaris by using getpeerucred()
and fail with a user friendly message if no method is available:
"sec: native cannot validate_cred on this system"

(back-ported from upstream pmix/master@c474a1fc60)
2016-08-24 11:18:40 +09:00
Gilles Gouaillardet
e91292aa41 pmix2x: configury: add missing check for <netdb.h> header file
(back-ported from upstream pmix/master@e54ce6d423)
2016-08-24 11:18:32 +09:00
Gilles Gouaillardet
a439afce5b opal/util: fix incorrect type casting in opal_path_df 2016-08-24 10:26:13 +09:00
Ralph Castain
22844b0dc6 Balance priorities to ensure something is below sync 2016-08-23 17:33:45 -07:00
Ralph Castain
540f23c4dd Adjust priority of coll/sync downwards 2016-08-23 17:12:48 -07:00
Jeff Squyres
b5d03c6eea Merge pull request #1996 from bharatpotnuri/patching
Add Chelsio T6 adapter device parameters.
2016-08-23 13:04:26 -04:00
Jeff Squyres
a0a1849101 README: restrict OS X and Oracle Studio compiler versions
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-23 09:46:30 -07:00
Edgar Gabriel
41ed4a28d2 add the protective lock around read and write operations in ompio 2016-08-23 11:07:58 -05:00
Jeff Squyres
997431696a opal_check_cma: make consistent with rest of configury
Split the CMA test into two parts so that the back-end test only has
to be run once.  Fail with --with-cma is specified and cannot be
provided.  Remove a few useless quotes.  Change
$ompi_check_cma_need_defs and $ompi_check_cma_happy to be numeric
values.  Finally, remove a bunch of tabs.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-23 07:26:47 -07:00
Jeff Squyres
065b93600d AUTHORS: Fix minor typos
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-23 06:32:57 -07:00
Potnuri Bharat Teja
9b7f9ece20 Add Chelsio T6 adapter device parameters.
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
2016-08-23 10:38:13 +05:30
Ralph Castain
92102304b6 Minor typo - init the job_data stdin_target field to 0 for default behavior. Add test. 2016-08-22 21:03:45 -07:00
Gilles Gouaillardet
93e73841f9 ess/singleton: push all PMIX_* environment variables, regardless how many there are 2016-08-23 09:46:55 +09:00
Gilles Gouaillardet
a1e8e58a8a ess/singleton: expects 4 PMIX_* environment variables or more 2016-08-23 09:34:03 +09:00
Howard Pritchard
696121cc4a Merge pull request #1988 from hppritcha/topic/another_ofi_fix
mtl/ofi: fix a botched assignment of av_type
2016-08-22 17:59:59 -06:00
Ralph Castain
7de4d6922b Change the behavior of cpus-per-rank. We previously counted each cpu against the #slots. However, IBM has pointed out that "slot" is equated to the number of processes allowed to run on each node, and not the number of cpus on the node. This has been a continuing source of confusion, so make the distinction a "hard" one.
Each process occupies a "slot". We automatically set #slots = #cpus if nothing else is told to us. If you want to run more procs and slots, you must tell us to allow oversubscription.

A process can utilize multiple pe's if that option is given. If you try to bind more than one proc to a given pe, then we will error out unless you tell us to allow overloading.
2016-08-22 15:54:41 -07:00
Ralph Castain
6549c878a9 Silence the warnings 2016-08-22 15:35:27 -07:00
rhc54
aa21013da3 Merge pull request #1994 from rhc54/topic/unify
Unify the PMIx2x components and minor cleanup of coll/sync
2016-08-22 17:20:24 -05:00
Ralph Castain
639dbdb7ea For maintainability, fold the external PMIx 2.x integration into the internal PMIx 2.x library component. This ensures that we always stay in sync with the two as that is becoming a problem. 2016-08-22 13:28:55 -07:00
Ralph Castain
871bedb103 Add missing "const" qualifiers 2016-08-22 12:54:24 -07:00
Jeff Squyres
17ca44b25e Merge pull request #1984 from jsquyres/pr/auto-generate-AUTHORS
Be able to auto-generate AUTHORS and preserve org affiliations
2016-08-22 15:37:22 -04:00