1
1
Граф коммитов

27910 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
3b3ce243bb Merge pull request #4214 from karasevb/pmix1_hang_fix
pmix: fixed immediate request for PMIx v1.2
2017-09-19 06:51:25 -07:00
Ralph Castain
48bbf707c3 Merge pull request #4232 from rhc54/topic/local
Implement support for "local" range when publishing data
2017-09-18 20:18:06 -07:00
Ralph Castain
5708872112 Implement support for "local" range when publishing data
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 2d54f7e0dd3a47260b0b2634aae3361316005933)
2017-09-18 19:34:08 -07:00
Jeff Squyres
2e5e7b8891 Merge pull request #4224 from bwbarrett/graph-coverity
util: Fix graph allocation size
2017-09-18 15:04:34 -04:00
Edgar Gabriel
76a8c67575 io/ompio: add a new grouping option avoiding communication
the new grouping option simple+ performs all calculations used
for the aggregator selection as if the default file view would be used,
thus avoiding communication in file_set_view all together. This mode
is useful for applications that do not set a file view, but use
explicit offset operations on the default file view.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-09-18 12:30:34 -05:00
Ralph Castain
08c93091f7 Merge pull request #4223 from rhc54/topic/stale
Remove stale tools
2017-09-18 09:43:06 -07:00
Josh Hursey
252be7ffb0 Merge pull request #4215 from jjhursey/fix/plm-lsf-rc
plm/lsf: Improve error message if lsb_launch fails
2017-09-18 11:14:25 -05:00
Josh Hursey
5cb5eb68f5 Merge pull request #4204 from jjhursey/fix/master/without-lsf
Fix --without-lsf and LSF in default search path
2017-09-18 11:04:02 -05:00
Ralph Castain
ed508010b4 Remove stale tools
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-18 07:30:47 -07:00
Antoine Dechaume
08e5ab4d9a Fix: Outdated README link #4220 2017-09-18 11:31:07 +02:00
Boris Karasev
2929f52ffc pmix1: fixed immediate request
This fixes a hang of immediate PMIx request. PMIx v1.2 does not support
the info key `PMIX_IMMEDIATE` that leads to hanging. For that request
the fix uses the key `PMIX_OPTIONAL` for not go to the server.

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-09-18 09:17:44 +03:00
Brian Barrett
abbe2ffb9f util: Fix graph allocation size
Fix an allocation bug that could occur on non-LP64 platforms.
match_edges_out is an array of integers representing the
edges of the graph (where vertices are ints), with two ints
for every edge.  The previous code allocated enough space
for num_dges * sizeof(int*), which happens to be the same
as num_edges * 2 * sizeof(int) on LP64 platforms, but would
be wrong on all other platforms.

Fixes: CID 1417754

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-17 19:49:26 +00:00
Ralph Castain
79f82f2c6d Merge pull request #4217 from rhc54/topic/dvm
Complete the fix of the ORTE DVM.
2017-09-16 14:53:24 -07:00
Ralph Castain
3c914a7a97 Complete the fix of the ORTE DVM. We will now use "prun" instead of "orterun -hnp foo" to execute jobs. This provides the feature of automatic discovery of the orte-dvm so you don't need to manually enter URI's or contact file locations. All IO is forwarded to prun.
Still in the "needs to be done" category:

* mapping/ranking/binding options aren't correctly supported

* if the DVM encounters some errors (e.g., not enough resources for the job), the resulting error is globally set and impacts any subsequent job submission

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-16 13:13:07 -07:00
Brian Barrett
bffcc3bca0 util: move graph solver from usnic to util
Cisco wrote a bipartite graph solver to properly solve
interface pair selection for usNIC.  Using the reachable
framework, the TCP BTL (and possibly the runtime network
code) can use the graph solver to make more optimal pair
selection.  Jeff was happy to have the code more broadly
used, but didn't have time to do the move, hence this
commit.

There are a couple of minor changes to the code compared
to the usNIC version.  Obviously, the functions have
been renamed to match naming convention for their new
home.  Since it's easier to write unit tests for
util/ code, the unit tests have been made first class
tests run at "make check" time.  This last bit required
moving some of the definitions into a new header,
bipartite_graph_internal.h, so that they could be
included in both the library code and the test code.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-15 15:08:47 -07:00
Joshua Hursey
89c1aaf646 plm/lsf: Improve error message if lsb_launch fails
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-09-15 09:45:58 -05:00
Rainer Keller
d529c289db Fails to compile with F77 fixed-form compiled programs...
Convert to F77 notation and split into two (shorter) lines.
Also, make usage of the SHMEM_MAX_NAME_LEN definition, by moving
that first.

Signed-off-by: Rainer Keller <rainer.keller@hft-stuttgart.de>
2017-09-15 15:09:43 +02:00
Ralph Castain
f69466d633 Merge pull request #4213 from rhc54/topic/dvm2
Backport changes from PMIx reference server
2017-09-14 13:17:53 -07:00
Ralph Castain
7c7d8a69a0 Backport changes from PMIx reference server
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-14 11:48:56 -07:00
Nathan Hjelm
0851122cce btl/openib/udcm: add support for connection across subnets
This commit adds the code necessary to support forming connections across
subnets. The primary changes are to 1) add the gid to the modex, and 2)
use the gid to create the address handle.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2017-09-14 06:42:06 -10:00
Ralph Castain
8d336ddcc0 Merge pull request #4209 from rhc54/topic/foobar
Only build prun if building --with-devel-headers
2017-09-13 13:07:29 -07:00
Ralph Castain
3f8908871b Since the DVM is now tied to prun, don't build the DVM either unless prun can be built
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-13 11:55:10 -07:00
Ralph Castain
589cc03d8e Only build prun if building --with-devel-headers
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-13 11:38:11 -07:00
Ralph Castain
0a3d8af4c2 Merge pull request #4202 from anandhis/master
Choosing provider when user requests generic transport "fabric"
2017-09-13 11:21:24 -07:00
Ralph Castain
27f15b67d7 Merge pull request #4210 from rhc54/topic/pup
Update to track PMIx master
2017-09-13 11:15:22 -07:00
Ralph Castain
691237801b Update to track PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-13 10:21:44 -07:00
Ralph Castain
df4bd83fcb Merge pull request #4206 from rhc54/topic/prun
Add a new launcher "prun" for starting applications against the ORTE DVM.
2017-09-13 06:55:30 -07:00
Ralph Castain
bbd83fd4c0 Add a new launcher "prun" for starting applications against the ORTE DVM.
Unlike "orterun", "prun" is a PMIx-only program that discovers the DVM connection instead of requiring that we explicitly provide it. Only build "prun" if PMIx v2.x is available.

This gets the DVM working again, but still is showing problems for multiple executions. I'll detail those in a separate issue. Thus, the DVM should still be considered "broken".

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-12 21:40:41 -07:00
Joshua Hursey
24a8b5c574 config/lsf: Fix case where --without-lsf and LSF in CFLAGS/LDFLAGS seach path
* Reference Issue #3546
 * If the user specified `--without-lsf` then do not check for it
   on the system, even if it is there. This can lead to the build
   failure identified in the issue above.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-09-12 21:17:37 -04:00
Joshua Hursey
fe97d3ef5a config/withdir: Make case for --without-X more clear
* Will display a message acknowledging the configure setting
   instead of 'simple ok' which is misleading.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-09-12 21:17:07 -04:00
Joshua Hursey
39a83a25c1 config/lsf: Adjust whitespace
* Spaces not tabs, and indent properly
 * No functional changes here

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-09-12 21:16:33 -04:00
anandhi
4d7de8882f Checking for generic transport "fabric" in mca parameter rml_ofi_transports
to choose the first available non-socket provider.
	modified:   orte/mca/rml/ofi/rml_ofi_component.c
	modified:   orte/mca/rml/ofi/rml_ofi_send.c

Signed-off-by: Anandhi Jayakumar <anandhi.s.jayakumar@intel.com>
2017-09-12 15:39:55 -07:00
Ralph Castain
d41069795f Merge pull request #4200 from rhc54/topic/cov
Silence coverity warnings
2017-09-12 10:29:32 -07:00
Ralph Castain
88eac797fb Silence coverity warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-12 09:14:36 -07:00
Brian Barrett
637ebf60f9 atomics: Remove requirement of 64 bit atomics
Remove two of the three  instances of components requiring
64 bit atomics, even on 32 bit systems.  The SM OSC component
also uses 64 bit atomics, but is a more complicated fix that
will follow this one.  Currently, no one is testing on
platforms that don't provide 64 bit atomics (even in 32 bit
mode), but with the removal of the non-inline assembly for
IA32, the older compilers on Absoft's test systems now
result in no practical way to call cmpxchg8 in 32 bit mode.
At that point, these failures started popping up.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-11 19:50:10 -07:00
Ralph Castain
6775b2a9c6 Merge pull request #4198 from rhc54/topic/dvmrepair
Repair the ORTE DVM
2017-09-11 18:40:06 -07:00
Ralph Castain
3477079804 Repair the ORTE DVM
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-11 17:38:21 -07:00
Nathan Hjelm
7cdda24206 osc/sm: do not require 64-bit atomic math
This commit fixes a compile issue on 32-bit systems that do not
support 64-bit atomic math. The active target path was using 64-bit
atomics exclusively to support PSCW. This commit updates the code to
use either 32 or 64-bit atomic math depending on what is available.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-09-11 14:10:38 -10:00
Clement Foyer
d5c192c825 Fix typos. Fix improper output on test. Reorder benchmarks.
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>
2017-09-11 17:37:25 +02:00
Brian Barrett
29a53b0269 git: Ignore OSHMEM C++ wrapper artifacts
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-08 08:54:08 -07:00
Josh Hursey
392129063b Merge pull request #4191 from jjhursey/fix/global_rank
orte/pmix: Always seed environment with global rank
2017-09-08 09:39:50 -05:00
Joshua Hursey
420ca65f4f orte/pmix: Always seed environment with global rank
* Even if we are only launching one app context, we might call spawn
   later and the remote groups might want their global rank information.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-09-08 08:53:49 -05:00
Brian Barrett
5602d3b9c2 atomics: Remove cmpset_64 on IA32
The recent changes to remove non-inline atomics have caused
a cascade of issues with cmpset_64 on IA32.  cmpxchg8 requires
the use of a bunch of registers (2 for every operand, 3 operands),
and one of them is ebx, which is used by the compiler to do
shared library things.  Some compilers don't deal well with
ebx being clobbered (I'm looking at you, gcc 4.1).  Rather than
continue trying to fight, remove cmpset_64 from the supported
atomic operations on IA32.  Other 32 bit platforms (MIPS32,
SPARC32, ARM, etc.) already don't support a 64 bit compare-and-
swap, so while this might slightly reduce performance, it will
at least be correct.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-07 12:19:34 -07:00
Ralph Castain
afe7f6983b Merge pull request #4184 from rhc54/topic/pmix
Update to track PMIx master
2017-09-06 15:19:01 -07:00
Brian Barrett
ff3ff28a00 NEWS: Remove duplicate "master" items
Both the C++ and Vampir notes appear in release branch notes
already, so remove from the "not on release branch" section.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-06 13:31:30 -07:00
Nathan Hjelm
4bba8774f4 monitoring: fix MPI_T regression
The monitoring code causes MPI_T based tools to segfault when
monitoring is disabled. This happens because the performance
variables remain registered after the common/monitoring
component is dlclosed due to a missing variable registration
flag. This commit adds the necessary flag to all the registered
performance variables.

The issue on github is #4162. Close when applied to master.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-09-06 14:24:35 -06:00
Ralph Castain
cbc114e923 Update to track PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-06 13:15:24 -07:00
Jeff Squyres
41c7230bc4 Merge pull request #4179 from jsquyres/pr/opal-path-nfs-razzem-frazzem
opal_path_nfs: ensure arrays are always long enough
2017-09-06 11:16:44 -04:00
Jeff Squyres
dee8cfbfd0 opal_path_nfs: ensure arrays are always long enough
This test used to have fixed-sized arrays for the mounts that it was
checking.  However, we periodically run across machines with more
mounts than can fit into those fixed-size arrays.  Rather than
periodically increasing the size of those arrays (after re-discovering
that the error is due to fixed-size arrays), just count how many
entries there are and make arrays that are big enough.

Additionally, add a check to ensure that we don't go over the max size
of the array when reading/filling them.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-09-06 07:01:45 -07:00
bosilca
dc538e9675 Merge pull request #1177 from bosilca/topic/large_msg
Topic/large msg
2017-09-05 13:30:19 -04:00