Ralph Castain
18d9fdfd8d
Restore full topology comparison to support inventory monitoring
2014-12-09 01:33:06 -08:00
Ralph Castain
9b2f8cd840
Add the processor architecture to the topology signature
2014-12-09 01:17:00 -08:00
Howard Pritchard
3a14c8eeff
fix build for cray xc
...
Recent addition of libfabric embdded broke build on Cray XC/XE.
This commit fixes this problem.
2014-12-08 22:21:13 -08:00
Ralph Castain
bb529ebd8e
Revise the way we handle hetero nodes as users are finding this (a) a significant surprise, and (b) confusing as to when it is required. So try to automate it a bit by creating a topology "signature" that mpirun can share on the cmd line with the remote daemons, thus allowing them to check to see if they match. This isn't comprehensive of course - for now, it only checks the number of each type of hwloc object on the node. This is good enough to pickup major differences (e.g., where we have different numbers of sockets or assigned core bindings).
...
Retain the hetero-nodes flag for those cases where the user *knows* that there are differences and our automated system isn't good enough to see it.
Will obviously require further refinement as we find out which variances it can detect, and which it cannot.
2014-12-08 15:38:14 -08:00
Yohann Burette
f33a9afd22
libfabric: fix typo in Makefile.am
2014-12-08 13:19:43 -08:00
Jeff Squyres
ac8e9d103c
libfabric: need to make AM_CONDITIONALs always be run
...
Ensure that the usnic-specific AM_CONDITIONAL for the embedded
libfabric is always run.
2014-12-08 11:51:26 -08:00
Jeff Squyres
d64881f040
psm_am.h: add missing file from libfabric snapshot
...
This is just about to be fixed upstream, but "make dist" was not
including this file in the libfabric tarball.
2014-12-08 11:39:08 -08:00
Jeff Squyres
d02756cdbb
libfabric: various configury updates
...
1. Ensure to override CFLAGS properly. Move the setting of CFLAGS outside the AM_CONDITIONAL so that Automake doesn't get confused (because CFLAGS is already set inside an AM_CONDITIONAL -- moving it outside the conditional ensure that this local CFLAGS override trumps all other CFLAGS overrides).
2. Only build libfabric on Linux. Add a little more configury to ensure that we only try to build libfabric on Linux.
3. Remove a dead/unused file
4. Fix typo in condition check
5. Use "false", not "/bin/false"
2014-12-08 11:39:07 -08:00
Jeff Squyres
92818d1fa5
usnic: remove SVN-style $Id$ tokens (and #idents)
...
This commit is also upstream in libfabric.
2014-12-08 11:39:07 -08:00
Jeff Squyres
9547345b18
usnic: fix show_help message
...
Rename a few symbols to use libfabric-friendly names. Fix a show_help
message when fi_av_insert times out.
2014-12-08 11:39:07 -08:00
Jeff Squyres
8e49cc754f
usnic: update to latest libfabric API changes
2014-12-08 11:37:37 -08:00
Jeff Squyres
c4e8d67515
libfabric: sync to upstream libfabric github
...
Bring down the latest from the libfabric github, as of
9d051567c8eb7adc2af89516f94c7d0539152948.
2014-12-08 11:37:37 -08:00
Jeff Squyres
7a96b58882
common verbs: remove usnic-specific code
...
Now that the usnic BTL uses libfabric, we can remove the
usnic-specific code from opal/mca/common/verbs.
2014-12-08 11:37:37 -08:00
Jeff Squyres
984982790a
usnic: convert from verbs to libfabric (yay!)
...
This commit represents the conversion of the usnic BTL from verbs to
libfabric.
For the moment, libfabric is embedded in Open MPI (currently in the
usnic BTL). This is because the libfabric API is still changing, and
also has not yet been released. Ultimately, this embedded copy of
libfabric will likely disappear and the usnic BTL will rely on an
external installation of libfabric.
New configure options:
* --with-libfabric: will cause configure to fail if libfabric support
cannot be built
* --without-libfabric: will prevent libfabric support from being built
* --with-libfabric=DIR: use an external libfabric installation
* --with-libfabric-libdir=LIBDIR: when paired with --with-libfabric=DIR,
use LIBDIR for the libfabric installation library dir
The --with-libnl3[-libdir] arguments are now gone.
2014-12-08 11:37:37 -08:00
George Bosilca
04a4cbd77a
Fix the clock_gettime monotonic timer. Thanks to Gilles for the
...
first sketch of the patch.
2014-12-04 00:20:56 -05:00
Jeff Squyres
983bd49f11
opal_timer_require_monotinic: change to bool / level 5
2014-12-03 17:09:43 -08:00
Jeff Squyres
8880b070b8
Merge pull request #295 from jsquyres/topic/bosilca-accurate-timers
...
Topic/bosilca accurate timers
2014-12-03 19:46:14 -05:00
Howard Pritchard
c67afadcfc
Merge pull request #289 from hppritcha/topic/remove_pmi
...
Topic/remove pmi
2014-12-03 16:58:35 -07:00
Nathan Hjelm
f989fe27b8
btl/vader: workaround to make jenkins happy
2014-12-03 15:51:58 -07:00
Todd Kordenbrock
c0c680bccb
Portals4 BTL: Do not disqualify if a peer does not put Portals4 BTL modex info
...
If OPAL_MODEX_RECV() returns OPAL_ERR_NOT_FOUND, the peer didn't
send any Portals4 BTL info. This is not a fatal error. Instead of
disqualifying the Portals4 BTL just ignore that peer.
@jsquyres reported this in #194 .
2014-12-03 14:22:10 -06:00
Howard Pritchard
c75dccede1
pmix/cray: remove finalize call from comp close
...
The finalize call in component close method is
no longer being matched by an equivalent init call,
so remove this call in the close method.
2014-12-03 09:44:18 -07:00
Ralph Castain
d9b23c1054
Increment the init_count in the Slurm pmix components so they correctly respond to calls to pmix.initialized
2014-12-02 20:20:29 -08:00
Ralph Castain
cb15cc06e1
Minor changes per Jeff's request on PR for 1.8.4
2014-12-02 19:54:10 -08:00
George Bosilca
a35d2b9fb5
Update copyrights and mark ia32 timers as non-monotonic.
2014-12-01 14:03:54 -08:00
George Bosilca
5277fd5aa2
Various cleanups.
2014-12-01 14:03:47 -08:00
George Bosilca
00300f464d
Add support for clock_gettime on Linux. Allow the user to
...
request a monotonic timer via MCA parameters.
2014-12-01 14:03:40 -08:00
Ralph Castain
960ef34988
Ensure the LSF ras adds the hosts to the allocation. Correctly handle the semi-colon vs comma situation in hwloc slot_lists
2014-11-30 14:37:37 -08:00
Ralph Castain
3f9d9ae8b6
Provide tighter LSF integration by correctly handling scenarios where the user has asked LSF to assign bindings. Fix a couple of typos in lex parser definitions. Tell hostfile parser to ignore binding designations in hostfiles. Add an attribute to indicate that cpusets were provided as physical cpu ids.
...
Once validated, a version of this will be backported to the v1.8.4 release.
2014-11-30 11:50:31 -08:00
George Bosilca
dee243c58d
ompi_proc_finalize has an interesting side effect. A proc is
...
inserted in the ompi_proc_list as soon as it is created and it
is removed only upon the call to the destructor. In ompi_proc_finalize
we loop over all procs in ompi_proc_finalize and release them once.
However, as a proc is not removed from this list right away, we
decrease the ref count for each proc until it reach zero and the
proc is finally removed. Thus, we cannot clean the BML/BTL after
the call the ompi_proc_finalize.
A quick fix is to delay the call to ompi_proc_finalize until all
other frameworks have been finalized, and then the behavior
depicted above will give the expected outcome.
2014-11-28 18:26:36 -05:00
bosilca
8cae899a42
Merge pull request #285 from bosilca/master
...
Reenable high accuracy timers
2014-11-25 17:09:34 -05:00
Gilles Gouaillardet
578fe41788
fix hangs introduced by previous commit a6744b81777ab8247908350bd15cca49bedf5208
2014-11-25 17:50:44 +09:00
Gilles Gouaillardet
a6744b8177
fix misc memory leaks specific to the master
2014-11-25 13:52:10 +09:00
George Bosilca
261684858f
Improved support for OSX timers.
2014-11-24 17:15:49 -05:00
George Bosilca
1877dfd0df
On Darwin make sure the field we expect to be 0 is indeed 0.
2014-11-24 14:16:36 -05:00
George Bosilca
766cfece36
Remove useless header.
2014-11-24 00:57:54 -05:00
George Bosilca
5f49a11b29
Minor cleanups.
2014-11-24 00:44:50 -05:00
George Bosilca
e27759956f
Allow the use of the optimized used timers
2014-11-23 23:51:13 -05:00
George Bosilca
324e43909d
Enable CUDA support on Mac OS X.
2014-11-20 13:51:10 -06:00
Gilles Gouaillardet
758f7ab768
Revert "btl/vader: use FRAG_ALLOC_USER when single_copy_mechanism is VADER_NONE"
...
as discussed with @hjelmn in open-mpi/ompi-release#86
This reverts commit d2d7f39a4bc8dde7e8c480f248819ec0d0e6be62.
2014-11-20 16:04:55 +09:00
Nathan Hjelm
1b564f62bd
Revert "Merge pull request #275 from hjelmn/btlmod"
...
This reverts commit ccaecf0fd6c862877e6a1e2643f95fa956c87769, reversing
changes made to 6a19bf85dde5306f559f09952cf3919d97f52502.
2014-11-19 23:22:43 -07:00
Nathan Hjelm
b1f9569b7d
Revert "btl/openib: fix warnings"
...
This reverts commit 6e6c786b49dc9df70c87bb8348c12b6f68de173b.
2014-11-19 23:16:16 -07:00
Nathan Hjelm
6e6c786b49
btl/openib: fix warnings
2014-11-19 15:57:01 -07:00
Nathan Hjelm
ccaecf0fd6
Merge pull request #275 from hjelmn/btlmod
...
Updated the btl interface. Please update your components.
2014-11-19 15:01:40 -07:00
Ralph Castain
6a19bf85dd
Add a little debug
2014-11-19 13:45:00 -08:00
Ralph Castain
27be73c6fb
Allow callers to dstore.open to not request a specific list of desired components, but just take the highest priority one that is available
2014-11-19 13:31:31 -08:00
Nathan Hjelm
2b579610f2
btl/openib: fix compilation issues with XRC
2014-11-19 11:44:48 -07:00
Nathan Hjelm
2a382c2ec1
add btl comment
2014-11-19 11:33:04 -07:00
Nathan Hjelm
bf7daac388
btl/openib: add atomic operation support
2014-11-19 11:33:04 -07:00
Nathan Hjelm
45d1fac8af
ugni thread safety fixes
2014-11-19 11:33:03 -07:00
Nathan Hjelm
5e7c77c576
btl/ugni: add support for atomic operations
2014-11-19 11:33:03 -07:00