1
1
Граф коммитов

20933 Коммитов

Автор SHA1 Сообщение Дата
Rolf vandeVaart
8db1f89dd1 Small change to allow CUDA-aware to work with non-reduction nonblocking collectives.
Only used when CUDA-aware feature compiled in.

This commit was SVN r32750.
2014-09-17 16:55:01 +00:00
Ralph Castain
3a437cbdb3 Silence set-but-not-used warning when timing isn't enabled
This commit was SVN r32749.
2014-09-17 00:40:10 +00:00
Ralph Castain
414f4e9783 Try to provide a real hostname for the remote host to aid in debugging
Refs trac:4908

This commit was SVN r32748.

The following Trac tickets were found above:
  Ticket 4908 --> https://svn.open-mpi.org/trac/ompi/ticket/4908
2014-09-17 00:39:49 +00:00
Jeff Squyres
9dc49c5f92 oob_tcp_connection: print "<unknown>" instead of "NULL"
"NULL" doesn't meany anything to the user, and is somewhat confusing
to see in an error message.  "<unknown>" at least indicates that
there's an error, and we know who the peer is.

This commit was SVN r32747.
2014-09-16 22:47:57 +00:00
Ralph Castain
09aecea55a Can't use show_help as the RML has already been enabled, but we haven't successfully connected back to the HNP. So use opal_output instead and hardwire the message.
Refs trac:4908

This commit was SVN r32746.

The following Trac tickets were found above:
  Ticket 4908 --> https://svn.open-mpi.org/trac/ompi/ticket/4908
2014-09-16 22:21:02 +00:00
Ralph Castain
d50c8ba65f Per patch from Gilles, cleanup some errors that surface when building with PGI. Verified by Tetsuya, reviewed okay by Jeff.
RM-approved

cmr=v1.8.3:reviewer=ompi-gk1.8

This commit was SVN r32745.
2014-09-16 19:07:02 +00:00
Ralph Castain
4bbc9a28d6 Try to resolve the simultaneous connection problem by being a little more careful about the choice of returned status when a connection is refused. As before, have the higher vpid of the two peers retry the connection, while the lower one waits. This can happen in a couple of places, so try to hit them all. Since this is hard to test, will ask Gilles to give it a try since he's the one who is seeing it.
cmr=v1.8.3:reviewer=rhc

This commit was SVN r32744.
2014-09-16 18:59:36 +00:00
Ralph Castain
a74428513d Provide a better help message when we are unable to complete a connection due to a firewall.
cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32743.
2014-09-16 16:28:29 +00:00
Vasily Filipov
ff10b25e7d warnings (caused by commit r32735) fix.
reviewed by miked
    cmr=v1.8.3:reviewer=ompi-rm1.8 

This commit was SVN r32740.

The following SVN revision numbers were found above:
  r32735 --> open-mpi/ompi@5fecf65daf
2014-09-16 06:33:49 +00:00
MPI Team
5f01f7ef5f Update git/hg ignore files
This commit was SVN r32739.
2014-09-16 05:00:23 +00:00
Ralph Castain
dfb952fa78 [Contribution from Artem - moved it to svn from git for him]
Replace our old, clunky timing setup with a much nicer one that is only available if configured with --enable-timing. Add a tool for profiling clock differences between the nodes so you can get more precise timing measurements. I'll ask Artem to update the Github wiki with full instructions on how to use this setup.

This commit was SVN r32738.
2014-09-15 18:00:46 +00:00
Brad Benton
9d39903a53 update author affiliation
This commit was SVN r32737.
2014-09-15 15:02:58 +00:00
Vasily Filipov
e26af91a64 BTL/OPENIB: set "max_lmc" param to be "1" and not "all available values" by default.
cmr=v1.8.3:reviewer=miked 

This commit was SVN r32736.
2014-09-15 13:56:41 +00:00
Vasily Filipov
5fecf65daf OMPI/COLL/Tuned: add command line params for thresholds to decide if small/intermediate MSGs alltoall algorithm will be used.
cmr=v1.8.3:reviewer=miked

This commit was SVN r32735.
2014-09-15 12:34:21 +00:00
Alex Mikheev
31d0724a08 OMPI: btl openib: fix detection of max registarable memory
Deal with the case when mlx4 module is loaded but device
is not present

cmr=v1.8.3:reviewer=miked

This commit was SVN r32734.
2014-09-15 12:17:23 +00:00
Ralph Castain
fad4384463 Not sure how we could get to this point without having already detected the error, but just to be safe - check for end-of-array and return if error.
Refs trac:4897

This commit was SVN r32731.

The following Trac tickets were found above:
  Ticket 4897 --> https://svn.open-mpi.org/trac/ompi/ticket/4897
2014-09-13 02:23:30 +00:00
Jeff Squyres
e95ed94a94 plm_rsh_module.c: output to the framework output
Trivial fix from r32686: don't output to stream 0, but rather to
orte_plm_base_framework.framework_output (this is the way it was
before r32686).  In reality, this is going to end up being stream 0,
anyway, but we might as well be pedantically correct...

Refs trac:4897.

This commit was SVN r32726.

The following SVN revision numbers were found above:
  r32686 --> open-mpi/ompi@4df1aa63f7

The following Trac tickets were found above:
  Ticket 4897 --> https://svn.open-mpi.org/trac/ompi/ticket/4897
2014-09-13 00:46:35 +00:00
Jeff Squyres
66aeadacff opal_search_libs: correctly AC_DEFINE results of search
1. It is not sufficient to put the result of m4_toupper() in a
variable and use that variable as the variable name in
AC_DEFINE_UNQUOTED.  Instead, just use m4_toupper() directly in
AC_DEFINE_UNQUOTED.  Also, save the result value in a "permanent"
variable that isn't erased, just in case autoconf decides to be lazy
about instantiating the body AC_DEFINE_UNQUOTED and move it later
(this is probably overkill :-) ).
1. Use the OMPI Way of always defining macros (to 0 or 1).  Then also
slightly change the logic in util/basename.c to just check
OPAL_HAVE_DIRNAME (because it will always be defined).

Refs trac:4894

This commit was SVN r32723.

The following Trac tickets were found above:
  Ticket 4894 --> https://svn.open-mpi.org/trac/ompi/ticket/4894
2014-09-13 00:28:30 +00:00
Jeff Squyres
01e62b1994 orte_check_lsf.m4: improve the search for LSF libraries
Based on a patch from Joshua Randall, improve the search for LSF
libraries in orte_check_lsf.m4.  This improvement theorhetically
allows the LSF libraries to work in a variety of OSs by properly
searching for its dependent libraries using a variety of library
names.

Fixes trac:4889.

This commit was SVN r32722.

The following Trac tickets were found above:
  Ticket 4889 --> https://svn.open-mpi.org/trac/ompi/ticket/4889
2014-09-12 23:52:18 +00:00
Jeff Squyres
2024d28a65 opal_search_libs.m4: remove redundant AC_MSG_CHECKING/RESULT
The AC_MSG_CHECKING/AC_MSG_RESULT in OPAL_SEARCH_LIBS_CORE were
generating recursive redundant output (i.e., AC_SEARCH_LIBS already
outputs "checking for library containing X... -lY".

Also add the "uppername" logic to OPAL_SEARCH_LIBS_COMPONENT so that
these two macros stay symmetrical.

Refs trac:4894.

This commit was SVN r32721.

The following Trac tickets were found above:
  Ticket 4894 --> https://svn.open-mpi.org/trac/ompi/ticket/4894
2014-09-12 23:27:33 +00:00
Ralph Castain
7269dae2da Per patch from Samuel Thibault, silence warning from Clang
This commit was SVN r32720.
2014-09-12 22:22:11 +00:00
Ralph Castain
0445052a1c Check for multiple declarations of a given MCA param and error out if detected as that can create an ambiguous definition of the param value.
Refs trac:4897

This commit was SVN r32719.

The following Trac tickets were found above:
  Ticket 4897 --> https://svn.open-mpi.org/trac/ompi/ticket/4897
2014-09-12 22:21:30 +00:00
Mangala Jyothi Bhaskar
dc05b709a7 it is ok to not have a sharedfp component selected, as long as no
sharedfp functionality is being used. Return an error however if no
sharedfp component is selected and the applications calls a
file_read/write_shared function.

This commit was SVN r32718.
2014-09-12 21:15:58 +00:00
Edgar Gabriel
597177cd8b silence a warning regarding the return value of the fbtl's.
This commit was SVN r32717.
2014-09-12 18:01:30 +00:00
Jeff Squyres
d244b7b860 mca_base_var: fix possibilty of unaligned variable assignments
Add a debugging check that ensures that the registered storage is
aligned appropriately for the type that is specified.

When we know that the storage is properly aligned, we can cast the
mbv_storage to the appropriate type and then simply do the assignment.
We used to do this assignment via a union, but clang's
-fsanitizer=alignment complained about this.

This commit was SVN r32716.
2014-09-11 23:02:49 +00:00
Ralph Castain
1f2c5863f0 Revert r32675 in favor of a different solution proposed by Brice
This commit was SVN r32715.

The following SVN revision numbers were found above:
  r32675 --> open-mpi/ompi@916f98a3ee
2014-09-11 21:58:48 +00:00
Howard Pritchard
e43715574a remove ignored restrct return type qualifier
The use of restrict in the return type qualifier for mca_btl_vader_reserve_fbox
is being ignored by gnu compiler.  for newer gcc, one sees this warning only
with -Wignored-qualifiers set, but for older variants of gcc it was reported
that numerous warning messages about this ignored qualifier were being
generated as vader is being compiled.

The warning reported by gcc is

btl_vader_fbox.h:53:47: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
 static inline mca_btl_vader_fbox_t * restrict mca_btl_vader_reserve_fbox (struct mca_btl_base_endpoint_t *ep, const size_t size)

This commit was SVN r32714.
2014-09-11 21:12:41 +00:00
Rolf vandeVaart
9a2bab0e27 Add support for detecting CUDA managed memory. Disabled for now.
This commit was SVN r32713.
2014-09-11 21:07:17 +00:00
Howard Pritchard
820b34e5d2 Fix bad cut/paste for commit c19e7369
This commit was SVN r32712.
2014-09-11 21:00:04 +00:00
Howard Pritchard
d07c5674a3 Fix potential double free in cray pmi cray_fini
This commit was SVN r32711.
2014-09-11 20:30:40 +00:00
Mangala Jyothi Bhaskar
cd78a3a026 Fixed offset data type in communication
This commit was SVN r32710.
2014-09-11 14:51:30 +00:00
Mangala Jyothi Bhaskar
4ff21d6178 Fixed offset data type in communication
This commit was SVN r32709.
2014-09-11 14:51:07 +00:00
Mangala Jyothi Bhaskar
6e5f2c8ae8 Fixed offset data type in communication
This commit was SVN r32708.
2014-09-11 14:50:30 +00:00
Ralph Castain
9e7e90265f Temporarily make the direct grpcomm component the default until we can debug the other modules
This commit was SVN r32707.
2014-09-11 14:47:54 +00:00
Edgar Gabriel
4ccc0f5ea2 the length of the iov array should be limited to IOV_MAX, which is defined in limits.h
This commit was SVN r32706.
2014-09-10 21:59:45 +00:00
Ralph Castain
a90f12ad1d Save the clang settings for detecting alignment issues - I don't want to have to remember the cmd line jango
This commit was SVN r32705.
2014-09-10 18:56:19 +00:00
Ralph Castain
cb2ad98f57 Silence an unused function warning
This commit was SVN r32704.
2014-09-10 17:36:34 +00:00
Ralph Castain
4eb6291334 Avoid conflicts when multiple collectives are underway in ORTE by giving each grpcomm component its own RML tag and posting persistent receives. We use the signature anyway to determine which collective the received message is addressing, so there is no need to post non-persistent receives.
This commit was SVN r32703.
2014-09-10 17:36:16 +00:00
Ralph Castain
a7c5b77d70 Just because the openib BTL can't reach a process doesn't mean it is a job-ending error. If we have other methods for reaching the process (e.g., sm for a local proc), then that's okay. If there is no method for reaching a proc, then that's an error - but the BML will report that situation.
The question of whether or not the openib BTL supports loopback is a separate question. It may be more appropriate to make the modex be PMIX_GLOBAL for cases where openib can support loopback so someone can run without a shared memory component. I'll leave that decision to the IB vendors.

This commit was SVN r32702.
2014-09-10 17:02:16 +00:00
Ralph Castain
ea11e63f59 Per patch from Tetsuya, allow the user to bind-to none when specifying multiple pe's/rank as requested by Reuti. This allows the user to reserve multiple "slots" in the allocation for each process while mapping, but not to bind the process to specific processing elements on the node.
Reviewed by rhc, so RM-approved to go across to v1.8.3

cmr=v1.8.3:reviewer=ompi-gk1.8

This commit was SVN r32701.
2014-09-10 15:52:18 +00:00
Edgar Gabriel
cc46b65a5e the fbtl interfaces should really be an ssize_t not a size_t, since the return
value could be negative, which is allowed for ssize_t, but not for size_t.

This commit was SVN r32700.
2014-09-10 15:01:54 +00:00
Edgar Gabriel
599cb7b351 update the pvfs2 fbtl to return the number of bytes generated.
This commit was SVN r32699.
2014-09-10 13:32:06 +00:00
Ralph Castain
93948f0c4e Resolve alignment issues when unpacking buffers
cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32698.
2014-09-10 10:19:16 +00:00
Gilles Gouaillardet
e71452d73a Revert r32696
This commit was SVN r32697.

The following SVN revision numbers were found above:
  r32696 --> open-mpi/ompi@e4c3500166
2014-09-10 04:35:47 +00:00
Gilles Gouaillardet
e4c3500166 Fix MPI_Status_set_elements[_x] for non predefined datatypes
Fixes trac:4896

cmr=v1.8.3:reviewer=bosilca

This commit was SVN r32696.

The following Trac tickets were found above:
  Ticket 4896 --> https://svn.open-mpi.org/trac/ompi/ticket/4896
2014-09-10 02:41:29 +00:00
Ralph Castain
e671620ac7 Per request from Jeff: tune up the help messages for binding options
Refs trac:4898

This commit was SVN r32691.

The following Trac tickets were found above:
  Ticket 4898 --> https://svn.open-mpi.org/trac/ompi/ticket/4898
2014-09-09 22:39:22 +00:00
Edgar Gabriel
3a5f4f72da make the zero byte read/write scenarios work without the contiguous flag.
This commit was SVN r32690.
2014-09-09 16:26:14 +00:00
Edgar Gabriel
6a607caed8 fix some zero byte allocation scenarios.
This commit was SVN r32689.
2014-09-09 16:25:44 +00:00
Gilles Gouaillardet
63209eac5b orte/util: use ORTE_JOB_FAMILY and ORTE_LOCAL_JOBID macros
This commit was SVN r32688.
2014-09-09 05:13:00 +00:00
Ralph Castain
4207b4c4ad Improve the --bind-to help message to better indicate the default options under various values of np. Remove the warning message if the user doesn't specify a binding policy and we are overloaded
cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32687.
2014-09-08 21:03:51 +00:00