1
1
Граф коммитов

19126 Коммитов

Автор SHA1 Сообщение Дата
Mike Dubman
80f4e02e0a Several changes:
- Modifications to coll/hcoll component related to the changes in the libhcoll API. 
  Now, hcoll_destroy_context accepts one more parameter that indicates if the context was
  really destroyed as a result of the call. 
  This new "non-blocking" context destruction fixes hang discovered in IMB with mcast enabled. 
- Clean up all the left contexts (if any) on the comm_world destruction. 

fixed by Val, reviewed by miked
cmr=v1.7.4:reviewer=ompi-rm1.7

This commit was SVN r30055.
2013-12-23 06:57:12 +00:00
Jeff Squyres
1448522d15 In an MPI_IBCAST, we cannot shortcut if there's only 1 process.
cmr=v1.7.4:reviewer=brbarret:subject=Fix IBCAST for COMM_SELF
-This line, and those below, will be ignored--

M    c/ibcast.c

This commit was SVN r30054.
2013-12-22 22:55:58 +00:00
Jeff Squyres
71ec6c1617 Remove unnecessary "mpi.h"; move opal headers to the top.
This commit was SVN r30053.
2013-12-22 20:38:43 +00:00
George Bosilca
24879f9def Code cleanup while chasing valgrind complaints.
This commit was SVN r30048.
2013-12-21 23:28:14 +00:00
George Bosilca
b324884375 This might explain the current difficulties with the mapping...
This commit was SVN r30047.
2013-12-21 23:26:13 +00:00
George Bosilca
5aa0837250 Init all fields (valgrind cleanup).
This commit was SVN r30046.
2013-12-21 23:24:29 +00:00
George Bosilca
38cbaeaa82 Try to impose a little bit of consistency on how we parse lists of
modules by enforcing the use of OPAL list accessors.

This commit was SVN r30045.
2013-12-21 23:23:33 +00:00
Ralph Castain
264150872b Add a bunch of debug output to the OOB connection completion code so we can track down a handshake problem. Available in optimized builds as well as debug ones by setting -mca oob_base_verbose 10
No review will be required as this is just debug code for those helping us debug the 1.7.4 release candidates

cmr-=v1.7.4:reviewer=ompi-gk1.7

This commit was SVN r30043.
2013-12-21 16:09:26 +00:00
Ralph Castain
042ed95e4e Remove an annoying warning. If the user excludes a non-existent interface, there is no reason to warn - the interface may simply not exist on that node.
cmr=v1.7.4:reviewer=jsquyres:subject=Remove an annoying warning

This commit was SVN r30042.
2013-12-21 01:51:11 +00:00
Ralph Castain
9c768df8b8 Resolve an unexpected behavior in hostfile allocations. Now that we filter allocations to determine what will be used for mapping, let the initial global pool be the union of nodes from all sources (default hostfile, hostfiles, and dash-hosts). Each app will filter down to only those specified for it using its own hostfile and dash-host options.
cmr=v1.7.4:reviewer=jsquyres:subject=Resolve an unexpected behavior in hostfile allocations

This commit was SVN r30040.
2013-12-21 01:38:27 +00:00
Jeff Squyres
6f6c3cc21c Per http://www.open-mpi.org/community/lists/devel/2013/12/13532.php,
r30016 was not enough to solve the issue.

So properly prefix all the shell variables used in opal_setup_java.m4
(one of them had an orte_ prefix -- oops!).  Now we won't get any
conflicts.

Refs trac:4015

This commit was SVN r30037.

The following SVN revision numbers were found above:
  r30016 --> open-mpi/ompi@35dfd26f9e

The following Trac tickets were found above:
  Ticket 4015 --> https://svn.open-mpi.org/trac/ompi/ticket/4015
2013-12-20 22:42:49 +00:00
Adrian Reber
53a70fe87f Trying to get the C/R code to compile again. (send_*_nb)
This patch changes all send/send_buffer occurrences in the C/R code
to send_nb/send_buffer_nb.
The new code compiles but does not work.

Changes from V1:
* #ifdef out the code (so it is preserved for later re-design)
* marked the broken C/R code with ENABLE_FT_FIXED

Changes from V2:
* just replace the blocking calls with the non-blocking calls
* all #ifdef's introduced in V1 are gone
* send_* returns error code or ORTE_SUCCESS (not the number of bytes)

This commit was SVN r30036.
2013-12-20 21:58:28 +00:00
Adrian Reber
a3813d37c7 Trying to get the C/R code to compile again. (recv_*_nb)
This patch changes all recv/recv_buffer occurrences in the C/R code
to recv_nb/recv_buffer_nb.
The old code is still there but disabled using ifdefs (ENABLE_FT_FIXED).
The new code compiles but does not work.

Changes from V1:
* #ifdef out the code (so it is preserved for later re-design)
* marked the broken C/R code with ENABLE_FT_FIXED

Changes from V2:
* only #ifdef out the code where the behaviour is changed
  (used to be blocking; now non-blocking)

This commit was SVN r30035.
2013-12-20 21:05:40 +00:00
Rolf vandeVaart
695d854cd8 Fix return value.
This commit was SVN r30034.
2013-12-20 20:57:04 +00:00
Ralph Castain
31248c0985 Correctly add support for the "env" MPI_Info key during comm_spawn, update the "map-by", "rank-by", and "bind-to" Info key behaviors to match the new mapping/ranking/binding system, and update all docs and comments to match.
Fix comm_spawn on a single host - with the new default mapping scheme, we were incorrectly computing the number of procs to put on the node.

Refs trac:4003

This commit was SVN r30033.

The following Trac tickets were found above:
  Ticket 4003 --> https://svn.open-mpi.org/trac/ompi/ticket/4003
2013-12-20 20:42:39 +00:00
Rolf vandeVaart
4cd1958deb Fix so we do not get warnings when running on system without CUDA software installed and CUDA-aware compiled in.
This commit was SVN r30032.
2013-12-20 20:39:25 +00:00
Ralph Castain
0098f9f51a Remove remaining stale references
Refs trac:4006

This commit was SVN r30027.

The following Trac tickets were found above:
  Ticket 4006 --> https://svn.open-mpi.org/trac/ompi/ticket/4006
2013-12-20 17:48:28 +00:00
Dave Goodell
bd901a68ed usnic: fix 'fls' warnings+errors
The old version caused compilation errors on Solaris.  Thanks to Paul
Hargrove for testing and reporting the bug:

  http://www.open-mpi.org/community/lists/devel/2013/12/13520.php

cmr=v1.7.4:reviewer=jsquyres

This commit was SVN r30025.
2013-12-20 17:37:22 +00:00
Mike Dubman
92fdbbd7b1 Implementing comment #1 from http://www.open-mpi.org/community/lists/devel/2013/12/13523.php
Refs trac:4011

This commit was SVN r30024.

The following Trac tickets were found above:
  Ticket 4011 --> https://svn.open-mpi.org/trac/ompi/ticket/4011
2013-12-20 16:53:28 +00:00
Jeff Squyres
4cfb24069e Update svn:ignore to ignore *.log and *.trs files generated by "make check"
This commit was SVN r30023.
2013-12-20 16:40:30 +00:00
Jeff Squyres
f026bdb68b Remove unused variable
Refs trac:4004

This commit was SVN r30021.

The following Trac tickets were found above:
  Ticket 4004 --> https://svn.open-mpi.org/trac/ompi/ticket/4004
2013-12-20 16:16:24 +00:00
George Bosilca
178c340992 rearrange the fields to remove a gap in the datatype.
This commit was SVN r30020.
2013-12-20 15:57:56 +00:00
George Bosilca
7178492dd5 Correctly initialize and finalize all the datatype classes. No memory leaks on the
datatype engine subsists.

This commit was SVN r30019.
2013-12-20 15:57:10 +00:00
George Bosilca
a85194ae96 Cleanup all the datatype test to avoid any memory leaks or RUI from valgrind.
This commit was SVN r30018.
2013-12-20 15:55:09 +00:00
Jeff Squyres
802c89680a Protect hwloc/configure/m4's use of some temporary shell variables
Fix problem reported by Paul Hargrove:
http://www.open-mpi.org/community/lists/devel/2013/12/13519.php

cmr=v1.7.4:reviewer=brbarret

This commit was SVN r30013.
2013-12-20 14:48:40 +00:00
Ralph Castain
71b52fe861 Ensure that comm_spawn'd procs get user-specified forwarded envars
Thanks to Tim Miller for reporting the regression from the 1.6 series

cmr=v1.7.4:reviewer=jsquyres:subject=Ensure that comm_spawn'd procs get user-specified forwarded envars

This commit was SVN r30012.
2013-12-20 14:47:35 +00:00
Ralph Castain
7cf0fc5578 One more round of sys_limit fixes...sigh
Refs trac:4010

This commit was SVN r30011.

The following Trac tickets were found above:
  Ticket 4010 --> https://svn.open-mpi.org/trac/ompi/ticket/4010
2013-12-20 14:44:51 +00:00
Ralph Castain
e49c16b975 Grrr....use #if instead of #ifdef
Refs trac:4010

This commit was SVN r30010.

The following Trac tickets were found above:
  Ticket 4010 --> https://svn.open-mpi.org/trac/ompi/ticket/4010
2013-12-20 14:24:26 +00:00
Ralph Castain
6e6351959d Check for all the RLIMIT_foo constants that we use, and update the limit checks to use the new #define values. Fix a bug where failure of some might lead to incorrect bracketing.
Refs trac:4010

This commit was SVN r30009.

The following Trac tickets were found above:
  Ticket 4010 --> https://svn.open-mpi.org/trac/ompi/ticket/4010
2013-12-20 14:09:43 +00:00
Jeff Squyres
4739850931 As reported by Paul Hargrove
(http://www.open-mpi.org/community/lists/devel/2013/12/13521.php),
OpenBSD-5 #define's MIN and MAX, so we need to #undef them.

cmr=v1.7.4:reviewer=rhc:subject=undef MIN and MAX for OpenBSD-5

This commit was SVN r30007.
2013-12-20 11:40:59 +00:00
Jeff Squyres
090ce4187a Fix compiler errors on Solaris, NetBSD, and OpenBSD:
* Per
   http://www.open-mpi.org/community/lists/devel/2013/12/13504.php, 
   protect usage of struct ifreq->ifr_hwaddr
 * Per
   http://www.open-mpi.org/community/lists/devel/2013/12/13503.php,
   avoid #define conflict with the token "if_mtu"
 * Also fix some whitespace and string naming issues in opal/util/if.c

Tested by Paul Hargrove.

Refs trac:4010

This commit was SVN r30006.

The following Trac tickets were found above:
  Ticket 4010 --> https://svn.open-mpi.org/trac/ompi/ticket/4010
2013-12-20 11:17:30 +00:00
Mike Dubman
d78a9cdd77 add rpath on mca_mtl_mxm.so to point to /path/to/mxm/lib/libmxm.so which was detected at configure time
This *should* fix following situation:

1 mxm.rpm puts /etc/ld.so.conf.d/mxm.conf file during rpm install with libpath to /opt/mellanox/mxm/lib
2 some1 can extract mxm.rpm into $HOME/mxm and compile OMPI with new mxm location
3 during runtime, OMPI from prev step will pick MXM from step (1) instead of from step (2)

cmr=v1.7.4:reviewer=ompi-rm1.7

This commit was SVN r30005.
2013-12-20 11:15:41 +00:00
Mike Dubman
6dbce7f9f8 fix compile warning
Refs trac:3763

This commit was SVN r30004.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2013-12-20 11:03:09 +00:00
Ralph Castain
f15b0c9863 Add protections around the various system limits to protect code on unusual systems
Thanks to Paul Hargrove for reporting it on OpenBSD-5

cmr=v1.7.4:reviewer=jsquyres

This commit was SVN r30003.
2013-12-20 03:18:07 +00:00
Ralph Castain
6959ba5577 Add missing include file.
Thanks to Paul Hargrove for spotting it.

cmr=v1.7.4:reviewer=jsquyres

This commit was SVN r29998.
2013-12-19 23:39:21 +00:00
Nathan Hjelm
653babc737 Fix a couple issues with the mca_base_var system:
- Use ->boolval for booleans when creating a string.
 - Solaris has some issue with the ?: used in one of find functions. Use an if instead.
 - Change all instances of index -> vari to avoid issues with redefining index.

cmr=v1.7.4:reviewer=jsquyres

This commit was SVN r29997.
2013-12-19 23:28:17 +00:00
Nathan Hjelm
ee9cd13b90 Remove opal_recursion_depth_counter and opal_progress_thread_count. These counters add two
atomics in the critical path and are not currently used. We can bring them back if there
turns out to be a good use for them.

cmr=v1.7.4:reviewer=brbarret

This commit was SVN r29994.
2013-12-19 23:15:27 +00:00
Dave Goodell
0c6b292442 romio: pick "infinitely stale" fix from upstream
Some NFS scenarios can result in an infinite ESTALE return, which will
hang ROMIO.  This commit causes ROMIO to error out after a large number
of retries instead of spinning forever.

This is MPICH commit b250d338:

http://git.mpich.org/mpich.git/commit/b250d338e66667a8a1071a5f73a4151fd59f83b2

cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r29993.
2013-12-19 22:55:26 +00:00
Jeff Squyres
e4097c5cc9 George pointed out that r29991 wasn't quite right. This patch is
added on top of r29991 and:

 * Consolidates the _debug variables in opal_datatype_internal.h and
   opal_convertor.h
 * Puts the DO_DEBUG macros back in the .c files, because they are
   slightly different from each other

Refs trac:4004

This commit was SVN r29992.

The following SVN revision numbers were found above:
  r29991 --> open-mpi/ompi@a88e143127

The following Trac tickets were found above:
  Ticket 4004 --> https://svn.open-mpi.org/trac/ompi/ticket/4004
2013-12-19 22:54:27 +00:00
Jeff Squyres
a88e143127 Fixes trac:3989: opal_pack_debug was instantiated as a bool in one file
and extern'ed s an int in another.  This caused a SIBGUS on
Solaris/SPARC.

This commit properly moves the extern to a .h file so that it's the
same in all files.  It also moves the DO_DEBUG to the header file,
because it was defined to the same thing in multiple .c files.

cmr=v1.7.4:reviewer=bosilca:subject=fix SPARC SIGBUS in opal convertor code

This commit was SVN r29991.

The following Trac tickets were found above:
  Ticket 3989 --> https://svn.open-mpi.org/trac/ompi/ticket/3989
2013-12-19 21:38:51 +00:00
Ralph Castain
b745078535 Support user-provided envars for comm_spawn using info key "env"
Thanks to Tom Fogal for the request

cmr=v1.7.4:reviewer=jsquyres

This commit was SVN r29990.
2013-12-19 20:59:50 +00:00
Ralph Castain
d47d2569f3 We stripped the process info packing routine to minimize message size when sending the launch message, but tools still require all the info. So modify the tool-hnp handshake to explicitly add the missing info
Refs trac:3992

This commit was SVN r29989.

The following Trac tickets were found above:
  Ticket 3992 --> https://svn.open-mpi.org/trac/ompi/ticket/3992
2013-12-19 20:42:20 +00:00
Jeff Squyres
3a14adef63 Remove the comments around these assignments; otherwise, we won't get
function pointers set to the _map functions, and we get segv's in MTT
testing (e.g., the C++ suite, which actually calls MPI_Cart_map and
MPI_Graph_map).

cmr=v1.7.4:reviewer=bosilca:subject=Fix topo _map function pointer assignments

This commit was SVN r29988.
2013-12-19 20:41:32 +00:00
Yossi Etigin
6ab4aba9e6 Fix missing include of show_help.h in mtl mxm.
cmr=v1.7.4:reviewer=jsquyres

This commit was SVN r29987.
2013-12-19 19:37:21 +00:00
Ralph Castain
79af9825ac Update of patch from Takahiro Kawashima
Refs trac:3986

This commit was SVN r29984.

The following Trac tickets were found above:
  Ticket 3986 --> https://svn.open-mpi.org/trac/ompi/ticket/3986
2013-12-19 17:22:37 +00:00
Jeff Squyres
42e3e5cd4b Fixes trac:3990: ensure we don't SIGBUS on SPARC by forcing a memory copy
and preventing access to potentially unaligned data.

Reviewed by Dave Goodell.  Tested by Siegmarr Gross.

cmr=v1.7.4:reviewer=ompi-rm1.7:subject=fix SPARC SIGBUS in opal net code

This commit was SVN r29983.

The following Trac tickets were found above:
  Ticket 3990 --> https://svn.open-mpi.org/trac/ompi/ticket/3990
2013-12-19 16:51:34 +00:00
Ralph Castain
55cd65b149 Don't warn about binding (process and/or memory) if the node cannot do it or if we would overload, but it wasn't specifically requested by the user (i.e., it is the result of the default policy). Instead, just don't bind and quietly move along.
Reset topology usage for each node as we bind as multiple nodes may be linked to the same topology object. This will need to be revisited for scale as it does take some non-zero time to reset the usage each iteration. However, storing individual topology objects for every node consumes memory, so it's a tradeoff.

cmr=v1.7.4:reviewer=jsquyres:subject=Eliminate excessive binding/memory warnings

This commit was SVN r29978.
2013-12-19 16:31:45 +00:00
Ralph Castain
2a6376fcf5 Update platform files
cmr=v1.7.4:reviewer=ompi-gk1.7

This commit was SVN r29977.
2013-12-19 15:38:28 +00:00
Mike Dubman
d70f93b2dc fix corrupted verbose output in oshmem
set yoda prio lower than ikrit
fix anon unions in ikrit
Refs trac:3763

This commit was SVN r29976.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2013-12-19 11:59:32 +00:00
Ralph Castain
9b32dacb6c Ensure we don't abort if a tool cannot send a message - the orte/util/comm library used by tools to query mpirun knows how to handle this situation.
Refs trac:3992

This commit was SVN r29975.

The following Trac tickets were found above:
  Ticket 3992 --> https://svn.open-mpi.org/trac/ompi/ticket/3992
2013-12-19 07:10:36 +00:00