1
1
Граф коммитов

19792 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
3f469d08e7 coll/ml: increase the number of allowed processes in a local reduce and
add checks to see if the bcol module can support allreduce.

cmr=v1.7.5:reviewer=manjugv

This commit was SVN r31096.
2014-03-17 23:10:19 +00:00
Pavel Shamis
fba1edbf14 Removing ml include from bcol_ptpcoll.h.
It is not really required.

This commit was SVN r31095.
2014-03-17 22:58:40 +00:00
Ralph Castain
545ac7dc58 Remove the job_control_forwarding logic as we want *any* signal to go to all members of the process group
Refs trac:4404

This commit was SVN r31094.

The following Trac tickets were found above:
  Ticket 4404 --> https://svn.open-mpi.org/trac/ompi/ticket/4404
2014-03-17 22:45:33 +00:00
Ralph Castain
5a868028a8 Revert r31091 - the functionality didn't disappear, but moved into the MPI layer :-(
This commit was SVN r31093.

The following SVN revision numbers were found above:
  r31091 --> open-mpi/ompi@edf680855e
2014-03-17 22:30:03 +00:00
Ralph Castain
99c9ecaed0 Ensure that we send the specified signal to the entire process group of each member of the pid provided to us. This ensures that any children spawned by our children also see the signal
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31092.
2014-03-17 22:12:15 +00:00
Ralph Castain
edf680855e Restore locality computation to the nidmap code - don't know how/when it was removed, but that was not good
cmr=v1.7.5:reviewer=hjelmn

This commit was SVN r31091.
2014-03-17 21:59:25 +00:00
Ralph Castain
45196d222b Minor cleanup of the node state definitions - using the enum allows the debuggers to pretty-print the value
This commit was SVN r31090.
2014-03-17 21:27:58 +00:00
Ralph Castain
796dfe5ada Do a little cleanup - only resusage needs the node/proc info, so remove it from the sensor base
This commit was SVN r31089.
2014-03-17 21:26:46 +00:00
Ralph Castain
7bb8dbade6 Extend the regular expression parsing support
This commit was SVN r31088.
2014-03-17 21:25:05 +00:00
Ralph Castain
0257d32eeb There is no OOB component object - it is a simple struct with an opal_list_item_t element at the beginning
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31087.
2014-03-17 21:23:59 +00:00
Ralph Castain
38e02890aa ORTE doesn't care about cxx flags
cmr=v1.8:reviewer=jsquyres

This commit was SVN r31086.
2014-03-17 21:21:54 +00:00
Nathan Hjelm
7ec19358df MCA/base: document that is is valid for the string_value parameter to
an enumerator's mca_base_var_enum_sfv_fn_t can be NULL.

cmr=v1.7.5:ticket=trac:4398:reviewer=ompi-gk1.7

This commit was SVN r31085.

The following Trac tickets were found above:
  Ticket 4398 --> https://svn.open-mpi.org/trac/ompi/ticket/4398
2014-03-17 18:52:54 +00:00
Ralph Castain
f259d50ed7 Fully fix the PMI2 warning - turned out to be larger than originally thought due to the way the function was being handled across multiple files. Properly resolve the problem by not compiling the file if PMI2 is not desired, and then appropriately setting the visibility of the function within the module
Refs trac:4400

This commit was SVN r31084.

The following Trac tickets were found above:
  Ticket 4400 --> https://svn.open-mpi.org/trac/ompi/ticket/4400
2014-03-17 17:36:37 +00:00
Ralph Castain
e152449be4 Silence warning
cmr=v1.7.5:reviewer=ompi-gk1.7

This commit was SVN r31083.
2014-03-17 17:05:24 +00:00
Nathan Hjelm
b9dfe84b05 Fix segmentation fault in handling of boolean variables in mca_base_var_set_value.
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31082.
2014-03-17 14:58:30 +00:00
Nathan Hjelm
f92579dce5 coll/ml: fix a case not correctly handled by r31071
In r31071 I modified the logic to not increment the hierarchy level if
no processes were selected by that sbgp. That fixed a problem seen on
systems where we don't support process binding. The problem is there
is a case where we actually did select processes yet the number of
selected processes is 0. We need to increment the hierarchy in this case
as well.

This should fix the segmentation fault found by recent MTT runs. Once
this is committed to 1.7.5 remove the .ompi_ignore's from coll/ml and
bcol/ptpcoll. Tested with ompi-tests/ibm.

cmr=v1.7.5:reviewer=rhc

This commit was SVN r31081.

The following SVN revision numbers were found above:
  r31071 --> open-mpi/ompi@1911d97044
2014-03-15 22:37:28 +00:00
Ralph Castain
b248b27637 Remove a check that prevented mpirun from exiting when it should in the single-node case
Refs trac:4393

This commit was SVN r31080.

The following Trac tickets were found above:
  Ticket 4393 --> https://svn.open-mpi.org/trac/ompi/ticket/4393
2014-03-15 15:25:44 +00:00
Jeff Squyres
34d92315ae Remove extraneous "while(0)".
Oops.

cmr=v1.7.5:ticket=trac:4395

This commit was SVN r31075.

The following Trac tickets were found above:
  Ticket 4395 --> https://svn.open-mpi.org/trac/ompi/ticket/4395
2014-03-14 20:41:54 +00:00
Jeff Squyres
06a58affca Fix minor hwloc memory leak in sbgp/basesmsocket
cmr=v1.8:reviewer=hjelmn

This commit was SVN r31074.
2014-03-14 20:40:12 +00:00
Jeff Squyres
036db91f3d For the love of all that is holy, do not put 1MB arrays on the stack.
This was causing JVMs to run out of stack space, and all manner of
badness ensued.

Instead, use the heap -- that's what it's there for.

cmr=v1.7.5:reviewer=rhc:subject=make coll/ml use the heap for large debug array

This commit was SVN r31073.
2014-03-14 20:39:39 +00:00
Rolf vandeVaart
ce5274652f Add some additional verbose output per this RFC
http://www.open-mpi.org/community/lists/devel/2014/03/14282.php
Reviewed by Jeff Squyres

This commit was SVN r31072.
2014-03-14 20:17:47 +00:00
Nathan Hjelm
1911d97044 coll/ml: fix assertion failure that occurs when level 0 of the hierarchy
fails to select any processes on any nodes.

Also modified basesmsocket to only print debugging info to the framework
output.

cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31071.
2014-03-14 19:39:00 +00:00
Ralph Castain
fbc5e3b773 Deal with the corner case where we encounter an error when attempting to launch a daemon. In this case, we will order abnormal termination before daemons callback to us, and thus any attempt to send them a "die" message will fail. Ensure that mpirun at least exits cleanly in this scenario, thereby allowing the remote daemons that did get launched to commit suicide when comm fails.
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31068.
2014-03-14 15:32:30 +00:00
Jeff Squyres
8e8154645b Rearrange ordering of redirection
This prevents "/usr/bin/which: no oshmem_info ..." messages from
appearing.

cmr=v1.7.5:reviewer=rhc

This commit was SVN r31067.
2014-03-14 15:23:18 +00:00
Mike Dubman
cf9f5f9c4c enable oshmem in mlnx platform file
cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r31065.
2014-03-14 08:18:55 +00:00
Jeff Squyres
616e62bb9e Fix --enable|disable-oshmem configure switch
* Fixed the AC_ARG_ENABLE for the oshmem option.
* Fixed the AM_CONDITIONALs values for the projects.
* Re-added (and slightly simplified) the use of PROJECT_OSHMEM in
  various Makefile.am's to disable building OSHMEM stuff

Submitted by Jeff, reviewed and approved by Ralph.

cmr=v1.7.5:reviewer=ompi-gk1.7

This commit was SVN r31062.
2014-03-13 21:23:04 +00:00
Jeff Squyres
16cab57ec5 Fix some set-but-not-used compiler warnings.
cmr=v1.8:reviewer=miked

This commit was SVN r31061.
2014-03-13 21:20:36 +00:00
Ralph Castain
2abed09d7c Continue to resolve priority issues. Cleanup the case of forced termination in mpirun during launch processing by ensuring we can respond to socket closures, and ensuring that the remote daemons correctly close their sockets when terminating.
Jeff: please test a variety of conditions to ensure we get this right

cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31058.
2014-03-13 04:02:24 +00:00
Jeff Squyres
24020ef1e3 Refs trac:4372: 3rd and hopefully final addendum to Fortran API fixes for the RMA functions
These parameters should not be marked as INTENT(OUT) (they aren't in
the MPI-3 standard).

This commit was SVN r31056.

The following Trac tickets were found above:
  Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
2014-03-12 22:55:57 +00:00
Ralph Castain
cd72aa9b66 Per Dave's comment, bzero has portability issues and little advantage over a simple memset. So let's use the safer solution.
cmr=v1.7.5:reviewer=dgoodell:subject=replace bzero with memset

This commit was SVN r31055.
2014-03-12 22:55:47 +00:00
Ralph Castain
ac421c931d The random number generator changes were incomplete (typo errors) in some places, and is missing the required declspec's for visibility.
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31053.
2014-03-12 22:37:27 +00:00
Nathan Hjelm
e70809e169 osc/rdma: fix the spelling of incoming
cmr=v1.7.5:ticket=trac:4379

This commit was SVN r31050.

The following Trac tickets were found above:
  Ticket 4379 --> https://svn.open-mpi.org/trac/ompi/ticket/4379
2014-03-12 21:43:23 +00:00
Jeff Squyres
ccff41383c Refs trac:4372: Another addendum to Fortran API fixes for the RMA functions
* Several parameters should not be marked as INTENT(OUT) (they aren't in
  the MPI-3 standard).
* Added missing PMPI F08 OMPI interfaces

This commit was SVN r31049.

The following Trac tickets were found above:
  Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
2014-03-12 20:22:15 +00:00
Jeff Squyres
8a5a832085 Refs trac:4372: Addendum to Fortran API fixes for the RMA functions
These parameters should not be marked as INTENT(OUT) (they aren't in
the MPI-3 standard).

This commit was SVN r31048.

The following Trac tickets were found above:
  Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
2014-03-12 19:59:04 +00:00
Nathan Hjelm
d0009938a6 osc/rdma: tighten semantics a bit more
It is not valid to call flush outside a passive target epoch nor is
it valid to call lock/lock_all when no_locks is set. In the former
we were just semantically incorrect and the later would crash and
burn.

cmr=v1.7.5:ticket=trac:4382

This commit was SVN r31046.

The following Trac tickets were found above:
  Ticket 4382 --> https://svn.open-mpi.org/trac/ompi/ticket/4382
2014-03-12 18:53:47 +00:00
Nathan Hjelm
1fc9a55d08 osc/rdma: do not use MPI_SOURCE to determine the peer in an send operation.
This fixes a bug in r31029 which removes the use of the pml base request
(also not a good way since cm doesn't use the base request). We now allocate
a data structure (ugh) to determine the needed information. Tested with
mtt/onesided.

cmr=v1.7.5:ticket=trac:4379

This commit was SVN r31044.

The following SVN revision numbers were found above:
  r31029 --> open-mpi/ompi@29e00f9161

The following Trac tickets were found above:
  Ticket 4379 --> https://svn.open-mpi.org/trac/ompi/ticket/4379
2014-03-12 17:14:11 +00:00
Nathan Hjelm
6648a46963 rma: fix semantic errors in osc/rdma and MPI_Win_fence
- Return an error if the caller specified both MPI_MODE_NOPRECEDE and
   MPI_MODE_NOSUCCEED to MPI_Win_fence.

 - Return an error if the caller attempts to enter an active target
   epoch while already in a passive target epoch.

 - End an active target epoch if MPI_Win_fence is called with
   MPI_MODE_NOSUCCEED.

cmr=v1.7.5:ticket=trac:4382

This commit was SVN r31043.

The following Trac tickets were found above:
  Ticket 4382 --> https://svn.open-mpi.org/trac/ompi/ticket/4382
2014-03-12 17:14:03 +00:00
Ralph Castain
f56f37d364 Shifting to an event-driven RTE raises some interesting issues during shutdown. We want the last messages to get thru, but also need to correctly shutdown the virtual machine. This requires a delicate balancing act across event priorities, and the need to check for termination conditions in places where related events get processed.
Change the priority of comm_failure and job_termination events to ensure we process final messages prior to terminating. Check for termination conditions when processing proc termination events as we may order proc termination when the daemon gets an exit command, but we can't see the proc actually terminate until we get out of that message event.

Jeff: probably easiest to review this by testing. I tested it under both Slurm and rsh on v1.7.5 as well as trunk

cmr=v1.7.5:reviewer=jsquyres:subject=resolve event priorities during VM shutdown

This commit was SVN r31042.
2014-03-12 16:49:58 +00:00
Nathan Hjelm
51916c5b41 osc/rdma: now that the access epoch is not open after MPI_Win_create* we
need to enable the access epoch in MPI_Win_fence.

I missed this change when I fixed the semantics of MPI_Win_create. With
this commit our one-sided MTT runs are now running clean.

cmr=v1.7.5:reviewer=dgoodell

This commit was SVN r31041.
2014-03-12 16:11:15 +00:00
Jeff Squyres
3120ec2b96 Also bump the Fortran MPI version constants to 3.0
cmr=v1.7.5:ticket=trac:4371

This commit was SVN r31038.

The following Trac tickets were found above:
  Ticket 4371 --> https://svn.open-mpi.org/trac/ompi/ticket/4371
2014-03-12 16:06:15 +00:00
Nathan Hjelm
c173344141 Add new MPI-3.1 tools interface functions.
See https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/377

This ticket adds the following functions to the standard:

 - MPI_T_cvar_get_index, MPI_T_pvar_get_index, and MPI_T_category_get_index

The ticket has passed and the functions are part of MPI-3.1 that will
be released sometime later this year. In Open MPI the functions expose
existing internal functionality so they are low-risk to add to 1.8.0. I
will leave it up to Ralph whether he wants to accept these into 1.8.

cmr=v1.8:reviewer=rhc

This commit was SVN r31037.
2014-03-12 16:03:39 +00:00
Jeff Squyres
c6fb1b51b1 Remove "medium" RMA interfaces
We no longer specify interfaces with choice buffers in the TKR "mpi"
module implementation -- MPI-3 prohibits it (see r30169 and r30170 for
more details).

cmr=v1.7.5:ticket=trac:4372

This commit was SVN r31033.

The following SVN revision numbers were found above:
  r30169 --> open-mpi/ompi@759ee33fd4
  r30170 --> open-mpi/ompi@776f6144af

The following Trac tickets were found above:
  Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
2014-03-12 15:51:42 +00:00
Jeff Squyres
53248a90f3 Add missing files in Makefile.am listing
cmr=v1.7.5:ticket=trac:4372

This commit was SVN r31032.

The following Trac tickets were found above:
  Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
2014-03-12 15:48:55 +00:00
Nathan Hjelm
61f30d992a coll/ml: reduce has some issues when using non-contiguous datatypes. until
these issues are resolved disable coll/ml reduce.

cmr=v1.7.5:reviewer=manjugv

This commit was SVN r31030.
2014-03-12 14:39:16 +00:00
Nathan Hjelm
29e00f9161 osc/rdma: fix issues with mpi_leave_pinned when using rdma capable btls
It seems we can't release accumulate buffers in completion callbacks
because the btls don't release registration resources until after the
callback has fired. The fix is to keep track of the unused buffers and
free them later. This should resolve issues when running IMB-EXT and
IMB-RMA.

cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31029.
2014-03-12 14:39:03 +00:00
Jeff Squyres
b8cd2878e7 Fix more pragma argument names to match dummy argument names.
Issue found by Absoft MTT runs.

cmr=v1.7.5:ticket=trac:4372

This commit was SVN r31028.

The following Trac tickets were found above:
  Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
2014-03-12 14:30:51 +00:00
Jeff Squyres
f3b5670dce Fix pragma argument names to match dummy argument names.
Issue found by Absoft MTT runs.

cmr=v1.7.5:ticket=trac:4372

This commit was SVN r31027.

The following Trac tickets were found above:
  Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
2014-03-12 14:27:43 +00:00
Jeff Squyres
3523e315b8 Fix ompi function in Fortran RMA bindings
cmr=v1.7.5:ticket=trac:4372

This commit was SVN r31026.

The following Trac tickets were found above:
  Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
2014-03-12 14:12:10 +00:00
Ralph Castain
a254d2db34 Silence warning when CR is not enabled
This commit was SVN r31025.
2014-03-12 13:47:03 +00:00
Jeff Squyres
90903813b1 Update .gitignore
This commit was SVN r31024.
2014-03-12 13:20:50 +00:00