1
1
Граф коммитов

19809 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
8b2d723fd4 coll/ml: fix valgrind warning about reading uninitialed value
This isn't causing any errors that I know about but it does fix an
annoying valgrind warning. Simple fix, no review required.

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r31130.
2014-03-18 21:26:17 +00:00
Nathan Hjelm
d9c8bf3785 coll/ml: move error messages to verbose output
There are situations where coll/ml does not initialize properly. These will
eventually need to be fixed but in the meantime it is better to not always
print an error message because the collective framework can still fall back
on another collective module. This commit reduces the verbose output.

cmr=v1.7.5:reviewer=manjugv

This commit was SVN r31129.
2014-03-18 21:26:10 +00:00
Nathan Hjelm
97d7315dd2 coll/ml: do not assert if a barrier algorithm is not available
It is usually not a good idea to assert when something is not implemented
or something goes wrong. Replace asserts with debug output and return.

cmr=v1.7.5:reviewer=manjugv

This commit was SVN r31128.
2014-03-18 21:26:04 +00:00
Nathan Hjelm
bddd6542b7 sbgp/basesmsocket: do not recalculate process locality
The necessary information is stored in the proc object. There is no need
to allgather the local process data to determine if another rank is on
the same socket.

cmr=v1.7.5:reviewer=manjugv

This commit was SVN r31127.
2014-03-18 21:25:57 +00:00
Nathan Hjelm
22f64bb62b Addendum to r31096. Up basesmuma algorithm limits to 1M.
After discussion with Manju we decided to update these the process count
limits of the shared memory collectives to an arbitrarily large number.

cmr=v1.7.5:ticket=trac:4405

This commit was SVN r31126.

The following SVN revision numbers were found above:
  r31096 --> open-mpi/ompi@3f469d08e7

The following Trac tickets were found above:
  Ticket 4405 --> https://svn.open-mpi.org/trac/ompi/ticket/4405
2014-03-18 21:25:49 +00:00
Ralph Castain
518ba55cf4 Ensure MPIEXEC_TIMEOUT calls the correct state to exit
cmr=v1.7.5:reviewer=dgoodell

This commit was SVN r31125.
2014-03-18 20:12:02 +00:00
Ralph Castain
06f0a9584b Sync NEWS and README with 1.7.5
This commit was SVN r31124.
2014-03-18 18:42:38 +00:00
Ralph Castain
d82bd5f3cf Sync NEWS and README with 1.7.5
This commit was SVN r31122.
2014-03-18 18:38:38 +00:00
Ralph Castain
543271b9de Set the locality prior to calling add_procs so bozos like Jeff get it at the right time
Refs trac:4411

This commit was SVN r31119.

The following Trac tickets were found above:
  Ticket 4411 --> https://svn.open-mpi.org/trac/ompi/ticket/4411
2014-03-18 17:57:27 +00:00
Ralph Castain
3323c47ab4 Ensure all procs set locality for all remote procs in the multi-way intercomm_create problem
Refs trac:4411

This commit was SVN r31118.

The following Trac tickets were found above:
  Ticket 4411 --> https://svn.open-mpi.org/trac/ompi/ticket/4411
2014-03-18 16:55:15 +00:00
Jeff Squyres
7933de4928 Fix segv when ibv_create_ah fails.
* Ensure that all endpoints[x] values are initialized to NULL
* If ibv_create_ah fails, remove each endpoint from the
  module->all_endpoints list so that the endpoint can be destructed
  properly.

Submitted by Jeff Squyres, reviewed by Dave Goodell.

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r31111.
2014-03-18 15:52:55 +00:00
Adrian Reber
62dea2be84 opal-restart: fix variable passing from opal-restart to CRS
opal-restart.c disables crs module selection by setting
crs_base_do_not_select to true using opal_setenv() before
opal_init(). After opal_init() it enables module selection
by changing the variable back to false using opal_setenv().
This does not work anymore as the variables are only read
from the environment during variable registration.
This changes the second opal_setenv() to mca_base_var_set_value()

This commit was SVN r31108.
2014-03-18 15:28:42 +00:00
Mike Dubman
84a6330b27 OSHMEM: SHMEM_API_ macro fix
The verbose level was initialized too early
group shmem mca params together

fixed by Roman, reviewed by Igor/Mike

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r31107.
2014-03-18 15:07:04 +00:00
Ralph Castain
554da83865 Set the locality for remote procs even after a comm_spawn. Ensure we store our own local cpuset upon launch so it will be shared during comm_join.
This provides full locality - i.e., not just node-level, but all the way down to whatever common binding level exists between the procs.

cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31106.
2014-03-18 14:51:07 +00:00
Jeff Squyres
5efd961149 Remove unnecessary \n's in ML_VERBOSE and ML_ERROR.
Also fixed spelling: IS_NOT_RECHABLE -> IS_NOT_REACHABLE.

Also mark a few places where opal_show_help() should have been used;
Manju will take care of these.

This commit was SVN r31104.
2014-03-18 12:24:32 +00:00
Ralph Castain
0aa23cdc35 Cleanup copy/paste errors to ensure we progress the launch
cmr=v1.7.5:reviewer=rhc

This commit was SVN r31102.
2014-03-18 01:24:49 +00:00
Ralph Castain
9a8d2d9989 Add a missing "break" statement that otherwise causes the fetch of any string object to return an error
cmr=v1.7.5:reviewer=hjelmn

This commit was SVN r31101.
2014-03-18 01:14:34 +00:00
Nathan Hjelm
3f469d08e7 coll/ml: increase the number of allowed processes in a local reduce and
add checks to see if the bcol module can support allreduce.

cmr=v1.7.5:reviewer=manjugv

This commit was SVN r31096.
2014-03-17 23:10:19 +00:00
Pavel Shamis
fba1edbf14 Removing ml include from bcol_ptpcoll.h.
It is not really required.

This commit was SVN r31095.
2014-03-17 22:58:40 +00:00
Ralph Castain
545ac7dc58 Remove the job_control_forwarding logic as we want *any* signal to go to all members of the process group
Refs trac:4404

This commit was SVN r31094.

The following Trac tickets were found above:
  Ticket 4404 --> https://svn.open-mpi.org/trac/ompi/ticket/4404
2014-03-17 22:45:33 +00:00
Ralph Castain
5a868028a8 Revert r31091 - the functionality didn't disappear, but moved into the MPI layer :-(
This commit was SVN r31093.

The following SVN revision numbers were found above:
  r31091 --> open-mpi/ompi@edf680855e
2014-03-17 22:30:03 +00:00
Ralph Castain
99c9ecaed0 Ensure that we send the specified signal to the entire process group of each member of the pid provided to us. This ensures that any children spawned by our children also see the signal
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31092.
2014-03-17 22:12:15 +00:00
Ralph Castain
edf680855e Restore locality computation to the nidmap code - don't know how/when it was removed, but that was not good
cmr=v1.7.5:reviewer=hjelmn

This commit was SVN r31091.
2014-03-17 21:59:25 +00:00
Ralph Castain
45196d222b Minor cleanup of the node state definitions - using the enum allows the debuggers to pretty-print the value
This commit was SVN r31090.
2014-03-17 21:27:58 +00:00
Ralph Castain
796dfe5ada Do a little cleanup - only resusage needs the node/proc info, so remove it from the sensor base
This commit was SVN r31089.
2014-03-17 21:26:46 +00:00
Ralph Castain
7bb8dbade6 Extend the regular expression parsing support
This commit was SVN r31088.
2014-03-17 21:25:05 +00:00
Ralph Castain
0257d32eeb There is no OOB component object - it is a simple struct with an opal_list_item_t element at the beginning
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31087.
2014-03-17 21:23:59 +00:00
Ralph Castain
38e02890aa ORTE doesn't care about cxx flags
cmr=v1.8:reviewer=jsquyres

This commit was SVN r31086.
2014-03-17 21:21:54 +00:00
Nathan Hjelm
7ec19358df MCA/base: document that is is valid for the string_value parameter to
an enumerator's mca_base_var_enum_sfv_fn_t can be NULL.

cmr=v1.7.5:ticket=trac:4398:reviewer=ompi-gk1.7

This commit was SVN r31085.

The following Trac tickets were found above:
  Ticket 4398 --> https://svn.open-mpi.org/trac/ompi/ticket/4398
2014-03-17 18:52:54 +00:00
Ralph Castain
f259d50ed7 Fully fix the PMI2 warning - turned out to be larger than originally thought due to the way the function was being handled across multiple files. Properly resolve the problem by not compiling the file if PMI2 is not desired, and then appropriately setting the visibility of the function within the module
Refs trac:4400

This commit was SVN r31084.

The following Trac tickets were found above:
  Ticket 4400 --> https://svn.open-mpi.org/trac/ompi/ticket/4400
2014-03-17 17:36:37 +00:00
Ralph Castain
e152449be4 Silence warning
cmr=v1.7.5:reviewer=ompi-gk1.7

This commit was SVN r31083.
2014-03-17 17:05:24 +00:00
Nathan Hjelm
b9dfe84b05 Fix segmentation fault in handling of boolean variables in mca_base_var_set_value.
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31082.
2014-03-17 14:58:30 +00:00
Nathan Hjelm
f92579dce5 coll/ml: fix a case not correctly handled by r31071
In r31071 I modified the logic to not increment the hierarchy level if
no processes were selected by that sbgp. That fixed a problem seen on
systems where we don't support process binding. The problem is there
is a case where we actually did select processes yet the number of
selected processes is 0. We need to increment the hierarchy in this case
as well.

This should fix the segmentation fault found by recent MTT runs. Once
this is committed to 1.7.5 remove the .ompi_ignore's from coll/ml and
bcol/ptpcoll. Tested with ompi-tests/ibm.

cmr=v1.7.5:reviewer=rhc

This commit was SVN r31081.

The following SVN revision numbers were found above:
  r31071 --> open-mpi/ompi@1911d97044
2014-03-15 22:37:28 +00:00
Ralph Castain
b248b27637 Remove a check that prevented mpirun from exiting when it should in the single-node case
Refs trac:4393

This commit was SVN r31080.

The following Trac tickets were found above:
  Ticket 4393 --> https://svn.open-mpi.org/trac/ompi/ticket/4393
2014-03-15 15:25:44 +00:00
Jeff Squyres
34d92315ae Remove extraneous "while(0)".
Oops.

cmr=v1.7.5:ticket=trac:4395

This commit was SVN r31075.

The following Trac tickets were found above:
  Ticket 4395 --> https://svn.open-mpi.org/trac/ompi/ticket/4395
2014-03-14 20:41:54 +00:00
Jeff Squyres
06a58affca Fix minor hwloc memory leak in sbgp/basesmsocket
cmr=v1.8:reviewer=hjelmn

This commit was SVN r31074.
2014-03-14 20:40:12 +00:00
Jeff Squyres
036db91f3d For the love of all that is holy, do not put 1MB arrays on the stack.
This was causing JVMs to run out of stack space, and all manner of
badness ensued.

Instead, use the heap -- that's what it's there for.

cmr=v1.7.5:reviewer=rhc:subject=make coll/ml use the heap for large debug array

This commit was SVN r31073.
2014-03-14 20:39:39 +00:00
Rolf vandeVaart
ce5274652f Add some additional verbose output per this RFC
http://www.open-mpi.org/community/lists/devel/2014/03/14282.php
Reviewed by Jeff Squyres

This commit was SVN r31072.
2014-03-14 20:17:47 +00:00
Nathan Hjelm
1911d97044 coll/ml: fix assertion failure that occurs when level 0 of the hierarchy
fails to select any processes on any nodes.

Also modified basesmsocket to only print debugging info to the framework
output.

cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31071.
2014-03-14 19:39:00 +00:00
Ralph Castain
fbc5e3b773 Deal with the corner case where we encounter an error when attempting to launch a daemon. In this case, we will order abnormal termination before daemons callback to us, and thus any attempt to send them a "die" message will fail. Ensure that mpirun at least exits cleanly in this scenario, thereby allowing the remote daemons that did get launched to commit suicide when comm fails.
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31068.
2014-03-14 15:32:30 +00:00
Jeff Squyres
8e8154645b Rearrange ordering of redirection
This prevents "/usr/bin/which: no oshmem_info ..." messages from
appearing.

cmr=v1.7.5:reviewer=rhc

This commit was SVN r31067.
2014-03-14 15:23:18 +00:00
Mike Dubman
cf9f5f9c4c enable oshmem in mlnx platform file
cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r31065.
2014-03-14 08:18:55 +00:00
Jeff Squyres
616e62bb9e Fix --enable|disable-oshmem configure switch
* Fixed the AC_ARG_ENABLE for the oshmem option.
* Fixed the AM_CONDITIONALs values for the projects.
* Re-added (and slightly simplified) the use of PROJECT_OSHMEM in
  various Makefile.am's to disable building OSHMEM stuff

Submitted by Jeff, reviewed and approved by Ralph.

cmr=v1.7.5:reviewer=ompi-gk1.7

This commit was SVN r31062.
2014-03-13 21:23:04 +00:00
Jeff Squyres
16cab57ec5 Fix some set-but-not-used compiler warnings.
cmr=v1.8:reviewer=miked

This commit was SVN r31061.
2014-03-13 21:20:36 +00:00
Ralph Castain
2abed09d7c Continue to resolve priority issues. Cleanup the case of forced termination in mpirun during launch processing by ensuring we can respond to socket closures, and ensuring that the remote daemons correctly close their sockets when terminating.
Jeff: please test a variety of conditions to ensure we get this right

cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31058.
2014-03-13 04:02:24 +00:00
Jeff Squyres
24020ef1e3 Refs trac:4372: 3rd and hopefully final addendum to Fortran API fixes for the RMA functions
These parameters should not be marked as INTENT(OUT) (they aren't in
the MPI-3 standard).

This commit was SVN r31056.

The following Trac tickets were found above:
  Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
2014-03-12 22:55:57 +00:00
Ralph Castain
cd72aa9b66 Per Dave's comment, bzero has portability issues and little advantage over a simple memset. So let's use the safer solution.
cmr=v1.7.5:reviewer=dgoodell:subject=replace bzero with memset

This commit was SVN r31055.
2014-03-12 22:55:47 +00:00
Ralph Castain
ac421c931d The random number generator changes were incomplete (typo errors) in some places, and is missing the required declspec's for visibility.
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31053.
2014-03-12 22:37:27 +00:00
Nathan Hjelm
e70809e169 osc/rdma: fix the spelling of incoming
cmr=v1.7.5:ticket=trac:4379

This commit was SVN r31050.

The following Trac tickets were found above:
  Ticket 4379 --> https://svn.open-mpi.org/trac/ompi/ticket/4379
2014-03-12 21:43:23 +00:00
Jeff Squyres
ccff41383c Refs trac:4372: Another addendum to Fortran API fixes for the RMA functions
* Several parameters should not be marked as INTENT(OUT) (they aren't in
  the MPI-3 standard).
* Added missing PMPI F08 OMPI interfaces

This commit was SVN r31049.

The following Trac tickets were found above:
  Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
2014-03-12 20:22:15 +00:00