1
1
Граф коммитов

12449 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
edb3d99687 Update SLURM environmental variables used to describe allocation. Retain backwards compatibility to SLURM 1.1 and earlier versions.
This commit was SVN r19647.
2008-09-26 02:38:37 +00:00
Tim Mattox
aaae28b6c7 Resync the NEWS file with the 1.2 branch.
This commit was SVN r19646.
2008-09-25 21:48:17 +00:00
Aurelien Bouteiller
4be474f727 CRS is now an opal framework. It should use OPAL version defines.
This commit was SVN r19643.
2008-09-25 21:01:04 +00:00
Kenneth Matney
91bbc6b919 Change algorithm from spawning a shell that spawns another shell, and
thereby runs apstat twice; and in the process thereof reads the ALPS
appinfo file TWICE; and in addition, experiences a failure sometimes
which causes mpirun to hang.  Change this to a looped read attempt
that breaks on success, thereby avoiding failure (except in the most

This commit was SVN r19642.
2008-09-25 20:44:16 +00:00
Tim Mattox
821d81304f Testing...
This commit was SVN r19641.
2008-09-25 20:34:00 +00:00
Jeff Squyres
8de0663ae0 Increase the size of MPI_MAX_PORT_NAME from 256 to 1024.
Rationale:

 1. This value has already changed since v1.2 (v1.2 MPI_MAX_PORT_NAME
    == 36).  Hence, this commit simply increases the value from a
    previous change.
 1. The changes does increase OMPI's memory footprint slightly, but
    only when using MPI-2 dynamics.  So it is expected that the change
    will have minimal impact on the overall footprint.
 1. The change is helpful for nodes that have 4 or more IP networks
    (e.g., regular ethernet and multiple IP-over-<pick your favorite
    high-speed network> networks).  Without this change, invoking
    MPI_COMM_SPAWN on hosts with 4 or more IP networks will fail
    because we'll exceed 256 bytes for the port name.  Some OMPI
    developer test clusters already have this kind of configuration
    (e.g., Cisco); it is expected that this is not too common in the
    real world yet, but with "manycore" coming, having multiple
    IP-based networks in a single server will likely become more
    common.

This commit was SVN r19638.
2008-09-25 16:47:17 +00:00
Ralph Castain
037231fbcb MOdify the node_rank and local_rank fields to be uint16_t so we can handle more than 256 procs/node. Change the type to a defined one so that any future change can be easily done, if required.
This commit was SVN r19637.
2008-09-25 13:39:08 +00:00
Ralph Castain
55738aeabe Very tiny modification of the output when displaying mca param values to clarify that ones found in the environment could have also been set on the cmd line - we don't have a way to distinguish them internally.
This commit was SVN r19636.
2008-09-25 13:08:17 +00:00
Tim Mattox
6a3e28a3b6 Resync the NEWS file with the 1.3 branch.
This commit was SVN r19635.
2008-09-24 21:06:34 +00:00
Tim Mattox
fae18d1ea2 Ugh, whitespace differences suck... fixing.
This commit was SVN r19632.
2008-09-24 20:51:18 +00:00
Tim Mattox
1d5a6602b6 Resync the NEWS on the trunk with the 1.2 branch.
This commit was SVN r19631.
2008-09-24 20:41:17 +00:00
Jeff Squyres
627f1ecd36 Oops -- fix silly cut-n-paste error...
This commit was SVN r19630.
2008-09-24 20:17:54 +00:00
Jeff Squyres
d85aaf521a Also show the process name; it *is* useful, at least to us developers ;-)
This commit was SVN r19629.
2008-09-24 19:32:34 +00:00
Josh Hursey
77e6b72c06 Update my entry to reflect all of the affiliations.
This commit was SVN r19627.
2008-09-24 17:34:58 +00:00
Jeff Squyres
78a25cf116 Commit a few missing header files, etc.
This commit was SVN r19626.
2008-09-24 15:41:42 +00:00
Brian Barrett
1f69ae5356 Add SNL affiliation for me. SNL is plural.
This commit was SVN r19625.
2008-09-24 15:20:45 +00:00
Ralph Castain
8d1ecdb361 Correct the creation of MPIR_Proctable so that the structs in the array correspond to the order of the ranks.
This commit was SVN r19624.
2008-09-24 14:55:46 +00:00
Jeff Squyres
bbfac2dfb5 Based on a review by Ralph, no need to call getpid() or gethostname();
we already have them in orte_process_info.  Refs trac:1523.

This commit was SVN r19615.

The following Trac tickets were found above:
  Ticket 1523 --> https://svn.open-mpi.org/trac/ompi/ticket/1523
2008-09-23 20:04:34 +00:00
Jeff Squyres
2879de60a1 Update a name spelling
This commit was SVN r19614.
2008-09-23 19:59:57 +00:00
Jeff Squyres
ca323aae8e Very minor updates. Refs trac:1399.
This commit was SVN r19613.

The following Trac tickets were found above:
  Ticket 1399 --> https://svn.open-mpi.org/trac/ompi/ticket/1399
2008-09-23 19:50:31 +00:00
Jeff Squyres
ef6a216771 Update AUTHORS file with all the IDs that have committed so far on the
OMPI trunk.  Need all organizations to ensure I got spellings and
affiliations correct.  

Also commit a helper script to help keep AUTHORS up to date on the
trunk; it should be run before we create release branches.

This commit was SVN r19612.
2008-09-23 19:38:53 +00:00
Jeff Squyres
4c558ed637 Enable aggregation checking for "*** An error occurred..." MPI layer
help messages so that users only see the message once instead of N
times when their MPI app crashes.

Note that there is a tradeoff here -- we now call malloc in this
particular "show the error" code path.  This shouldn't usually be a
problem, because the errors typically displayed through this mechanism
are MPI API argument problems (e.g., sending a negative count to
MPI_SEND), and not memory errors.  But such API argument errors could
be a consequence of of a prior memory error, so there's a nonzero
chance that the error failure will fail to print because malloc
failed.  In this case, the user can disable help message aggregation
(via the orte_base_want_aggregate MCA parameter) and we'll fall back
to the no-malloc code path (but without aggregation).

Note that we won't aggregate before MPI_INIT or after MPI_FINALIZE.
So if you call an MPI function before MPI_INIT / after MPI_FINALIZE,
you'll still see the error message N times.  Nothing we can do about
that; we need ORTE to do the aggregation properly (which is obviously
unavailable before MPI_INIT / after MPI_FINALIZE).

This commit was SVN r19611.
2008-09-23 17:19:24 +00:00
Jeff Squyres
a676874f47 Disable global ID resolution when sparse groups are used. Tested by
Terry and George in the non-sparse-groups scenarios.  Fixes trac:1464.

Will file a new ticket to actually resolve IDs when sparse groups are
used.

This commit was SVN r19610.

The following Trac tickets were found above:
  Ticket 1464 --> https://svn.open-mpi.org/trac/ompi/ticket/1464
2008-09-23 16:27:01 +00:00
Pavel Shamis
bd09bbf851 Disabling IBCM support by default.
The component still is not stable.

This commit was SVN r19609.
2008-09-23 15:57:55 +00:00
Ralph Castain
e64b79f30f Modify the --display-map and --display-alloc per note on devel list to reduce info for user understanding.
Add --display-devel-map and --display-devel-alloc to display all the detailed info we used to provide - it is only of use/interest to developers anyway and confuses users.

This commit was SVN r19608.
2008-09-23 15:46:34 +00:00
Josh Hursey
90c936b292 Cleanup BLCR configure logic. Add a '--with-blcr-libdir' option to allow a user to specify a library directory outside of the '--with-blcr' option.
Needs to be moved to v1.3

This commit was SVN r19607.
2008-09-22 19:48:47 +00:00
Josh Hursey
36b824effd Make sure to protect the symbol, so builds that do not involve threads will build properly.
Thanks to Jeff for pointing this out to me.

This commit was SVN r19606.
2008-09-22 19:03:41 +00:00
Kenneth Matney
68248a32ef Add #include for stdio.h to allow make check to run with gcc 4.2.4 (on
Cray XT platform).

This commit was SVN r19605.
2008-09-22 18:00:30 +00:00
Jeff Squyres
e0a991a8c2 Print out a message telling the user how to enable non-aggregated help
/ error messages.

This commit was SVN r19604.
2008-09-22 17:42:56 +00:00
Jeff Squyres
d6696c46a6 Oops -- sometimes we actually pass NULL for the error_code. Make sure
to handle that nicely without segv'ing.

This commit was SVN r19603.
2008-09-22 17:41:39 +00:00
Josh Hursey
0cd65bfaa8 Fix a SIGPIPE that may occur when checkpointing a restarted process. This was a result of calling system() in the BLCR CRS. After inspection and testing it was determined that the operation was no longer necessary. So the call was removed thus fixing the bug.
This commit was SVN r19601.
2008-09-22 16:49:56 +00:00
Jeff Squyres
8eccda391a Fix comment to match the code.
This commit was SVN r19598.
2008-09-20 12:35:48 +00:00
Jeff Squyres
02f2cbe85a * Added bullet about upgrading autotools
* Added bullet about removing duplicate error messages
 * Some minor grammar and syntax fixes.

This commit was SVN r19597.
2008-09-20 11:42:59 +00:00
Jeff Squyres
5fd742e769 Add in the standardized way to notify a debugger if the MPI job is
about to abort.  Fixes trac:1509.

This commit was SVN r19596.

The following Trac tickets were found above:
  Ticket 1509 --> https://svn.open-mpi.org/trac/ompi/ticket/1509
2008-09-20 11:34:37 +00:00
Matthias Jurenz
16561fa297 Added config.h.in to svn:ignore
This commit was SVN r19593.
2008-09-19 15:17:36 +00:00
Matthias Jurenz
5755d35045 Removed - This file will be created by autotools
This commit was SVN r19591.
2008-09-19 15:09:46 +00:00
Jeff Squyres
53967f2b4e Merge in PLPA v1.2rc2 (README fixes, new version of Autotools, and
have PLPA report its version correctly).

This commit was SVN r19590.
2008-09-19 15:05:03 +00:00
Jeff Squyres
b1ff61b19e Update to PLPA v1.2rc1
This commit was SVN r19589.
2008-09-19 14:49:53 +00:00
Jeff Squyres
7d119a1c3b Fix CID 1116: ensure to check return code (patch approved by George
:-) ).

This commit was SVN r19584.
2008-09-19 13:28:04 +00:00
Jeff Squyres
d0a8be6d2f Fix CID 1117: ensure to check return values.
This commit was SVN r19583.
2008-09-19 13:27:30 +00:00
Lenny Verkhovsky
ca0a5ea60b Fixed the warnings on the crays.
base/paffinity_base_service.c:153: warning: 'phys_core' may be used uninitialized in this function
base/paffinity_base_service.c:153: note: 'phys_core' was declared here

This commit was SVN r19580.
2008-09-18 11:31:12 +00:00
Matthias Jurenz
d42592113b Fixed compiler warning (unused variable)
This commit was SVN r19577.
2008-09-17 14:39:19 +00:00
Josh Hursey
778e387618 fix a compiler warning
This commit was SVN r19574.
2008-09-17 14:01:31 +00:00
Josh Hursey
80d05cf957 Cleanup the patch from r19566.
Thanks to George and Jeff for pointing out a better way to do this.

This commit was SVN r19573.

The following SVN revision numbers were found above:
  r19566 --> open-mpi/ompi@351c3a3a86
2008-09-17 13:55:21 +00:00
Jeff Squyres
d2d06008a0 Change the default value of mpi_leave_pinned to -1, meaning that we'll
figure it out at runtime (really meaning: we'll still default to "0"
unless something explicitly overrides to 1, such as the openib BTL).
This way, ompi_info doesn't confusingly report mpi_leave_pinned==0 for
mpi_leave_pinned, but we end up running with mpi_leave_pinned==1.

Fixes trac:1502.

This commit was SVN r19571.

The following Trac tickets were found above:
  Ticket 1502 --> https://svn.open-mpi.org/trac/ompi/ticket/1502
2008-09-16 22:06:14 +00:00
Josh Hursey
351c3a3a86 The ft_event function needs access to the bml_r2_remove_btl_progress() to ensure
that all progress events are flushed as needed across a checkpoint/restart.

This commit was SVN r19566.
2008-09-16 19:06:53 +00:00
Jeff Squyres
270f482fea Addendum to r19561: also remove a comment that is no longer true and
some code that is commented out.

This commit was SVN r19564.

The following SVN revision numbers were found above:
  r19561 --> open-mpi/ompi@17e65369be
2008-09-16 13:02:10 +00:00
George Bosilca
6a9514ee08 Make the code match the comment. I checked with Jelena, and based on the papers we
published this is the expected algorithm for the specified message and communicator
size.

This commit closes ticket #1330.

This commit was SVN r19563.
2008-09-15 23:28:40 +00:00
George Bosilca
acd3406aa7 Never drop messages. No never no more.
This is supposed to fix the ticket #1460.

This commit was SVN r19562.
2008-09-15 23:04:18 +00:00
George Bosilca
17e65369be Fix the deadlock when we run out of resources on the BTLs. Move the progress
function from the BML into the PML. The BTL progress functions are now directly
registered with the event library.

This commit was SVN r19561.
2008-09-15 22:56:23 +00:00