This completes the minor changes required to the PLS components. Basically, there is a small change required to the parameter list of the orted cmd functions. I caught and did it for xcpu and poe, in addition to the components listed in my email - so I think that only leaves xgrid unconverted.
The orted fail-to-start mods will also make changes in the PLS components, but those can be localized so they come in one at a time.
This commit was SVN r14499.
Test for system limits (where known) prior to doing things like fork and pipe since some systems aren't very nice about it when we try to exceed such limits.
This commit was SVN r14494.
There is a binomial algorithm in the code (i.e., the HNP would send to a subset of the orteds, which then relay it on according to the typical log-2 algo), but that has a bug in it so the code won't let you select it even if you tried (and the mca param doesn't show, so you'd *really* have to try).
This also involved a slight change to the oob.xcast API, so propagated that as required.
Note: this has *only* been tested on rsh, SLURM, and Bproc environments (now that it has been transferred to the OMPI trunk, I'll need to re-test it [only done rsh so far]). It should work fine on any environment that uses the ORTE daemons - anywhere else, you are on your own... :-)
Also, correct a mistake where the orte_debug_flag was declared an int, but the mca param was set as a bool. Move the storage for that flag to the orte/runtime/params.c and orte/runtime/params.h files appropriately.
This commit was SVN r14475.
finally brings in functionality that is already on the 1.2 branch, and
was developed and tested in the v1.2ofed branch (and other places).
Short version of new features:
* Support for ibv_fork_init()
* Automatically fill in the openib BTL bandwidth value by
querying the HCA port
* Installdirs functionality
* Fixes to always use -I in the Fortran wrapper compilers (#924)
* Gleb's mpool updates
* Remove some kruft in btl/openib/configure.m4, therefore
fixing the harmless warnings noted in #665
* Bunches of updates to the Linux RPM spec file
I.e., effectively the same thing that r14411 brought to the v1.2
branch.
Also effectively brought in r14432 and r14433 (some fixes on top of
the original r14411 commit to v1.2). Still need to bring in the moral
equivalent of r14445 after this commit (fixes to installdirs).
This commit was SVN r14449.
The following SVN revision numbers were found above:
r14411 --> open-mpi/ompi@83b31314ae
r14432 --> open-mpi/ompi@a48f160595
r14433 --> open-mpi/ompi@68f346d2bc
r14445 --> open-mpi/ompi@13d366b827
Create a sentinel value in the metadata file to clearly indicate
that the sequence number is complete (versus in progress). This
way we do not try to restart from an invalid sequence number
which can lead to badness.
This commit was SVN r14423.
the program it try to spawn is missing.
Description of the problem: When the rsh pls try to spawn a local
process which is missing (such as a removed orted) the orterun
deadlock.
Description of the fix: The forked child deal with finding the
program to be executed. If it fails to find it, then instead of
calling exit (as a normal forked program is expected to do) it
continue the execution using a execution path it was never
expected to use (back in orterun and then main). Bad things
happens as expected. Forcing the child to use exit when it fails
to find the orted (and forcing the child to use exit everywhere
instead of return) correct the logic of the rsh pls and make it
behave as expected.
This commit was SVN r14377.
Per discussions with Brian and Ralph, make a slight correction in
where components are installed. Use $pkglibdir, not $libdir/openmpi,
so that when compiled in the orte trunk, components are installed to
the right directory (because the component search patch is checking
$pkglibdir).
This commit was SVN r14345.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r14289
follow the same behavior as before, the changes just make sure the check is done
in linear time and the memory usage is kept to a minimum.
This commit was SVN r14331.
as the IOF components lack the required finalize function. Instead rely on the
module finalize. Read the comment or more informations.
This commit was SVN r14323.
Collect the base 'orted' command line into a base function since most of the
PLS components were duplicating this code. Add AMCA parameter command line
component to the base set.
Add Aggregate MCA parameter support to the following PLS components:
- gridengine
- process
- slurm
- poe
- tm
Improve support for 'rsh' component.
Did/could not support the following components:
- bproc
- proxy
- xcpu
- cnos
- xgrid
The above components had peculiar needs that made it non-trivial to add an
option. The authors of these components need to help in supporting this
new option.
I was only able to test the SLURM and RSH components due to system availability.
The others should work without problem.
This commit was SVN r14284.
The following Trac tickets were found above:
Ticket 976 --> https://svn.open-mpi.org/trac/ompi/ticket/976
- Make it so that all the GPR pointer arrays are allocated initially at 16 elements instead of 512. This saves (on a 64 bit machine) approximately 4*(# procs + # nodes) KB.
- Fix up the segment prealloc function so that preallocating an existant segment is not an error, and make the areas where we do large inserts use it.
Fix the orte_pointer_array to efficiently implement setting its size. Before we just realloced the array one block at a time until the desired size was reached. Now we resize it all in one realloc.
This commit was SVN r14264.
* Remove the connect() timeout code, as it had some nasty race conditions
when connections were established as the trigger was firing. A better
solution has been found for the cluster where this was needed, so just
removing it was easiest.
* When a fatal error (too many connection failures) occurs, set an error
on messages in the queue even if there isn't an active message. The
first message to any peer will be queued without being active (and
so will all subsequent messages until the connection is established),
and the orteds will hang until that first message completes. So if
an orted can never contact it's peer, it will never exit and just sit
waiting for that message to complete.
* Cover an interesting RST condition in the connect code. A connection
can complete the three-way handshake, the connector can even send
some data, but the server side will drop the connection because it
can't move it from the half-connected to fully-connected state because
of space shortage in the listen backlog queue. This causes a RST to
be received first time that recv() is called, which will be when waiting
for the remote side of the OOB ack. In this case, transition the
connection back into a CLOSED state and try to connect again.
* Add levels of debugging, rather than all or nothing, each building on
the previous level. 0 (default) is hard errors. 1 is connection
error debugging info. 2 is all connection info. 3 is more state
info. 4 includes all message info.
* Add some hopefully useful comments
This commit was SVN r14261.
deal with the PLS RSH. Remove support for unknown user (i.e. if the user is
not known by the system, then it shouldn't be allowed to spawn anything).
This commit was SVN r14232.
OPAL_FREE_LIST_WAIT/RETURN will not use locks in a non-threaded build
conditionaly use locks if non-threaded around the OPAL_FREE_LIST_WAIT/RETURN
seems to fix the issue
Tested at 4K processes and seems to work..
This commit was SVN r14135.
listen thread, but we're not the HNP. This is better than not starting up
any listen mode, which is what we were doing before :/
This commit was SVN r14133.
__opal_attribute_nonnull__, __opal_attribute_warn_unused_result__,
__opal_attribute_malloc__, __opal_attribute_sentinel__ and
__opal_attribute_format__
This commit was SVN r14078.
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.
This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.
This commit closes trac:158
More details to follow.
This commit was SVN r14051.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r13912
The following Trac tickets were found above:
Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
* Do not empty the list of in-flight frags during _close(); the OOB
callback will still occur (_send_cb()) and try to remove the frag
from the list, which will then result in an assert failure (debug
builds).
* Add one more fix for a possible problem -- add an extra RETAIN /
RELEASE pair on the endpoint to ensure that it is not actually
freed before all in-flight frags have drained.
This commit was SVN r13953.
The following Trac tickets were found above:
Ticket 921 --> https://svn.open-mpi.org/trac/ompi/ticket/921
- mca_base_param_file_prefix
(Default: NULL)
This is the fullname of the "-am" mpirun option. Used to specify a ':'
separated list of AMCA parameter set files.
- mca_base_param_file_path
(Default: $SYSCONFDIR/amca-param-sets/:$CWD)
The path to search for AMCA files with relative paths. A warning will be
printed if the AMCA file cannot be found.
* Added a new function "mca_base_param_recache_files" the re-reads the file
configurations. This is used internally to help bootstrap the MCA system.
* Added a new orterun/mpirun command line option '-am' that aliases for the
mca_base_param_file_prefix MCA parameter
* Exposed the opal_path_access function as it is generally useful in other
places in the code.
* New function "opal_cmd_line_make_opt_mca" which will allow you to append a
new command line option with MCA parameter identifiers to set at the same
time. Previously this could only be done at command line declaration time.
* Added a new directory under the $pkgdatadir named "amca-param-sets" where all
the 'shipped with' Open MPI AMCA parameter sets are placed. This is the first
place to search for AMCA sets with relative paths.
* An example.conf AMCA parameter set file is located in
contrib/amca-param-sets/.
* Jeff Squyres contributed an OpenIB AMCA set for benchmarking.
Note: You will need to autogen with this commit as it adds a configure param.
Sorry :(
This commit was SVN r13867.
its bigger than the timeout for the connect() call, just don't register
the handler by default and fall back to connect() timing out. Should give
much happier performance on big clusters.
This commit was SVN r13639.
Adjust the RMAPS mapped_node object to propagate the required launch_id info now included in the ras_node object. This provides support for those few systems that don't use nodename to launch, but instead want some id (typically an index into the array of allocated nodes). This value gets set for each node in the RAS - the RMAPS just propagates it for easy launch.
This commit was SVN r13581.
This change utilizes the new num_processors function. I also left the mods made to ompi_mpi_init and the bug fix for the default value of mpi_yield_when_idle. Note that the mods to mpi_init will not really take effect as the mca param will now *always* be set (either by user or odls). We will need those mods later, so no point in removing them now.
This commit was SVN r13519.
1. if the user has specified sched_yield, we simply do what we are told
2. if they didn't specify anything, try to get the number of processors on this node. Note that we already now get the number of local procs in our job that are sharing this node - that now comes in through the proc callback and is stored in the ompi_proc_t structures.
3. if we can get the number of processors, compare that to the number of local procs from my job that are sharing my node. If the number of local procs exceeds the number of processors, then set sched_yield to true. If not, then be a hog and set sched_yield to false
4. if we can't get the number of processors, default to conservative behavior and set sched_yield to true.
Note that I have not yet dealt with the need to dynamically adjust this setting as more processes are added via comm_spawn. So far, we are *only* looking within our own job. Given that we have now moved this logic to mpi_init (and away from the orteds), it isn't yet clear to me how a process will be informed about the number of procs in *other* jobs that are also sharing this node.
Something to continue to ponder.
This commit was SVN r13430.
Tested this functionality quite a bit more and made some fixes:
* Print far fewer help messages
* Fix one additional deadlock upon error
* Change some ORTE_LOG messages to silent (because they're not
errors)
* Some code got re-indented, sorry...
Discussed and reviewed with Ralph.
This commit was SVN r13375.
The following Trac tickets were found above:
Ticket 726 --> https://svn.open-mpi.org/trac/ompi/ticket/726
and George on these refinements):
* Rename the static OBJ initializer macro to be
OPAL_OBJ_STATIC_INIT(class)
* Ensure that all static OBJ initializations get a refcount of 1
(doesn't ''really'' matter, since they're static, it should never
get to the point where the OBJ is DESTRUCTed, but more correct
nonetheless)
* Add a "magic number" to the OBJ when compiling with debug support.
The magic number does some rudimentary support to ensure that
you're operating on a valid OBJ (and fails an assertion if you're
not). Check to ensure that the memory contains the magic number
when performing actions of OBJ's. Also remove the magic number
when DESTRUCTing OBJs, so that if, for example, an OBJ is
DESTRUCTed more than once, we'll fail the magic number assert.
This commit was SVN r13338.
The following SVN revision numbers were found above:
r13227 --> open-mpi/ompi@96030de97b
r13228 --> open-mpi/ompi@c2e9075d29