Short description: major changes include -
1. singletons now fork/exec a local daemon to manage their operations.
2. the orte daemon code now resides in libopen-rte
3. daemons no longer use the orte triggering system during startup. Instead, they directly call back to their parent pls component to report ready to operate. A base function to count the callbacks has been provided.
I have modified all the pls components except xcpu and poe (don't understand either well enough to do it). Full functionality has been verified for rsh, SLURM, and TM systems. Compile has been verified for xgrid and gridengine.
This commit was SVN r15390.
VxWorks. Still some issues remaining, I'm sure.
Refs trac:1010
This commit was SVN r15320.
The following Trac tickets were found above:
Ticket 1010 --> https://svn.open-mpi.org/trac/ompi/ticket/1010
types and add STOP_AT_FIRST_PRIORITY type for framework configuration,
which allows all components at the highest priority that succeeds to
succeed
* Use STOP_AT_FIRST_PRIORITY type for gpr framework, so that the null
component isn't built when the replica and proxy components are
available.
This commit was SVN r15286.
The default odls has been updated and works fine. The process odls has been updated, but I could not verify its operation. The bproc ODLS has not been updated yet. Ralph will look at it soon.
This commit was SVN r15257.
* Remove the 'opal_mca_base_param_use_amca_sets' global variable
* Harness the fact that you can (read should) call the cmd_line functions
before initializing opal_init_util(). This pushes the MCA/GMCA/AMCA
command line options into the environment before OPAL inits and starts
to use these values. By putting the cmd_line parse before opal_init_util
in orterun and orted we only parse the *MCA parameter files once, and
correctly (alleviating the need to 'recache' the files on init.)
* Small bits of cleanup.
This commit was SVN r15219.
Per suggestion, if we don't find a valid shell via getpwuid(), also
check the $SHELL environment variable. Also perform a few minor
cleanups along the way.
This commit was SVN r15156.
The following Trac tickets were found above:
Ticket 1060 --> https://svn.open-mpi.org/trac/ompi/ticket/1060
The SnapC Full local Coordinator used this argument to attach to the job the
daemon would be launching. So once this option was removed C/R support broke.
This commit has the local coordinator attach to the job just before it is
launched by the ODLS module. This is a much cleaner solution, and will
eventually allow the SnapC modules to attach to multiple jobs launched
on a single machine.
This commit fixes the C/R regression introduced in r15007.
This commit was SVN r15121.
The following SVN revision numbers were found above:
r15007 --> open-mpi/ompi@85df3bd92f
the multiple threads accessing the OOB/registry asynchronously via the
callbacks. The quickest solution (but definitively not the cleanest) is
to serialize these callbacks in such a way that at any given time
only one thread can execute a callbacks.
This commit was SVN r15086.
through the win dll using multiple threads, we have to insure that
the oob callbacks happens only in a synchronous way or really bad
things happens with the current design (blocking messages from a receive
callback).
This commit was SVN r15069.
The warning was indicative of overly-complex code anyway. So I
removed the "first" bool and simply use a sentinel value in seq_min to
indicate that nothing has changed. Note that this is "correct enough"
for the moment -- more fixes will come in this area with tickets #1049
and/or #1051.
This commit was SVN r15013.
single threaded builds. In its default configuration, all this does
is ensure that there's at least a good chance of threads building
based on non-threaded development (since the variable names will be
checked). There is also code to make sure that a "mutex" is never
"double locked" when using the conditional macro mutex operations.
This is off by default because there are a number of places in both
ORTE and OMPI where this alarm spews mega bytes of errors on a
simple test. So we have some work to do on our path towards
thread support.
Also removed the macro versions of the non-conditional thread locks,
as the only places they were used, the author of the code intended
to use the conditional thread locks. So now you have upper-case
macros for conditional thread locks and lowercase functions for
non-conditional locks. Simple, right? :).
This commit was SVN r15011.
1. generalize orte_rml.xcast to become a general broadcast-like messaging system. Messages can now be sent to any tag on the daemons or processes. Note that any message sent via xcast will be delivered to ALL processes in the specified job - you don't get to pick and choose. At a later date, we will introduce an augmented capability that will use the daemons as relays, but will allow you to send to a specified array of process names.
2. extended orte_rml.xcast so it supports more scalable message routing methodologies. At the moment, we support three: (a) direct, which sends the message directly to all recipients; (b) linear, which sends the message to the local daemon on each node, which then relays it to its own local procs; and (b) binomial, which sends the message via a binomial algo across all the daemons, each of which then relays to its own local procs. The crossover points between the algos are adjustable via MCA param, or you can simply demand that a specific algo be used.
3. orteds no longer exhibit two types of behavior: bootproxy or VM. Orteds now always behave like they are part of a virtual machine - they simply launch a job if mpirun tells them to do so. This is another step towards creating an "orteboot" functionality, but also provided a clean system for supporting message relaying.
Note one major impact of this commit: multiple daemons on a node cannot be supported any longer! Only a single daemon/node is now allowed.
This commit is known to break support for the following environments: POE, Xgrid, Xcpu, Windows. It has been tested on rsh, SLURM, and Bproc. Modifications for TM support have been made but could not be verified due to machine problems at LANL. Modifications for SGE have been made but could not be verified. The developers for the non-verified environments will be separately notified along with suggestions on how to fix the problems.
This commit was SVN r15007.
structures in the system. Instead of using memcmp, use the ns function.
This won't cause a problem as long as all three elements of the name are
ints, but if they have different sizes, alignment and padding rules
can cause memcmp() to compare padding space, which rarely holds a sane
value.
This commit was SVN r14998.
framework. Updated pointers to match current definitions.
* Trimmed some dead wood while I was at it:
* No need for component close function that does nothing
* Use BEGIN/END_C_DECLS
* Use recent MCA param register function
* Ditch MCA param orte_iof_debug (it wasn't used anywhere)
* Use MCA param orte_iof_override properly in the code (i.e., look
up the value once and use the cached value later)
This commit was SVN r14981.
generalized component include/exclude infrastructure. This commit
removes the oob_base_include and oob_base_exclude MCA params because
they have long-since been handled by the "oob" MCA parameter in the
MCA base.
This commit was SVN r14979.
A bunch of fixes from the /tmp/iof-fixes branch that fix up ''some''
(but not ''all'') of the problems that we have seen with iof:
* Reading very large files via stdin redirected to orteun (Sun saw
this)
* Reading a little bit of a large file redirected to orterun's stdin
and then either closing stdin or exiting the process
The Big Change was to make the proxy iof (the one running in non-HNP
orteds) send back a "I'm closing the stream" ACK back to the service
iof. This tells the HNP that there will be nothing more coming from
that peer, and therefore the iof forward should be removed.
Many other minor cleanups/fixes, terminology changes, and
documentation additions are included in this commit as well. However,
there are still some pretty big outstanding issues with IOF that are
not addressed either by #967 or this commit. A few examples:
* IOF was designed to allow multiple subscribers to a single stream.
We're not entirely sure that this works (for one thing, there is
nothing in the ORTE/OMPI code base that uses this functionality).
* There are also resources leaked when processes/jobs exit (per
Ralph's first comment on this ticket).
* There is no feedback to close orterun's stdin when all subscribers
to the corresponding stream have closed stdin.
This commit was SVN r14967.
The following Trac tickets were found above:
Ticket 967 --> https://svn.open-mpi.org/trac/ompi/ticket/967
This correctly repairs the problem by enabling the GPR's "get" function to correctly handle NULL data values.
This commit was SVN r14916.
The following SVN revision numbers were found above:
r14910 --> open-mpi/ompi@0757467d77
The only place where we attempt to store/retrieve attributes is in the RMAPS framework in support of comm_spawn. So this is where things broke down. The fix was simply to say "if the attribute data type is ORTE_UNDEF, then treat it like a boolean with value true". Trivial fix - solves problem.
This commit was SVN r14910.
symbols in them and environ is defined only in the final application
(probably in crt1.o). Apple provides a function for getting at the
environment, so use that instead if it's available.
This commit was SVN r14857.
by using a small hash function before doing the strcmp. The hask key for each
registry entry is computed when it is added to the registry. When we're doing a
query, instead of comparing the 2 strings we first check if the hash key match,
and if they do match then we compare the 2 strings in order to make sure we
eliminate collisions from our answers.
There is some benefit in terms of performance. It's hardly visible for few
processes, but it start showing up when the number of processes increase. In fact
the number of strcmp in the trace file drastically decrease. The main reason it
works well, is because most of the keys start with basically the same chars
(such as orte-blahblah) which transform the strcmp on a loop over few chars.
This commit was SVN r14791.
Rename the oob_tcp_include and oob_tcp_exclude MCA parameters to be
oob_tcp_if_include and oob_tcp_if_exclude (to match the convention
with btl_tcp_if_[in|ex]clude). Keep "hidden" synonyms oob_tcp_include
and oob_tcp_exclude in case anyone is actually using them (and some
users undoubtedly are), but do not have them show up in ompi_info
--param output. Instead, the new "oob_tcp_if_*" names will show up in
ompi_info output.
This commit was SVN r14746.
The primary change that underlies all this is in the OOB. Specifically, the problem in the code until now has been that the OOB attempts to resolve an address when we call the "send" to an unknown recipient. The OOB would then wait forever if that recipient never actually started (and hence, never reported back its OOB contact info). In the case of an orted that failed to start, we would correctly detect that the orted hadn't started, but then we would attempt to order all orteds (including the one that failed to start) to die. This would cause the OOB to "hang" the system.
Unfortunately, revising how the OOB resolves addresses introduced a number of additional problems. Specifically, and most troublesome, was the fact that comm_spawn involved the immediate transmission of the rendezvous point from parent-to-child after the child was spawned. The current code used the OOB address resolution as a "barrier" - basically, the parent would attempt to send the info to the child, and then "hold" there until the child's contact info had arrived (meaning the child had started) and the send could be completed.
Note that this also caused comm_spawn to "hang" the entire system if the child never started... The app-failed-to-start helped improve that behavior - this code provides additional relief.
With this change, the OOB will return an ADDRESSEE_UNKNOWN error if you attempt to send to a recipient whose contact info isn't already in the OOB's hash tables. To resolve comm_spawn issues, we also now force the cross-sharing of connection info between parent and child jobs during spawn.
Finally, to aid in setting triggers to the right values, we introduce the "arith" API for the GPR. This function allows you to atomically change the value in a registry location (either divide, multiply, add, or subtract) by the provided operand. It is equivalent to first fetching the value using a "get", then modifying it, and then putting the result back into the registry via a "put".
This commit was SVN r14711.
To be precise, given this hypothetical launching pattern:
host1: vpids 0, 2, 4, 6
host2: vpids 1, 3, 5, 7
The local_rank for these procs would be:
host1: vpids 0->local_rank 0, v2->lr1, v4->lr2, v6->lr3
host2: vpids 1->local_rank 0, v3->lr1, v5->lr2, v7->lr3
and the number of local procs on each node would be four. If vpid=0 then does a comm_spawn of one process on host1, the values of the parent job would remain unchanged. The local_rank of the child process would be 0 and its num_local_procs would be 1 since it is in a separate jobid.
I have verified this functionality for the rsh case - need to verify that slurm and other cases also get the right values. Some consolidation of common code is probably going to occur in the SDS components to make this simpler and more maintainable in the future.
This commit was SVN r14706.
* Move ipv6comat.h code into opal_config_bottom.h and change into some
more intelligent testing of structures
* Change opal's if interface to use sockaddr instead of sockaddr_storage,
as the RFCs suggest we do
* Move the networking code in opal that isn't directly related to if
detection into net.h
* Add quicky function to get the port out of either a sockaddr_in
or sockaddr_in6, saving a bunch of code in the oob.
* Update TCP oob and btl with new interface
This commit was SVN r14679.
* Require Autoconf 2.60 or higher and remove some cruft
required for AC 2.59 or the AC 2.59 / AC 2.60 mix
* Remove a bunch of now unnecessary AC_SUBST calls
* Use the libtool-provided variables for the -I and
library to use when compiling against ltdl
Fixes trac:1000
This commit was SVN r14652.
The following Trac tickets were found above:
Ticket 1000 --> https://svn.open-mpi.org/trac/ompi/ticket/1000
via the visibility feature that is provided by some compilers.
Per default this feature is disabled, to enable it you need to
configure with --enable-visibility and obviously you need a compiler
with visibility support. Please refer to the wiki for more information.
https://svn.open-mpi.org/trac/ompi/wiki/Visibility
This commit was SVN r14582.
assumptions in the FT restart code for the ORTE layer.
This fixes those problems by having the RML completely shutdown and
restart the OOB framework (instead of just the module as before).
This makes it much easier to manage, and maintainable as the OOB
changes in the future.
The SDS now does communication as part of its startup procedure, so
we need to make sure we restart the RML before the SDS so that it can
communicate properly.
OOB base [close|open] used a static bool to determine if they have
been called previously or not. I needed to expose this boolean so
that I can close() then open() the oob base in the restart procedure.
The functionality has not changed, we just now have the ability to
open/close the framework as many times as we need to as long as we
always call them in that order. (So calling open twice in a row is not allowed
as before, it is only allowed if you open(), close(), then open() again).
Things seem to be working now.
This commit was SVN r14515.
- make opal_sockaddr2str() take a sockaddr_storage instead of a sockaddr_in6
so that it works for IPv4 and IPv6 addresses, and remove a whole bunch
of #ifs in the OOOB code.
- Fix a compiler warning in the TCP BTL due to run-time determined
array size by making it a dynamicly allocated array.
- Fix the unpacking code of IPv4 addresses when using IPv6 support, so
that the address is in the correct location (instead of in an IPv6
structure, use an IPv4 structure). Refs trac:1005.
This commit was SVN r14514.
The following Trac tickets were found above:
Ticket 1005 --> https://svn.open-mpi.org/trac/ompi/ticket/1005
This completes the minor changes required to the PLS components. Basically, there is a small change required to the parameter list of the orted cmd functions. I caught and did it for xcpu and poe, in addition to the components listed in my email - so I think that only leaves xgrid unconverted.
The orted fail-to-start mods will also make changes in the PLS components, but those can be localized so they come in one at a time.
This commit was SVN r14499.
Test for system limits (where known) prior to doing things like fork and pipe since some systems aren't very nice about it when we try to exceed such limits.
This commit was SVN r14494.
There is a binomial algorithm in the code (i.e., the HNP would send to a subset of the orteds, which then relay it on according to the typical log-2 algo), but that has a bug in it so the code won't let you select it even if you tried (and the mca param doesn't show, so you'd *really* have to try).
This also involved a slight change to the oob.xcast API, so propagated that as required.
Note: this has *only* been tested on rsh, SLURM, and Bproc environments (now that it has been transferred to the OMPI trunk, I'll need to re-test it [only done rsh so far]). It should work fine on any environment that uses the ORTE daemons - anywhere else, you are on your own... :-)
Also, correct a mistake where the orte_debug_flag was declared an int, but the mca param was set as a bool. Move the storage for that flag to the orte/runtime/params.c and orte/runtime/params.h files appropriately.
This commit was SVN r14475.
finally brings in functionality that is already on the 1.2 branch, and
was developed and tested in the v1.2ofed branch (and other places).
Short version of new features:
* Support for ibv_fork_init()
* Automatically fill in the openib BTL bandwidth value by
querying the HCA port
* Installdirs functionality
* Fixes to always use -I in the Fortran wrapper compilers (#924)
* Gleb's mpool updates
* Remove some kruft in btl/openib/configure.m4, therefore
fixing the harmless warnings noted in #665
* Bunches of updates to the Linux RPM spec file
I.e., effectively the same thing that r14411 brought to the v1.2
branch.
Also effectively brought in r14432 and r14433 (some fixes on top of
the original r14411 commit to v1.2). Still need to bring in the moral
equivalent of r14445 after this commit (fixes to installdirs).
This commit was SVN r14449.
The following SVN revision numbers were found above:
r14411 --> open-mpi/ompi@83b31314ae
r14432 --> open-mpi/ompi@a48f160595
r14433 --> open-mpi/ompi@68f346d2bc
r14445 --> open-mpi/ompi@13d366b827
Create a sentinel value in the metadata file to clearly indicate
that the sequence number is complete (versus in progress). This
way we do not try to restart from an invalid sequence number
which can lead to badness.
This commit was SVN r14423.
the program it try to spawn is missing.
Description of the problem: When the rsh pls try to spawn a local
process which is missing (such as a removed orted) the orterun
deadlock.
Description of the fix: The forked child deal with finding the
program to be executed. If it fails to find it, then instead of
calling exit (as a normal forked program is expected to do) it
continue the execution using a execution path it was never
expected to use (back in orterun and then main). Bad things
happens as expected. Forcing the child to use exit when it fails
to find the orted (and forcing the child to use exit everywhere
instead of return) correct the logic of the rsh pls and make it
behave as expected.
This commit was SVN r14377.
Per discussions with Brian and Ralph, make a slight correction in
where components are installed. Use $pkglibdir, not $libdir/openmpi,
so that when compiled in the orte trunk, components are installed to
the right directory (because the component search patch is checking
$pkglibdir).
This commit was SVN r14345.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r14289
follow the same behavior as before, the changes just make sure the check is done
in linear time and the memory usage is kept to a minimum.
This commit was SVN r14331.
as the IOF components lack the required finalize function. Instead rely on the
module finalize. Read the comment or more informations.
This commit was SVN r14323.
Collect the base 'orted' command line into a base function since most of the
PLS components were duplicating this code. Add AMCA parameter command line
component to the base set.
Add Aggregate MCA parameter support to the following PLS components:
- gridengine
- process
- slurm
- poe
- tm
Improve support for 'rsh' component.
Did/could not support the following components:
- bproc
- proxy
- xcpu
- cnos
- xgrid
The above components had peculiar needs that made it non-trivial to add an
option. The authors of these components need to help in supporting this
new option.
I was only able to test the SLURM and RSH components due to system availability.
The others should work without problem.
This commit was SVN r14284.
The following Trac tickets were found above:
Ticket 976 --> https://svn.open-mpi.org/trac/ompi/ticket/976
- Make it so that all the GPR pointer arrays are allocated initially at 16 elements instead of 512. This saves (on a 64 bit machine) approximately 4*(# procs + # nodes) KB.
- Fix up the segment prealloc function so that preallocating an existant segment is not an error, and make the areas where we do large inserts use it.
Fix the orte_pointer_array to efficiently implement setting its size. Before we just realloced the array one block at a time until the desired size was reached. Now we resize it all in one realloc.
This commit was SVN r14264.
* Remove the connect() timeout code, as it had some nasty race conditions
when connections were established as the trigger was firing. A better
solution has been found for the cluster where this was needed, so just
removing it was easiest.
* When a fatal error (too many connection failures) occurs, set an error
on messages in the queue even if there isn't an active message. The
first message to any peer will be queued without being active (and
so will all subsequent messages until the connection is established),
and the orteds will hang until that first message completes. So if
an orted can never contact it's peer, it will never exit and just sit
waiting for that message to complete.
* Cover an interesting RST condition in the connect code. A connection
can complete the three-way handshake, the connector can even send
some data, but the server side will drop the connection because it
can't move it from the half-connected to fully-connected state because
of space shortage in the listen backlog queue. This causes a RST to
be received first time that recv() is called, which will be when waiting
for the remote side of the OOB ack. In this case, transition the
connection back into a CLOSED state and try to connect again.
* Add levels of debugging, rather than all or nothing, each building on
the previous level. 0 (default) is hard errors. 1 is connection
error debugging info. 2 is all connection info. 3 is more state
info. 4 includes all message info.
* Add some hopefully useful comments
This commit was SVN r14261.
deal with the PLS RSH. Remove support for unknown user (i.e. if the user is
not known by the system, then it shouldn't be allowed to spawn anything).
This commit was SVN r14232.
OPAL_FREE_LIST_WAIT/RETURN will not use locks in a non-threaded build
conditionaly use locks if non-threaded around the OPAL_FREE_LIST_WAIT/RETURN
seems to fix the issue
Tested at 4K processes and seems to work..
This commit was SVN r14135.
listen thread, but we're not the HNP. This is better than not starting up
any listen mode, which is what we were doing before :/
This commit was SVN r14133.
__opal_attribute_nonnull__, __opal_attribute_warn_unused_result__,
__opal_attribute_malloc__, __opal_attribute_sentinel__ and
__opal_attribute_format__
This commit was SVN r14078.
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.
This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.
This commit closes trac:158
More details to follow.
This commit was SVN r14051.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r13912
The following Trac tickets were found above:
Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
* Do not empty the list of in-flight frags during _close(); the OOB
callback will still occur (_send_cb()) and try to remove the frag
from the list, which will then result in an assert failure (debug
builds).
* Add one more fix for a possible problem -- add an extra RETAIN /
RELEASE pair on the endpoint to ensure that it is not actually
freed before all in-flight frags have drained.
This commit was SVN r13953.
The following Trac tickets were found above:
Ticket 921 --> https://svn.open-mpi.org/trac/ompi/ticket/921
- mca_base_param_file_prefix
(Default: NULL)
This is the fullname of the "-am" mpirun option. Used to specify a ':'
separated list of AMCA parameter set files.
- mca_base_param_file_path
(Default: $SYSCONFDIR/amca-param-sets/:$CWD)
The path to search for AMCA files with relative paths. A warning will be
printed if the AMCA file cannot be found.
* Added a new function "mca_base_param_recache_files" the re-reads the file
configurations. This is used internally to help bootstrap the MCA system.
* Added a new orterun/mpirun command line option '-am' that aliases for the
mca_base_param_file_prefix MCA parameter
* Exposed the opal_path_access function as it is generally useful in other
places in the code.
* New function "opal_cmd_line_make_opt_mca" which will allow you to append a
new command line option with MCA parameter identifiers to set at the same
time. Previously this could only be done at command line declaration time.
* Added a new directory under the $pkgdatadir named "amca-param-sets" where all
the 'shipped with' Open MPI AMCA parameter sets are placed. This is the first
place to search for AMCA sets with relative paths.
* An example.conf AMCA parameter set file is located in
contrib/amca-param-sets/.
* Jeff Squyres contributed an OpenIB AMCA set for benchmarking.
Note: You will need to autogen with this commit as it adds a configure param.
Sorry :(
This commit was SVN r13867.
its bigger than the timeout for the connect() call, just don't register
the handler by default and fall back to connect() timing out. Should give
much happier performance on big clusters.
This commit was SVN r13639.
Adjust the RMAPS mapped_node object to propagate the required launch_id info now included in the ras_node object. This provides support for those few systems that don't use nodename to launch, but instead want some id (typically an index into the array of allocated nodes). This value gets set for each node in the RAS - the RMAPS just propagates it for easy launch.
This commit was SVN r13581.
This change utilizes the new num_processors function. I also left the mods made to ompi_mpi_init and the bug fix for the default value of mpi_yield_when_idle. Note that the mods to mpi_init will not really take effect as the mca param will now *always* be set (either by user or odls). We will need those mods later, so no point in removing them now.
This commit was SVN r13519.
1. if the user has specified sched_yield, we simply do what we are told
2. if they didn't specify anything, try to get the number of processors on this node. Note that we already now get the number of local procs in our job that are sharing this node - that now comes in through the proc callback and is stored in the ompi_proc_t structures.
3. if we can get the number of processors, compare that to the number of local procs from my job that are sharing my node. If the number of local procs exceeds the number of processors, then set sched_yield to true. If not, then be a hog and set sched_yield to false
4. if we can't get the number of processors, default to conservative behavior and set sched_yield to true.
Note that I have not yet dealt with the need to dynamically adjust this setting as more processes are added via comm_spawn. So far, we are *only* looking within our own job. Given that we have now moved this logic to mpi_init (and away from the orteds), it isn't yet clear to me how a process will be informed about the number of procs in *other* jobs that are also sharing this node.
Something to continue to ponder.
This commit was SVN r13430.
Tested this functionality quite a bit more and made some fixes:
* Print far fewer help messages
* Fix one additional deadlock upon error
* Change some ORTE_LOG messages to silent (because they're not
errors)
* Some code got re-indented, sorry...
Discussed and reviewed with Ralph.
This commit was SVN r13375.
The following Trac tickets were found above:
Ticket 726 --> https://svn.open-mpi.org/trac/ompi/ticket/726
and George on these refinements):
* Rename the static OBJ initializer macro to be
OPAL_OBJ_STATIC_INIT(class)
* Ensure that all static OBJ initializations get a refcount of 1
(doesn't ''really'' matter, since they're static, it should never
get to the point where the OBJ is DESTRUCTed, but more correct
nonetheless)
* Add a "magic number" to the OBJ when compiling with debug support.
The magic number does some rudimentary support to ensure that
you're operating on a valid OBJ (and fails an assertion if you're
not). Check to ensure that the memory contains the magic number
when performing actions of OBJ's. Also remove the magic number
when DESTRUCTing OBJs, so that if, for example, an OBJ is
DESTRUCTed more than once, we'll fail the magic number assert.
This commit was SVN r13338.
The following SVN revision numbers were found above:
r13227 --> open-mpi/ompi@96030de97b
r13228 --> open-mpi/ompi@c2e9075d29
then we modify the argv, forcing the reallocation of the array. With luck
the saved pointer still have a meaning ... without execve return with error
14 (EFAULT).
This commit was SVN r13321.
The following SVN revision numbers were found above:
r12059 --> open-mpi/ompi@ae79894bad
1. add a "cancel_operation" API to the pls components that allows orterun to demand that an orted operation (e.g., terminate_job) be immediately cancelled and abandoned.
2. changes the pls orted commands from blocking to non-blocking. This allows us to interrupt those operations should an orted be non-responsive. The change also adds an orte_abort_timeout that limits how long orterun will automatically wait for the orteds to respond - if the terminate command, for example, doesn't see orted response within that time, then we printout an appropriate error message and just give up.
3. modifies orterun to allow multiple ctrl-c's to simply abort the program even if the orteds have not responded
4. does some cleanup on the orte-level mca params so that their implementation looks a lot more like that of ompi - makes it easier to maintain. This change also includes the definition of an orte_abort_timeout struct and associated MCA param (can't have too many!) so you can set the time after which orterun gives up on waiting for orteds to respond
This needs more testing before migrating to 1.2.
This commit was SVN r13304.
This fix includes two parts: (a) we now initialize the keyval pointer locations to NULL after the malloc, and (b) we now OBJ_NEW the keyvals prior to storing info in them.
BTW, in case anyone reads this and wonders why we don't just OBJ_NEW the keyvals in create_value, the reason is simply that some places in the code use static keyvals and simply assign those addresses into the value object's array. So not everyone wants to OBJ_NEW keyvals - by not forcing it here in create_value, we give the user the flexibility to do whatever they want.
This commit was SVN r13300.
- Make it so the SLURM ras can handle different nodelist configurations
- Some code cleanup and better/more informative error messages and error handling
This commit was SVN r13271.
The following Trac tickets were found above:
Ticket 801 --> https://svn.open-mpi.org/trac/ompi/ticket/801
then exec the "srun..." from there. But somewhere along the line, we
switched to having a copy of environ and modifying that. It looks
like we forgot to update the stuff for --prefix behavior. So this
commit fixes the setenv's for PATH and LD_LIBRARY_PATH to modify the
environ copy (not environ itself) so that the values properly get
passed down to the srun environment via execve().
This restores --prefix behavior in the SLURM pls.
This commit was SVN r13239.
function prototype lives. Without this, we get compile
warnings. In addition, for 64-bit Solaris, we get a
segmentation fault from orterun without this include.
This commit was SVN r13065.
the connect() timeout, so that we'll use that rather than our own timeout by
defualt. There timeout was set low for Big Red, but causes problems for very
large clusters, as there's no way to wire them up in 10 seconds most of the
time.
This commit was SVN r13062.
components that use configure.m4 for configuration or are always built.
The macro has not been needed since moving to configure types other than
configure.stub
Fixes trac:590
This commit was SVN r13031.
The following Trac tickets were found above:
Ticket 590 --> https://svn.open-mpi.org/trac/ompi/ticket/590
I know it's just a technicality, but it is time to address such things rather than just letting them continue to propagate. :-)
This commit was SVN r12954.
This has now been corrected. The singleton startup will dutifully call the mapper framework so that the proper data storage locations get initialized. Unfortunately, we then had to instruct the RMAPS not to allocate a vpid range for this job - otherwise, it would make a mistake and think there were two processes in it. Hence, a change was required to RMAPS to tell it "map this job, but don't allocate a vpid range for it".
This change will need to migrate across to 1.2 after it "soaks" the appropriate time.
This commit was SVN r12952.
is allocated on a per comm_world instance, with the lowest rank
in comm_world on the given host creating and initializing the file,
and then notifying the remaining files via the OOB.
Reviewed: Ralph Castain, Brian Barrett
Addressing ticket #674.
This commit was SVN r12949.
rc (which is -1 or 4 if we hit this case) resulted in an odd error that a
signal killed the proc (instead of a startup error, as is reality).
Instead, use the W_EXITCODE macro (if available) to build up an exit
code that has an error code for exit status, but does not make it look
like the process died from a signal
This commit was SVN r12890.
1. no -np provided - put one proc/node across all allocated nodes
2. -np N provided, N > #nodes - we print a pretty error message and exit
3. -np N provided, N <= #nodes - put one proc/node across N nodes
I also added a new orte constant (ORTE_ERR_SILENT) that allows us to pass up the chain that an error was encountered, but NOT print ORTE_ERROR_LOG messages. This is intended to be used for cases where the error we encounter is NOT an orte error, but rather is one associated with incorrect user input (e.g., the preceding case 2). In such cases, there is no point in printing an ORTE_ERROR_LOG chain of messages as it isn't an orte error.
This commit was SVN r12821.
I found only two places that were looking at the tokens:
1. the odls - we used the tokens to separately process the globals container data from everything else. In this case, I left the subscription that returned the globals data alone, but "stripped" the subscription that returned the launch data for the procs. These subscriptions have nothing to do with the xcast message.
2. the pml_base_modex - the callback function was getting process names from the returned tokens. Actually, this function was doing a very bad thing - it was assuming that the first token returned was *always* the process name. This is currently true, but is one of those assumptions that someone could have easily changed - and suddenly found the system inexplicably failing. I modified the function to (a) get the name sent back to us, (b) "stripped" the value structures of tokens and segment strings, and (c) correctly obtained process names from the returned values. I also reindented the heck out of the code so it was legible (at least, to my old eyes).
This commit was SVN r12813.
Obviously, people like bproc will have to get the app_num via another avenue...but that's a problem for another day. Several options are easily available.
This commit was SVN r12788.
1. implement and enable the non-described buffer operations. I will send out a more detailed explanation separately. However, this mode of operation (which is now the default) significantly reduces message size during startup. If you want the described buffers, set the mca param "-mca dss_describe_buffer 1".
2. revise the xcast system to support both linear and binomial tree broadcast methods. Since we are seeing scenarios where the binomiall tree can cause problems, I have made the linear method the default. To run with the binomial tree, set the mca param "-mca oob_xcast_mode binomial".
3. add some detailed timing reports to the xcast operation. These are enabled via "-mca oob_xcast_timing 1".
4. add some more unit tests for the dss and gpr (focused on support for the non-described buffer)
This commit was SVN r12722.
Ralph identified the problem, I tracked down ''where'' the fd was
being closed, and Brian figured out ''why'' (and the fix).
What was happening is that a remote process was closing its
stdout/stderr and therefore sending a 0-byte IOF message to mpirun.
mpirun, in turn, closed the iof endpoint associated with that stream
(i.e., stdout/stderr). IOF does this to handle the case where
mpirun's stdin is closed -- this therefore causes the stdin on all the
ORTE-started processes to have their stdin's closed as well.
So the workaround here is to check that if we get a 0-byte IOF message
on a sink (indicating a remote closure), and if that sink is the
special stdout or stderr stream, don't actually close anything in the
local process.
This commit was SVN r12691.
The following Trac tickets were found above:
Ticket 635 --> https://svn.open-mpi.org/trac/ompi/ticket/635
because they are in ORTE, not OMPI. Also, remove the ORTE_PROCESS_NAME macros
in iof base as they are duplicates of the ones that were in ns_types, which
meant that bad things happened if you changed what an orte_process_name_t
looked like.
This commit was SVN r12646.
We were burned again by the fact that the bproc state monitor creates entries on the node segment for *all* the nodes in the cluster when it is opened during orte_init. As a result, the bjs allocator was never being called, and the system merrily assumed that *all* nodes in the cluster had been allocated to it.
To fix this, I removed a test that had been inserted into the allocation procedure that checked for a non-zero node segment. This was an old artifact - the RAS components already know that they are not to overwrite any existing node segment entries (at least, bproc does - I will check the others. For now, I just want to save the bproc fix on this machine).
This commit was SVN r12640.