Add --enable-orterun-prefix-by-default (and a synonym:
--enable-mpirun-prefix-by-default) to make orterun always behave as if
"--prefix $prefix" was given on the command line (where $prefix is the
value given to the --prefix option to configure). This prevents many
rsh/ssh users from needing to modify their shell startup files to set
the LD_LIBRARY_PATH for Open MPI (they will still need to set PATH or
otherwise find the OMPI executables to mpicc/mpirun/etc. their MPI
applications).
Also added --noprefix option to orterun to disable this behavior.
Finally, note that even if --enable-orterun-prefix-by-default is
specified, if the user specifies --prefix or /path/to/mpirun, these
options will override the default value of the prefix ($prefix).
This commit was SVN r11669.
The following Trac tickets were found above:
Ticket 377 --> https://svn.open-mpi.org/trac/ompi/ticket/377
Allow the POE RAS to be compled for linux as well as AIX.
The POE RAS is really a Loadleveler RAS, and IU now has
a cluster that uses Loadleveler in a Linux environment (BigRed).
This seems to be the only thing we need to do so far to run
Open MPI on BigRed. Yay :)
This commit was SVN r11600.
set to 1 or 0 instead of the user defined number or default (128).
This caused the PLS to deadlock when using '--debug-daemons' with
more than 2 processes. :(
svn blame says that it was broken in r11347
It is *not* a problem on v1.1 or v1.2 branches.
Bug spotted by Tim Mattox and myself.
This commit was SVN r11575.
The following SVN revision numbers were found above:
r11347 --> open-mpi/ompi@f52c10d18e
- everything statically built (dynamically opened).
- OPAL, ORTE and OMPI static libraries and all the components
as dynamic files(DLL).
- everything as dynamic files (DLL).
This commit was SVN r11461.
create a process component which use CreateProcess to spawn the child.
Special care should be taken in order to correctly redirect the stdin,
stdout and stderr of the child process.
This commit was SVN r11405.
- Remove extra NULL argument from rsh module.
This commit was SVN r11377.
The following SVN revision numbers were found above:
r11347 --> open-mpi/ompi@f52c10d18e
- use the OPAL functions for PATH and environment variables
- make all headers C++ friendly
- no unamed structures
- no implicit cast.
Plus a full implementation for the orte_wait functions.
This commit was SVN r11347.
different macros, one for each project. Therefore, now we have OPAL_DECLSPEC,
ORTE_DECLSPEC and OMPI_DECLSPEC. Please use them based on the sub-project.
This commit was SVN r11270.
Other changes:
1. Remove the old xcpu components as they are not functional.
2. Fix a "bug" in orterun whereby we called dump_aborted_procs even when we normally terminated. There is still some kind of bug in this procedure, however, as we appear to be calling the orterun job_state_callback function every time a process terminates (instead of only once when they have all terminated). I'll continue digging into that one.
This will require an autogen/configure, I'm afraid.
This commit was SVN r11228.
Clean up the remainder of the size_t references in the runtime itself. Convert to orte_std_cntr_t wherever it makes sense (only avoid those places where the actual memory size is referenced).
Remove the obsolete oob barrier function (we actually obsoleted it a long time ago - just never bothered to clean it up).
I have done my best to go through all the components and catch everything, even if I couldn't test compile them since I wasn't on that type of system. Still, I cannot guarantee that problems won't show up when you test this on specific systems. Usually, these will just show as "warning: comparison between signed and unsigned" notes which are easily fixed (just change a size_t to orte_std_cntr_t).
In some places, people didn't use size_t, but instead used some other variant (e.g., I found several places with uint32_t). I tried to catch all of them, but...
Once we get all the instances caught and fixed, this should once and for all resolve many of the heterogeneity problems.
This commit was SVN r11204.
Note that some compile warnings are generated here because of the direct inclusion of an orte include file in the program. Not entirely sure why that is happening (it is relatively new phenomenon), but it doesn't cause any problems in terms of operation.
This commit was SVN r11175.
Fixed a few very minor compiler complaints in the pls_gridengine_module.c file. ISO C is less forgiving about where variables get declared.
This commit was SVN r11156.
- indent / whitespace cleanup
- don't set --daemon-debug when pls debug is given, as it seems to make
the daemons abort.
This commit was SVN r11113.
The following SVN revision numbers were found above:
r11109 --> open-mpi/ompi@da7df6d257
1. Introduces a flag for the type of buffer that now allows a user to either have a fully described or a completely non-described buffer. In the latter case, no data type descriptions are included in the buffer. This obviously limits what we can do for debugging purposes, but the intent here was to provide an optimized communications capability for those wanting it.
Note that individual buffers can be designated for either type using the orte_dss.set_buffer_type command. In other words, the buffer type can be set dynamically - it isn't a configuration setting at all. The type will default to fully described. A buffer MUST be empty to set its type - this is checked by the set_buffer_type command, and you will receive an error if you violate that rule.
IMPORTANT NOTE: ORTE 1.x actually will NOT work with non-described buffers. This capability should therefore NOT be used until we tell you it is okay. For now, it is here simply so we can begin bringing over parts of ORTE 2.0. The problem is that ORTE 1.x depends upon the transmission of non-hard-cast data types such as size_t. These "soft" types currently utilize a "peek" function to see their actual type in the buffer - obviously, without description, the system has no idea how to unpack these "soft" types. We will deal with this later - for now, please don't use the non-described buffer option.
2. Introduces the orte_std_cntr_t type. This will become the replacement for the size_t's used throughout ORTE 1.x. At the moment, it is actually typedef'd to size_t for backward compatibility.
3. Introduces the orte_dss.arith API that supports arbitrary arithmetic functions on numeric data types. Calling the function with any other data type will generate an error.
This commit was SVN r11075.
correctly with MPI_Comm_spawn.
The problem wiht MPI_Comm_spawn was that the 'parent' process was
rmgr.create'ing and then rmgr.launch'ing the children via the rmgr proxy
component. The HNP saw these commands and processed them normally, but
since we never went through the HNP's rmgr (urm component) spawn()
logic the triggers and key/value pairs were never created. So the
children were launched correctly, but since the HNP did not
have any triggers setup, never triggered the xcast for the
children to finish orte_init().
This fix puts the trigger and key/value pair initialization in
rmgr_urm_spawn() for the 'mpirun a.out' case, *and* in the
rmgr_base_unpack routine that deals with the creation of the
job for the child as requested by the proxy component. This
will allow the triggers to be registered for the proxy's request
which only happens during MPI_Comm_spawn*
Small change for a lot of debugging. Notice that his reverts r11037
to its previous version, and adds a newline to handle the spawn
cases.
This commit was SVN r11046.
The following SVN revision numbers were found above:
r11037 --> open-mpi/ompi@5813fb7d2a
By reverting this file (changeset from commit r10708) to its previous
version fixes the problem.
This should be moved to the v1.1 branch where it is also broken.
This commit was SVN r11037.
The following SVN revision numbers were found above:
r10708 --> open-mpi/ompi@febc143d8c
r10841, so revert it (and it's fixes) out. Will bring back once cleaned up from
the code used in the tbird experiment
This commit was SVN r10991.
The following SVN revision numbers were found above:
r10841 --> open-mpi/ompi@dfa1221c3b
Added another system-level test function for ORTE that just spins until terminated by a ctrl-c signal.
Modified orterun - added a couple of newlines to the output when abnormally terminating so the prompt always is on a new line.
This commit was SVN r10866.
handler before the write() and de-register it afterwards. Determine
if the write() succeeded or failed by the return of write().
This commit was SVN r10858.
than $(LN_S). This causes problems with with Windows and probably
elsewhere (re: #200). So use a slightly different trick to get the
right header selected for the MEMCPY and TIMER components.
* Using the same trick used to solve the AC_CONFIG_LINKS problem,
stop using a separate header file for direct calling in the
PML and MTL. This lets me remove some icky code in ompi_mca.m4
that was more fragile than I really liked.
This commit was SVN r10841.
keep the resulting tm_event_t that is generated because the back-end
TM library actually caches a bunch of stuff on it for internal
processing, and doesn't let go of it until tm_poll().
tm_event_t's are similar to (but slightly different than)
MPI_Requests: you can't do a million MPI_Isend()'s on a single
MPI_Request -- a) you need an array of MPI_Request's to fill and b)
you need to keep them around until all the requests have completed.
This commit was SVN r10820.
Jeff: this needs to be back-patched to our supported prior releases. I'll try to verify how far back we need to go - my initial guess is probably all of them
This commit was SVN r10801.
- change -no_oversubscribe to -nooversubscribe (to be similar to
-nolocal)
- Added text to orterun.1 describing slots and -nooversubscribe
Still need to add text about "mpirun a.out" functionality, and RHC
wants to make some minor edits, so committing for synchronization.
This commit was SVN r10800.
Since Jeff and I are going to a branch for T-bird, we have restored the trunk to its prior state to avoid any possibility of disturbing it.
This commit was SVN r10774.
Please report any abnormal behavior during launch, though, as we would like to understand what (if any) impact is seen. I couldn't see any on small jobs (the modulo functions render this number down pretty low).
This commit was SVN r10763.
- orte-clean.c : check to see if the base session directory is empty
and delete it if it is.
- orte_universe_exists.c : Fix a down stread problem resulting from
George's r10718 commit. Don't use the 'fulldirpath' since
that is no longer guarenteed to be the absolute path
to the session directory. Construct this value outside of that
function from the prefix and frontend vars.
This commit was SVN r10741.
The following SVN revision numbers were found above:
r10718 --> open-mpi/ompi@47eef2e002
so that it does not return an error when no universe is passed to it.
Also put back in the 'Slots In Use' column as it is now working properly
per Ralphs recent ras commits. Still not sure what 'Slots Alloc' is meant
to represent, so left that as #if 0'd out for the moment.
This commit was SVN r10739.
The following SVN revision numbers were found above:
r10718 --> open-mpi/ompi@47eef2e002
Update the help text to report errors when not following that rule.
Also updated the RMAPS help text to reflect the reorganization of some of the round-robin code into the base.
The new functionality has been tested under Mac OS-X and on Odin using an MPI program. Both byslot and bynode mapping have been checked and verified. Operational support for other systems needs to be verified - I respectfully request people's help in doing so.
This commit was SVN r10708.
1. Modifies the RAS framework so it correctly stores and retrieves the actual slots in use, not just those that were allocated. Although the RAS node structure had storage for the number of slots in use, it turned out that the base function for storing and retrieving that information ignored what was in the field and simply set it equal to the number of slots allocated. This has now been fixed.
2. Modified the RMAPS framework so it updates the registry with the actual number of slots used by the mapping. Note that daemons are still NOT counted in this process as daemons are NOT mapped at this time. This will be fixed in 2.0, but will not be addressed in 1.x.
3. Added a new MCA parameter "rmaps_base_no_oversubscribe" that tells the system not to oversubscribe nodes even if the underlying environment permits it. The default is to oversubscribe if needed and the underlying environment permits it. I'm sure someone may argue "why would a user do that?", but it turns out that (looking ahead to dynamic resource reservations) sometimes users won't know how many nodes or slots they've been given in advance - this just allows them to say "hey, I'd rather not run if I didn't get enough".
4. Reorganizes the RMAPS framework to more easily support multiple components. A lot of the logic in the round_robin mapper was very valuable to any component - this has been moved to the base so others can take advantage of it.
5. Added a new test program "hello_nodename" - just does "hello_world" but also prints out the name of the node it is on.
6. Made the orte_ras_node_t object a full ORTE data type so it can more easily be copied, packed, etc. This proved helpful for the RMAPS code reorganization and might be of use elsewhere too.
This commit was SVN r10697.
using a pty for everything, which drops all buffered data on the floor when
close() is called on the daemon side, meaning EOF has some issues. Instead,
do the same thing we do for other starters that use the fork() pls -- use
a pipe/fifo for stdin and stderr and a pty for stdout. This is good enough
for what we need and avoids most of the issues with ptys.
This commit was SVN r10692.
Basically, the problem was that the allocator was grabbing everything on the cluster for which the user had access privilege. Thus, if a user had two sessions operable, each with its own allocation, mpirun in each session would grab both sets of nodes and use them. Not very polite.
This commit was SVN r10683.
* num_children should really be an int instead of size_t
since 'size_t' is not signed and num_children can (in rare cases)
drop below 0, and don't want it to roll around to MAX_INT or some
such.
* I figured out that this problem only happened to me because I use
the pls_fork_reap_timeout MCA parameter and thus the only time that
the code in pls_fork_module.c to waitpid is executed is if this is
not set to 0 (I had it set to 1 to give my procs time to exit). I
adjusted the loop from while{...} to do{...}while; so that it is
executed at least once for consistency.
* de-register the SIGCHILD callback for the pid before we attempt
to kill it, so that we don't leave the door open for both the
waitpids (the one in the callback, and the one in this function)
to race to see who can wait on the child.
* Move the 'thread release' to outside the for loop for a bit of an
optimization, and always set the value to 0 since we want to
finish after this function.
* Added a help message for the case when we can't send a kill()
signal to the process. Should never happen, but all is possible
in the wild wild west of HPC.
This commit was SVN r10666.
When we force an application to terminate (via CTRL-C to mpirun)
we send an out-of-band message to the orted to reap its children.
the fork PLS was doing an internal waitpid but never releasing or
updating the information and signaling the condition variable. So
the fork PLS callback for SIGCHLD registered with the event library
and this waitpid are in a bit of a race to 'waitpid' for the children.
Since the PLS callback was the only one that handled the signal properly
when it 'won' then things were great -- as in the normal termination case.
But when it 'lost' -- as in the abnormal termination case -- the orted
never received the proper signal that its children had gone away.
We want to preserve the internal fork PLS callback since it allows
for a timeout while waiting for the child, which the event library
won't do.
This allows both to exist, and behave properly.
This was introduced in r9068.
The ticket is still open since the orted's hang in other situations
still. This is a fix for one of the causes.
This commit was SVN r10662.
The following SVN revision numbers were found above:
r9068 --> open-mpi/ompi@c2c2daa966
/tmp/tm-merge). Validated by RHC. Summary:
- Add --nolocal (and -nolocal) options to orterun
- Make some scalability improvements to the tm pls
This commit was SVN r10651.
with an error status (< 0) then the req buffer is NULL. Put checks around the
OBJ_RELEASE(req) calls so that we don't try to release NULL :/
This commit was SVN r10641.
After seeing the uglyness that is removing directories in the
codebase I decided to push down this to the OPAL by extending the
opal/os_create_dirpath.(c|h) to contain some more functionality.
In this process I renamed 'os_create_dirpath' to 'os_dirpath' since it
is a bit more general now.
Added a few functions to:
- check if an directory is empty
- check to see if the access permissions are set correctly
- destroy the directory at the end of the dirpath
- By using a caller callback function (a la Perl, I believe)
for every file, the caller can have fine grained control over
whether a specific file is deleted or not.
This simplifies things a bit for orte_session_dir_(finalize|cleanup)
as it should no longer contain any of this functionality, but uses
these functions to do the work.
From the external perspective nothing has changed, from the
developer point of view we have some cleaner, more generic code.
This commit was SVN r10640.
Since we don't properly handle connecting/disconecting from multiple
universes, only connect to the first one (or the user specified one).
This is a bug that needs to be fixed, but involves some deep magic in
ORTE.
Print the node segment upon request (-n option).
{{{
Node Name | Arch | Cell ID | State | Slots | Slots Max |
-----------------------------------------------------------
odin001 | | 0 | Unknown | 2 | 4 |
odin002 | | 0 | Unknown | 2 | 5 |
odin003 | | 0 | Unknown | 2 | 6 |
odin004 | | 0 | Unknown | 2 | 7 |
}}}
Since node_slots_alloc and node_slots_inuse are not properly updated
in the GPR don't print those values.
This commit was SVN r10633.
This moves the logic to create the symbolic links for:
- mpirun
- mpiexec
- ompi-ps
- ompi-clean
and their respective man pages to the ompi level from
the orte layer.
This is a bit pedantic, but orte shouldn't be doing the
work of ompi since that is a bit of an abstraction break.
Note: need to autogen.sh to get this. Sorry :(
This commit was SVN r10602.
per a request.
Currently it is not working well. That will soon change
as it just needs a bit of attention and testing to
make it lots-mo-betta.
This commit was SVN r10556.
per a request for its functionality into the main trunk.
This command provides basic information about a running job. It
needs a bit of attention, but works fine in its current iteration.
Please play with it, and lets try to work out all the left over bugs.
Pending action for this tool:
It has been requested that the tool be changed slightly to allow
it to be called via a function call from internal libraries
(e.g. orteconsole).
This commit was SVN r10554.
from the tmp/jjhursey-ft-cr branch.
In this commit we change the way universe names are created.
Before we by default first created "default-universe" then
if there was a conflict we created "default-universe-PID"
where PID is the PID of the HNP.
Now we create "default-universe-PID" all the time (when
a default universe name is used). This makes it much
easier when trying to find a HNP from an outside app
(e.g. orte-ps, orteconsole, ...)
This also adds a "search" function to find all of the
universes on the machine. This is useful in many contexts
when trying to find a persistent daemon or when trying to
connect to a HNP.
This commit also makes orte_universe_t an opal_object_t,
which is something that needed to happen, and only effected
the SDS in one of it's base functions.
I was asked to bring this over to aid in fixing orteconsole
and orteprobe. Due to the change of orte_universe_t to
an object orteprobe may need to be updated to reflect this
change. Since orteprobe needs to be looked at anyway I'll
leave this to Ralph to take care of.
*Note*:
These changes do not depend upon any of the FT work (but
the FT work does depend upon them). These were brought over
to help in fixing some of the ORTE tool set that require
the functionality layed out in this patch.
Testing:
Ran the 'ibm' tests before and after this change, and all was
as well as before the change. If anyone notices additional
irregularities in the system let me know. But none are expected.
This commit was SVN r10550.
SIGUSR2. This can be extended later if needed to include other
signals we should forward to the user processes (TSTP and CONT,
perhaps?)
* Since the signal handlers don't actually run in signal context, we
can use malloc/fprintf/etc. So clean up some of the signal handler
code so that we don't keep message buffers around for the life of
the process
This commit was SVN r10496.
This commit will apply cleanly to the v1.1 branch, and should
be moved over once I get someone to verify it.
The problem is outlined in the bug. The fix was to move the
setting of the app context index (idx) before we put it in the
GPR so that it is propogated to the gpr.
The reason this hasn't bitten us before is because we init
app->idx to 0, which is true most of the time. Except that is
when MPI_Comm_spawn_multiple in which we put in more than
one app context, thus care about correct indexing.
This was causing down the line memory corruption by overrunning
the mapping array. This commit also puts in a check to make
sure that we error out if we ever try to do that again.
This commit was SVN r10380.
mpirun/orterun now has an option to print the version number. If -V/--version
is given, it will print the version number. If it's the only option, we
exit cleanly. Otherwise, we continue on as if --version wasn't given
(except we've printed the version number).
--This line, and th se below, will be ignored--
M orte/tools/orterun/orterun.c
M orte/tools/orterun/help-orterun.txt
This commit was SVN r10276.
1. Changed the RMGR and PLS APIs to add "signal_job" and "signal_proc" entry points. Only the "signal_job" entries are implemented - none of the components have implementations for "signal_proc" at this time. Thus, you can signal all of the procs in a job, but cannot currently signal only one specific proc.
2. Implemented those new API functions in all components except xgrid (Brian will do so very soon). Only the rsh/ssh and fork modules have been tested, however, and only under OS-X.
3. Added signal traps and callback functions for SIGUSR1/2 to orterun/mpirun that catch those signals and call the appropriate commands to propagate them out to all processes in the job.
4. Added a new test directory under the orte branch to (eventually) hold unit and system level tests for just the run-time. Since our test branch of the repository is under restricted access, people working on the RTE were continually developing their own system-level tests - thus making it hard to help diagnose problems. I have moved the more commonly-used functions here, and added one specifically for testing the SIGUSR1/2 functionality.
I will be contacting people directly to seek help with testing the changes on more environments. Other than compile issues, you should see absolutely no change in behavior on any of your systems - this additional functionality is transparent to anyone who does not issue a SIGUSR1/2 to mpirun.
Ralph
This commit was SVN r10258.
of $libdir and $bindir (i.e., was correctly doing local launches, but
was still using $prefix/lib and $prefix/bin for remote launches).
[Re-]Fixes OFED bug 59.
This commit was SVN r10207.
The following SVN revision numbers were found above:
r9930 --> open-mpi/ompi@1d6902296c