A new file has been added (new_pack.c). It is not yet used but inside I try to create
a more cleaner ppack/unpack function. There is still work in progress on this file.
New !!! In debug mode there is a gdb friendly function where you can set a breakpoint
(ompi_ddt_safeguard_pointer_debug_breakpoint) in order to see what's the status of
the convertor and the internal veraible when the ddt engine try to do a pack/unpack
outside the user buffer.
A new field in the convertor structure. This field get initialized in the convertor_init_for_*
functions and point to the datatype description that have to be used by the pack/unpack
functions. Until now each one of them have a test in the begining to detect which data
representation to be used (normal or optimized). Now this field point directly to the correct
one.
Change names for local variable to be easier to understand what they are supposed to
represent.
Add a new function ompi_convertor_set_start_position which can be used to set the
current position in the convertor depending on the datatype attached to the convertor.
An improved version of convertor_init_generic which work wiith convertors comming from a cache.
Only the necessary field get modified.
A lot more cleanups ...
This commit was SVN r5950.
Use the correct indentation.
Now we can force the progress function to grab as many events as possible
(in order to avoid starvation for the send queue).
Add more elems in the unexpected queue (internal buffers use to temporary
store the data for the unexpected messages).
Decrease the number of variables in some functions (cleanup).
Avoid using goto ...
This commit was SVN r5949.
Split the datatypes in 3 categories:
1. basic datatypes: count always one and the datatype is always
contiguous
2. complex datatypes composed on one basic type with a count. Most of
the time these datatypes will be contiguous.
3. complex datatypes composed by 2 basic types. Depending on the
architecture these types can be non contiguous.
Reorder the defines to match the previous categories. Add some
comment to describe the changes in the files.
Clean-up the flags:
- DT_FLAG_PREDEFINED is attached to all predefined datatypes.
- DT_FLAG_CONTIGUOUS is attached to all contiguous one. This flag is
detected at runtime depending on the architecture.
This commit was SVN r5946.
-bynode and -byslot orterun command line parameters now set a single
MCA param (ras_base_schedule_policy) which is looked up by the
following components to decide which RAS base API function to invoke:
ras base bjs
ras base host
rmaps round_robin
This commit was SVN r5941.
* turn off echoing on the pty (which was what r5925 was trying to
do).
With this patch, stdin forwarding to rank 0 looks good, with the exception
of the initialization delay until stage gate 1.
This commit was SVN r5930.
The following SVN revision numbers were found above:
r5925 --> open-mpi/ompi@e406a4b1aa
setup. Now uses fork() instead of ssh if the target nodename is the
same as the current nodename (which will happen if the user gave
"localhost" or just the hostname without the domain) or if the
target nodename is local according to ompi_ifislocal() (which will
happen if the user gave a FQDN)
This commit was SVN r5916.
orte_system_info.nodename so that cleanup and the like occur
correctly. Otherwise, the daemon on localhost and an MPI process
can have different ideas on what the local nodename is, and that
lead to all kinds of badness with both process killing and cleanup.
Also fixes the annoying ssh keys problem when sshing to localhost.
- modify the rsh pls to ssh to localhost if the target nodename is the
same as orte_system_info.nodename AND is not resolvable (ie, ssh to
would fail). Otherwise, ssh to nodename. This should work around
the issues Ralph was seeing with ssh failing on his laptop (since
the above change undid the previous fix to this problem).
- Small change to ompi_ifislocal() to squelch a warning message about
unresolvable hostnames when checking to see if a name is, in fact,
resolvable.
- Force ORTE process to have same nodename field as it's starting
daemon (assuming it was started using the fork pls), so that the
fork pls can properly kill the process, and cleanup its session
directory on abnormal exit.
This commit was SVN r5914.
of the started job (which should be rank 0 of the started MPI job). Still
some issues for Tim / Ralph to work out (below). Only works from MPI_Init
onward. Remaining issues:
- Need to move the orte_rmgr_urm_wireup_stdin() call from STG1 to
when everyone sets LAUNCHED state. Tim/Ralph are going to look
at adding this code
- stdin frags are not properly acked, leading to some shutdown
workarounds. Tim is going to look at this one.
- Probably somehow related to the 2nd point, stdin text appears
to be echoed by the IOF framework
This commit was SVN r5913.
[10.4] with gfortran 4.0) who need to be able to add flags to compile
simple Fortran executables that use libc routines.
Notably, for Tiger with gfortran 4.0 installed, you'll need to:
./configure F77=gfortran FC=gfortran LIBS=-lSystemStubs
This commit was SVN r5909.
call the memory pool to do special memory allocations, and extended
the mpool so that it will do the allocations and keep tack of them in
a tree. Currently, if you pass MPI_INFO_NULL to MPI_Alloc_mem, we will
try to allocate the memory and register it with as many mpools as
possible. Alternatively, one can pass an info object with the names of
the mpools as keys, and from these we decide which mpools to register
the new memory with.
- fixed some comments in the allocator and fixed a minor bug
- extended the red black tree test and made a minor correction
This commit was SVN r5902.
I spoke with Tim about this the other day -- he gave me the green
light to go ahead with this, but it turned into a bigger job than I
thought it would be. I revamped how the default RAS scheduling and
round_robin RMAPS mapping occurs. The previous algorithms were pretty
brain dead, and ignored the "slots" and "max_slots" tokens in
hostfiles. I considered this a big enough problem to fix it for the
beta (because there is currently no way to control where processes are
launched on SMPs).
There's still some more bells and whistles that I'd like to implement,
but there's no hurry, and they can go on the trunk at any time. My
patches below are for what I considered "essential", and do the
following:
- honor the "slots" and "max-slots" tokens in the hostfile (and all
their synonyms), meaning that we allocate/map until we fill slots,
and if there are still more processes to allocate/map, we keep going
until we fill max-slots (i.e., only oversubscribe a node if we have
to).
- offer two different algorithms, currently supported by two new
options to orterun. Remember that there are two parts here -- slot
allocation and process mapping. Slot allocation controls how many
processes we'll be running on a node. After that decision has been
made, process mapping effectively controls where the ranks of
MPI_COMM_WORLD (MCW) are placed. Some of the examples given below
don't make sense unless you remember that there is a difference
between the two (which makes total sense, but you have to think
about it in terms of both things):
1. "-bynode": allocates/maps one process per node in a round-robin
fashion until all slots on the node are taken. If we still have more
processes after all slots are taken, then keep going until all
max-slots are taken. Examples:
- The hostfile:
eddie slots=2 max-slots=4
vogon slots=4 max-slots=8
- orterun -bynode -np 6 -hostfile hostfile a.out
eddie: MCW ranks 0, 2
vogon: MCW ranks 1, 3, 4, 5
- orterun -bynode -np 8 -hostfile hostfile a.out
eddie: MCW ranks 0, 2, 4
vogon: MCW ranks 1, 3, 5, 6, 7
-> the algorithm oversubscribes all nodes "equally" (until each
node's max_slots is hit, of course)
- orterun -bynode -np 12 -hostfile hostfile a.out
eddie: MCW ranks 0, 2, 4, 6
vogon: MCW ranks 1, 3, 5, 7, 8, 9, 10, 11
2. "-byslot" (this is the default if you don't specify -bynode):
greedily takes all available slots on a node for a job before moving
on to the next node. If we still have processes to allocate/schedule,
then oversubscribe all nodes equally (i.e., go round robin on all
nodes until each node's max_slots is hit). Examples:
- The hostfile
eddie slots=2 max-slots=4
vogon slots=4 max-slots=8
- orterun -np 6 -hostfile hostfile a.out
eddie: MCW ranks 0, 1
vogon: MCW ranks 2, 3, 4, 5
- orterun -np 8 -hostfile hostfile a.out
eddie: MCW ranks 0, 1, 2
vogon: MCW ranks 3, 4, 5, 6, 7
-> the algorithm oversubscribes all nodes "equally" (until max_slots
is hit)
- orterun -np 12 -hostfile hostfile a.out
eddie: MCW ranks 0, 1, 2, 3
vogon: MCW ranks 4, 5, 6, 7, 8, 9, 10, 11
The above examples are fairly contrived, and it's not clear from them
that you can get different allocation answers in all cases (the
mapping differences are obvious). Consider the following allocation
example:
- The hostfile
eddie count=4
vogon count=4
earth count=4
deep-thought count=4
- orterun -np 8 -hostfile hostfile a.out
eddie: 4 slots will be allocated
vogon: 4 slots will be allocated
earth: no slots allocated
deep-thought: no slots allocated
- orterun -bynode -np 8 -hostfile hostfile a.out
eddie: 2 slots will be allocated
vogon: 2 slots will be allocated
earth: 2 slots will be allocated
deep-thought: 2 slots will be allocated
This commit was SVN r5894.
remove the MPI_ERR_INIT_FINALIZE() macro. Also check to see how we
invoke the errhandler if an error occurs (i.e., the action depends on
whether we're between MPI_INIT and MPI_FINALIZE or not).
This commit was SVN r5891.
additions from his previous commit:
- Properly propagate error upwards if we have a losthost+other_node
error
- Added logic to handle multiple instances of the same hostname
- Added logic to properly increment the slot count for multiple
instances. For example, a hostfile with:
foo.example.com
foo.example.com slots=4
foo.example.com slots=8
would result in a single host with a slot count of 13 (i.e., if no
slot count is specified, 1 is assumed)
- Revised the localhost logic a bit -- some cases are ok (e.g.,
specifying localhost multiple times is ok, as long as there are no
other hosts)
This commit was SVN r5886.
The problem was that the displacement was increased even when the current memcpy completly
succeed. It not a problem for most of the cases ... except when we completly finish a
data.
This commit was SVN r5885.