know what my local rank is, and therefore set my paffinity ID as
appropriate. Specifically, we're no longer relying on the
special/secret mpi_paffinity_processor MCA parameter that the orted
would set for us.
This allows processor affinity to be used in environments where the
orted is not used (e.g., bproc, and someday in the hopefully not
too-distant future, SLURM).
This commit was SVN r13352.
The following SVN revision numbers were found above:
r13351 --> open-mpi/ompi@a338b7e533
Over to Jeff now for modifying mpi_init accordingly.
Until Jeff makes his changes, nobody should see anything different as the new info just isn't used by anything!
This commit was SVN r13351.
MPICH2 for "small" commutative operations in the reduce_scatter basic
implementation. "small" is currently pretty big, as it doesn't take
much to beat reduce/scatterv. Need to do much more than this for
better all around performance of MPI_Reduce_scatter, but this was enough
to solve the problems I was having.
This commit was SVN r13348.
Found another places that we were incorrectly casting a C++ MPI handle
array to the corresponding C array type and hoping for the best (which
won't work at all). This commit fixes things so that we now do the
proper conversion between C<-->C++ handles.
This commit was SVN r13346.
intended. In truth, any unique value will do, but he's right that
having a regular, recognizeable, repeatable value is probably much
better for debugging purposes.
This commit was SVN r13341.
and George on these refinements):
* Rename the static OBJ initializer macro to be
OPAL_OBJ_STATIC_INIT(class)
* Ensure that all static OBJ initializations get a refcount of 1
(doesn't ''really'' matter, since they're static, it should never
get to the point where the OBJ is DESTRUCTed, but more correct
nonetheless)
* Add a "magic number" to the OBJ when compiling with debug support.
The magic number does some rudimentary support to ensure that
you're operating on a valid OBJ (and fails an assertion if you're
not). Check to ensure that the memory contains the magic number
when performing actions of OBJ's. Also remove the magic number
when DESTRUCTing OBJs, so that if, for example, an OBJ is
DESTRUCTed more than once, we'll fail the magic number assert.
This commit was SVN r13338.
The following SVN revision numbers were found above:
r13227 --> open-mpi/ompi@96030de97b
r13228 --> open-mpi/ompi@c2e9075d29
- post isends in reverse order of posting irecvs.
if the messages arrive approximately in order, this should
minimize the time spent in matching the requests.
I did not see any performance difference over MX up to 64 nodes, but
the change makes sense and may have some impact when we have (many)
more nodes.
This commit was SVN r13337.
then we modify the argv, forcing the reallocation of the array. With luck
the saved pointer still have a meaning ... without execve return with error
14 (EFAULT).
This commit was SVN r13321.
The following SVN revision numbers were found above:
r12059 --> open-mpi/ompi@ae79894bad
the ORTE_DAEMON_CMD type. Which, unfortunately, is used all over
the place. Without this, we get error:
[msc01:12341] [0,0,0] ORTE_ERROR_LOG: Data pack failed in file ../../ompi-trunk/orte/dss/dss_pack.c at line 83
[msc01:12341] [0,0,0] ORTE_ERROR_LOG: Data pack failed in file ../../ompi-trunk/orte/dss/dss_pack.c at line 58
[msc01:12341] [0,0,0] ORTE_ERROR_LOG: Data pack failed in file ../../../../ompi-trunk/orte/mca/pls/base/pls_base_orted_cmds.c at line 136
This commit was SVN r13320.
- The Verification check only checked that a file that's in SVN is there,
which AM would have complained about during make dist, so it's really
a pointless check
- No need to remove / restore autogen.sh, as AM isn't going to put it
in the tarball anyway, and even if it would, this thing would only
cause it to fail during make dist. All this step did was erase any
changes you had to autogen.sh when you run make_dist_tarball, which
really sucks.
This commit was SVN r13307.
1. add a "cancel_operation" API to the pls components that allows orterun to demand that an orted operation (e.g., terminate_job) be immediately cancelled and abandoned.
2. changes the pls orted commands from blocking to non-blocking. This allows us to interrupt those operations should an orted be non-responsive. The change also adds an orte_abort_timeout that limits how long orterun will automatically wait for the orteds to respond - if the terminate command, for example, doesn't see orted response within that time, then we printout an appropriate error message and just give up.
3. modifies orterun to allow multiple ctrl-c's to simply abort the program even if the orteds have not responded
4. does some cleanup on the orte-level mca params so that their implementation looks a lot more like that of ompi - makes it easier to maintain. This change also includes the definition of an orte_abort_timeout struct and associated MCA param (can't have too many!) so you can set the time after which orterun gives up on waiting for orteds to respond
This needs more testing before migrating to 1.2.
This commit was SVN r13304.
This fix includes two parts: (a) we now initialize the keyval pointer locations to NULL after the malloc, and (b) we now OBJ_NEW the keyvals prior to storing info in them.
BTW, in case anyone reads this and wonders why we don't just OBJ_NEW the keyvals in create_value, the reason is simply that some places in the code use static keyvals and simply assign those addresses into the value object's array. So not everyone wants to OBJ_NEW keyvals - by not forcing it here in create_value, we give the user the flexibility to do whatever they want.
This commit was SVN r13300.
configured with --disable-mpi-cxx so that the default -I flags in the
wrapper compilers don't point to a directory that doesn't exist.
Thanks to Martin Audet for identifying the problem.
This commit was SVN r13296.
comment explaining the patch in the patch.
Refs trac:574
This commit was SVN r13276.
The following Trac tickets were found above:
Ticket 574 --> https://svn.open-mpi.org/trac/ompi/ticket/574
- Make it so the SLURM ras can handle different nodelist configurations
- Some code cleanup and better/more informative error messages and error handling
This commit was SVN r13271.
The following Trac tickets were found above:
Ticket 801 --> https://svn.open-mpi.org/trac/ompi/ticket/801