Add ability to store the RM's jobid string to tag the notifier message so that the sys admin knows what job had the problem.
This commit was SVN r21687.
OMPI
and a language agnostic part in OPAL. The convertor is completely
moved into OPAL. This offers several benefits as described in RFC
http://www.open-mpi.org/community/lists/devel/2009/07/6387.php
namely:
- Fewer basic types (int* and float* types, boolean and wchar
- Fixing naming scheme to ompi-nomenclature.
- Usability outside of the ompi-layer.
- Due to the fixed nature of simple opal types, their information is
completely
known at compile time and therefore constified
- With fewer datatypes (22), the actual sizes of bit-field types may be
reduced
from 64 to 32 bits, allowing reorganizing the opal_datatype
structure, eliminating holes and keeping data required in convertor
(upon send/recv) in one cacheline...
This has implications to the convertor-datastructure and other parts
of the code.
- Several performance tests have been run, the netpipe latency does not
change with
this patch on Linux/x86-64 on the smoky cluster.
- Extensive tests have been done to verify correctness (no new
regressions) using:
1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and
ompi-ddt:
a. running both trunk and ompi-ddt resulted in no differences
(except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run
correctly).
b. with --enable-memchecker and running under valgrind (one buglet
when run with static found in test-suite, commited)
2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt:
all passed (except for the dynamic/ tests failed!! as trunk/MTT)
3. compilation and usage of HDF5 tests on Jaguar using PGI and
PathScale compilers.
4. compilation and usage on Scicortex.
- Please note, that for the heterogeneous case, (-m32 compiled
binaries/ompi), neither
ompi-trunk, nor ompi-ddt branch would successfully launch.
This commit was SVN r21641.
point, the event engine has been shut down until btl finalization is
done, so opal_progress in the wait loop is not an option - we have
to drain from inside the btl.
Clean up the looping structure for the finalize routine
Update copyrights.
This commit was SVN r21620.
IPv4 and IPv6) is outside the legal boundaries. This fixes trac:1869.
This commit was SVN r21612.
The following Trac tickets were found above:
Ticket 1869 --> https://svn.open-mpi.org/trac/ompi/ticket/1869
btl_sm.c: In function ‘mca_btl_sm_sendi’:
btl_sm.c:734: warning: comparison between signed and unsigned
btl_sm.c: In function ‘mca_btl_sm_send’:
btl_sm.c:812: warning: comparison between signed and unsigned
This commit was SVN r21552.
The following SVN revision numbers were found above:
r21551 --> open-mpi/ompi@bd995d26b4
- poll FIFO occasionally even if just sending messages
- retry pending sends more often
- just before trying a new send
- as part of mca_btl_sm_component_progress
Maintain two new mca_btl_sm_component variables, num_outstanding_frags
and num_pending_sends, to keep overhead low.
Drain only one message fragment from the FIFO per btl_sm_component_progress
call (rather than drain until empty, which in retrospect everyone considers
to have been a mistake).
This commit was SVN r21551.
This commit was SVN r21533.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r21524
(now works on both big and little endian machines)
* Be a little more flexible when looking for active devices in
btl_openib_component.c
* Add device name and port number to lots of verbose and help
messages
* Add a bunch of verbose messages to give insight into what is
occurring during all the CPC wireups
This commit was SVN r21418.
the debugger plugins into files suffixed by _dbg.h.
This commit was SVN r21404.
The following Trac tickets were found above:
Ticket 1931 --> https://svn.open-mpi.org/trac/ompi/ticket/1931
Yes, friends, our favorite PCIE BTL has resurfaced as mgmt vacillates over its existence. This is an updated version that actually mostly works, in its final stages of debugging.
Some generalization still remains to be done...
This commit was SVN r21358.
not end up in OPAL
- Will post an updated patch for the OMPI_ALIGNMENT_ parts (within C).
This commit was SVN r21342.
The following SVN revision numbers were found above:
r21330 --> open-mpi/ompi@95596d1814
into the OPAL namespace, eliminating cases like opal/util/arch.c
testing for ompi_fortran_logical_t.
As this is processor- and compiler-related information
(e.g. does the compiler/architecture support REAL*16)
this should have been on the OPAL layer.
- Unifies f77 code using MPI_Flogical instead of opal_fortran_logical_t
- Tested locally (Linux/x86-64) with mpich and intel testsuite
but would like to get this week-ends MTT output
- PLEASE NOTE: configure-internal macro-names and
ompi_cv_ variables have not been changed, so that
external platform (not in contrib/) files still work.
This commit was SVN r21330.
well..)
- As Jeff suggested, for m4 macros, dont use _ OPAL, but
rather OPAL_ prefix
- Set the variable before AC_SUBST, so that replacement happens
in f77 header-file, too.
This commit was SVN r21316.
happens when hierarch is used. . Two major items:
- modify the comm_activate step to take an additional argument, indicating
whether the new communicatio has to go through the collective selection
step. This is not required sometimes (e.g. when a process calls
MPI_COMM_SPLIT with color=MPI_UNDEFINED), and contributed significantly to
the exhaustion of cids.
- when freeing a communicator, check whether we can reuse the block of cids
assigned to that comm. This only works if the current front of the cid
assignment (cid_block_start) is right ater the block of cids assigned to this
comm.
Fixes trac:1904
Fixes trac:1926
This commit was SVN r21296.
The following Trac tickets were found above:
Ticket 1904 --> https://svn.open-mpi.org/trac/ompi/ticket/1904
Ticket 1926 --> https://svn.open-mpi.org/trac/ompi/ticket/1926
shm_fifos values are only partially updated, and this leads to wrong values
for the offset. Moving the write barrier at the right place, plus forcing
some read barriers might help.
In addition I get rid of the sm_offset array which is completely useless.
This commit was SVN r21253.