Wire the security check into ORTE's OOB handshake, and add a "version" check to ensure that both ends are from the same ORTE version. If not, report the mismatch and refuse the connection
Fixes trac:4171
cmr=v1.7.5:reviewer=jsquyres:subject=Add a security framework for authenticating connections
This commit was SVN r30551.
The following Trac tickets were found above:
Ticket 4171 --> https://svn.open-mpi.org/trac/ompi/ticket/4171
During the commits to make the C/R code compile again the
blocking receive calls in snapc_full_app.c were
replaced by non-blocking receive calls.
This commit adds ORTE_WAIT_FOR_COMPLETION()
after each non-blocking receive to wait for the data.
This commit was SVN r30487.
During the commits to make the C/R code compile again the
blocking receive calls were replaced by non-blocking
which broke the code. This patch uses ORTE_WAIT_FOR_COMPLETION()
to wait until the non-blocking calls have finished.
This commit was SVN r30486.
The sstore component was still using static buffers
for send_buffer_nb(). This patch changes opal_buffer_t buffer;
to opal_buffer_t *buffer;
This commit was SVN r30485.
The snapc component was still using static buffers
for send_buffer_nb(). This patch changes opal_buffer_t buffer;
to opal_buffer_t *buffer;
This commit was SVN r30484.
Right after starting the communication with orterun the buffer
containing the message is deleted. This patch removes the deletion
of the buffer which is now done by orte_rml_send_callback(). This is
now also the callback function used by orte_rml.send_buffer_nb().
The previous callback hnp_receiver() was introduced by an
earlier patch which only was trying to get the code to compile again.
This commit was SVN r30405.
The function
int orte_snapc_base_select(bool seed, bool app);
wants to know if it called by an application or not. Therefore
it expects as second paremeter 'bool app'. It used to be
'!ORTE_PROC_IS_DAEMON' which is not always correct if it is
a tool or a HNP. This patch changes it to ORTE_PROC_IS_APP, which
has the correct information if it is an application.
This commit was SVN r30404.
* don't return null if someone wants to print ORTE_SUCCESS
* rename some stale process types
* keep show_help local if we are in standalone operation as there is nobody to send it to
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r30400.
* use the global flags for linux and apple being found instead of re-doing the case statements
* update select procedure to ignore components that measure the same thing (e.g., resusage and sigar), taking the higher priority module
cmr=v1.7.5:reviewer=jsquyres:subject=Cleanup the sensor code
This commit was SVN r30368.
Refs trac:4117. Please use this commit rather than the patch attached to
the ticket; the patch had a few mistakes in the tweaked wording.
This commit was SVN r30362.
The following SVN revision numbers were found above:
r30298 --> open-mpi/ompi@58479399c3
The following Trac tickets were found above:
Ticket 4117 --> https://svn.open-mpi.org/trac/ompi/ticket/4117
NOTE: launch performance will be absolutely awful if you do this with BTLs that aren't configured to modex_recv on first message!
Even with "modex on demand", we still have to do a barrier in place of the modex - we simply don't move any data around, which does reduce the time impact. The barrier is required to ensure that the other proc has in fact registered all its BTL info and therefore is prepared to hand over a complete data package. Otherwise, you may not get the info you need. In addition, the shared memory BTL can fail to properly rendezvous as it expects the barrier to be in place.
This behavior will *only* take effect under the following conditions:
1. launched via mpirun
2. #procs is greater than ompi_hostname_cutoff, which defaults to UINT32_MAX
3. mca param rte_orte_direct_modex is set to 1. At the moment, we are having problems getting this param to register properly, so only the first two conditions are in effect. Still, the bottom line is you have to *want* this behavior to get it.
The planned next evolution of this will be to make the direct modex be non-blocking - this will require two fixes:
1. if the remote proc doesn't have the required info, then let it delay its response until it does. This means we need a way for the MPI layer to tell the RTE "I am done entering modex data".
2. adjust the SM rendezvous logic to loop until the required file has been created
Creating a placeholder to bring this over to 1.7.5 when ready.
cmr=v1.7.5:reviewer=hjelmn:subject=Enable direct modex at scale
This commit was SVN r30259.
specifically delineate that we're referring to the process' rank in
MPI_COMM_WORLD.
Refs trac:4068
This commit was SVN r30181.
The following Trac tickets were found above:
Ticket 4068 --> https://svn.open-mpi.org/trac/ompi/ticket/4068
Thanks to Paul Hargrove for reporting the problem on NetBSD.
cmr=v1.7.4:reviewer=jsquyres:subject=Handle the case of nodes that do not report cores
This commit was SVN r30180.
configury/Makefile.am changes; this commit renames the internal
installdirs.h framework struct field names to match the configry macro
names:
* pkgdatdir -> ompidatadir
* pkglibdir -> ompilibdir
* pkgincludedir -> ompiincludedir
This commit was SVN r30145.
The following SVN revision numbers were found above:
r30140 --> open-mpi/ompi@8b778903d8
pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is
always set to {datadir,libdir,includedir}/openmpi. This will keep us from
having help files in prefix/share/open-rte when building without Open MPI,
but in prefix/share/openmpi when building with Open MPI.
This commit was SVN r30140.
after last refactoring in rmaps, map-by dist:hca was disabled.
reverting it back
found/fixed by Elena, reviewed by miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30118.
Fixes trac:4043
cmr=v1.7.4:reviewer=jsquyres:subject=Ensure that rankfile-provided allocations are correctly handled
This commit was SVN r30106.
The following Trac tickets were found above:
Ticket 4043 --> https://svn.open-mpi.org/trac/ompi/ticket/4043
No review will be required as this is just debug code for those helping us debug the 1.7.4 release candidates
cmr-=v1.7.4:reviewer=ompi-gk1.7
This commit was SVN r30043.
This patch changes all send/send_buffer occurrences in the C/R code
to send_nb/send_buffer_nb.
The new code compiles but does not work.
Changes from V1:
* #ifdef out the code (so it is preserved for later re-design)
* marked the broken C/R code with ENABLE_FT_FIXED
Changes from V2:
* just replace the blocking calls with the non-blocking calls
* all #ifdef's introduced in V1 are gone
* send_* returns error code or ORTE_SUCCESS (not the number of bytes)
This commit was SVN r30036.
This patch changes all recv/recv_buffer occurrences in the C/R code
to recv_nb/recv_buffer_nb.
The old code is still there but disabled using ifdefs (ENABLE_FT_FIXED).
The new code compiles but does not work.
Changes from V1:
* #ifdef out the code (so it is preserved for later re-design)
* marked the broken C/R code with ENABLE_FT_FIXED
Changes from V2:
* only #ifdef out the code where the behaviour is changed
(used to be blocking; now non-blocking)
This commit was SVN r30035.
Fix comm_spawn on a single host - with the new default mapping scheme, we were incorrectly computing the number of procs to put on the node.
Refs trac:4003
This commit was SVN r30033.
The following Trac tickets were found above:
Ticket 4003 --> https://svn.open-mpi.org/trac/ompi/ticket/4003
Thanks to Tim Miller for reporting the regression from the 1.6 series
cmr=v1.7.4:reviewer=jsquyres:subject=Ensure that comm_spawn'd procs get user-specified forwarded envars
This commit was SVN r30012.
Reset topology usage for each node as we bind as multiple nodes may be linked to the same topology object. This will need to be revisited for scale as it does take some non-zero time to reset the usage each iteration. However, storing individual topology objects for every node consumes memory, so it's a tradeoff.
cmr=v1.7.4:reviewer=jsquyres:subject=Eliminate excessive binding/memory warnings
This commit was SVN r29978.