Also, per chat with Jeff, modified the Makefile.am's of a few orte tools so that they were consistent in the way we generate the ompi-equivalent cmds.
This commit was SVN r20165.
This commit fixes trac:1691
Thanks to Matthias Hovestadt for identifying this issue.
This commit was SVN r20114.
The following Trac tickets were found above:
Ticket 1691 --> https://svn.open-mpi.org/trac/ompi/ticket/1691
corrections to non-windows files (but within ifdef __WINDOWS__)
type casts, event library for windows use win32.
in orte runtime, add windows sockets handling and object construction.
This commit was SVN r20110.
changes to the already existing ccp components
event/win32.c: merge old FD handling into new
opal_installdirs_windows.c:fix the registry handling
This commit was SVN r20109.
Basically, the remaining problem turned out to be:
1. closing stdout/stderr during orte_finalize of mpirun
2. inadvertently setting up a write event on fd = -1
3. devising a scheme to more accurately track when the stdin write event was active vs closed so it only got released once
This passed prelim MTT testing by Jeff and Tim, but should soak for awhile before migrating to 1.3.
This commit was SVN r20106.
The following SVN revision numbers were found above:
r20064 --> open-mpi/ompi@a07660aea8
r20068 --> open-mpi/ompi@ec930d14a9
r20074 --> open-mpi/ompi@2940309613
1. minor modification to include two new opal MCA params:
(a) opal_profile: outputs what components were selected by each framework
currently enabled for most, but not all, frameworks
(b) opal_profile_file: name of file that contains profile info required
for modex
2. introduction of two new tools:
(a) ompi-probe: MPI process that simply calls MPI_Init/Finalize with
opal_profile set. Also reports back the rml IP address for all
interfaces on the node
(b) ompi-profiler: uses ompi-probe to create the profile_file, also
reports out a summary of what framework components are actually
being used to help with configuration options
3. modification of the grpcomm basic component to utilize the
profile file in place of the modex where possible
4. modification of orterun so it properly sees opal mca params and
handles opal_profile correctly to ensure we don't get its profile
5. similar mod to orted as for orterun
6. addition of new test that calls orte_init followed by calls to
grpcomm.barrier
This is all completely benign unless actively selected. At the moment, it only supports modex-less launch for openib-based systems. Minor mod to the TCP btl would be required to enable it as well, if people are interested. Similarly, anyone interested in enabling other BTL's for modex-less operation should let me know and I'll give you the magic details.
This seems to significantly improve scalability provided the file can be locally located on the nodes. I'm looking at an alternative means of disseminating the info (perhaps in launch message) as an option for removing that constraint.
This commit was SVN r20098.
and components from the home directory for platforms that are bad at
reading in files from home directory at scale (like Red Storm)
This commit was SVN r20069.
1. fix a bug in pml_ob1_recvreq/sendreq.c, buffer was made defined where the request has already been released.
2. complete memchecker support for collective functions.
3. change the wrongly spelled function name of memchecker, i.e. '*_isaddressible' should be '*_isaddressable'
This commit was SVN r20043.
because we wasted a good amount of time today assuming that it was
returning the actual netmask. Specifically, we were confused why it
returned 0x18 instead of 0xffffff00 for a class C subnet (the
head-smacking moment wasn't until [much] later when we converted 0x18
to decimal, which is 24. Then the Clue Light(tm) went on).
This commit was SVN r20002.
This happens with really long paths as part of the variable name.
Found in MTT testing (where the paths are long). This will need to be moved to v1.3
This commit was SVN r19989.
size. The correct way is to cast to an integer type that has the same length, and
then allow the compiler to upgrade to the read type.
This commit was SVN r19944.
parameters on the connecting side also. Also move define of IF_NAMESIZE
into if.h file. And lastly, add one verbose debug message which may be
useful if we run into other issues like this.
This commit fixes trac:1573.
This commit was SVN r19932.
The following Trac tickets were found above:
Ticket 1573 --> https://svn.open-mpi.org/trac/ompi/ticket/1573
the sign of the divident), we have to cast the pointer to an uintptr_t in
order to be able to correctly compute how to align it on the cache line.
Rported and solved by Stephan Kramer. Thanks Stephan.
This commit was SVN r19778.
Valgrind header files if they weren't already in the compiler's
default header file search path. This commit fixes that typo and adds
a little more infrastructure (via an AC_SUBST) to pass in the relevant
CPPFLAGS to the build system for the valgrind memchecker component.
This commit was SVN r19764.
There is still a problem with OpenIB and threads (external to C/R functionality). It has been reported in Ticket #1539
Additionally:
* Fix a file cleanup bug in CRS Base.
* Fix a possible deadlock in the TCP ft_event function
* Add a mca_base_param_deregister() function to MCA base
* Add whole process checkpoint timers
* Add support for BTL: OpenIB, MX, Shared Memory
* Add support Mpool: rdma, sm
* Sundry bounds checking an cleanup in some scattered functions
This commit was SVN r19756.
Fix a finalize bug with the C/R thread. There is a race in the way we were finalizing the C/R thread such that, given a particular interleaving of threads, the pthread_join would stall because the C/R thread was never released properly.
Found in MTT regression testing on Odin.
All and all if you were not compiling with FT & threading support you would never see this problem.
This commit was SVN r19708.
Refers to ticket #1548. Although this appears to fix the problem, the ticket will be held open pending further test prior to transition to the 1.3 branch.
This commit was SVN r19674.
base/paffinity_base_service.c:153: warning: 'phys_core' may be used uninitialized in this function
base/paffinity_base_service.c:153: note: 'phys_core' was declared here
This commit was SVN r19580.