1
1
openmpi/orte/mca
Josh Hursey 88aa45dd52 Commit to bring online OpenIB, MX, and shared memory support for Open MPI's checkpoint/restart functionality. Some tuning is still needed, but basic functionality is in place.
There is still a problem with OpenIB and threads (external to C/R functionality). It has been reported in Ticket #1539

Additionally:
* Fix a file cleanup bug in CRS Base.
* Fix a possible deadlock in the TCP ft_event function
* Add a mca_base_param_deregister() function to MCA base
* Add whole process checkpoint timers
* Add support for BTL: OpenIB, MX,  Shared Memory
* Add support Mpool: rdma, sm
* Sundry bounds checking an cleanup in some scattered functions

This commit was SVN r19756.
2008-10-16 15:09:00 +00:00
..
errmgr - The intel compiler does not play nice with the 2008-08-08 16:26:09 +00:00
ess MOdify the node_rank and local_rank fields to be uint16_t so we can handle more than 256 procs/node. Change the type to a defined one so that any future change can be easily done, if required. 2008-09-25 13:39:08 +00:00
filem Some more work on the man pages: 2008-08-07 19:20:40 +00:00
grpcomm Fixes trac:1392, #1400 2008-07-28 22:40:57 +00:00
iof Application processes should not open/close the IOF framework - there is nothing in that framework for application procs to do. 2008-08-22 01:28:19 +00:00
notifier - make sure that the system has the header files. 2008-08-25 13:56:10 +00:00
odls Revise the daemon collective system to handle comm_spawn patterns that cross into new nodes that are not direct children on the routing tree of the HNP. 2008-10-02 20:08:27 +00:00
oob Commit to bring online OpenIB, MX, and shared memory support for Open MPI's checkpoint/restart functionality. Some tuning is still needed, but basic functionality is in place. 2008-10-16 15:09:00 +00:00
plm Fix a problem in the plm "failed to start" code observed by Jeff. When we are unable to launch to a specific node because it doesn't exist or is down, the system would hang and/or segv. The reason for the hang was that we were "firing" the orted exit trigger prior to its timer event being defined - thus "locking" that one-shot and preventing it from firing when we actually were ready to use it. 2008-10-16 14:21:37 +00:00
ras Prettify the user level display of allocation and map to make it easier to see and understand 2008-09-28 16:44:09 +00:00
rmaps Correct a bug in the bookmarking code that incorrectly looked at #slots instead of #slots_allocated, thus causing slot reductions in hostfiles to be ignored when selecting our starting node. 2008-09-29 14:09:02 +00:00
rml Revise the daemon collective system to handle comm_spawn patterns that cross into new nodes that are not direct children on the routing tree of the HNP. 2008-10-02 20:08:27 +00:00
routed Let the HNP only update the routing tree if necessary. Enable some debug output 2008-10-03 13:41:08 +00:00
snapc Commit to bring online OpenIB, MX, and shared memory support for Open MPI's checkpoint/restart functionality. Some tuning is still needed, but basic functionality is in place. 2008-10-16 15:09:00 +00:00