1
1
openmpi/orte/mca
Ralph Castain 1ace83c470 Enable modex-less launch. Consists of:
1. minor modification to include two new opal MCA params:
   (a) opal_profile: outputs what components were selected by each framework
       currently enabled for most, but not all, frameworks
   (b) opal_profile_file: name of file that contains profile info required
       for modex

2. introduction of two new tools:
   (a) ompi-probe: MPI process that simply calls MPI_Init/Finalize with
       opal_profile set. Also reports back the rml IP address for all
       interfaces on the node
   (b) ompi-profiler: uses ompi-probe to create the profile_file, also
       reports out a summary of what framework components are actually
       being used to help with configuration options

3. modification of the grpcomm basic component to utilize the
   profile file in place of the modex where possible

4. modification of orterun so it properly sees opal mca params and
   handles opal_profile correctly to ensure we don't get its profile

5. similar mod to orted as for orterun

6. addition of new test that calls orte_init followed by calls to
   grpcomm.barrier

This is all completely benign unless actively selected. At the moment, it only supports modex-less launch for openib-based systems. Minor mod to the TCP btl would be required to enable it as well, if people are interested. Similarly, anyone interested in enabling other BTL's for modex-less operation should let me know and I'll give you the magic details.

This seems to significantly improve scalability provided the file can be locally located on the nodes. I'm looking at an alternative means of disseminating the info (perhaps in launch message) as an option for removing that constraint.

This commit was SVN r20098.
2008-12-09 23:49:02 +00:00
..
errmgr - The intel compiler does not play nice with the 2008-08-08 16:26:09 +00:00
ess Fix the ft_event function in response to r20022. Also make the structure cleanup match the finalize() function a bit more closely. 2008-12-02 21:18:32 +00:00
filem Some more work on the man pages: 2008-08-07 19:20:40 +00:00
grpcomm Enable modex-less launch. Consists of: 2008-12-09 23:49:02 +00:00
iof Revert r20074, r20068, and r20064: remove the IOF proc completion code pending further off-trunk work. 2008-12-09 17:11:59 +00:00
notifier Correct the notifier default module to include the new added API 2008-11-13 18:03:41 +00:00
odls Revert r20074, r20068, and r20064: remove the IOF proc completion code pending further off-trunk work. 2008-12-09 17:11:59 +00:00
oob Ensure we know how to route to a different job family when it connects to us 2008-11-03 14:25:14 +00:00
plm Revert r20074, r20068, and r20064: remove the IOF proc completion code pending further off-trunk work. 2008-12-09 17:11:59 +00:00
ras May as well have the other "clean" outputs use the same channel 2008-12-08 19:37:22 +00:00
rmaps May as well have the other "clean" outputs use the same channel 2008-12-08 19:37:22 +00:00
rml Enable modex-less launch. Consists of: 2008-12-09 23:49:02 +00:00
routed To support comm_spawn in fully routed environments, daemons need to know the route to all procs in their job family. They already had this information, but were not retaining it. The infrastructure to do so has existed for some time - just never had the time to complete it. 2008-11-18 15:35:50 +00:00
snapc fix some typos. should be moved to v1.3 2008-11-10 19:05:26 +00:00