Nathan Hjelm
211e2dbdf3
clean up tab characters
...
This commit was SVN r25413.
2011-11-02 15:07:57 +00:00
Ralph Castain
f00753881e
Handle the case where mpirun -is- of the same topology as the compute nodes.
...
This commit was SVN r25412.
2011-11-01 22:26:03 +00:00
Jeff Squyres
12d4280d0b
Fix a bunch of memory leaks
...
This commit was SVN r25411.
2011-11-01 20:22:49 +00:00
Jeff Squyres
4fe26b0392
Fix some minor memory leaks
...
This commit was SVN r25410.
2011-11-01 20:22:26 +00:00
Jeff Squyres
308b88e1a7
Make use of the opal_dss_initialized variable for finalize protection
...
This commit was SVN r25409.
2011-11-01 20:22:12 +00:00
Ralph Castain
d28dd55d33
Minimize the amount of topology info returned by the daemons. Most clusters, especially at scale, use the same node topology on every node, so there is no re
...
ason to return the topology from every daemon. Borrow a page from the --hetero-apps page and let users indicate that the node topology differs by adding a --
hetero-nodes option to mpirun. If the option is set, then every daemon returns topology info. If not set, then only daemon vpid=1 returns it.
We always want one daemon to return the topology as the head node is often different from the compute nodes. Having one daemon return the compute node topolo
gy allows us to detect any such difference. All compute nodes are then set to the same topology.
This commit was SVN r25408.
2011-11-01 18:43:10 +00:00
Ralph Castain
14966e0f8f
Cleanup PMI startup - if a component isn't selected, it should finalize PMI IFF it started it. Otherwise, components that aren't selected can finalize PMI when it is in use by other parts of the system.
...
This commit was SVN r25407.
2011-11-01 16:25:12 +00:00
Mike Dubman
3edd77ea25
update mxm plugin to mxm api change: pass synchronous request as an opcode instead of a flag
...
This commit was SVN r25403.
2011-10-31 22:36:15 +00:00
Ralph Castain
4368199c86
Missing include
...
This commit was SVN r25402.
2011-10-31 13:39:57 +00:00
Mike Dubman
6b50ba22a6
select mxm ptl based on user preferences
...
This commit was SVN r25401.
2011-10-31 10:17:43 +00:00
Ralph Castain
96332a2859
Fix typo
...
This commit was SVN r25400.
2011-10-30 13:23:42 +00:00
Ralph Castain
71ed8e3cd3
Bring back the local node's binding capabilities along with its topology. Clean up indentation.
...
This commit was SVN r25399.
2011-10-30 13:20:16 +00:00
Ralph Castain
d492b20975
Bozo check for topology info
...
This commit was SVN r25398.
2011-10-30 11:49:38 +00:00
Ralph Castain
4232115a98
Ensure pruning remains within the current job/app being mapped.
...
This commit was SVN r25397.
2011-10-30 00:02:20 +00:00
Ralph Castain
648c85b41b
Add a simple pattern mapper as an example of how to use the topology info to create desired mappings. Let the user specify a pattern based on resource types, and map that pattern across all available nodes as resources permit.
...
Don't automatically display the topology for each node when --display-devel-map is set as it can overwhelm the reader. Use a separate flag --display-topo to get it.
This commit was SVN r25396.
2011-10-29 15:12:45 +00:00
Ralph Castain
12a589130a
Add some debug
...
This commit was SVN r25395.
2011-10-29 15:07:58 +00:00
Ralph Castain
965b04d1a5
Use the new utilities to get a topology that reflects available cpus
...
This commit was SVN r25394.
2011-10-29 15:07:36 +00:00
Ralph Castain
7ba4675adf
Bring over some useful utilities and definitions for working with hwloc inside ORTE/OMPI. Cache frequently computed info to save processing time when handling multiple nodes with the same topology. Deal with available cpus as defined by online vs allowed vs user-specified limits. Help deal with hwloc's unfortunate decision to lump all caches in the same object type.
...
This commit was SVN r25393.
2011-10-29 14:58:58 +00:00
Jeff Squyres
6092b50ebb
Fix the cases where the default values of MCA params were not always
...
handled properly when MCA parameters are re-registered and their types
change. Specifically, this case was broken:
1. Register an int MCA param with a non-zero default value
1. Re-register the same MCA param as a string with a NULL default value
The 2nd step would cause a segv because the first int default value
wasn't being reset properly. Here's sample code that shows the issue:
{{{
{
int ibogus;
char *sbogus;
opal_init(&argc, &argv);
mca_base_param_reg_int_name("type", "name", "help", false, false, 3, &ibogus);
printf("Ibogus: %d\n", ibogus);
mca_base_param_reg_string_name("type", "name", "help", false, false, NULL, &sbogus);
printf("Sbogus: %s\n", (NULL == sbogus) ? "NULL" : sbogus);
exit(0);
}
}}}
This commit fixes the problem from the sample code above as well as
the a similar issue for file-set MCA params and override values. It
also resets default values for MCA params initially registered as a
string but then re-registered as an int.
This commit was SVN r25392.
2011-10-29 12:29:31 +00:00
Jeff Squyres
b0bd0b3924
Don't use slashes in the date field ('/') because it'll confuse other
...
sed scripts in a build of that tarball (e.g., when substituting the
date into man pages).
This commit was SVN r25389.
2011-10-29 11:50:08 +00:00
Ralph Castain
e50bcbf028
Add the ability to specify a topology-containing xml file to describe the simulated nodes to support mapping tests against arbitrary topologies
...
This commit was SVN r25388.
2011-10-29 02:01:11 +00:00
Ralph Castain
21d45b0807
Just some cleanup in case of error
...
This commit was SVN r25387.
2011-10-29 01:55:19 +00:00
Ralph Castain
7fa5f82d70
Add simulator component to support testing of large scale mapping methods. Automatically sets do-not-resolve and do-not-launch, and creates however many nodes the user wants to simulate in the system.
...
This commit was SVN r25386.
2011-10-28 23:48:53 +00:00
Ralph Castain
691aa0a6dd
Set ignores
...
This commit was SVN r25385.
2011-10-28 23:47:33 +00:00
Jeff Squyres
6cfcfb95e4
Mark nightly tarballs through their release dates
...
This commit was SVN r25382.
2011-10-28 19:43:59 +00:00
Ralph Castain
e2eb8d5f78
Remove bad param registration - that param was already registered as an int_name in another location.
...
This commit was SVN r25381.
2011-10-28 19:14:43 +00:00
Josh Hursey
6726590b1c
Remove the 'ess_node_rank' accessor from here. This caused running under 'tm' to segv at the orteds.
...
It just looks like this part of the component was not updated during r25331. It was removed from the 'env' and 'slurm' environments in that patch. It looks like 'tm' was updated, but did not get this particular piece.
This commit was SVN r25380.
The following SVN revision numbers were found above:
r25331 --> open-mpi/ompi@b44f8d4b28
2011-10-28 17:41:35 +00:00
Josh Hursey
59ff1dbbfb
Fix indentation problem that caused a segv when running without regex.
...
This was introduced in r25063.
This commit was SVN r25379.
The following SVN revision numbers were found above:
r25063 --> open-mpi/ompi@e58623cd5b
2011-10-28 13:39:32 +00:00
Samuel Gutierrez
0ba13e2f8e
fix typo. use PMI_Initialized for init status instead of PMI_Init.
...
This commit was SVN r25378.
2011-10-27 22:41:50 +00:00
Samuel Gutierrez
922e41a318
fix typo. use PMI_Initialized for init status instead of PMI_Init.
...
This commit was SVN r25377.
2011-10-27 22:27:30 +00:00
Nathan Hjelm
ee087de073
added fast boxes to vader
...
This commit was SVN r25376.
2011-10-27 20:22:46 +00:00
Mike Dubman
f96ae43e23
pass jobid to mxm/sm module
...
This commit was SVN r25375.
2011-10-27 13:14:52 +00:00
Nathan Hjelm
82efe131dc
made btl_vader_max_inline_send a configurable parameter and updated and enabled sendi
...
This commit was SVN r25374.
2011-10-26 22:15:42 +00:00
Nathan Hjelm
033179d6ac
fixed bug in frag initialization
...
This commit was SVN r25373.
2011-10-26 19:29:37 +00:00
George Bosilca
6fdb040eef
ORTE_ERROR to OPAL_ERROR.
...
This commit was SVN r25372.
2011-10-26 15:59:43 +00:00
George Bosilca
9d8e84142f
Survivor!!!
...
This commit was SVN r25371.
2011-10-26 00:58:55 +00:00
Samuel Gutierrez
ae66347c7a
added GNI configure script.
...
This commit was SVN r25370.
2011-10-25 22:15:16 +00:00
Nathan Hjelm
05114ffb51
fixed off by one error
...
This commit was SVN r25369.
2011-10-25 22:07:47 +00:00
George Bosilca
72f731f25f
The SM2 collective component has not been updated in a long
...
time. Rich, the original developer, agrees with this removal.
This commit was SVN r25368.
2011-10-25 22:07:09 +00:00
Nathan Hjelm
e887d595c7
fix potential bug with non-contiguous sends
...
This commit was SVN r25367.
2011-10-25 19:21:45 +00:00
Ralph Castain
951d72692c
Reverse the #if direction so we report daemon failure to the errmgr - otherwise, we just hang if a daemon fails to start.
...
Reviewed with Josh.
This commit was SVN r25366.
2011-10-25 19:09:52 +00:00
Nathan Hjelm
433cfa3665
use single copy for some sends
...
This commit was SVN r25365.
2011-10-25 18:38:42 +00:00
Mike Dubman
9ffeeb69d9
fix help message
...
This commit was SVN r25364.
2011-10-25 14:02:43 +00:00
Samuel Gutierrez
c646c93eec
remove unneeded flags from cray xe6 platform file.
...
This commit was SVN r25363.
2011-10-24 18:42:43 +00:00
Samuel Gutierrez
663f4546f5
fix define typo in psm mtl.
...
This commit was SVN r25362.
2011-10-24 18:38:12 +00:00
Ralph Castain
c55cba55a7
Totally trivial spelling fix
...
This commit was SVN r25361.
2011-10-24 14:06:33 +00:00
Ralph Castain
a7cbc25658
Minor cleanups - check hwloc returns everywhere. Thanks to Chris Yeoh for pointing this out.
...
This commit was SVN r25360.
2011-10-24 14:05:26 +00:00
Ralph Castain
955d8e7d46
Allow apps to use pmi when launched by mpirun, if desired, without affecting daemons
...
This commit was SVN r25359.
2011-10-23 15:57:13 +00:00
Nathan Hjelm
e8af0d8589
don't use alps paffinity
...
This commit was SVN r25358.
2011-10-21 22:52:03 +00:00
Abhishek Kulkarni
46952e9008
Fix C/R functionality in trunk. Intra-node checkpointing of a job now works as expected.
...
Signed-off-by: Abhishek Kulkarni <adkulkar@osl.iu.edu>
This commit was SVN r25357.
2011-10-21 22:07:35 +00:00