Ralph Castain
648c85b41b
Add a simple pattern mapper as an example of how to use the topology info to create desired mappings. Let the user specify a pattern based on resource types, and map that pattern across all available nodes as resources permit.
...
Don't automatically display the topology for each node when --display-devel-map is set as it can overwhelm the reader. Use a separate flag --display-topo to get it.
This commit was SVN r25396.
2011-10-29 15:12:45 +00:00
Ralph Castain
12a589130a
Add some debug
...
This commit was SVN r25395.
2011-10-29 15:07:58 +00:00
Ralph Castain
965b04d1a5
Use the new utilities to get a topology that reflects available cpus
...
This commit was SVN r25394.
2011-10-29 15:07:36 +00:00
Ralph Castain
7ba4675adf
Bring over some useful utilities and definitions for working with hwloc inside ORTE/OMPI. Cache frequently computed info to save processing time when handling multiple nodes with the same topology. Deal with available cpus as defined by online vs allowed vs user-specified limits. Help deal with hwloc's unfortunate decision to lump all caches in the same object type.
...
This commit was SVN r25393.
2011-10-29 14:58:58 +00:00
Jeff Squyres
6092b50ebb
Fix the cases where the default values of MCA params were not always
...
handled properly when MCA parameters are re-registered and their types
change. Specifically, this case was broken:
1. Register an int MCA param with a non-zero default value
1. Re-register the same MCA param as a string with a NULL default value
The 2nd step would cause a segv because the first int default value
wasn't being reset properly. Here's sample code that shows the issue:
{{{
{
int ibogus;
char *sbogus;
opal_init(&argc, &argv);
mca_base_param_reg_int_name("type", "name", "help", false, false, 3, &ibogus);
printf("Ibogus: %d\n", ibogus);
mca_base_param_reg_string_name("type", "name", "help", false, false, NULL, &sbogus);
printf("Sbogus: %s\n", (NULL == sbogus) ? "NULL" : sbogus);
exit(0);
}
}}}
This commit fixes the problem from the sample code above as well as
the a similar issue for file-set MCA params and override values. It
also resets default values for MCA params initially registered as a
string but then re-registered as an int.
This commit was SVN r25392.
2011-10-29 12:29:31 +00:00
Jeff Squyres
b0bd0b3924
Don't use slashes in the date field ('/') because it'll confuse other
...
sed scripts in a build of that tarball (e.g., when substituting the
date into man pages).
This commit was SVN r25389.
2011-10-29 11:50:08 +00:00
Ralph Castain
e50bcbf028
Add the ability to specify a topology-containing xml file to describe the simulated nodes to support mapping tests against arbitrary topologies
...
This commit was SVN r25388.
2011-10-29 02:01:11 +00:00
Ralph Castain
21d45b0807
Just some cleanup in case of error
...
This commit was SVN r25387.
2011-10-29 01:55:19 +00:00
Ralph Castain
7fa5f82d70
Add simulator component to support testing of large scale mapping methods. Automatically sets do-not-resolve and do-not-launch, and creates however many nodes the user wants to simulate in the system.
...
This commit was SVN r25386.
2011-10-28 23:48:53 +00:00
Ralph Castain
691aa0a6dd
Set ignores
...
This commit was SVN r25385.
2011-10-28 23:47:33 +00:00
Jeff Squyres
6cfcfb95e4
Mark nightly tarballs through their release dates
...
This commit was SVN r25382.
2011-10-28 19:43:59 +00:00
Ralph Castain
e2eb8d5f78
Remove bad param registration - that param was already registered as an int_name in another location.
...
This commit was SVN r25381.
2011-10-28 19:14:43 +00:00
Josh Hursey
6726590b1c
Remove the 'ess_node_rank' accessor from here. This caused running under 'tm' to segv at the orteds.
...
It just looks like this part of the component was not updated during r25331. It was removed from the 'env' and 'slurm' environments in that patch. It looks like 'tm' was updated, but did not get this particular piece.
This commit was SVN r25380.
The following SVN revision numbers were found above:
r25331 --> open-mpi/ompi@b44f8d4b28
2011-10-28 17:41:35 +00:00
Josh Hursey
59ff1dbbfb
Fix indentation problem that caused a segv when running without regex.
...
This was introduced in r25063.
This commit was SVN r25379.
The following SVN revision numbers were found above:
r25063 --> open-mpi/ompi@e58623cd5b
2011-10-28 13:39:32 +00:00
Samuel Gutierrez
0ba13e2f8e
fix typo. use PMI_Initialized for init status instead of PMI_Init.
...
This commit was SVN r25378.
2011-10-27 22:41:50 +00:00
Samuel Gutierrez
922e41a318
fix typo. use PMI_Initialized for init status instead of PMI_Init.
...
This commit was SVN r25377.
2011-10-27 22:27:30 +00:00
Nathan Hjelm
ee087de073
added fast boxes to vader
...
This commit was SVN r25376.
2011-10-27 20:22:46 +00:00
Mike Dubman
f96ae43e23
pass jobid to mxm/sm module
...
This commit was SVN r25375.
2011-10-27 13:14:52 +00:00
Nathan Hjelm
82efe131dc
made btl_vader_max_inline_send a configurable parameter and updated and enabled sendi
...
This commit was SVN r25374.
2011-10-26 22:15:42 +00:00
Nathan Hjelm
033179d6ac
fixed bug in frag initialization
...
This commit was SVN r25373.
2011-10-26 19:29:37 +00:00
George Bosilca
6fdb040eef
ORTE_ERROR to OPAL_ERROR.
...
This commit was SVN r25372.
2011-10-26 15:59:43 +00:00
George Bosilca
9d8e84142f
Survivor!!!
...
This commit was SVN r25371.
2011-10-26 00:58:55 +00:00
Samuel Gutierrez
ae66347c7a
added GNI configure script.
...
This commit was SVN r25370.
2011-10-25 22:15:16 +00:00
Nathan Hjelm
05114ffb51
fixed off by one error
...
This commit was SVN r25369.
2011-10-25 22:07:47 +00:00
George Bosilca
72f731f25f
The SM2 collective component has not been updated in a long
...
time. Rich, the original developer, agrees with this removal.
This commit was SVN r25368.
2011-10-25 22:07:09 +00:00
Nathan Hjelm
e887d595c7
fix potential bug with non-contiguous sends
...
This commit was SVN r25367.
2011-10-25 19:21:45 +00:00
Ralph Castain
951d72692c
Reverse the #if direction so we report daemon failure to the errmgr - otherwise, we just hang if a daemon fails to start.
...
Reviewed with Josh.
This commit was SVN r25366.
2011-10-25 19:09:52 +00:00
Nathan Hjelm
433cfa3665
use single copy for some sends
...
This commit was SVN r25365.
2011-10-25 18:38:42 +00:00
Mike Dubman
9ffeeb69d9
fix help message
...
This commit was SVN r25364.
2011-10-25 14:02:43 +00:00
Samuel Gutierrez
c646c93eec
remove unneeded flags from cray xe6 platform file.
...
This commit was SVN r25363.
2011-10-24 18:42:43 +00:00
Samuel Gutierrez
663f4546f5
fix define typo in psm mtl.
...
This commit was SVN r25362.
2011-10-24 18:38:12 +00:00
Ralph Castain
c55cba55a7
Totally trivial spelling fix
...
This commit was SVN r25361.
2011-10-24 14:06:33 +00:00
Ralph Castain
a7cbc25658
Minor cleanups - check hwloc returns everywhere. Thanks to Chris Yeoh for pointing this out.
...
This commit was SVN r25360.
2011-10-24 14:05:26 +00:00
Ralph Castain
955d8e7d46
Allow apps to use pmi when launched by mpirun, if desired, without affecting daemons
...
This commit was SVN r25359.
2011-10-23 15:57:13 +00:00
Nathan Hjelm
e8af0d8589
don't use alps paffinity
...
This commit was SVN r25358.
2011-10-21 22:52:03 +00:00
Abhishek Kulkarni
46952e9008
Fix C/R functionality in trunk. Intra-node checkpointing of a job now works as expected.
...
Signed-off-by: Abhishek Kulkarni <adkulkar@osl.iu.edu>
This commit was SVN r25357.
2011-10-21 22:07:35 +00:00
Samuel Gutierrez
949364d2d6
update LANL Cray XE6 platform files to include PMI support.
...
This commit was SVN r25356.
2011-10-21 21:05:23 +00:00
Nathan Hjelm
7b1172b346
need a terminating character in the decoded string
...
This commit was SVN r25355.
2011-10-21 16:46:28 +00:00
Nathan Hjelm
fb19f56965
Cray doesn't define PMI2_SUCCESS
...
This commit was SVN r25354.
2011-10-21 16:34:22 +00:00
Nathan Hjelm
cd257ac707
fixed typo in pmi grpcomm
...
This commit was SVN r25353.
2011-10-21 16:28:36 +00:00
Nathan Hjelm
cd68dbe2b8
only try to build vader if xpmem is installed. unignore vader
...
This commit was SVN r25352.
2011-10-21 15:45:05 +00:00
Shiqing Fan
5711414eb7
Fix Windows build
...
This commit was SVN r25351.
2011-10-21 14:46:58 +00:00
Ralph Castain
53ef085567
Fix a minor issue seen by Jeff in specific failure pathway
...
This commit was SVN r25350.
2011-10-21 14:44:48 +00:00
Jeff Squyres
cbafea8f69
Add a DEPENDENCIES line so that if you edit something down in the
...
hwloc tree, it'll get picked up by the component (and therefore by
libopen-pal).
Thanks to Terry for finding the problem.
This commit was SVN r25349.
2011-10-21 11:39:52 +00:00
Ralph Castain
3e72fccacf
Cray's PMI implementation is quite different from slurm's - they extended PMI-1 by adding some, but not all, of the PMI-2 APIs. So you can't just switch to using PMI-2 functions as it isn't a complete implementation. Instead, you have to selectively figure out which ones they have in PMI-2, and use any missing ones from PMI-1. What fun.
...
Modify the configure logic and the PMI components to accommodate Cray's approach. Refactor the PMI error reporting code so it resides in only one place. Cray actually decided -not- to define the PMI-2 error codes, so we have to use the PMI-1 codes instead. More fun.
This commit was SVN r25348.
2011-10-21 04:54:38 +00:00
Ralph Castain
e2adc8fa3a
Ignore until Nathan can fix - probably configure problem
...
This commit was SVN r25347.
2011-10-21 03:43:01 +00:00
Ralph Castain
5947f61b86
Remove windows reference for now
...
This commit was SVN r25346.
2011-10-21 01:19:03 +00:00
Nathan Hjelm
414677a082
default to no xpmem support
...
This commit was SVN r25345.
2011-10-20 22:13:45 +00:00
Nathan Hjelm
ce29170968
update lanl xe6 platform files for vader
...
This commit was SVN r25344.
2011-10-20 21:50:53 +00:00
Nathan Hjelm
808a73a5c5
removed erroneous add of .deps
...
This commit was SVN r25343.
2011-10-20 21:41:51 +00:00