Terry Dontje
13be2d2a00
correct mistype in odle should be odls call to orte_show_help
...
This commit was SVN r21979.
2009-09-21 13:22:37 +00:00
Ralph Castain
7138fd131f
Final cleanup on new paffinity "if-avail" messages, plus fix one bug reported by Terry
...
This commit was SVN r21978.
2009-09-19 17:43:21 +00:00
Ralph Castain
2028017554
Modify the paffinity system to handle binding directives that are "soft" - i.e., when someone directs that we bind if the system supports it. This allows community members to distribute OMPI with default MCA param files that direct general binding policies, without having the distributed software fail if the system cannot support those policies.
...
The new options work by adding an ":if-avail" qualifier to the "bind-to-socket" and "bind-to-core" MCA params. If the system does not support this capability, the job will launch anyway. Without the qualifier, the job will abort with an error message indicating that the required functionality is not supported on this system.
This commit was SVN r21975.
2009-09-18 19:48:42 +00:00
Ralph Castain
98a4450df6
Fix the seq mapper by initializing the proc object to NULL before claiming a slot for it
...
This commit was SVN r21969.
2009-09-17 05:18:37 +00:00
Ralph Castain
ae31af7dec
Enable monitoring if configured to do so. Update the sensor framework
...
This commit was SVN r21964.
2009-09-09 21:00:27 +00:00
Ralph Castain
5fb3d13c24
Cleanup some pointer array addressing
...
This commit was SVN r21963.
2009-09-09 20:59:17 +00:00
Ralph Castain
e554fc282d
Add some diagnostic output when daemons die
...
This commit was SVN r21960.
2009-09-09 18:16:50 +00:00
Ralph Castain
c20d977a30
Report the allocate event, if requested
...
This commit was SVN r21959.
2009-09-09 17:47:58 +00:00
Ralph Castain
2688ad2c9f
Ensure the odls_types are included when referencing the APIs
...
This commit was SVN r21958.
2009-09-09 17:47:13 +00:00
Ralph Castain
cb7f608006
Remove debug output
...
This commit was SVN r21957.
2009-09-09 17:46:28 +00:00
Ralph Castain
51b13b3d5c
A few minor cleanups in where threads are unlocked.
...
Reset mpirun's exit code when we restart failed procs
This commit was SVN r21955.
2009-09-09 05:31:06 +00:00
Ralph Castain
8ae4b55d16
Enable a new command line option to --report-events that instructs mpirun to RML-report specific events during job life to the requestor.
...
This commit was SVN r21954.
2009-09-09 05:28:45 +00:00
Ralph Castain
c877b1a5f8
Silence a compiler warning about no format
...
This commit was SVN r21951.
2009-09-08 15:03:14 +00:00
Ralph Castain
81b8bc5b54
Silence a compiler warning about no format
...
This commit was SVN r21950.
2009-09-08 15:02:48 +00:00
Ralph Castain
142036f2c0
Issue an error message and abort if the user requests a number of processes that conflicts with nperxxx directives when evaluated against available resources
...
This commit was SVN r21949.
2009-09-07 03:36:10 +00:00
Ralph Castain
ca09e8f604
Minor modification required to allow opal_paffinity_alone to default to bind-to-core
...
This commit was SVN r21948.
2009-09-05 15:24:26 +00:00
Ralph Castain
17444243f7
Correct the bit mask to properly set the binding policy
...
This commit was SVN r21934.
2009-09-03 17:58:23 +00:00
Jeff Squyres
e1fe03ad44
Minor grammar fixes, and use "#" for separating lines, not blank lines.
...
This commit was SVN r21931.
2009-09-03 07:02:21 +00:00
Ralph Castain
0421a49844
Update the xml support to allow -xml-file foo whereby we redirect all xml formatted output (and ONLY xml formatted output) to a specified file
...
This commit was SVN r21930.
2009-09-02 18:03:10 +00:00
Ralph Castain
d3d34f8f15
Correct a bug in the assignment of node index value. Ensure we set the app number so that MPI attributes get set correctly.
...
This commit was SVN r21927.
2009-09-02 01:15:44 +00:00
Ralph Castain
50ca27c1c8
Ensure that procs launched natively by slurm do not mistakenly identify themselves as daemons to the system
...
This commit was SVN r21926.
2009-09-01 17:57:15 +00:00
Lenny Verkhovsky
2a594fec6c
added help message to rankfile mapper when failed if using alias instead of full hostname
...
This commit was SVN r21919.
2009-09-01 11:17:32 +00:00
Ralph Castain
59645c5c8e
Per direction from the slurm team, change the envar we look at to get our allocation
...
This commit was SVN r21915.
2009-08-30 15:57:27 +00:00
Ralph Castain
0394a4884d
Setup cpus-per-proc and cpus-per-rank as synonyms, both in mca params and on mpirun cmd line
...
This commit was SVN r21914.
2009-08-30 14:30:36 +00:00
Ralph Castain
ef4cdeeb69
Fix round-robin mapping when bind-to-socket in cases where #procs > #sockets and #cores
...
This commit was SVN r21913.
2009-08-29 03:36:21 +00:00
Ralph Castain
433673c64f
Report bindings in all cases, including external bindings and slot lists
...
This commit was SVN r21911.
2009-08-28 13:58:46 +00:00
Ralph Castain
3ef028ca23
Trap mpirun error messages in xml format
...
This commit was SVN r21910.
2009-08-28 02:46:15 +00:00
Ralph Castain
59f08dd2ff
Support the combination of npersocket and bind-to-core
...
This commit was SVN r21909.
2009-08-28 02:31:26 +00:00
Shiqing Fan
fb777134cf
Adjust the command string length.
...
This commit was SVN r21905.
2009-08-27 13:42:55 +00:00
Ralph Castain
2d27bc9824
Default npersocket to bind-to-socket unless otherwise directed
...
This commit was SVN r21904.
2009-08-27 13:21:14 +00:00
Shiqing Fan
bdb45b25c6
Skip a few more signal events on Windows for orted, for the signal events don't work for select on Windows.
...
This commit was SVN r21903.
2009-08-27 13:17:13 +00:00
Shiqing Fan
ffd55631bc
Deal with the case when the prefix is NULL.
...
This commit was SVN r21902.
2009-08-27 13:11:18 +00:00
Ralph Castain
01ba0eaa47
Correctly handle npersocket corner cases
...
This commit was SVN r21901.
2009-08-27 11:25:48 +00:00
Ralph Castain
12d6595c41
Do not deprecate the rankfile mca param.
...
This commit was SVN r21900.
2009-08-27 10:40:51 +00:00
Shiqing Fan
4119497c5a
Use select() for Windows events by default.
...
For some historical reasons, we didn't use select() for the Windows events. Now it could be merged back to have a better performance on Windows.
This commit was SVN r21899.
2009-08-27 08:11:56 +00:00
Shiqing Fan
6e57772ebd
Remove a few redundant security functions, and use USERDOMAIN environment variable for the domain name.
...
This commit was SVN r21898.
2009-08-27 08:00:49 +00:00
Shiqing Fan
1b6db85988
Complete the support for building on UNC path.
...
This commit was SVN r21897.
2009-08-27 07:57:26 +00:00
Ralph Castain
5e710928a5
Revise the new binding system slightly:
...
1. finalize the logic for properly respecting externally assigned bindings. Thanks to Chris Samuel for his help with this. Still needs some acid testing, but appears to now work.
2. remove the double-logic of requiring opal_paffinity_alone AND bind-to-foo. If the user specifies bind-to-foo, trust her and just do it.
This commit was SVN r21885.
2009-08-26 02:01:49 +00:00
Ralph Castain
2016a3180b
Silence compiler warnings about uninitialized variables
...
This commit was SVN r21883.
2009-08-26 01:56:39 +00:00
Ralph Castain
9ad33a4688
Silence compiler warning about uninitialized variable
...
This commit was SVN r21882.
2009-08-26 01:56:11 +00:00
Ralph Castain
0e528e994f
Revert last commit - went to wrong repo!
...
Didn't we just have that happen the other day too? :-)
This commit was SVN r21878.
2009-08-25 13:06:14 +00:00
Ralph Castain
15d12b240b
Sync to r21876
...
This commit was SVN r21877.
The following SVN revision numbers were found above:
r21876 --> open-mpi/ompi@ef970293f0
2009-08-25 13:04:12 +00:00
Ralph Castain
0bd12e99ff
Fix typo - we want to detect bindings, not set them in the mask.
...
Thanks to Chris Samuel for finding it!
This commit was SVN r21875.
2009-08-25 11:53:05 +00:00
Ralph Castain
509cc0553c
When directly launched by an RM, flag that a process is operating without daemons - i.e., standalone. Provide an error string for the new socket_not_available error. Use errmgr.abort to exit when we cannot get a socket, and ensure that the slurmd module returns the proper exit status for slurm 2.0
...
This commit was SVN r21868.
2009-08-22 02:58:20 +00:00
Ralph Castain
7370235c3e
Create a more specific error code for when specific sockets are not available. Ensure that slurm 2.0 gets the expected error return if the process can't start for that reason so it can take corrective action.
...
This commit was SVN r21867.
2009-08-21 21:28:15 +00:00
Ralph Castain
7183179f56
Provide native integration with SLURM 2.0's OMPI support
...
This commit was SVN r21865.
2009-08-21 18:03:34 +00:00
Ralph Castain
35f8b68de6
Note to self: save all changes before committing
...
This commit was SVN r21863.
2009-08-21 12:54:29 +00:00
Ralph Castain
535408d6c2
Answer a Jeff-ism and check malloc for NULL return - for all xml formatting errors, revert to at least showing the non-xml formatted message
...
This commit was SVN r21862.
2009-08-21 12:41:54 +00:00
Shiqing Fan
baa81a6525
Add a missing header to compile on Windows.
...
This commit was SVN r21861.
2009-08-21 07:20:21 +00:00
Ralph Castain
2e0bd04755
Ensure that show_help messages are properly xml formatted
...
This commit was SVN r21858.
2009-08-20 19:23:26 +00:00