Ralph Castain
afb0db5b6f
Okay, Jeff - just for you...flow the show help thru the orte functions so help messages will be aggregated
...
This commit was SVN r28007.
2013-02-01 00:35:48 +00:00
Ralph Castain
53e0ed71b0
Disqualify slurm module even if slurm support was configured into the build if we don't have an allocation and haven't enabled dynamic allocations
...
This commit was SVN r27995.
2013-01-31 18:15:47 +00:00
Ralph Castain
166f512924
Add some useful debug to the heartbeat sensor
...
This commit was SVN r27994.
2013-01-31 18:01:13 +00:00
Ralph Castain
ca9605773b
If sensors are enabled, then the daemons need to have their proc->node field linked to their local node object
...
This commit was SVN r27991.
2013-01-31 16:38:57 +00:00
Ralph Castain
c87fa68f9b
Cleanup the resource usage sensor, letting the db handle any printing requests.
...
This commit was SVN r27990.
2013-01-31 15:20:56 +00:00
Ralph Castain
9625757a71
Add new database component for printing "add_log" info
...
This commit was SVN r27989.
2013-01-31 15:19:39 +00:00
Ralph Castain
8e8e95ca6b
Silence error report - just because someone only defines ipv4 static ports doesn't make a fatal error
...
This commit was SVN r27976.
2013-01-29 23:48:22 +00:00
Jeff Squyres
8e25b927ab
Clean some minor warnings: remove variables that were set but never
...
used.
This commit was SVN r27974.
2013-01-29 23:35:42 +00:00
Ralph Castain
112f8eedb1
Handle the case where rankfile is providing the allocation
...
This commit was SVN r27971.
2013-01-29 20:37:58 +00:00
Nathan Hjelm
666bd826dc
fix alps configury
...
This commit was SVN r27962.
2013-01-29 15:44:30 +00:00
Brian Barrett
b8442ba505
Revamp the handling of wrapper compiler flags. The user flags, main configure
...
flags, and mca flags are kept seperate until the very end. The main configure
wrapper flags should now be modified by using the OPAL_WRAPPER_FLAGS_ADD
macro. MCA components should either let <framework>_<component>_{LIBS,LDFLAGS}
be copied over OR set <framework>_<component>_WRAPPER_EXTRA_{LIBS,LDFLAGS}.
The situations in which WRAPPER CPPFLAGS can be set by MCA components was
made very small to match the one use case where it makes sense.
This commit was SVN r27950.
2013-01-29 00:00:43 +00:00
Ralph Castain
cfaefb3286
Remove the only place where PMI was used outside a component, and relocate that code to common/pmi.
...
This commit was SVN r27944.
2013-01-28 20:14:51 +00:00
Brian Barrett
f42783ae1a
Move the RTE framework change into the trunk. With this change, all non-CR
...
runtime code goes through one of the rte, dpm, or pubsub frameworks.
This commit was SVN r27934.
2013-01-27 23:25:10 +00:00
Ralph Castain
6eaf601ae6
Good ol' Cray changed the way node/cpu allocation is handled in their latest release of ALPS, and so our allocator is broken. Adjust for the revised method, but preserve the older method for those Cray users who have not updated their system.
...
cmr:v1.7
This commit was SVN r27911.
2013-01-25 21:53:31 +00:00
Ralph Castain
f6b4db0b79
Fix rank_file operations. We changed the syntax to use semi-colons between multiple slot assignments so that we could use the comma to separate specific cores, but somehow the flex definitions didn't get updated to accept that character. We also incorrectly zero'd the bitmap between slot assignment sections, and so multiple slot assignments only wound up making the last one in the list.
...
This commit was SVN r27908.
2013-01-25 18:33:25 +00:00
Ralph Castain
2504da1ac9
Remove stale code - message arrival time doesn't really mean much anymore.
...
This commit was SVN r27905.
2013-01-24 23:02:02 +00:00
Brian Barrett
0e799a93c3
Automake will ship the .in file whether or not the conditional is taken,
...
so don't install orte_wrapper_script when it's not used
This commit was SVN r27902.
2013-01-24 21:36:25 +00:00
Ralph Castain
9bfb2b989b
Silence warning
...
This commit was SVN r27901.
2013-01-24 19:38:51 +00:00
Ralph Castain
4b310473a1
Correct the computation of the daemon vpid
...
cmr:v1.7
This commit was SVN r27899.
2013-01-24 18:04:53 +00:00
Ralph Castain
b403ca5bd8
Silence warning
...
This commit was SVN r27897.
2013-01-23 22:17:08 +00:00
Ralph Castain
4d34d30a97
Silence warning
...
This commit was SVN r27896.
2013-01-23 22:16:48 +00:00
Ralph Castain
6e2cabb87f
Remove duplicate code
...
This commit was SVN r27889.
2013-01-23 02:07:06 +00:00
Ralph Castain
a591fbf06f
Add initial support for dynamic allocations. At this time, only Slurm supports the new capability, which will be included in an upcoming release.
...
Add hooks for supporting dynamic allocation and deallocation to support application-driven requests and fault recovery operations.
This commit was SVN r27879.
2013-01-20 00:33:42 +00:00
Ralph Castain
e4673f3283
Add new job state
...
This commit was SVN r27878.
2013-01-20 00:30:27 +00:00
Ralph Castain
7aa80b984d
Add new test program
...
This commit was SVN r27877.
2013-01-20 00:29:45 +00:00
Ralph Castain
7102d7c5f7
ick - brain is fried. take that test out as it isnt needed on a regular basis
...
This commit was SVN r27875.
2013-01-19 14:48:31 +00:00
Ralph Castain
38786457cb
Add new test
...
This commit was SVN r27874.
2013-01-19 14:46:23 +00:00
Ralph Castain
73387e50e2
Add missing variable def - thanks to Paul Hargrove for spotting.
...
This commit was SVN r27865.
2013-01-18 14:32:53 +00:00
George Bosilca
e69dc00460
Dont duplicate headers nor global variables.
...
This commit was SVN r27864.
2013-01-18 11:51:56 +00:00
Ralph Castain
c96cc2d5a0
In order to properly connect to debuggers like STAT, we need to get the hostname in its unstripped version for the MPIR_proctab. Unfortunately, we need a stripped version for Cray's alps launcher. So when we are stripping the hostname prefix, retain alias hostnames and add the ability to specify an alias to use in the proctab.
...
This commit was SVN r27863.
2013-01-18 05:00:05 +00:00
Ralph Castain
54266837e9
Remove use of param_find function as that function will be disappearing
...
This commit was SVN r27831.
2013-01-15 19:50:38 +00:00
Ralph Castain
5b8de0b9f4
Ouch - opal_progress calls event_loop with a NO_BLOCK flag. So when run without progress threads, the ORTE tools were not blocking in the event lib as they should be. Avoid calling opal_progress inside ORTE by directly using the event_loop call instead of ORTE_WAIT_FOR_COMPLETION as parts of the OMPI layer are using that macro.
...
Thanks to George for spotting the problem.
This commit was SVN r27815.
2013-01-14 23:06:42 +00:00
Ralph Castain
97ee683275
Track parentage of procs
...
This commit was SVN r27758.
2013-01-08 04:41:12 +00:00
Ralph Castain
aea6787918
Add new routed component with self-healing connections - based on radix component - for use in monitoring system
...
This commit was SVN r27757.
2013-01-08 04:40:35 +00:00
Ralph Castain
c9a596b487
Remove unused var
...
This commit was SVN r27756.
2013-01-08 04:39:30 +00:00
Ralph Castain
beddf3b379
Add required rml tag
...
This commit was SVN r27751.
2013-01-05 06:32:20 +00:00
Ralph Castain
bee8bf5d8f
Update the sensor framework to report stats back to the HNP if requested by including the data in heartbeats.
...
This commit was SVN r27748.
2013-01-05 06:30:20 +00:00
Ralph Castain
c71e119bbb
Extend the db framework to add support for logging data to databases without duplicating all the modex-related storage.
...
This commit was SVN r27746.
2013-01-05 06:28:09 +00:00
George Bosilca
34eecb8956
Be more explicit about the operation (store or update). complain loudly
...
if something goes wrong.
This commit was SVN r27743.
2013-01-04 20:47:25 +00:00
Ralph Castain
cc29f8ff95
Attempt to fix the stupid Cray PMI problem
...
This commit was SVN r27742.
2013-01-04 02:53:42 +00:00
Nathan Hjelm
6a9ab9b221
Change orte_startup_timeout to be in seconds and remove the 10 second maximum
...
This commit was SVN r27741.
2013-01-03 23:56:34 +00:00
Ralph Castain
81a8e21939
Need to have the event thread running during init/finalize, but we still have a problem with cleanup - so comment out the event_base_free for now.
...
This commit was SVN r27738.
2013-01-03 02:16:57 +00:00
Ralph Castain
c65de32218
Cleanup the PMI subsystems to support Sam's "rml-less" shared memory wireup. Only retrieve keys that are specifically requested, and only when they are requested. Let string values be segmented across multiple keys, but don't do it for anything else.
...
This commit was SVN r27737.
2013-01-03 02:16:10 +00:00
Ralph Castain
c1690f403e
Remove non-existent file
...
This commit was SVN r27730.
2012-12-29 02:21:50 +00:00
Ralph Castain
68329b516c
Cleanup stale test codes
...
This commit was SVN r27729.
2012-12-28 16:52:51 +00:00
Ralph Castain
d1163ebbf2
Ensure we cleanup DFS worker threads during finalize to avoid segfaulting in MCA param cleanup
...
This commit was SVN r27723.
2012-12-25 21:17:35 +00:00
Ralph Castain
64da742d5f
Remove the orte_finalize_event variable - no longer needed
...
This commit was SVN r27722.
2012-12-25 19:33:20 +00:00
Ralph Castain
cada035f38
Fix the segfault problem in the orteds - turns out it only occurred with progress threads enabled. Ensure the thread gets started at the right time (at the end of init), although the event base gets created earlier. Remove the finalize event as we can instead use the loopbreak call to exit the event loop.
...
This commit was SVN r27721.
2012-12-25 19:30:18 +00:00
Ralph Castain
c8e34813b6
THIS IS A TEMPORARY FIX - do not finalize opal as the parameter system has been broken and will segfault when finalized.
...
THIS PATCH MUST BE REMOVED WHEN THE PARAMETER SYSTEM HAS BEEN FIXED.
This commit was SVN r27720.
2012-12-24 18:42:19 +00:00
Ralph Castain
72bea688f1
Fix typo
...
This commit was SVN r27717.
2012-12-23 18:13:39 +00:00