openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	039b7acfb5	Fix the quoting algorithm so only rsh command lines get quoted values cmr=v1.8.2:reviewer=jsquyres This commit was SVN r32586.	2014-08-22 22:47:38 +00:00
Ralph Castain	aec5cd08bd	Per the PMIx RFC: WHAT: Merge the PMIx branch into the devel repo, creating a new OPAL “lmix” framework to abstract PMI support for all RTEs. Replace the ORTE daemon-level collectives with a new PMIx server and update the ORTE grpcomm framework to support server-to-server collectives WHY: We’ve had problems dealing with variations in PMI implementations, and need to extend the existing PMI definitions to meet exascale requirements. WHEN: Mon, Aug 25 WHERE: https://github.com/rhc54/ompi-svn-mirror.git Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding. All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level. Accordingly, we have: * created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations. * Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported. * Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint * removed the prior OMPI/OPAL modex code * added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform. * retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand This commit was SVN r32570.	2014-08-21 18:56:47 +00:00
Jeff Squyres	1551339eba	rsh: revert part of r32517: keep the quoting As part of reviewing CMR #4860, I talked through r32517 with Ralph. In attempt to fix various rsh quoting problems, r32517 removed all the quoting from the main code path and then only added it back in at the end in some cases. This commit puts back the quoting parts that were removed in r32517 (r32517 fixed 2 other important bugs: a) change "--<foo>" to "--mca <foo_equivalent> 1" so that de-duplication works, and b) change a != to ==). refs trac:4860 This commit was SVN r32524. The following SVN revision numbers were found above: r32517 --> open-mpi/ompi@7342bce58f The following Trac tickets were found above: Ticket 4860 --> https://svn.open-mpi.org/trac/ompi/ticket/4860	2014-08-13 19:27:10 +00:00
Ralph Castain	7342bce58f	Cleanup the over-aggressive quoting of params on the orted cmd line. Remove duplicates caused by passing on both cmd line shortcuts and the mca param version of the same thing. Fixes trac:4857 cmr=v1.8.2:reviewer=jsquyres This commit was SVN r32517. The following Trac tickets were found above: Ticket 4857 --> https://svn.open-mpi.org/trac/ompi/ticket/4857	2014-08-13 03:51:04 +00:00
George Bosilca	de7191132d	Remove few warnings. This commit was SVN r32506.	2014-08-11 13:34:44 +00:00
Ralph Castain	0cad281a92	Single-word cmd line values for orted are dealt with in orte_plm_base_orted_append_basic_args, so protect against special characters there. Have the rsh module only deal with multi-word arguments as those were skipped by orte_plm_base_orted_append_basic_args. Refs trac:4802 This commit was SVN r32293. The following Trac tickets were found above: Ticket 4802 --> https://svn.open-mpi.org/trac/ompi/ticket/4802	2014-07-23 17:06:51 +00:00
Ralph Castain	a94a97bd50	Cleanup the passing of MCA params on the orted cmd line in ssh by ensuring that we quote all values since they could be multi-word and/or contain special characters. Thanks to Dirk Schubert for pointing it out. cmr=v1.8.2:reviewer=jsquyres This commit was SVN r32280.	2014-07-22 18:22:06 +00:00
Ralph Castain	6c5e592785	Revert r32222, r32210, and r32203 as they created a problem when daemon collectives did not involve app procs on every node. Instead, modify the ompi/mca/rte/orte/rte_orte.h to add a new function that allows apps to request new daemon collective ids for use in barrier and modex operations. This will only appear in ORTE-based installations, but it is only being used by a couple of researchers at the moment. Update the orte/test/mpi/coll_test.c test to show the revised example. This commit was SVN r32234. The following SVN revision numbers were found above: r32203 --> open-mpi/ompi@a523dba41d r32210 --> open-mpi/ompi@2ce11ed5c4 r32222 --> open-mpi/ompi@d55f16db50	2014-07-15 03:48:00 +00:00
Ralph Castain	1feaffbb15	Get the blasted singleton comm_spawn working again. There remain problems with the Slurm interaction in this use-case as the PMI components (if configured to build) try to run even when a Slurm allocation hasn't been made, but I leave that to someone else to resolve. I did, however, tell the Slurm ess to quit interfering with applications launched in this use-case by ORTE daemons, so things do work when inside a Slurm allocation. Also discovered that the rsh launcher is not picking up --enable-orterun-prefix-by-default when invoked during singleton comm_spawn, but I was unable to see why that was happening and ran out of time. cmr=v1.8.2:reviewer=rhc This commit was SVN r32229.	2014-07-13 14:47:22 +00:00
Ralph Castain	a523dba41d	NOTE: this modifies the MPI-RTE interface We have been getting several requests for new collectives that need to be inserted in various places of the MPI layer, all in support of either checkpoint/restart or various research efforts. Until now, this would require that the collective id's be generated at launch. which required modification s to ORTE and other places. We chose not to make collectives reusable as the race conditions associated with resetting collective counters are daunti ng. This commit extends the collective system to allow self-generation of collective id's that the daemons need to support, thereby allowing developers to request any number of collectives for their work. There is one restriction: RTE collectives must occur at the process level - i.e., we don't curren tly have a way of tagging the collective to a specific thread. From the comment in the code: * In order to allow scalable * generation of collective id's, they are formed as: * * top 32-bits are the jobid of the procs involved in * the collective. For collectives across multiple jobs * (e.g., in a connect_accept), the daemon jobid will * be used as the id will be issued by mpirun. This * won't cause problems because daemons don't use the * collective_id * * bottom 32-bits are a rolling counter that recycles * when the max is hit. The daemon will cleanup each * collective upon completion, so this means a job can * never have more than 2*32 collectives going on at a time. If someone needs more than that - they've got * a problem. * * Note that this means (for now) that RTE-level collectives * cannot be done by individual threads - they must be * done at the overall process level. This is required as * there is no guaranteed ordering for the collective id's, * and all the participants must agree on the id of the * collective they are executing. So if thread A on one * process asks for a collective id before thread B does, * but B asks before A on another process, the collectives will * be mixed and not result in the expected behavior. We may * find a way to relax this requirement in the future by * adding a thread context id to the jobid field (maybe taking the * lower 16-bits of that field). This commit includes a test program (orte/test/mpi/coll_test.c) that cycles 100 times across barrier and modex collectives. This commit was SVN r32203.	2014-07-10 18:53:12 +00:00
Ralph Castain	8c85ca350e	Remove debug This commit was SVN r32200.	2014-07-10 18:28:24 +00:00
Ralph Castain	356e7ea904	Move all collective id's into the attributes and let the job pack/unpack take care of them instead of singling them out. Add the envars just prior to forking the children instead of into the launch message itself. Remove a few #if CR as the attributes functionality can handle this condition now. This commit was SVN r32133.	2014-07-03 15:58:13 +00:00
Adrian Reber	cabf1d4e68	use the orte attributes in the FT code to fix compile errors This commit was SVN r32093.	2014-06-26 03:19:17 +00:00
Nathan Hjelm	563eaf0726	Fix support for Cray alps The alps ras and plm components were broken by recent changes in ORTE. This commit resolves those issues. Changes: - Define PMI2_SUCCESS if it isn't defined. This fixes a problem with Cray's PMI implementation which does not define (for some reason) PMI2_SUCCESS. We had previously just used PMI_SUCCESS. - Add missing definition and a typo in pml_alps_module. - launch_id is no longer available in the orte_node_t structure. Use the attribute lookup to get the value. - Do not use an O(n^2) sorting algorithm when putting alps nodes in order. Use opal_list_sort instead (O(nlogn)). This commit was SVN r32076.	2014-06-24 21:29:04 +00:00
Ralph Castain	3f032d39e8	Mark the proc as alive so waitpid callback system doesn't immediately activate the callback Refs trac:4717 This commit was SVN r32026. The following Trac tickets were found above: Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717	2014-06-18 14:04:55 +00:00
Ralph Castain	8e7c0257f0	Cleanup some missed updates to orte_wait_cb as params have changed Refs trac:4717 This commit was SVN r32025. The following Trac tickets were found above: Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717	2014-06-17 23:40:31 +00:00
Ralph Castain	5216bd5558	Multiple sigchld reports can occur within a single event callback, so have to reap them until none remain. Also, need to ensure the daemon is flagged as alive prior to calling wait_cb Refs trac:4717 This commit was SVN r32020. The following Trac tickets were found above: Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717	2014-06-17 18:46:40 +00:00
Ralph Castain	42bf7466fc	This isn't as big a change as it appears - a change in one place caused a whole bunch of files to require updated #include's due to some arcane linkage. Rework the orte_wait code to reflect the introduction of the state machine. If we are in cleanup mode and just want to kill all our local children, then there is no reason to be polite about it as that introduces very long delays at scale. Just kill the procs and move on. Refs trac:4717 This commit was SVN r32019. The following Trac tickets were found above: Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717	2014-06-17 17:57:51 +00:00
Ralph Castain	b2413a6b88	Cannot update the proc state prior to activating the state machine as some callback functions need to compare the prior proc state against the new one. cmr=v1.8.2:reviewer=jsquyres This commit was SVN r31949.	2014-06-04 03:40:08 +00:00
Ralph Castain	c5384d44d7	Protect against NULL result in get_attr This commit was SVN r31947.	2014-06-04 03:09:37 +00:00
Ralph Castain	f1978fba7c	Cleanup a set of typos on the orte_get_attribute call This commit was SVN r31942.	2014-06-03 20:36:38 +00:00
Ralph Castain	5668f085a3	Silence some useless warnings, and fix a missed updated in the tm plm This commit was SVN r31930.	2014-06-02 17:57:56 +00:00
Ralph Castain	742c0d2284	Fix typo that would cause a segfault if orte_startup_timeout was set This commit was SVN r31929.	2014-06-02 15:59:18 +00:00
Ralph Castain	65a35d92ef	Cleanup compile issues - missing updates to some plm components and the slurm ras component This commit was SVN r31921.	2014-06-01 17:59:06 +00:00
Ralph Castain	8736a1c138	Per RFC: http://www.open-mpi.org/community/lists/devel/2014/05/14822.php Revamp the ORTE global data structures to reduce memory footprint and add new features. Add ability to control/set cpu frequency, though this can only be done if the sys admin has setup the system to support it (or you run as root). This commit was SVN r31916.	2014-06-01 16:14:10 +00:00
Nathan Hjelm	041b72b0cc	plm/alps: better workaround for the noisy cray pmi implementation This commit is a slightly better workaround to prevent mesages of the form: [unset]:_pmi_alps_get_apid:alps_app_lli_put_request failed [unset]:_pmi_alps_get_appLayout:pmi_alps_get_apid returned with error: Bad file descriptor It works by completely disabling PMI in the application process when using mpirun. This should not be an issue for any apps. cmr=v1.8.2:reviewer=rhc This commit was SVN r31882.	2014-05-22 16:04:36 +00:00
Nathan Hjelm	2a57e71a47	plm/alps: fix typo introduced in r31589 This commit was SVN r31747. The following SVN revision numbers were found above: r31589 --> open-mpi/ompi@445b552d3a	2014-05-13 22:36:54 +00:00
Ralph Castain	5602156a1c	Use the correct abstraction layer name for the data dirs This commit was SVN r31684.	2014-05-08 14:32:24 +00:00
Ralph Castain	11faab1091	The final step of the RFC: convert the <foo>libdir and friends to fit their respective code areas, and equate them all at the top. Note that we can't entirely separate things as the opal_install_dirs framework can't handle separated locations for the various trees. This commit was SVN r31679.	2014-05-08 02:01:35 +00:00
Ralph Castain	445b552d3a	Try again to get an error message printed when a daemon fails to successfully report back to mpirun. In this case, there is no guaranteed way for the daemon to output the error report itself - we don't have a connection back to the HNP, and we have tied stderr off to /dev/null (for good reasons). So the HNP has to detect the failure itself and report it. The HNP can't know the precise reason, of course - all it knows is that the daemon failed. So output a generic error message that provides guidance on probable causes. Refs trac:4571 This commit was SVN r31589. The following Trac tickets were found above: Ticket 4571 --> https://svn.open-mpi.org/trac/ompi/ticket/4571	2014-05-01 19:48:21 +00:00
Ralph Castain	238ecea311	When we comm_spawn, we really want to respect the original -host directives and not expand the daemon virtual machine unless directed to do so in the comm_spawn command. Otherwise, we will automatically launch daemons on every node in the allocation. cmr=v1.8.2:reviewer=rhc:subject=respect vm boundaries during comm_spawn This commit was SVN r31578.	2014-04-30 22:26:18 +00:00
Jeff Squyres	ea4c916096	plm_slurm_module.c: don't leave the extra fd to /dev/null open Prior to r29058, this same logic was in place (i.e., ensure that the extra fd to /dev/null is closed). It looks like it was accidentally removed in the ORTE conversion to the state machine in r29058. This ''might'' have something to do with many hangs that we're seeing in Cisco MTT with jobs that exhibit failure (e.g., call MPI_ABORT)...? cmr=v1.8.2:reviewer=rhc This commit was SVN r31469. The following SVN revision numbers were found above: r29058 --> open-mpi/ompi@a200e4f865	2014-04-21 20:09:15 +00:00
Ralph Castain	a368e84e70	Per the RFC, remove the sensor framework from the ORTE code area, relocating it offsite to the ORCM code area. Also update some ignores to ensure we don't pickup crosstalk in components This commit was SVN r31403.	2014-04-15 21:48:24 +00:00
Nathan Hjelm	9df795d1dd	plm/alps: silence annoying warning message when using Cray PMI 3.x or newer This commit adds a workaround for messages printed by the Cray PMI library when launching using mpirun. We are still talking with Cray to find a better fix but this will silence the warnings for now. cmr=v1.8.1:reviewer=manjugv This commit was SVN r31352.	2014-04-08 21:54:10 +00:00
Dave Goodell	19efa09540	plm/slurm: tweak /dev/null usage (#4489 ) See the ticket for more details. cmr=v1.8.1:reviewer=rhc:ticket=4489 This commit was SVN r31351. The following Trac tickets were found above: Ticket 4489 --> https://svn.open-mpi.org/trac/ompi/ticket/4489	2014-04-08 21:46:07 +00:00
Ralph Castain	957c9ecf53	Okay, silence the anality by simplifying the already irrelevant code, thus allowing us to turn our attention to things that actually matter Refs trac:4489 This commit was SVN r31348. The following Trac tickets were found above: Ticket 4489 --> https://svn.open-mpi.org/trac/ompi/ticket/4489	2014-04-08 19:51:11 +00:00
Ralph Castain	8ce98ccc8d	Not sure when this got messed up, but correct the stdout/stderr redirection on the srun command so we don't get all those slurm warnings cmr=v1.8.1:reviewer=dgoodell:subject=silence srun warning output This commit was SVN r31308.	2014-04-04 04:23:31 +00:00
Ralph Castain	3fdcaeab97	Fix a problem where we need to abort due to a mapping failure, but we are in a managed environment and thus the orteds have not wired up. Thus, if we send the exit message across the routed network, the remote daemons won't have a way to relay the message along - and we won't exit. If we are aborting, then set the flags so the HNP directly sends an exit command to each daemon. Make it the halt_vm command so the remote daemon doesn't try to relay it, but instead just exits without waiting for its routed children to exit first. cmr=v1.8.1:reviewer=jsquyres:subject=fix hangs due to abort prior to daemon wireup This commit was SVN r31304.	2014-04-02 04:17:55 +00:00
Ralph Castain	70ee3fb000	Ensure that orted's are not bound to single processors if the TaskAffinity option is set by default. Thanks to Artem Polyakov for the patch, and for his patience in explaining the situation. Reviewed with Moe Jette to ensure this was correct, and confirmed by me. RM-approved cmr=v1.8:reviewer=ompi-gk1.8 This commit was SVN r31288.	2014-03-29 18:30:38 +00:00
Ralph Castain	bd9bd2ff16	Be consistent in our handling of the "only HNP in allocation" case when setting up the VM. Thanks to Tetsuya Mishima for the suggestion. cmr=v1.8:reviewer=rhc This commit was SVN r31195.	2014-03-24 15:28:09 +00:00
Ralph Castain	d17f811ff5	Surrender to the tyranny of C++ and give up on enum for node states, as nice as that would be, in favor of retaining memory footprint constraints. This commit was SVN r31149.	2014-03-19 16:15:24 +00:00
Ralph Castain	0aa23cdc35	Cleanup copy/paste errors to ensure we progress the launch cmr=v1.7.5:reviewer=rhc This commit was SVN r31102.	2014-03-18 01:24:49 +00:00
Ralph Castain	45196d222b	Minor cleanup of the node state definitions - using the enum allows the debuggers to pretty-print the value This commit was SVN r31090.	2014-03-17 21:27:58 +00:00
Ralph Castain	b248b27637	Remove a check that prevented mpirun from exiting when it should in the single-node case Refs trac:4393 This commit was SVN r31080. The following Trac tickets were found above: Ticket 4393 --> https://svn.open-mpi.org/trac/ompi/ticket/4393	2014-03-15 15:25:44 +00:00
Ralph Castain	fbc5e3b773	Deal with the corner case where we encounter an error when attempting to launch a daemon. In this case, we will order abnormal termination before daemons callback to us, and thus any attempt to send them a "die" message will fail. Ensure that mpirun at least exits cleanly in this scenario, thereby allowing the remote daemons that did get launched to commit suicide when comm fails. cmr=v1.7.5:reviewer=jsquyres This commit was SVN r31068.	2014-03-14 15:32:30 +00:00
Adrian Reber	7304b700e1	Fix the newly added FT event state when compiling --with-ft This commit was SVN r30988.	2014-03-11 13:20:08 +00:00
Ralph Castain	7a44af375c	Add an FT event state and set the state machine to callback to the OOB base ft event when activated This commit was SVN r30950.	2014-03-06 02:44:29 +00:00
Ralph Castain	c9465d97b4	Resolve a race condition when responding to a SIGTERM to ensure that any final message from the application is correctly output. Remove a duplicate command, reduce the priority of the daemon exit command to MSG so that the IOF will have a chance to output cached messages. Update the signal trapping test. Thanks to Paul Kapinos for reporting the problem. cmr=v1.7.5:reviewer=jsquyres:subject=resolve a race condition This commit was SVN r30942.	2014-03-05 04:38:17 +00:00
Ralph Castain	0ac97761cc	Now that we are binding by default, the issue of #slots and what to do when oversubscribed has become a bit more complicated. This isn't a problem in managed environments as we are always provided an accurate assignment for the #slots, or when -host is used to define the allocation since we automatically assume one slot for every time a node is named. The problem arises when a hostfile is used, and the user provides host names without specifying the slots= paramater. In these cases, we assign slots=1, but automatically allow oversubscription since that number isn't confirmed. We then provide a separate parameter by which the user can direct that we assign the number of slots based on the sensed hardware - e.g., by telling us to set the #slots equal to the #cores on each node. However, this has been set to "off" by default. In order to make this a little less complex for the user, set the default such that we automatically set #slots equal to #cores (or #hwt's if use_hwthreads_as_cpus has been set) only for those cases where the user provides names in a hostfile but does not provide slot information. Also cleanup some a couple of issues in the mapping/binding system: * ensure we only override the binding directive if we are oversubscribed and overload is not allowed * ensure that the MPI procs don't attempt to bind themselves if they are launched by an orted as any binding directive (no matter what it was) would have been serviced by the orted on launch * minor cleanup to the warning message when oversubscribed and binding was requested cmr=v1.7.5:reviewer=rhc:subject=update mapping/binding system This commit was SVN r30909.	2014-03-03 16:46:37 +00:00
Ralph Castain	0dc5f50d27	Add a plm component for local-only operation that doesn't require rsh/ssh to be installed. Requested by Fedora packagers for testing purposes. cmr=v1.7.5:reviewer=jsquyres:subject=Add a plm component for local-only operation This commit was SVN r30645.	2014-02-09 15:53:10 +00:00

1 2 3 4 5 ...

571 Коммитов