openmpi

Автор	SHA1	Сообщение	Дата
Nathan Hjelm	4d92c9989e	more c99 updates This commit does two things. It removes checks for C99 required headers (stdlib.h, string.h, signal.h, etc). Additionally it removes definitions for required C99 types (intptr_t, int64_t, int32_t, etc). Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2015-06-25 10:14:13 -06:00
Ralph Castain	869041f770	Purge whitespace from the repo	2015-06-23 20:59:57 -07:00
Ralph Castain	c21cd1c91e	Ensure the ssh session is dead	2015-05-23 08:14:29 -07:00
Ralph Castain	920562d9b4	Ensure that all ssh sessions are terminated when abnormally terminating the job	2015-05-23 08:14:29 -07:00
Ralph Castain	8e3f0b1d33	Ensure the --tree-spawn option is inside any parens from the sh and ksh shell support	2015-05-06 15:18:15 -07:00
Jeff Squyres	11e8c2096b	plm rsh: assign some levels to the rsh PLM MCA params	2015-04-20 16:18:57 -07:00
Nathan Hjelm	45e053dbce	orte: use C99 subobject naming for component initialization This commit helps future-proof orte components by initializing each component member by name. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-04-18 10:29:58 -06:00
Ralph Castain	34b53ac3dc	Silence Coverity warnings	2015-04-18 07:48:22 -07:00
Ralph Castain	12bfb27161	Redo in cleaner form: Per request from Andy Rieb, add ability to pass PATH and LD_LIBRARY_PATH elements to ssh command	2015-04-17 16:11:37 -07:00
Nathan Hjelm	b68d66bb9b	MCA: Add the project/project version to the MCA base component This commit adds support for project_framework_component_* parameter matching. This is the first step in allowing the same framework name in multiple projects. This change also bumps the MCA component version to 2.1.0. All master frameworks have been updated to use the new component versioning macro. An mca.h has been added to each project to add a project specific versioning macro of the form PROJECT_MCA_VERSION_2_1_0. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2015-03-27 10:59:04 -06:00
Gilles Gouaillardet	7de3f35b90	pml/rsh: fix misc memory leaks as reported by Coverity with CIDs 71091, 71230, 71231, 72274, 72389, 1196718 and 1196719	2015-03-05 20:03:37 +09:00
Jeff Squyres	e2223cd9bf	plm_rsh: ensure cwd array is \0-terminated This was CID 72257	2015-02-24 15:24:08 -05:00
Howard Pritchard	bf89131f9e	add owner files to opa/ompi/orte mca directories This commit adds an owner file in each of the component directories for each framework. This allows for a simple script to parse the contents of the files and generate, among other things, tables to be used on the project's wiki page. Currently there are two "fields" in the file, an owner and a status. A tool to parse the files and generate tables for the wiki page will be added in a subsequent commit.	2015-02-22 15:10:23 -07:00
Ralph Castain	894acb0aa8	configury: new OPAL_SET_MCA_PREFIX/ORTE_SET_MCA_CMD_LINE_ID macros These two macros set the MCA prefix and MCA cmd line id, respectively. Specifically, MCA parameters will be named PREFIX<foo> in the environment, and the cmd line will use -ID foo bar. These macros must be called during configure.ac and a value supplied. In the case of Open MPI, the values given are PREFIX=OMPI_MCA_ and ID=mca. Other projects (such as ORCM) will call these macros with their own unique values. For example, ORCM uses PREFIX=ORCM_MCA_ and ID=omca This scheme is necessary to allow running Open MPI applications under systems that use their own versions of ORTE and OPAL. For example, when running OMPI applications under ORCM, we need the MCA params passed to the ORCM daemons to be separated from those recognized by the OMPI application.	2014-10-22 18:57:40 -07:00
Ralph Castain	b6aa691e0a	Fix incorrect implementation of new MCA param mca_base_env_list - it was not picking up envars and forwarding them, but only worked if you explicitly set a value for the envar. Ensure it works for both direct and indirect launch modes. Remove stale code as this replaced orte_forward_envars. Ensure it doesn't get passed to the ORTE daemons.	2014-10-16 12:58:56 -07:00
Jeff Squyres	e95ed94a94	plm_rsh_module.c: output to the framework output Trivial fix from r32686: don't output to stream 0, but rather to orte_plm_base_framework.framework_output (this is the way it was before r32686). In reality, this is going to end up being stream 0, anyway, but we might as well be pedantically correct... Refs trac:4897. This commit was SVN r32726. The following SVN revision numbers were found above: r32686 --> open-mpi/ompi@4df1aa63f7 The following Trac tickets were found above: Ticket 4897 --> https://svn.open-mpi.org/trac/ompi/ticket/4897	2014-09-13 00:46:35 +00:00
Ralph Castain	4df1aa63f7	Since we've run into the situation where someone puts a script wrapper around a launcher such as srun, we need to always protect MCA cmd line params with quotes. This means we also need to protect the backend from quotes coming into the system as part of a value, or else the parser gets confused. So add a new function for wrapping MCA arguments, and tell the backend parser to ignore/remove leading/trailing quotes. cmr=v1.8.3:reviewer=jsquyres This commit was SVN r32686.	2014-09-08 20:38:46 +00:00
Ralph Castain	039b7acfb5	Fix the quoting algorithm so only rsh command lines get quoted values cmr=v1.8.2:reviewer=jsquyres This commit was SVN r32586.	2014-08-22 22:47:38 +00:00
Ralph Castain	aec5cd08bd	Per the PMIx RFC: WHAT: Merge the PMIx branch into the devel repo, creating a new OPAL “lmix” framework to abstract PMI support for all RTEs. Replace the ORTE daemon-level collectives with a new PMIx server and update the ORTE grpcomm framework to support server-to-server collectives WHY: We’ve had problems dealing with variations in PMI implementations, and need to extend the existing PMI definitions to meet exascale requirements. WHEN: Mon, Aug 25 WHERE: https://github.com/rhc54/ompi-svn-mirror.git Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding. All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level. Accordingly, we have: * created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations. * Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported. * Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint * removed the prior OMPI/OPAL modex code * added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform. * retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand This commit was SVN r32570.	2014-08-21 18:56:47 +00:00
Jeff Squyres	1551339eba	rsh: revert part of r32517: keep the quoting As part of reviewing CMR #4860, I talked through r32517 with Ralph. In attempt to fix various rsh quoting problems, r32517 removed all the quoting from the main code path and then only added it back in at the end in some cases. This commit puts back the quoting parts that were removed in r32517 (r32517 fixed 2 other important bugs: a) change "--<foo>" to "--mca <foo_equivalent> 1" so that de-duplication works, and b) change a != to ==). refs trac:4860 This commit was SVN r32524. The following SVN revision numbers were found above: r32517 --> open-mpi/ompi@7342bce58f The following Trac tickets were found above: Ticket 4860 --> https://svn.open-mpi.org/trac/ompi/ticket/4860	2014-08-13 19:27:10 +00:00
Ralph Castain	7342bce58f	Cleanup the over-aggressive quoting of params on the orted cmd line. Remove duplicates caused by passing on both cmd line shortcuts and the mca param version of the same thing. Fixes trac:4857 cmr=v1.8.2:reviewer=jsquyres This commit was SVN r32517. The following Trac tickets were found above: Ticket 4857 --> https://svn.open-mpi.org/trac/ompi/ticket/4857	2014-08-13 03:51:04 +00:00
Ralph Castain	0cad281a92	Single-word cmd line values for orted are dealt with in orte_plm_base_orted_append_basic_args, so protect against special characters there. Have the rsh module only deal with multi-word arguments as those were skipped by orte_plm_base_orted_append_basic_args. Refs trac:4802 This commit was SVN r32293. The following Trac tickets were found above: Ticket 4802 --> https://svn.open-mpi.org/trac/ompi/ticket/4802	2014-07-23 17:06:51 +00:00
Ralph Castain	a94a97bd50	Cleanup the passing of MCA params on the orted cmd line in ssh by ensuring that we quote all values since they could be multi-word and/or contain special characters. Thanks to Dirk Schubert for pointing it out. cmr=v1.8.2:reviewer=jsquyres This commit was SVN r32280.	2014-07-22 18:22:06 +00:00
Ralph Castain	5216bd5558	Multiple sigchld reports can occur within a single event callback, so have to reap them until none remain. Also, need to ensure the daemon is flagged as alive prior to calling wait_cb Refs trac:4717 This commit was SVN r32020. The following Trac tickets were found above: Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717	2014-06-17 18:46:40 +00:00
Ralph Castain	42bf7466fc	This isn't as big a change as it appears - a change in one place caused a whole bunch of files to require updated #include's due to some arcane linkage. Rework the orte_wait code to reflect the introduction of the state machine. If we are in cleanup mode and just want to kill all our local children, then there is no reason to be polite about it as that introduces very long delays at scale. Just kill the procs and move on. Refs trac:4717 This commit was SVN r32019. The following Trac tickets were found above: Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717	2014-06-17 17:57:51 +00:00
Ralph Castain	f1978fba7c	Cleanup a set of typos on the orte_get_attribute call This commit was SVN r31942.	2014-06-03 20:36:38 +00:00
Ralph Castain	5668f085a3	Silence some useless warnings, and fix a missed updated in the tm plm This commit was SVN r31930.	2014-06-02 17:57:56 +00:00
Ralph Castain	8736a1c138	Per RFC: http://www.open-mpi.org/community/lists/devel/2014/05/14822.php Revamp the ORTE global data structures to reduce memory footprint and add new features. Add ability to control/set cpu frequency, though this can only be done if the sys admin has setup the system to support it (or you run as root). This commit was SVN r31916.	2014-06-01 16:14:10 +00:00
Ralph Castain	5602156a1c	Use the correct abstraction layer name for the data dirs This commit was SVN r31684.	2014-05-08 14:32:24 +00:00
Ralph Castain	11faab1091	The final step of the RFC: convert the <foo>libdir and friends to fit their respective code areas, and equate them all at the top. Note that we can't entirely separate things as the opal_install_dirs framework can't handle separated locations for the various trees. This commit was SVN r31679.	2014-05-08 02:01:35 +00:00
Ralph Castain	0aa23cdc35	Cleanup copy/paste errors to ensure we progress the launch cmr=v1.7.5:reviewer=rhc This commit was SVN r31102.	2014-03-18 01:24:49 +00:00
Ralph Castain	fbc5e3b773	Deal with the corner case where we encounter an error when attempting to launch a daemon. In this case, we will order abnormal termination before daemons callback to us, and thus any attempt to send them a "die" message will fail. Ensure that mpirun at least exits cleanly in this scenario, thereby allowing the remote daemons that did get launched to commit suicide when comm fails. cmr=v1.7.5:reviewer=jsquyres This commit was SVN r31068.	2014-03-14 15:32:30 +00:00
Ralph Castain	c9465d97b4	Resolve a race condition when responding to a SIGTERM to ensure that any final message from the application is correctly output. Remove a duplicate command, reduce the priority of the daemon exit command to MSG so that the IOF will have a chance to output cached messages. Update the signal trapping test. Thanks to Paul Kapinos for reporting the problem. cmr=v1.7.5:reviewer=jsquyres:subject=resolve a race condition This commit was SVN r30942.	2014-03-05 04:38:17 +00:00
Ralph Castain	e3cb4b4a5b	Grant Nathan his wish - add an --disable-getpwuid to the configure options and protect all users of that code so it disappears if disabled. cmr=v1.7.5:reviewer=hjelmn:subject=disable getpwuid if requested This commit was SVN r30413.	2014-01-24 19:18:37 +00:00
Brian Barrett	8b778903d8	Fix longstanding issue with our multi-project support. Rather than using pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is always set to {datadir,libdir,includedir}/openmpi. This will keep us from having help files in prefix/share/open-rte when building without Open MPI, but in prefix/share/openmpi when building with Open MPI. This commit was SVN r30140.	2014-01-07 22:11:15 +00:00
Nathan Hjelm	3be4536d9b	Cleanup various leaks in ompi_info reported by valgrind. cmr=v1.7.4:reviewer=jsquyres This commit was SVN r30058.	2013-12-23 17:47:43 +00:00
Ralph Castain	a200e4f865	As per the RFC, bring in the ORTE async progress code and the rewrite of OOB: * THIS RFC INCLUDES A MINOR CHANGE TO THE MPI-RTE INTERFACE * Note: during the course of this work, it was necessary to completely separate the MPI and RTE progress engines. There were multiple places in the MPI layer where ORTE_WAIT_FOR_COMPLETION was being used. A new OMPI_WAIT_FOR_COMPLETION macro was created (defined in ompi/mca/rte/rte.h) that simply cycles across opal_progress until the provided flag becomes false. Places where the MPI layer blocked waiting for RTE to complete an event have been modified to use this macro. *************************************************************************************** I am reissuing this RFC because of the time that has passed since its original release. Since its initial release and review, I have debugged it further to ensure it fully supports tests like loop_spawn. It therefore seems ready for merge back to the trunk. Given its prior review, I have set the timeout for one week. The code is in https://bitbucket.org/rhc/ompi-oob2 WHAT: Rewrite of ORTE OOB WHY: Support asynchronous progress and a host of other features WHEN: Wed, August 21 SYNOPSIS: The current OOB has served us well, but a number of limitations have been identified over the years. Specifically: * it is only progressed when called via opal_progress, which can lead to hangs or recursive calls into libevent (which is not supported by that code) * we've had issues when multiple NICs are available as the code doesn't "shift" messages between transports - thus, all nodes had to be available via the same TCP interface. * the OOB "unloads" incoming opal_buffer_t objects during the transmission, thus preventing use of OBJ_RETAIN in the code when repeatedly sending the same message to multiple recipients * there is no failover mechanism across NICs - if the selected NIC (or its attached switch) fails, we are forced to abort * only one transport (i.e., component) can be "active" The revised OOB resolves these problems: * async progress is used for all application processes, with the progress thread blocking in the event library * each available TCP NIC is supported by its own TCP module. The ability to asynchronously progress each module independently is provided, but not enabled by default (a runtime MCA parameter turns it "on") * multi-address TCP NICs (e.g., a NIC with both an IPv4 and IPv6 address, or with virtual interfaces) are supported - reachability is determined by comparing the contact info for a peer against all addresses within the range covered by the address/mask pairs for the NIC. * a message that arrives on one TCP NIC is automatically shifted to whatever NIC that is connected to the next "hop" if that peer cannot be reached by the incoming NIC. If no TCP module will reach the peer, then the OOB attempts to send the message via all other available components - if none can reach the peer, then an "error" is reported back to the RML, which then calls the errmgr for instructions. * opal_buffer_t now conforms to standard object rules re OBJ_RETAIN as we no longer "unload" the incoming object * NIC failure is reported to the TCP component, which then tries to resend the message across any other available TCP NIC. If that doesn't work, then the message is given back to the OOB base to try using other components. If all that fails, then the error is reported to the RML, which reports to the errmgr for instructions * obviously from the above, multiple OOB components (e.g., TCP and UD) can be active in parallel * the matching code has been moved to the RML (and out of the OOB/TCP component) so it is independent of transport * routing is done by the individual OOB modules (as opposed to the RML). Thus, both routed and non-routed transports can simultaneously be active * all blocking send/recv APIs have been removed. Everything operates asynchronously. KNOWN LIMITATIONS: * although provision is made for component failover as described above, the code for doing so has not been fully implemented yet. At the moment, if all connections for a given peer fail, the errmgr is notified of a "lost connection", which by default results in termination of the job if it was a lifeline * the IPv6 code is present and compiles, but is not complete. Since the current IPv6 support in the OOB doesn't work anyway, I don't consider this a blocker * routing is performed at the individual module level, yet the active routed component is selected on a global basis. We probably should update that to reflect that different transports may need/choose to route in different ways * obviously, not every error path has been tested nor necessarily covered * determining abnormal termination is more challenging than in the old code as we now potentially have multiple ways of connecting to a process. Ideally, we would declare "connection failed" when all transports can no longer reach the process, but that requires some additional (possibly complex) code. For now, the code replicates the old behavior only somewhat modified - i.e., if a module sees its connection fail, it checks to see if it is a lifeline. If so, it notifies the errmgr that the lifeline is lost - otherwise, it notifies the errmgr that a non-lifeline connection was lost. * reachability is determined solely on the basis of a shared subnet address/mask - more sophisticated algorithms (e.g., the one used in the tcp btl) are required to handle routing via gateways * the RML needs to assign sequence numbers to each message on a per-peer basis. The receiving RML will then deliver messages in order, thus preventing out-of-order messaging in the case where messages travel across different transports or a message needs to be redirected/resent due to failure of a NIC This commit was SVN r29058.	2013-08-22 16:37:40 +00:00
Ralph Castain	698b4ad6e7	Fix the parameter handling so no-tree-spawn isn't getting reversed This commit was SVN r28300.	2013-04-07 15:48:25 +00:00
Ralph Castain	e6ae088813	Cleanup error outputs when a daemon fails to start This commit was SVN r28261.	2013-03-28 16:51:19 +00:00
Nathan Hjelm	c041156f60	Update ORTE frameworks to use the MCA framework system. This commit was SVN r28240.	2013-03-27 21:14:43 +00:00
Nathan Hjelm	cf377db823	MCA/base: Add new MCA variable system Features: - Support for an override parameter file (openmpi-mca-param-override.conf). Variable values in this file can not be overridden by any file or environment value. - Support for boolean, unsigned, and unsigned long long variables. - Support for true/false values. - Support for enumerations on integer variables. - Support for MPIT scope, verbosity, and binding. - Support for command line source. - Support for setting variable source via the environment using OMPI_MCA_SOURCE_<var name>=source (either command or file:filename) - Cleaner API. - Support for variable groups (equivalent to MPIT categories). Notes: - Variables must be created with a backing store (char *, int , or bool *) that must live at least as long as the variable. - Creating a variable with the MCA_BASE_VAR_FLAG_SETTABLE enables the use of mca_base_var_set_value() to change the value. - String values are duplicated when the variable is registered. It is up to the caller to free the original value if necessary. The new value will be freed by the mca_base_var system and must not be freed by the user. - Variables with constant scope may not be settable. - Variable groups (and all associated variables) are deregistered when the component is closed or the component repository item is freed. This prevents a segmentation fault from accessing a variable after its component is unloaded. - After some discussion we decided we should remove the automatic registration of component priority variables. Few component actually made use of this feature. - The enumerator interface was updated to be general enough to handle future uses of the interface. - The code to generate ompi_info output has been moved into the MCA variable system. See mca_base_var_dump(). opal: update core and components to mca_base_var system orte: update core and components to mca_base_var system ompi: update core and components to mca_base_var system This commit also modifies the rmaps framework. The following variables were moved from ppr and lama: rmaps_base_pernode, rmaps_base_n_pernode, rmaps_base_n_persocket. Both lama and ppr create synonyms for these variables. This commit was SVN r28236.	2013-03-27 21:09:41 +00:00
Ralph Castain	147c6ff9e7	Clean out the cruft leftover from the use_common_ports experiment cmr:v1.7 This commit was SVN r28184.	2013-03-20 15:07:43 +00:00
Ralph Castain	cf9796accd	Remove the old configure option for disabling full rte support - we now use the OMPI rte framework for such purposes This commit was SVN r28134.	2013-02-28 01:35:55 +00:00
Ralph Castain	f2ec35536e	Fix a bug that prevented MCA params from being forwarded to daemons upon launch cmr:v1.7 This commit was SVN r27621.	2012-11-18 17:55:26 +00:00
Nathan Hjelm	df9bd0ed59	fix bug in plm/rsh that could add extraneous mca options to the orted argv cmr:v1.7 This commit was SVN r27526.	2012-10-30 19:40:04 +00:00
Ralph Castain	7574d6673b	If someone provides the launch_agent cmd, then don't prefix it cmr:v1.7 This commit was SVN r27473.	2012-10-24 16:14:04 +00:00
Ralph Castain	98580c117b	Introduce staged execution. If you don't have adequate resources to run everything without oversubscribing, don't want to oversubscribe, and aren't using MPI, then staged execution lets you (a) run as many procs as there are available resources, and (b) start additional procs as others complete and free up resources. Adds a new mapper as well as a new state machine. Remove some stale configure.m4's we no longer need. Optimize the nidmaps a bit by only sending info that has changed each time, instead of sending a complete copy of everything. Makes no difference for the typical MPI job - only impacts things like staged execution where we are sending multiple (possibly many) launch messages. This commit was SVN r27165.	2012-08-28 21:20:17 +00:00
Ralph Castain	8bc6694a62	Ensure the daemons don't incorrectly declare a failed launch This commit was SVN r26875.	2012-07-26 19:05:06 +00:00
Ralph Castain	07846f12ae	Reconnect the rsh/ssh error reporting code for remote spawns to report failure to launch. Ensure the HNP correctly reports non-zero exit status when ssh encounters a problem. Thanks to Terry for spotting it! This commit was SVN r26868.	2012-07-25 21:46:45 +00:00
Ralph Castain	0dfe29b1a6	Roll in the rest of the modex change. Eliminate all non-modex API access of RTE info from the MPI layer - in some cases, the info was already present (either in the ompi_proc_t or in the orte_process_info struct) and no call was necessary. This removes all calls to orte_ess from the MPI layer. Calls to orte_grpcomm remain required. Update all the orte ess components to remove their associated APIs for retrieving proc data. Update the grpcomm API to reflect transfer of set/get modex info to the db framework. Note that this doesn't recreate the old GPR. This is strictly a local db storage that may (at some point) obtain any missing data from the local daemon as part of an async methodology. The framework allows us to experiment with such methods without perturbing the default one. This commit was SVN r26678.	2012-06-27 14:53:55 +00:00

1 2 3 4

180 Коммитов