openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	a013f3059f	For scalability reasons, and to make life easier for the poor Cray-ites, don't bang on the system for the username - we'll just use the uid.	2015-03-19 21:24:13 -07:00
Jeff Squyres	a026456bef	(orte\|ompi\|oshmem)info tools: convert to opal_dl interface Noe that this commit removes option:lt_dladvise from the various "info" tools output. This technically breaks our CLI "ABI" because we're not deprecating it / replacing it with an alias to some other "into" tool output. Although the dl/libltdl component contains an "have_lt_dladvise" MCA var that contains the same information, the "option:lt_dladvise" output from the various "info" tools is not* an MCA var, and therefore we can't alias it. So it just has to die.	2015-03-09 08:18:13 -07:00
Gilles Gouaillardet	4c0eb11e08	orterun: fix misc errors as reported by Coverity with CIDs 70700, 71039, 710651	2015-03-09 11:57:18 +09:00
Gilles Gouaillardet	33841361c0	orte-clean: use pclose instead of fclose as reported by Coverity with CID 1287029	2015-03-09 11:17:59 +09:00
Gilles Gouaillardet	4e7b5240e4	orte/tools: fix misc memory leaks as reported by Coverity with CIDs 70700, 71039, 71854, 72384 and 710651	2015-03-05 14:06:18 +09:00
Jeff Squyres	4f54fedf05	orterun: ensure to set used_num_procs=true after finding that token This was CID 71687.	2015-02-24 15:25:39 -05:00
Jeff Squyres	15be948d79	wrappers: _EXTRA_INCLUDES does not exist any more There were a few places where _EXTRA_INCLUDES (and derivates) were still being used. This commit removes all of them.	2015-02-20 08:43:25 -08:00
Jeff Squyres	9b716d946e	wrappers: fix errant @{libdir} reference in pkg-config files The RPATH support added a @{libdir} token into <package>_WRAPPER_EXTRA_LDFLAGS. However, these flags are also substituted into the pkg-config data files, and they don't understand the @{foo} notation. So convert @{libdir} into ${libdir}, which pkg-config does understand. Thanks to Christoph Junghans (@junghans) for notifying us of the issue. Fixes #406.	2015-02-20 08:43:19 -08:00
Gilles Gouaillardet	8dc4f30fae	orte/tools: fix NULL pointer dereference as reported by Coverity with CIDs 1196671 and 1196824	2015-02-17 15:45:06 +09:00
Gilles Gouaillardet	8fe8079080	Fix a build failure when configure'd with --without-hwloc see http://mtt.open-mpi.org/index.php?do_redir=2235	2015-02-16 10:31:09 +09:00
Jeff Squyres	3ac1d0dae5	*-info: add "lt_dladvise support" lines	2015-02-11 12:25:20 -08:00
Ralph Castain	ce56c0a2cf	Oops - remove debug/exit	2015-02-11 10:14:06 -08:00
Ralph Castain	46fb850bb0	Continue adding support for options on orte-submit - still need to shift some of the MCA params to job object attributes	2015-02-10 13:56:14 -08:00
Ralph Castain	116fcaff2c	Start adding support for cmd line options to orte-submit	2015-02-10 12:13:21 -08:00
rhc54	cf3f4def48	Merge pull request #386 from marksantcroos/master Add debug option to orte-dvm. Looks fine - thanks	2015-02-10 11:38:52 -08:00
Ralph Castain	df2cd96772	Display the local/global attribute flag more prominently. Mark the attributes as global in orte-submit so they will be communicated	2015-02-10 10:47:32 -08:00
Mark Santcroos	ff6a69a68d	Add debug option to orte-dvm.	2015-02-10 13:02:23 -05:00
Ralph Castain	063e4c9989	Cleanup the pretty-print of odls cmds as some were missing. Add a new cmd to terminate the DVM, which the HNP will use to trun around and issue an xcast to the DVM.	2015-02-10 08:27:13 -08:00
Ralph Castain	3ae3b96c17	Fix master compilation - a buried header dependency must have been removed.	2015-02-10 07:22:10 -08:00
Ralph Castain	3478def791	Ensure that nodes get included in the nidmap when spawning a new DVM job - we really only need to do this once, but for now we do it for every job until we work out how to avoid the duplication. Remove debug from orte-dvm tool	2015-02-09 23:47:46 -05:00
Ralph Castain	ef13ba7db3	Add debug-daemons option to orte-dvm	2015-02-09 11:08:45 -05:00
Ralph Castain	2b0b012460	Continue refinement of the DVM operations. Send the spawn request to the right place (it helps) as it isn't a comm_spawn request and has to be treated a little differently. Ensure IO gets forwarded back to the tool. Ensure the tool outputs show_help locally as there is no place to send it.	2015-02-04 06:21:54 -08:00
Ralph Castain	7299cc3ab9	Cleanup the communications handshake so that orte-submit properly terminates upon job completion, and properly sends the terminate command to orte-dvm	2015-02-03 07:25:43 -08:00
Ralph Castain	4dba298e6e	Update orte-submit manpage, add the ompi-* versions of orte-dvm and orte-submit manpages	2015-02-01 15:46:40 -08:00
Ralph Castain	e303a9b1d6	Provide an orte-dvm man page. Provide an option to orte-submit for terminating the DVM	2015-02-01 12:14:44 -08:00
Ralph Castain	ec5ccb76cf	Enable persistent ORTE DVM so users can execute multiple OMPI jobs within an allocation without restarting the DVM every time.	2015-01-30 11:00:43 -08:00
Ralph Castain	028b00154d	Complete implementation of the schizo framework to support OMPI component	2015-01-27 09:29:42 -06:00
Gilles Gouaillardet	661c35ca67	cleanup dead code caused by the removal of the --with-threads configure option	2015-01-16 19:13:59 +09:00
Mike Dubman	f83d6045aa	ORTE: undeprecate -x var=val in mpirun mpirun -x var=val is back, actually it is useful alias for -mca mca_base_env_list "var=val"	2014-11-12 10:51:15 +02:00
Ralph Castain	d0704ef118	Restore handling of physical processors in rankfiles. Note that the prior implementation was likely incorrect as it falsely assumed that physical core indices were unique, which isn't always true. Stipulate that physical rankfiles can only include PU numbers, and bind the result to the core that contains that physical PU. Update the mpirun man page to cover the new use-case.	2014-11-10 14:00:40 -08:00
Ralph Castain	738c3e1d72	Ensure that mpirun correctly selects the HNP ess component without attempting to init the PMI subsystem as mpirun won't be supported anyway, so let's avoid the error message. Also, daemons launched by the plm/slurm component must use the ess/slurm module as we cannot trust the Slurm PMI_Init functions to correctly tell us when PMI support is available.	2014-11-03 21:35:42 -08:00
Gilles Gouaillardet	eef7590e58	wrappers: add the $(EXEEXT) extension to the installed symbolic links	2014-10-28 16:42:51 +09:00
Ralph Castain	894acb0aa8	configury: new OPAL_SET_MCA_PREFIX/ORTE_SET_MCA_CMD_LINE_ID macros These two macros set the MCA prefix and MCA cmd line id, respectively. Specifically, MCA parameters will be named PREFIX<foo> in the environment, and the cmd line will use -ID foo bar. These macros must be called during configure.ac and a value supplied. In the case of Open MPI, the values given are PREFIX=OMPI_MCA_ and ID=mca. Other projects (such as ORCM) will call these macros with their own unique values. For example, ORCM uses PREFIX=ORCM_MCA_ and ID=omca This scheme is necessary to allow running Open MPI applications under systems that use their own versions of ORTE and OPAL. For example, when running OMPI applications under ORCM, we need the MCA params passed to the ORCM daemons to be separated from those recognized by the OMPI application.	2014-10-22 18:57:40 -07:00
Jeff Squyres	c22e1ae33b	configury: new OPAL_SET_LIB_PREFIX/ORTE_SET_LIB_PREFIX macros These two macros set the prefix for the OPAL and ORTE libraries, respectively. Specifically, the OPAL library will be named libPREFIXopen-pal.la and the ORTE library will be named libPREFIXopen-rte.la. These macros must be called, even if the prefix argument is empty. The intent is that Open MPI will call these macros with an empty prefix, but other projects (such as ORCM) will call these macros with a non-empty prefix. For example, ORCM libraries can be named liborcm-open-pal.la and liborcm-open-rte.la. This scheme is necessary to allow running Open MPI applications under systems that use their own versions of ORTE and OPAL. For example, when running MPI applications under ORTE, if the ORTE and OPAL libraries between OMPI and ORCM are not identical (which, because they are released at different times, are likely to be different), we need to ensure that the OMPI applications link against their ORTE and OPAL libraries, but the ORCM executables link against their ORTE and OPAL libraries.	2014-10-22 10:32:19 -07:00
Jeff Squyres	01fd96bfa5	Revert "Provide a mechanism by which an upstream project can rename the OPAL and ORTE libraries. This is required by projects such as ORCM that have their own ORTE and OPAL libraries in order to avoid library confusion. By renaming their version of the libraries, the OMPI applications can correctly dynamically load the correct one for their build." This reverts commit `63f619f871`.	2014-10-22 10:32:11 -07:00
Jeff Squyres	206eade32c	mpirun.1in: whitespace cleanup Whitespace cleanup only; no content changes.	2014-10-20 05:18:25 -07:00
Jeff Squyres	9529289319	mpirun.1in: more updates about binding/etc. Follow on to `91e9686` and `f9d620e`.	2014-10-20 05:17:49 -07:00
Ralph Castain	91e96861dd	Cleanup the orterun man page per review by Gus Correa	2014-10-19 10:21:50 -07:00
Ralph Castain	f9d620e3a7	Update the orterun man page	2014-10-16 21:05:04 -07:00
Ralph Castain	ecbae03009	Fix typo	2014-10-16 13:30:06 -07:00
Ralph Castain	b6aa691e0a	Fix incorrect implementation of new MCA param mca_base_env_list - it was not picking up envars and forwarding them, but only worked if you explicitly set a value for the envar. Ensure it works for both direct and indirect launch modes. Remove stale code as this replaced orte_forward_envars. Ensure it doesn't get passed to the ORTE daemons.	2014-10-16 12:58:56 -07:00
Ralph Castain	63f619f871	Provide a mechanism by which an upstream project can rename the OPAL and ORTE libraries. This is required by projects such as ORCM that have their own ORTE and OPAL libraries in order to avoid library confusion. By renaming their version of the libraries, the OMPI applications can correctly dynamically load the correct one for their build.	2014-10-10 11:39:08 -07:00
Jeff Squyres	72704441a2	URLs: update URLs for GitHub	2014-10-01 14:44:09 -07:00
Ralph Castain	84810b80fd	Cover the remaining code paths for Java apps to define class path Refs trac:4926 This commit was SVN r32823. The following Trac tickets were found above: Ticket 4926 --> https://svn.open-mpi.org/trac/ompi/ticket/4926	2014-09-30 22:27:03 +00:00
Ralph Castain	040a69c38b	Correct the classpath to correctly include the local directory so Java programs find the application class cmr=v1.8.4:reviewer=jsquyres This commit was SVN r32817.	2014-09-30 16:35:12 +00:00
Ralph Castain	0445052a1c	Check for multiple declarations of a given MCA param and error out if detected as that can create an ambiguous definition of the param value. Refs trac:4897 This commit was SVN r32719. The following Trac tickets were found above: Ticket 4897 --> https://svn.open-mpi.org/trac/ompi/ticket/4897	2014-09-12 22:21:30 +00:00
Ralph Castain	e671620ac7	Per request from Jeff: tune up the help messages for binding options Refs trac:4898 This commit was SVN r32691. The following Trac tickets were found above: Ticket 4898 --> https://svn.open-mpi.org/trac/ompi/ticket/4898	2014-09-09 22:39:22 +00:00
Ralph Castain	4207b4c4ad	Improve the --bind-to help message to better indicate the default options under various values of np. Remove the warning message if the user doesn't specify a binding policy and we are overloaded cmr=v1.8.3:reviewer=jsquyres This commit was SVN r32687.	2014-09-08 21:03:51 +00:00
Ralph Castain	4df1aa63f7	Since we've run into the situation where someone puts a script wrapper around a launcher such as srun, we need to always protect MCA cmd line params with quotes. This means we also need to protect the backend from quotes coming into the system as part of a value, or else the parser gets confused. So add a new function for wrapping MCA arguments, and tell the backend parser to ignore/remove leading/trailing quotes. cmr=v1.8.3:reviewer=jsquyres This commit was SVN r32686.	2014-09-08 20:38:46 +00:00
Ralph Castain	aec5cd08bd	Per the PMIx RFC: WHAT: Merge the PMIx branch into the devel repo, creating a new OPAL “lmix” framework to abstract PMI support for all RTEs. Replace the ORTE daemon-level collectives with a new PMIx server and update the ORTE grpcomm framework to support server-to-server collectives WHY: We’ve had problems dealing with variations in PMI implementations, and need to extend the existing PMI definitions to meet exascale requirements. WHEN: Mon, Aug 25 WHERE: https://github.com/rhc54/ompi-svn-mirror.git Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding. All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level. Accordingly, we have: * created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations. * Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported. * Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint * removed the prior OMPI/OPAL modex code * added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform. * retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand This commit was SVN r32570.	2014-08-21 18:56:47 +00:00

1 2 3 4 5 ...

878 Коммитов