openmpi

Автор	SHA1	Сообщение	Дата
Boris Karasev	3796307a57	timings: added new timing points Signed-off-by: Boris Karasev <karasev.b@gmail.com>	2018-03-21 05:16:25 +02:00
Ralph Castain	0434b615b5	Update ORTE to support PMIx v3 This is a point-in-time update that includes support for several new PMIx features, mostly focused on debuggers and "instant on": * initial prototype support for PMIx-based debuggers. For the moment, this is restricted to using the DVM. Supports direct launch of apps under debugger control, and indirect launch using prun as the intermediate launcher. Includes ability for debuggers to control the environment of both the launcher and the spawned app procs. Work continues on completing support for indirect launch * IO forwarding for tools. Output of apps launched under tool control is directed to the tool and output there - includes support for XML formatting and output to files. Stdin can be forwarded from the tool to apps, but this hasn't been implemented in ORTE yet. * Fabric integration for "instant on". Enable collection of network "blobs" to be delivered to network libraries on compute nodes prior to local proc spawn. Infrastructure is in place - implementation will come later. * Harvesting and forwarding of envars. Enable network plugins to harvest envars and include them in the launch msg for setting the environment prior to local proc spawn. Currently, only OmniPath is supported. PMIx MCA params control which envars are included, and also allows envars to be excluded. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-03-02 02:00:31 -08:00
Ralph Castain	fe9b584c05	Fully support OMPI spawn options. Fix a bug in the round-robin mappers where we weren't adding nodes to the job map node array, and so resources were not released Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 285d8cfef74ffc899e9c51e1d9c597b7fb2ceb89)	2017-09-21 10:29:27 -07:00
Gabe Saba	8f2df42055	reachable: Initialize / Finalize reachable framework Initialize the reachable framework during opal_init() and tear it back down during opal_finalize(). The framework was never used, so the lack of initialization didn't matter, but this is a required step in using the framework. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-09-19 19:42:54 -07:00
Ralph Castain	3c914a7a97	Complete the fix of the ORTE DVM. We will now use "prun" instead of "orterun -hnp foo" to execute jobs. This provides the feature of automatic discovery of the orte-dvm so you don't need to manually enter URI's or contact file locations. All IO is forwarded to prun. Still in the "needs to be done" category: * mapping/ranking/binding options aren't correctly supported * if the DVM encounters some errors (e.g., not enough resources for the job), the resulting error is globally set and impacts any subsequent job submission Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-09-16 13:13:07 -07:00
Ralph Castain	acd60a2cc4	Add missing constant to error-strings Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-06-06 16:10:52 -07:00
Ralph Castain	c86f71376a	Increase fine grain of timing info Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-20 00:17:40 -07:00
Ralph Castain	d645557fa0	Update to include the PMIx 2.0 APIs for monitoring and job control. Include required integration, but leave the monitors off for now. Move the sensor framework out of ORTE as it is being absorbed into PMIx Fix typo and silence warnings Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-03-21 17:47:08 -07:00
Ralph Castain	83199979ba	Remove the stale opal/sec framework Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-03-02 15:41:56 -08:00
Gilles Gouaillardet	b3a2bdda7b	opal/threads: manually invoke thread-specific key destructors on the main thread. there is no such thing as pthread_join(main_thread), so key destructors are never invoked on the main thread, which causes valgrind report some memory leaks. Manually store and then invoke the key destructors and make valgrind happy. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:35 +09:00
Gilles Gouaillardet	c9aeccb84e	opal/if: open the if framework once in opal_init_util the if framework is no more open in opal_if*, which plugs several memory leaks Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-12-01 14:24:30 +09:00
Gilles Gouaillardet	8cc3f288c9	opal: fix opal_class_finalize() usage the class system can be initialized/finalized as many times as we like, so there is no more need to have opal_class_finalize() invoked in a destructor Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-10-26 15:15:54 +09:00
Ralph Castain	0ea1cff733	Implement notification of completion on comm_spawn'd child jobs. Add a configure flag to enable PMIx 3's shared memory datastore, and set it disable by default so that comm_spawn functions again. Will reverse the default once that feature is fully functional	2016-09-01 13:10:10 -07:00
Ralph Castain	20a91c2baf	Add a new --continuous flag to mpirun that directs ORTE to let a job continue running as app procs terminate. Don't attempt to restart them. Add event notification of abnormally terminating procs, and demonstrate that in the mpi_spin test program. Cleanup debug message	2016-07-13 15:28:33 -07:00
Ralph Castain	6e434d6785	Add support for PMIx tool connections and queries. Initially only support a request to list all known namespaces (jobids) from ORTE, but other folks will extend that support to include additional information Update to match PMIx RFC Fix configury to point to correct libevent and hwloc locations	2016-06-29 19:19:19 -07:00
Ralph Castain	5d330d5220	Enable the PMIx event notification capability and use that for all error notifications, including debugger release. This capability requires use of PMIx 2.0 or above as the features are not available with earlier PMIx releases. When OMPI master is built against an earlier external version, it will fallback to the prior behavior - i.e., debugger will be released via RML and all notifications will go strictly to the default error handler. Add PMIx 2.0 Remove PMIx 1.1.4 Cleanup copying of component Add missing file Touchup a typo in the Makefile.am Update the pmix ext114 component Minor cleanups and resync to master Update to latest PMIx 2.x Update to the PMIx event notification branch latest changes	2016-06-14 13:08:41 -07:00
Jeff Squyres	5071602c59	PSM/PSM2: Disable signal handler hijacking by default Per discussion on https://github.com/open-mpi/ompi/pull/1767 (and some subsequent phone calls and off-issue email discussions), the PSM library is hijacking signal handlers by default. Specifically: unless the environment variables `IPATH_NO_BACKTRACE=1` (for PSM / Intel TrueScale) is set, the library constructor for this library will hijack various signal handlers for the purpose of invoking its own error reporting mechanisms. This may be a bit surprising, but is not a problem, per se. The real problem is that older versions of at least the PSM library do not unregister these signal handlers upon being unloaded from memory. Hence, a segv can actually result in a double segv (i.e., the original segv and then another segv when the now-non-existent signal handler is invoked). This PSM signal hijacking subverts Open MPI's own signal reporting mechanism, which may be a bit surprising for some users (particularly those who do not have Intel TrueScale). As such, we disable it by default so that Open MPI's own error-reporting mechanisms are used. Additionally, there is a typo in the library destructor for the PSM2 library that may cause problems in the unloading of its signal handlers. This problem can be avoided by setting `HFI_NO_BACKTRACE=1` (for PSM2 / Intel OmniPath). This is further compounded by the fact that the PSM / PSM2 libraries can be loaded by the OFI MTL and the usNIC BTL (because they are loaded by libfabric), even when there is no Intel networking hardware present. Having the PSM/PSM2 libraries behave this way when no Intel hardware is present is clearly undesirable (and is likely to be fixed in future releases of the PSM/PSM2 libraries). This commit sets the following two environment variables to disable this behavior from the PSM/PSM2 libraries (if they are not already set): * IPATH_NO_BACKTRACE=1 * HFI_NO_BACKTRACE=1 If the user has set these variables before invoking Open MPI, we will not override their values (i.e., their preferences will be honored). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-06-14 11:45:23 -07:00
Ralph Castain	80f4e3b872	Fix the --tune problem by searching the argv for MCA params in advance of opal_init_util. Only search the first app_context as we historically have done - we can debate whether or not to search all app_contexts	2016-05-23 21:09:44 -07:00
Ralph Castain	42ecffb6d0	Move the registration of MCA params out of the init of the var system - put them in with the rest of the OPAL MCA param registrations Take another shot at untangling the spaghetti orterun: fix for command line parsing orte-submit calls opal_init_util () before parsing out MCA command line options (-mca, -am, etc). This prevents mpirun from setting opal MCA variables for some frameworks as well as the MCA base. This is because when a framework is opened all of its variables are set to read-only. Eventually we want to lift this restriction on some MCA variables but since -mca is affected we must parse out the MCA command line options before opal_init_util(). This commit fixes the bug by adding a new option to opal_cmd_line_parse (ignore unknown option) so orte-submit can pre-parse the command line for MCA options. Signed-off-by: Nathan Hjelm <hjelmn@me.com> Minor cleanups to avoid releasing/recreating the cmd line	2016-05-20 09:59:50 -07:00
Nathan Hjelm	41f00b7465	memory/patcher: initialize patcher framework when needed This commit moves the patcher framework initialization to the memory/patcher component. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-04 12:46:42 -06:00
Karol Mroz	e1c64e6e59	opal: standardize on max hostname length Define OPAL_MAXHOSTNAMELEN to be either: (MAXHOSTNAMELEN + 1) or (limits.h:HOST_NAME_MAX + 1) or (255 + 1) For pmix code, define above using PMIX_MAXHOSTNAMELEN. Fixup opal layer to use the new max. Signed-off-by: Karol Mroz <mroz.karol@gmail.com>	2016-04-24 08:19:47 +02:00
Nathan Hjelm	c2b6fbb124	opal/memory: move initialization to first rcache creation Because of the removal of the linux memory component it is no longer necessary to initialize the memory component in opal_init(). This commit moves the initialization to the creation of the first rcache component. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-13 17:21:46 -06:00
Nathan Hjelm	27f8a4e806	opal: add code patcher framework This commit adds a framework to abstract runtime code patching. Components in the new framework can provide functions for either patching a named function or a function pointer. The later functionality is not being used but may provide a way to allow memory hooks when dlopen functionality is disabled. This commit adds two different flavors of code patching. The first is provided by the overwrite component. This component overwrites the first several instructions of the target function with code to jump to the provided hook function. The hook is expected to provide the full functionality of the hooked function. The linux patcher component is based on the memory hooks in ucx. It only works on linux and operates by overwriting function pointers in the symbol table. In this case the hook is free to call the original function using the function pointer returned by dlsym. Both components restore the original functions when the patcher framework closes. Changes had to be made to support Power/PowerPC with the Linux dynamic loader patcher. Some of the changes: - Move code necessary for powerpc/power support to the patcher base. The code is needed by both the overwrite and linux components. - Move patch structure down to base and move the patch list to mca_patcher_base_module_t. The structure has been modified to include a function pointer to the function that will unapply the patch. This allows the mixing of multiple different types of patches in the patch_list. - Update linux patching code to keep track of the matching between got entry and original (unpatched) address. This allows us to completely clean up the patch on finalize. All patchers keep track of the changes they made so that they can be reversed when the patcher framework is closed. At this time there are bugs in the Linux dynamic loader patcher so its priority is lower than the overwrite patcher. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-13 17:16:13 -06:00
Ralph Castain	60a7bc2e50	Enable the PMIx notification callback system. This currently is only supported by the pmix120 component, which is not selected by default. All other components will ignore error registration requests, and thus do not support debugger attach when launched via mpirun. Note that direct launched applications will support such attachment, but may not do so in a scalable fashion. Fixes ##1225	2016-02-18 09:29:12 -08:00
Gilles Gouaillardet	0b3e3c6817	opal/runtime: add missing #include <unistd.h> Thanks Marco Atzeri for contributing the original patch	2015-12-24 14:41:56 +09:00
Ralph Castain	869041f770	Purge whitespace from the repo	2015-06-23 20:59:57 -07:00
Gilles Gouaillardet	c809aace47	initialize common symbols from opal A few uninitialized common symbols are remaining: common symbols generated by flex : * opal/util/keyval/keyval_lex.l: opal_util_keyval_yyleng * opal/util/keyval/keyval_lex.o: opal_util_keyval_yytext * opal/util/show_help_lex.l: opal_show_help_yyleng * opal/util/show_help_lex.l: opal_show_help_yytext common symbol generated by "external" hwloc library: * opal/mca/hwloc/hwloc191/hwloc/src/components.o: component_map	2015-05-08 09:48:51 +09:00
Nathan Hjelm	8287e1d28f	Merge pull request #528 from hjelmn/add_destructor Add opal destructor/fini function	2015-04-20 11:30:15 -06:00
Nathan Hjelm	662460b06b	Modify destructor function configury Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-04-20 09:51:06 -06:00
Nathan Hjelm	38589c46c0	opal: add a destructor/fini function to opal This commit is related to an RFC from June 2014. Disscussion can be found at: http://www.open-mpi.org/community/lists/devel/2014/07/15140.php The finalize function is set using either the linker option -fini or __attribute__((destructor)) depending on compiler support. I have confirmed that this hybrid approach works with all the major compilers. The attribute is supported by gcc, clang, llvm, xlc, and icc. The fini function will support pgi. If a compiler/linker combination does not support either the destructor or fini function a message will be printed on re-init indicating it is not supported (an improvement over the current behavior-- SEGV). I moved the following to the destructor function: - Class system finalize. This solves a bug when MPI_T_finalize is called before MPI_Init. The only downside to this change is we will leave the footprint of the opal class system after MPI_Finalize. This footprint should be relatively small. This is an alternative to #517 but the two PRs are not mutually-exclusive (with some modifications). This commit should also be safe for 1.8.x as it does not change internal or external ABI (#517 changes internal ABI). Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-04-16 19:53:52 -06:00
Nathan Hjelm	e794658f2d	Merge pull request #516 from hjelmn/repository_update RFC: Repository update	2015-04-15 10:03:08 -06:00
Nathan Hjelm	c954f457d9	mca/base: update the way dynamic components are handled This commit is a rework of the component repository. The changes included in this commit are: - Remove the component dependency code based off .ompi_info files. This code is legacy code dating back 10 years that and is no longer used. - Move the plugin scanning code to the component repository. New calls have been added to add new scanning paths, query available components, and dlopen/load components. - Pass the framework down to mca_base_component_find/filter. Eventually the framework structure will be used to further validate components before they are used. - Add support to the MCA framework system to disable scanning for dlopened components on open (support already existed in register). This is really only relevant to installdirs as it has no register function and no DSO components. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-04-14 15:55:33 -06:00
Nathan Hjelm	a7b0c00ab6	fix memory leaks and valgrind errors This commit fixes several vagrind errors. Included: - installdirs did not correctly reinitialize all pointers to NULL at close. This causes valgrind errors on a subsequent call to opal_init_tool. - several opal strings were leaked by opal_deregister_params which was setting them to NULL instead of letting them be freed by the MCA variable system. - move opal_net_init to AFTER the variable system is initialized and opal's MCA variables have been registered. opal_net_init uses a variable registered by opal_register_params! - do not leak ompi_mpi_main_thread when it is allocated by MPI_T_init_thread. - do not overwrite ompi_mpi_main_thread if it is already set (by MPI_T_init_thread). - mca_base_var: read_files was overwritting mca_base_var_file_list even if it was non-NULL. - mca_base_var: set all file global variables to initial states on finalize. - btl/vader: decrement enumerator reference count to ensure that it is freed. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-04-11 09:28:35 -06:00
Jeff Squyres	336626dafe	spelling: trivial spelling fix s/interupted/interrupted/gi	2015-02-27 18:30:43 -08:00
Gilles Gouaillardet	661c35ca67	cleanup dead code caused by the removal of the --with-threads configure option	2015-01-16 19:13:59 +09:00
Ralph Castain	fd6a044b7f	Cleanup some cruft resulting from the move of the btl's to opal. We had created the ability to delay modex operations, which included a need to delay retrieving hostname info for remote procs. This allowed us to not retrieve the modex info until first message unless required - the hostname is generally only required for debug and error messages. Properly setup the opal_process_info structure early in the initialization procedure. Define the local hostname right at the beginning of opal_init so all parts of opal can use it. Overlay that during orte_init as the user may choose to remove fqdn and strip prefixes during that time. Setup the job_session_dir and other such info immediately when it becomes available during orte_init.	2014-10-03 16:02:57 -06:00
Ralph Castain	e32d541c8d	Bring over a slight modification to the opal_init_test routine This commit was SVN r32676.	2014-09-07 15:46:53 +00:00
Ralph Castain	aec5cd08bd	Per the PMIx RFC: WHAT: Merge the PMIx branch into the devel repo, creating a new OPAL “lmix” framework to abstract PMI support for all RTEs. Replace the ORTE daemon-level collectives with a new PMIx server and update the ORTE grpcomm framework to support server-to-server collectives WHY: We’ve had problems dealing with variations in PMI implementations, and need to extend the existing PMI definitions to meet exascale requirements. WHEN: Mon, Aug 25 WHERE: https://github.com/rhc54/ompi-svn-mirror.git Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding. All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level. Accordingly, we have: * created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations. * Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported. * Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint * removed the prior OMPI/OPAL modex code * added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform. * retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand This commit was SVN r32570.	2014-08-21 18:56:47 +00:00
Ralph Castain	a347b19dc1	Add missing include This commit was SVN r32406.	2014-08-01 18:49:37 +00:00
Ralph Castain	552c9ca5a0	George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic. This commit was SVN r32317.	2014-07-26 00:47:28 +00:00
George Bosilca	77c74e8872	Don't iniaitlize twice the event framework (it is already initialized the init_tool). This commit was SVN r32239.	2014-07-15 05:04:29 +00:00
Ralph Castain	230336b6a8	Upgrade the security framework to avoid multiple hits against the global security server. Add support for future case where mpirun assings a global security credential for a given run, though we need to work out how to handle connect-accept from other mpirun's in that case. Remove a bunch of duplicate code in the OOB by consolidating the connection handshake code. Refs trac:4221 This commit was SVN r30554. The following Trac tickets were found above: Ticket 4221 --> https://svn.open-mpi.org/trac/ompi/ticket/4221	2014-02-04 14:47:04 +00:00
Ralph Castain	5980b7e042	Add a security framework for authenticating connections - we will add LDAP, Kerberos, and Keystone support in the next month. For now, just put a placeholder "basic" module that does the minimum. Wire the security check into ORTE's OOB handshake, and add a "version" check to ensure that both ends are from the same ORTE version. If not, report the mismatch and refuse the connection Fixes trac:4171 cmr=v1.7.5:reviewer=jsquyres:subject=Add a security framework for authenticating connections This commit was SVN r30551. The following Trac tickets were found above: Ticket 4171 --> https://svn.open-mpi.org/trac/ompi/ticket/4171	2014-02-04 01:38:45 +00:00
Ralph Castain	83e32aadb7	Add a variant of opal_init/finalize for running unit tests This commit was SVN r30497.	2014-01-30 11:14:36 +00:00
Ralph Castain	26fbb4e77b	Necessary constants for postgress module This commit was SVN r30338.	2014-01-20 19:58:56 +00:00
Ralph Castain	10ca1c1b04	Turns out that there was exactly ONE place in all of the OMPI code base that still referred to OPAL_TRACE, though a few places retained the include file for no reason. So no point in letting this sit as it is clearly an unused "feature". This commit was SVN r28789.	2013-07-14 18:57:20 +00:00
Nathan Hjelm	721779d7ab	Per RFC: remove old MCA parameter system. This commit was SVN r28541.	2013-05-20 15:36:13 +00:00
Ralph Castain	1f011bef99	Cleanup the updated sys limits capability. Fix a few copy/paste bugs (my bad). Shift the limit set to the ODLS default module so that we sete the limits for all apps, even those that don't call opal_init. Leave it in opal_init as well to support direct-launch apps, but ensure we only set the limits once by removing the envar after launch by ODLS. Provide some nice error messages if we fail to set the limits. Since the user had to specifically request we set the limit, treat failure as an error-out situation. This commit was SVN r28288.	2013-04-04 16:00:17 +00:00
Nathan Hjelm	17315bf360	Now that the entire codebase has been updated to use the MCA framework system remove the last calls to the MCA parameter system. This commit was SVN r28242.	2013-03-27 21:17:53 +00:00
Nathan Hjelm	365cf48db5	Update OPAL frameworks to use the MCA framework system. This commit was SVN r28239.	2013-03-27 21:11:47 +00:00

1 2 3

138 Коммитов