openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	83199979ba	Remove the stale opal/sec framework Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-03-02 15:41:56 -08:00
Gilles Gouaillardet	b3a2bdda7b	opal/threads: manually invoke thread-specific key destructors on the main thread. there is no such thing as pthread_join(main_thread), so key destructors are never invoked on the main thread, which causes valgrind report some memory leaks. Manually store and then invoke the key destructors and make valgrind happy. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:35 +09:00
Gilles Gouaillardet	c9aeccb84e	opal/if: open the if framework once in opal_init_util the if framework is no more open in opal_if*, which plugs several memory leaks Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-12-01 14:24:30 +09:00
Gilles Gouaillardet	8cc3f288c9	opal: fix opal_class_finalize() usage the class system can be initialized/finalized as many times as we like, so there is no more need to have opal_class_finalize() invoked in a destructor Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-10-26 15:15:54 +09:00
Ralph Castain	0ea1cff733	Implement notification of completion on comm_spawn'd child jobs. Add a configure flag to enable PMIx 3's shared memory datastore, and set it disable by default so that comm_spawn functions again. Will reverse the default once that feature is fully functional	2016-09-01 13:10:10 -07:00
Ralph Castain	20a91c2baf	Add a new --continuous flag to mpirun that directs ORTE to let a job continue running as app procs terminate. Don't attempt to restart them. Add event notification of abnormally terminating procs, and demonstrate that in the mpi_spin test program. Cleanup debug message	2016-07-13 15:28:33 -07:00
Ralph Castain	6e434d6785	Add support for PMIx tool connections and queries. Initially only support a request to list all known namespaces (jobids) from ORTE, but other folks will extend that support to include additional information Update to match PMIx RFC Fix configury to point to correct libevent and hwloc locations	2016-06-29 19:19:19 -07:00
Ralph Castain	5d330d5220	Enable the PMIx event notification capability and use that for all error notifications, including debugger release. This capability requires use of PMIx 2.0 or above as the features are not available with earlier PMIx releases. When OMPI master is built against an earlier external version, it will fallback to the prior behavior - i.e., debugger will be released via RML and all notifications will go strictly to the default error handler. Add PMIx 2.0 Remove PMIx 1.1.4 Cleanup copying of component Add missing file Touchup a typo in the Makefile.am Update the pmix ext114 component Minor cleanups and resync to master Update to latest PMIx 2.x Update to the PMIx event notification branch latest changes	2016-06-14 13:08:41 -07:00
Jeff Squyres	5071602c59	PSM/PSM2: Disable signal handler hijacking by default Per discussion on https://github.com/open-mpi/ompi/pull/1767 (and some subsequent phone calls and off-issue email discussions), the PSM library is hijacking signal handlers by default. Specifically: unless the environment variables `IPATH_NO_BACKTRACE=1` (for PSM / Intel TrueScale) is set, the library constructor for this library will hijack various signal handlers for the purpose of invoking its own error reporting mechanisms. This may be a bit surprising, but is not a problem, per se. The real problem is that older versions of at least the PSM library do not unregister these signal handlers upon being unloaded from memory. Hence, a segv can actually result in a double segv (i.e., the original segv and then another segv when the now-non-existent signal handler is invoked). This PSM signal hijacking subverts Open MPI's own signal reporting mechanism, which may be a bit surprising for some users (particularly those who do not have Intel TrueScale). As such, we disable it by default so that Open MPI's own error-reporting mechanisms are used. Additionally, there is a typo in the library destructor for the PSM2 library that may cause problems in the unloading of its signal handlers. This problem can be avoided by setting `HFI_NO_BACKTRACE=1` (for PSM2 / Intel OmniPath). This is further compounded by the fact that the PSM / PSM2 libraries can be loaded by the OFI MTL and the usNIC BTL (because they are loaded by libfabric), even when there is no Intel networking hardware present. Having the PSM/PSM2 libraries behave this way when no Intel hardware is present is clearly undesirable (and is likely to be fixed in future releases of the PSM/PSM2 libraries). This commit sets the following two environment variables to disable this behavior from the PSM/PSM2 libraries (if they are not already set): * IPATH_NO_BACKTRACE=1 * HFI_NO_BACKTRACE=1 If the user has set these variables before invoking Open MPI, we will not override their values (i.e., their preferences will be honored). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-06-14 11:45:23 -07:00
Ralph Castain	80f4e3b872	Fix the --tune problem by searching the argv for MCA params in advance of opal_init_util. Only search the first app_context as we historically have done - we can debate whether or not to search all app_contexts	2016-05-23 21:09:44 -07:00
Ralph Castain	42ecffb6d0	Move the registration of MCA params out of the init of the var system - put them in with the rest of the OPAL MCA param registrations Take another shot at untangling the spaghetti orterun: fix for command line parsing orte-submit calls opal_init_util () before parsing out MCA command line options (-mca, -am, etc). This prevents mpirun from setting opal MCA variables for some frameworks as well as the MCA base. This is because when a framework is opened all of its variables are set to read-only. Eventually we want to lift this restriction on some MCA variables but since -mca is affected we must parse out the MCA command line options before opal_init_util(). This commit fixes the bug by adding a new option to opal_cmd_line_parse (ignore unknown option) so orte-submit can pre-parse the command line for MCA options. Signed-off-by: Nathan Hjelm <hjelmn@me.com> Minor cleanups to avoid releasing/recreating the cmd line	2016-05-20 09:59:50 -07:00
Nathan Hjelm	41f00b7465	memory/patcher: initialize patcher framework when needed This commit moves the patcher framework initialization to the memory/patcher component. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-04 12:46:42 -06:00
Karol Mroz	e1c64e6e59	opal: standardize on max hostname length Define OPAL_MAXHOSTNAMELEN to be either: (MAXHOSTNAMELEN + 1) or (limits.h:HOST_NAME_MAX + 1) or (255 + 1) For pmix code, define above using PMIX_MAXHOSTNAMELEN. Fixup opal layer to use the new max. Signed-off-by: Karol Mroz <mroz.karol@gmail.com>	2016-04-24 08:19:47 +02:00
Nathan Hjelm	c2b6fbb124	opal/memory: move initialization to first rcache creation Because of the removal of the linux memory component it is no longer necessary to initialize the memory component in opal_init(). This commit moves the initialization to the creation of the first rcache component. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-13 17:21:46 -06:00
Nathan Hjelm	27f8a4e806	opal: add code patcher framework This commit adds a framework to abstract runtime code patching. Components in the new framework can provide functions for either patching a named function or a function pointer. The later functionality is not being used but may provide a way to allow memory hooks when dlopen functionality is disabled. This commit adds two different flavors of code patching. The first is provided by the overwrite component. This component overwrites the first several instructions of the target function with code to jump to the provided hook function. The hook is expected to provide the full functionality of the hooked function. The linux patcher component is based on the memory hooks in ucx. It only works on linux and operates by overwriting function pointers in the symbol table. In this case the hook is free to call the original function using the function pointer returned by dlsym. Both components restore the original functions when the patcher framework closes. Changes had to be made to support Power/PowerPC with the Linux dynamic loader patcher. Some of the changes: - Move code necessary for powerpc/power support to the patcher base. The code is needed by both the overwrite and linux components. - Move patch structure down to base and move the patch list to mca_patcher_base_module_t. The structure has been modified to include a function pointer to the function that will unapply the patch. This allows the mixing of multiple different types of patches in the patch_list. - Update linux patching code to keep track of the matching between got entry and original (unpatched) address. This allows us to completely clean up the patch on finalize. All patchers keep track of the changes they made so that they can be reversed when the patcher framework is closed. At this time there are bugs in the Linux dynamic loader patcher so its priority is lower than the overwrite patcher. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-13 17:16:13 -06:00
Ralph Castain	60a7bc2e50	Enable the PMIx notification callback system. This currently is only supported by the pmix120 component, which is not selected by default. All other components will ignore error registration requests, and thus do not support debugger attach when launched via mpirun. Note that direct launched applications will support such attachment, but may not do so in a scalable fashion. Fixes ##1225	2016-02-18 09:29:12 -08:00
Gilles Gouaillardet	0b3e3c6817	opal/runtime: add missing #include <unistd.h> Thanks Marco Atzeri for contributing the original patch	2015-12-24 14:41:56 +09:00
Ralph Castain	869041f770	Purge whitespace from the repo	2015-06-23 20:59:57 -07:00
Gilles Gouaillardet	c809aace47	initialize common symbols from opal A few uninitialized common symbols are remaining: common symbols generated by flex : * opal/util/keyval/keyval_lex.l: opal_util_keyval_yyleng * opal/util/keyval/keyval_lex.o: opal_util_keyval_yytext * opal/util/show_help_lex.l: opal_show_help_yyleng * opal/util/show_help_lex.l: opal_show_help_yytext common symbol generated by "external" hwloc library: * opal/mca/hwloc/hwloc191/hwloc/src/components.o: component_map	2015-05-08 09:48:51 +09:00
Nathan Hjelm	8287e1d28f	Merge pull request #528 from hjelmn/add_destructor Add opal destructor/fini function	2015-04-20 11:30:15 -06:00
Nathan Hjelm	662460b06b	Modify destructor function configury Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-04-20 09:51:06 -06:00
Nathan Hjelm	38589c46c0	opal: add a destructor/fini function to opal This commit is related to an RFC from June 2014. Disscussion can be found at: http://www.open-mpi.org/community/lists/devel/2014/07/15140.php The finalize function is set using either the linker option -fini or __attribute__((destructor)) depending on compiler support. I have confirmed that this hybrid approach works with all the major compilers. The attribute is supported by gcc, clang, llvm, xlc, and icc. The fini function will support pgi. If a compiler/linker combination does not support either the destructor or fini function a message will be printed on re-init indicating it is not supported (an improvement over the current behavior-- SEGV). I moved the following to the destructor function: - Class system finalize. This solves a bug when MPI_T_finalize is called before MPI_Init. The only downside to this change is we will leave the footprint of the opal class system after MPI_Finalize. This footprint should be relatively small. This is an alternative to #517 but the two PRs are not mutually-exclusive (with some modifications). This commit should also be safe for 1.8.x as it does not change internal or external ABI (#517 changes internal ABI). Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-04-16 19:53:52 -06:00
Nathan Hjelm	e794658f2d	Merge pull request #516 from hjelmn/repository_update RFC: Repository update	2015-04-15 10:03:08 -06:00
Nathan Hjelm	c954f457d9	mca/base: update the way dynamic components are handled This commit is a rework of the component repository. The changes included in this commit are: - Remove the component dependency code based off .ompi_info files. This code is legacy code dating back 10 years that and is no longer used. - Move the plugin scanning code to the component repository. New calls have been added to add new scanning paths, query available components, and dlopen/load components. - Pass the framework down to mca_base_component_find/filter. Eventually the framework structure will be used to further validate components before they are used. - Add support to the MCA framework system to disable scanning for dlopened components on open (support already existed in register). This is really only relevant to installdirs as it has no register function and no DSO components. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-04-14 15:55:33 -06:00
Nathan Hjelm	a7b0c00ab6	fix memory leaks and valgrind errors This commit fixes several vagrind errors. Included: - installdirs did not correctly reinitialize all pointers to NULL at close. This causes valgrind errors on a subsequent call to opal_init_tool. - several opal strings were leaked by opal_deregister_params which was setting them to NULL instead of letting them be freed by the MCA variable system. - move opal_net_init to AFTER the variable system is initialized and opal's MCA variables have been registered. opal_net_init uses a variable registered by opal_register_params! - do not leak ompi_mpi_main_thread when it is allocated by MPI_T_init_thread. - do not overwrite ompi_mpi_main_thread if it is already set (by MPI_T_init_thread). - mca_base_var: read_files was overwritting mca_base_var_file_list even if it was non-NULL. - mca_base_var: set all file global variables to initial states on finalize. - btl/vader: decrement enumerator reference count to ensure that it is freed. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-04-11 09:28:35 -06:00
Jeff Squyres	336626dafe	spelling: trivial spelling fix s/interupted/interrupted/gi	2015-02-27 18:30:43 -08:00
Gilles Gouaillardet	661c35ca67	cleanup dead code caused by the removal of the --with-threads configure option	2015-01-16 19:13:59 +09:00
Ralph Castain	fd6a044b7f	Cleanup some cruft resulting from the move of the btl's to opal. We had created the ability to delay modex operations, which included a need to delay retrieving hostname info for remote procs. This allowed us to not retrieve the modex info until first message unless required - the hostname is generally only required for debug and error messages. Properly setup the opal_process_info structure early in the initialization procedure. Define the local hostname right at the beginning of opal_init so all parts of opal can use it. Overlay that during orte_init as the user may choose to remove fqdn and strip prefixes during that time. Setup the job_session_dir and other such info immediately when it becomes available during orte_init.	2014-10-03 16:02:57 -06:00
Ralph Castain	e32d541c8d	Bring over a slight modification to the opal_init_test routine This commit was SVN r32676.	2014-09-07 15:46:53 +00:00
Ralph Castain	aec5cd08bd	Per the PMIx RFC: WHAT: Merge the PMIx branch into the devel repo, creating a new OPAL “lmix” framework to abstract PMI support for all RTEs. Replace the ORTE daemon-level collectives with a new PMIx server and update the ORTE grpcomm framework to support server-to-server collectives WHY: We’ve had problems dealing with variations in PMI implementations, and need to extend the existing PMI definitions to meet exascale requirements. WHEN: Mon, Aug 25 WHERE: https://github.com/rhc54/ompi-svn-mirror.git Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding. All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level. Accordingly, we have: * created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations. * Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported. * Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint * removed the prior OMPI/OPAL modex code * added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform. * retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand This commit was SVN r32570.	2014-08-21 18:56:47 +00:00
Ralph Castain	a347b19dc1	Add missing include This commit was SVN r32406.	2014-08-01 18:49:37 +00:00
Ralph Castain	552c9ca5a0	George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic. This commit was SVN r32317.	2014-07-26 00:47:28 +00:00
George Bosilca	77c74e8872	Don't iniaitlize twice the event framework (it is already initialized the init_tool). This commit was SVN r32239.	2014-07-15 05:04:29 +00:00
Ralph Castain	230336b6a8	Upgrade the security framework to avoid multiple hits against the global security server. Add support for future case where mpirun assings a global security credential for a given run, though we need to work out how to handle connect-accept from other mpirun's in that case. Remove a bunch of duplicate code in the OOB by consolidating the connection handshake code. Refs trac:4221 This commit was SVN r30554. The following Trac tickets were found above: Ticket 4221 --> https://svn.open-mpi.org/trac/ompi/ticket/4221	2014-02-04 14:47:04 +00:00
Ralph Castain	5980b7e042	Add a security framework for authenticating connections - we will add LDAP, Kerberos, and Keystone support in the next month. For now, just put a placeholder "basic" module that does the minimum. Wire the security check into ORTE's OOB handshake, and add a "version" check to ensure that both ends are from the same ORTE version. If not, report the mismatch and refuse the connection Fixes trac:4171 cmr=v1.7.5:reviewer=jsquyres:subject=Add a security framework for authenticating connections This commit was SVN r30551. The following Trac tickets were found above: Ticket 4171 --> https://svn.open-mpi.org/trac/ompi/ticket/4171	2014-02-04 01:38:45 +00:00
Ralph Castain	83e32aadb7	Add a variant of opal_init/finalize for running unit tests This commit was SVN r30497.	2014-01-30 11:14:36 +00:00
Ralph Castain	26fbb4e77b	Necessary constants for postgress module This commit was SVN r30338.	2014-01-20 19:58:56 +00:00
Ralph Castain	10ca1c1b04	Turns out that there was exactly ONE place in all of the OMPI code base that still referred to OPAL_TRACE, though a few places retained the include file for no reason. So no point in letting this sit as it is clearly an unused "feature". This commit was SVN r28789.	2013-07-14 18:57:20 +00:00
Nathan Hjelm	721779d7ab	Per RFC: remove old MCA parameter system. This commit was SVN r28541.	2013-05-20 15:36:13 +00:00
Ralph Castain	1f011bef99	Cleanup the updated sys limits capability. Fix a few copy/paste bugs (my bad). Shift the limit set to the ODLS default module so that we sete the limits for all apps, even those that don't call opal_init. Leave it in opal_init as well to support direct-launch apps, but ensure we only set the limits once by removing the envar after launch by ODLS. Provide some nice error messages if we fail to set the limits. Since the user had to specifically request we set the limit, treat failure as an error-out situation. This commit was SVN r28288.	2013-04-04 16:00:17 +00:00
Nathan Hjelm	17315bf360	Now that the entire codebase has been updated to use the MCA framework system remove the last calls to the MCA parameter system. This commit was SVN r28242.	2013-03-27 21:17:53 +00:00
Nathan Hjelm	365cf48db5	Update OPAL frameworks to use the MCA framework system. This commit was SVN r28239.	2013-03-27 21:11:47 +00:00
Nathan Hjelm	cf377db823	MCA/base: Add new MCA variable system Features: - Support for an override parameter file (openmpi-mca-param-override.conf). Variable values in this file can not be overridden by any file or environment value. - Support for boolean, unsigned, and unsigned long long variables. - Support for true/false values. - Support for enumerations on integer variables. - Support for MPIT scope, verbosity, and binding. - Support for command line source. - Support for setting variable source via the environment using OMPI_MCA_SOURCE_<var name>=source (either command or file:filename) - Cleaner API. - Support for variable groups (equivalent to MPIT categories). Notes: - Variables must be created with a backing store (char *, int , or bool *) that must live at least as long as the variable. - Creating a variable with the MCA_BASE_VAR_FLAG_SETTABLE enables the use of mca_base_var_set_value() to change the value. - String values are duplicated when the variable is registered. It is up to the caller to free the original value if necessary. The new value will be freed by the mca_base_var system and must not be freed by the user. - Variables with constant scope may not be settable. - Variable groups (and all associated variables) are deregistered when the component is closed or the component repository item is freed. This prevents a segmentation fault from accessing a variable after its component is unloaded. - After some discussion we decided we should remove the automatic registration of component priority variables. Few component actually made use of this feature. - The enumerator interface was updated to be general enough to handle future uses of the interface. - The code to generate ompi_info output has been moved into the MCA variable system. See mca_base_var_dump(). opal: update core and components to mca_base_var system orte: update core and components to mca_base_var system ompi: update core and components to mca_base_var system This commit also modifies the rmaps framework. The following variables were moved from ppr and lama: rmaps_base_pernode, rmaps_base_n_pernode, rmaps_base_n_persocket. Both lama and ppr create synonyms for these variables. This commit was SVN r28236.	2013-03-27 21:09:41 +00:00
Ralph Castain	b7f0e46319	Provide a nicer error message when someone gives a bad signal number to opal_signal cmr:v1.7.1 This commit was SVN r28188.	2013-03-20 15:30:59 +00:00
Ralph Castain	bd9265c560	Per the meeting on moving the BTLs to OPAL, move the ORTE database "db" framework to OPAL so the relocated BTLs can access it. Because the data is indexed by process, this requires that we define a new "opal_identifier_t" that corresponds to the orte_process_name_t struct. In order to support multiple run-times, this is defined in opal/mca/db/db_types.h as a uint64_t without identifying the meaning of any part of that data. A few changes were required to support this move: 1. the PMI component used to identify rte-related data (e.g., host name, bind level) and package them as a unit to reduce the number of PMI keys. This code was moved up to the ORTE layer as the OPAL layer has no understanding of these concepts. In addition, the component locally stored data based on process jobid/vpid - this could no longer be supported (see below for the solution). 2. the hash component was updated to use the new opal_identifier_t instead of orte_process_name_t as its index for storing data in the hash tables. Previously, we did a hash on the vpid and stored the data in a 32-bit hash table. In the revised system, we don't see a separate "vpid" field - we only have a 64-bit opaque value. The orte_process_name_t hash turned out to do nothing useful, so we now store the data in a 64-bit hash table. Preliminary tests didn't show any identifiable change in behavior or performance, but we'll have to see if a move back to the 32-bit table is required at some later time. 3. the db framework was a "select one" system. However, since the PMI component could no longer use its internal storage system, the framework has now been changed to a "select many" mode of operation. This allows the hash component to handle all internal storage, while the PMI component only handles pushing/pulling things from the PMI system. This was something we had planned for some time - when fetching data, we first check internal storage to see if we already have it, and then automatically go to the global system to look for it if we don't. Accordingly, the framework was provided with a custom query function used during "select" that lets you seperately specify the "store" and "fetch" ordering. 4. the ORTE grpcomm and ess/pmi components, and the nidmap code, were updated to work with the new db framework and to specify internal/global storage options. No changes were made to the MPI layer, except for modifying the ORTE component of the OMPI/rte framework to support the new db framework. This commit was SVN r28112.	2013-02-26 17:50:04 +00:00
Brian Barrett	fc3df11e08	Remove the (only two) fortran constants from OPAL. The only places that actually care if opal_pointer_array is limited to handle_max already passes that in as the max_size during init, so don't need it there. The arch constant was a bit more difficult, so pass that in during MPI init and leave empty otherwise. This is to help with the effort to allow building ompi against an external opal or orte. This commit was SVN r27817.	2013-01-15 01:27:36 +00:00
Jeff Squyres	2ba10c37fe	Per RFC, bring in the following changes: * Remove paffinity, maffinity, and carto frameworks -- they've been wholly replaced by hwloc. * Move ompi_mpi_init() affinity-setting/checking code down to ORTE. * Update sm, smcuda, wv, and openib components to no longer use carto. Instead, use hwloc data. There are still optimizations possible in the sm/smcuda BTLs (i.e., making multiple mpools). Also, the old carto-based code found out how many NUMA nodes were ''available'' -- not how many were used ''in this job''. The new hwloc-using code computes the same value -- it was not updated to calculate how many NUMA nodes are used ''by this job.'' * Note that I cannot compile the smcuda and wv BTLs -- I ''think'' they're right, but they need to be verified by their owners. * The openib component now does a bunch of stuff to figure out where "near" OpenFabrics devices are. '''THIS IS A CHANGE IN DEFAULT BEHAVIOR!!''' and still needs to be verified by OpenFabrics vendors (I do not have a NUMA machine with an OpenFabrics device that is a non-uniform distance from multiple different NUMA nodes). * Completely rewrite the OMPI_Affinity_str() routine from the "affinity" mpiext extension. This extension now understands hyperthreads; the output format of it has changed a bit to reflect this new information. * Bunches of minor changes around the code base to update names/types from maffinity/paffinity-based names to hwloc-based names. * Add some helper functions into the hwloc base, mainly having to do with the fact that we have the hwloc data reporting ''all'' topology information, but sometimes you really only want the (online \| available) data. This commit was SVN r26391.	2012-05-07 14:52:54 +00:00
Jeff Squyres	aba398ce09	Per RFC (http://www.open-mpi.org/community/lists/devel/2012/04/10905.php), set opal_cache_line_size via hwloc data, if we have it. opal_cache_line_size will be set to an hwloc-inspired value by the end of orte_init(), but will always have a safe value to use (i.e., a default value 128) -- even before opal_init() has completed. Default to the same value of 128 that Open MPI has used for several years if a) we have no hwloc data, or b) we weren't able to find L2 objects in the hwloc data. This commit was SVN r26322.	2012-04-24 17:31:06 +00:00
Ralph Castain	bd8b4f7f1e	Sorry for mid-day commit, but I had promised on the call to do this upon my return. Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code. Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch. This commit was SVN r26242.	2012-04-06 14:23:13 +00:00
Jeff Squyres	63a96e92b5	In a recent v1.5 branch issue, it took a while to figure out that paffinity hwloc was returning "NOT_SUPPORTED" when the real problem was that the underlying hwloc simply hadn't been initialized yet. So let's clearly delineate this case: return OPAL_ERR_NOT_INITIALIZED if the underlying hwloc is not initialized. This commit was SVN r25902.	2012-02-10 18:29:52 +00:00

1 2 3

130 Коммитов