in generated executables on systems that support it. Use
--disable-wrapper-rpath to disable this behavior. See text in
README about --disable-wrapper-rpath for more details.
This commit was SVN r28479.
The following Trac tickets were found above:
Ticket 376 --> https://svn.open-mpi.org/trac/ompi/ticket/376
added by hwloc's embedding so that it doesn't appear in
libhwloc_embedded.la (and therefore propogate all the way up to
libmpi.la).
Committed upstream in hwloc SVN r5588.
This commit was SVN r28457.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r5588
everything out before using it.
This is not in response to any known bug, but rather just a
pre-emptive, defensive move to help prevent bugs in code that forgets
to initialize a field.
This commit was SVN r28343.
Other changes:
- Added a flag to the MCA variable system to indicate a variable should go away
when its group does. Both mca_base_framework_var_register() and
mca_base_component_var_register() set this flag.
Notes:
- mca_base_components_open is deprecated. It will be removed in a future commit.
- All frameworks should use MCA_BASE_FRAMEWORK_DECLARE to declare their
framework structure.
- All calls to framework open/close functions should be changed to use the
mca_base_framework_* functions.
- Instead of special-casing installdirs a flag was added to prevent calling
into the variable system when opening a framework.
- Ralph: Clarify the functional definition of the "register" function in the
MCA framework object - it had the same name as another function that does a
totally different thing.
- As per discussion with Ralph the behavior of mca_base_framework_register()
is to always call mca_base_framework_components_register() if the framework's
register function was successful. This removed the need for frameworks to
have to call this function directly.
This commit was SVN r28237.
Features:
- Support for an override parameter file (openmpi-mca-param-override.conf).
Variable values in this file can not be overridden by any file or environment
value.
- Support for boolean, unsigned, and unsigned long long variables.
- Support for true/false values.
- Support for enumerations on integer variables.
- Support for MPIT scope, verbosity, and binding.
- Support for command line source.
- Support for setting variable source via the environment using
OMPI_MCA_SOURCE_<var name>=source (either command or file:filename)
- Cleaner API.
- Support for variable groups (equivalent to MPIT categories).
Notes:
- Variables must be created with a backing store (char **, int *, or bool *)
that must live at least as long as the variable.
- Creating a variable with the MCA_BASE_VAR_FLAG_SETTABLE enables the use of
mca_base_var_set_value() to change the value.
- String values are duplicated when the variable is registered. It is up to
the caller to free the original value if necessary. The new value will be
freed by the mca_base_var system and must not be freed by the user.
- Variables with constant scope may not be settable.
- Variable groups (and all associated variables) are deregistered when the
component is closed or the component repository item is freed. This
prevents a segmentation fault from accessing a variable after its component
is unloaded.
- After some discussion we decided we should remove the automatic registration
of component priority variables. Few component actually made use of this
feature.
- The enumerator interface was updated to be general enough to handle
future uses of the interface.
- The code to generate ompi_info output has been moved into the MCA variable
system. See mca_base_var_dump().
opal: update core and components to mca_base_var system
orte: update core and components to mca_base_var system
ompi: update core and components to mca_base_var system
This commit also modifies the rmaps framework. The following variables were
moved from ppr and lama: rmaps_base_pernode, rmaps_base_n_pernode,
rmaps_base_n_persocket. Both lama and ppr create synonyms for these variables.
This commit was SVN r28236.
binding. This fix was included in the upstream 1.6 series, but not
the upstream 1.5 series, and was therefore missed when we brought
1.5.2 to OMPI.
This commit was SVN r28212.
The following SVN revision numbers were found above:
r28040 --> open-mpi/ompi@3d44f97572
At the same time, fix a minor issue where the init hook was being called twice, once by the libc malloc and once by our malloc by removing the call from our malloc.
This commit was SVN r28202.
library to multiple libraries that are implicitly sucked into the executable
as a dependency of libmpi. The initialize hook isn't visible to libc on some
linux distributions when it's in libopal and libopal isn't explicity linked
into the executable. The fix is to have a duplicate initialize hook in
libmpi as well as libopal. *sigh*.
This commit was SVN r28164.
A few changes were required to support this move:
1. the PMI component used to identify rte-related data (e.g., host name, bind level) and package them as a unit to reduce the number of PMI keys. This code was moved up to the ORTE layer as the OPAL layer has no understanding of these concepts. In addition, the component locally stored data based on process jobid/vpid - this could no longer be supported (see below for the solution).
2. the hash component was updated to use the new opal_identifier_t instead of orte_process_name_t as its index for storing data in the hash tables. Previously, we did a hash on the vpid and stored the data in a 32-bit hash table. In the revised system, we don't see a separate "vpid" field - we only have a 64-bit opaque value. The orte_process_name_t hash turned out to do nothing useful, so we now store the data in a 64-bit hash table. Preliminary tests didn't show any identifiable change in behavior or performance, but we'll have to see if a move back to the 32-bit table is required at some later time.
3. the db framework was a "select one" system. However, since the PMI component could no longer use its internal storage system, the framework has now been changed to a "select many" mode of operation. This allows the hash component to handle all internal storage, while the PMI component only handles pushing/pulling things from the PMI system. This was something we had planned for some time - when fetching data, we first check internal storage to see if we already have it, and then automatically go to the global system to look for it if we don't. Accordingly, the framework was provided with a custom query function used during "select" that lets you seperately specify the "store" and "fetch" ordering.
4. the ORTE grpcomm and ess/pmi components, and the nidmap code, were updated to work with the new db framework and to specify internal/global storage options.
No changes were made to the MPI layer, except for modifying the ORTE component of the OMPI/rte framework to support the new db framework.
This commit was SVN r28112.
processor_bind to see if we're bound to a single core.
If not, THEN check lgroup affinity. Already CMR'ed to
v1.6 (trac 3507) and fixed upstream in hwloc (r5295).
This commit was SVN r28040.
The following SVN revision numbers were found above:
r5295 --> open-mpi/ompi@6df8cb0f02
flags, and mca flags are kept seperate until the very end. The main configure
wrapper flags should now be modified by using the OPAL_WRAPPER_FLAGS_ADD
macro. MCA components should either let <framework>_<component>_{LIBS,LDFLAGS}
be copied over OR set <framework>_<component>_WRAPPER_EXTRA_{LIBS,LDFLAGS}.
The situations in which WRAPPER CPPFLAGS can be set by MCA components was
made very small to match the one use case where it makes sense.
This commit was SVN r27950.
it that the others did: move the "I won!" code up into the POST_CONFIG
macro. Also, fix a long-standing typo when restoring the $CPPFLAGS (!).
This commit was SVN r27813.
STOP_AT_FIRST. And move the side-effect-inducing code in
hwloc142/configure.m4 up to POST_CONFIG.
Also change the priority of the external hwloc component to 90 so that
it is evaluated before the internal component (as a direct result of
changing to STOP_AT_FIRST).
This commit was SVN r27796.
The following SVN revision numbers were found above:
r27794 --> open-mpi/ompi@569a60c2de
event framework to STOP_AT_FIRST, and then moves a bunch of
side-effect-inducing code in the libevent2019 configure.m4 up to
POST_CONFIG.
== More detail ==
Change the event framework from STOP_AT_FIRST_PRIORITY to
STOP_AT_FIRST. This means that only one component can win (vs. all
STOP_AT_FIRST_PRIORITY, in which multiple components of the same
priority can all win).
You still need to ensure that there are no side-effects from the
winner, however, so check for winning during POST_CONFIG, and set
things like the base_include there.
This simplifies the configury quite a bit -- you don't have to assume
that mulitple components can win: zero or one components will win.
Also change the libevent 2019 priority to 50 so that some other
(developer-specific/local) component could win, if it wanted to.
This commit was SVN r27794.
party configure.in scripts to be configure.ac so that Automake stops
complaining about them.
This commit was SVN r27791.
The following SVN revision numbers were found above:
r27790 --> open-mpi/ompi@675a2f5c48
using the modex or RML to share sm initialization information, have node rank 0
create a file containing initialization information in a well-known place. Then
during add_procs, the rest of the node processes requiring sm BTL initialization
will just read from that file to complete their initialization.
This commit was SVN r27789.
Carns, change to use access(.., F_OK) instead of stat() to check for
the presence of files.
Also remove redundant check for FAKEROOTKEY, and update all comments
to match.
This commit was SVN r27785.
config/ directory. We split them apart a while ago in the hopes that
it would simplify things, but it didn't really (e.g., because there
were still some ompi/opal .m4 files in the top-level config/
directory, resulting in developer confusion where any given m4 macro
was defined).
So this commit consolidates them back into the top-level directory for
simplicity.
There's still (at least) two changes that would be nice to make:
1. Split any generated .m4 file (e.g., autogen-generated .m4 files)
into a separate directory somewhere so that a top-level -Iconfig/
will only get our explicitly defined macros, not the autogen stuff
(e.g., with libevent2019 needing to get the visibility macro, but
NOT all the autogen-generated inclusion of component configure.m4
files).
1. Change configure to be of the form:
{{{
# ...a small amount of preamble/setup...
OPAL_SETUP
m4_ifdef([project_orte], [ORTE_SETUP])
m4_ifdef([project_ompi], [OMPI_SETUP])
# ...a small amount of finishing stuff...
}}}
I doubt we'll ever get anything as clean as that, but that would be
the goal to shoot for.
This commit was SVN r27704.
The old behavior of mca_base_param_deregister could cause the indices of other mca parameters to change. This could potentially cause problems if a mca user saves and later references an affected index.
This commit was SVN r27633.
Reasoning: The old behavior was a little confusing. mca_base_components_open does not open an output stream so it is a little unexpected that mca_base_components_close does. To add to this several frameworks (that don't use mca_base_components_close) failed to close their output in the framework close function and others closed their output a second time. This change is an improvement to the symantics of mca_base_components_open/close as they are now symetric in their functionality.
This commit was SVN r27570.
pml/v:
- If vprotocol is not being used vprotocol_include_list is leaked. Assume vprotocol never takes ownership (see below) and always free the string.
coll/ml:
- (patch verified) calling mca_base_param_lookup_string after mca_base_param_reg_string is unnecessary. The call to mca_base_param_lookup_string causes the value returned by mca_base_param_reg_string to be leaked.
- Need to free mca_coll_ml_component.config_file_name on component close.
btl/openib:
- calling mca_base_param_lookup_string after mca_base_param_reg_string is unnecessary. The call to mca_base_param_lookup_string causes the value returned by mca_base_param_reg_string to be leaked.
vprotocol/base:
- There was no way for pml/v to determine if vprotocol took ownership of vprotocol_include_list. Fix by always never ownership (use strdup).
mca/base:
- param_lookup will result in storage->stringval to be a newly allocated string if the mca parameter has a string value. ensure this string is always freed.
cmr:v1.7
This commit was SVN r27569.
It appears the problem was not with the command line parser but the rsh plm. I don't know why this problem was not occuring before the command line parser changes but it appears to be resolved now.
This commit was SVN r27527.
The following SVN revision numbers were found above:
r27451 --> open-mpi/ompi@d59034e6ef
r27456 --> open-mpi/ompi@ecdbf34937
Not sure what happened here, but the resulting trunk wouldn't even configure. After spending time fixing that problem, I found it wouldn't compile due to multiple syntax errors that had been introduced in both the OPAL and OMPI layer. This raised questions as to the completeness of the work.
Given that the author is departing, I pinged Jeff about it and we agreed to revert this for now. Hopefully, it can either be fixed by the author prior to actual departure, or someone else can pick it up (now that it is in the history) and fix it.
This commit was SVN r27511.
The following SVN revision numbers were found above:
r27508 --> open-mpi/ompi@12c3c743de
r27509 --> open-mpi/ompi@79e4a8ca38
r27510 --> open-mpi/ompi@1ad5ff625a
opal_shmem_segment_create by testing whether or not the target mount has enough
space to accommodate the shared-memory backing store. Fixes trac:2827. Will work
with Shiqing to add Windows support (if required).
This commit was SVN r27433.
The following Trac tickets were found above:
Ticket 2827 --> https://svn.open-mpi.org/trac/ompi/ticket/2827
* Use the hwloc logical index, not the os_index. Fixes problems with
opal_hwloc_base_cset2str() output (e.g., --report-bindings output)
on machines where the os_index is not tightly packed in the range
![0, n-1]
This commit was SVN r27394.
ompi/mca/sbgp/basesmsocket
orte/mca/rmaps/lama
Remove stale configure.params files from the sbgp framework as the OMPI build system no longer looks at those files.
This commit was SVN r27377.
Cannot start the data clearing at the root object level as the root object has a different struct attached to userdata.
This commit was SVN r27357.
The following Trac tickets were found above:
Ticket 3322 --> https://svn.open-mpi.org/trac/ompi/ticket/3322
This now results in the procs being bound within their assigned location. It also causes us to use only the 0th HT on a core unless --use-hwthread-cpus has been specified (in which case, we use all the HTs in a core). Bind to core binds you to all HTs regardless - the --use-hwthread-cpus only impacts the oversubscribed determination and when binding to HT.
cmr:v1.7
This commit was SVN r27342.
We ran into a case where the OMPI SVN trunk grew a new acceptable MCA
parameter value, but this new value was not accepted on the v1.6
branch (hwloc_base_mem_bind_failure_action -- on the trunk it accepts
the value "silent", but on the older v1.6 branch, it doesn't). If you
set "hwloc_base_mem_bind_failure_action=silent" in the default MCA
params file and then accidentally ran with the v1.6 branch, every OMPI
executable (including ompi_info) just failed because hwloc_base_open()
would say "hey, 'silent' is not a valid value for
hwloc_base_mem_bind_failure_action!". Kaboom.
The only problem is that it didn't give you any indication of where
this value was being set. Quite maddening, from a user perspective.
So we changed the ompi_info handles this case. If any framework open
function return OMPI_ERR_BAD_PARAM (either because its base MCA params
got a bad value or because one of its component register/open
functions return OMPI_ERR_BAD_PARAM), ompi_info will stop, print out
a warning that it received and error, and then dump out the parameters
that it has received so far in the framework that had a problem.
At a minimum, this will show the user the MCA param that had an error
(it's usually the last one), and ''where it was set from'' (so that
they can go fix it).
We updated ompi_info to check for O???_ERR_BAD_PARAM from each from
the framework opens. Also updated the doxygen docs in mca.h for this
O???_BAD_PARAM behavior. And we noticed that mca.h had MCA_SUCCESS
and MCA_ERR_??? codes. Why? I think we used them in exactly one
place in the code base (mca_base_components_open.c). So we deleted
those and just used the normal OPAL_* codes instead.
While we were doing this, we also cleaned up a little memory
management during ompi_info/orte-info/opal-info finalization.
Valgrind still reports a truckload of memory still in use at ompi_info
termination, but they mostly look to be components not freeing
memory/resources properly (and outside the scope of this fix).
This commit was SVN r27306.
The following Trac tickets were found above:
Ticket 3275 --> https://svn.open-mpi.org/trac/ompi/ticket/3275
* NULL's out the hwloc_obj_t->userdata in
hwloc_base_util.c:free_object() and
hwloc_base_util.c:opal_hwloc_base_free_topology() after it has been
OBJ_RELEASE'd.
* Adds a userdata field to opal_hwloc_topo_data_t. This field will
be used in an upcoming rmaps component ("lama") to cache some
associated data during hardware tree traversals.
This commit was SVN r26938.
Update all the orte ess components to remove their associated APIs for retrieving proc data. Update the grpcomm API to reflect transfer of set/get modex info to the db framework.
Note that this doesn't recreate the old GPR. This is strictly a local db storage that may (at some point) obtain any missing data from the local daemon as part of an async methodology. The framework allows us to experiment with such methods without perturbing the default one.
This commit was SVN r26678.
* Add new configure command line options and deprecate some old ones:
* --with-verbs replaces --with-openib
* --with-verbs-libdir replaces --with-openib-libdir
* If you specify --with-openib[-libdir] without
--with-verbs[-libdir], you'll get a "these options have been
deprecated!" warning, but then they'll act just like
--with-verbs[--libdir].
'''Sidenote:''' Note that we are not renaming any components at this
time, nor are we renaming the top-level OMPI_CHECK_OPENIB m4 macro
(which is pretty strongly tied to the openib BTL and is bastaridzed
by the ofud BTL). Note that there will likely be more changes in
this area coming soon (next week?) when some long-standing changes
move to the SVN trunk: some openib BTL infrastructure will move to
ompi/mca/common, and its configury gets split up / refactored.
We extend our philosophy of other --with-<foo> configure options of
--with-verbs to ''all'' verbs-lovin components:
* If you specify --with-verbs, then all verbs-lovin' components must
configure successfully (or abort). This currently means: OOB ud,
BTL ofud, BTL openib.
* If you specify --with-verbs=DIR, then all verbs-lovin' component
must configure successfully (or abort), and will use DIR to find
verbs headers and libraries.
* If you specify --without-verbs, then all verbs-lovin' components
will be ignored.
This commit also fixes a problem where the --with-openib=DIR form
would not use DIR for ''all'' verbs-lovin' components (I think only
BTL openib and BTL ofud used that DIR). Now all of them do, as does
hwloc (because hwloc has some !OpenFabrics helper functions that
require ibv types from verbs.h).
There's a little new m4 infrastructure worth mentioning:
* If you create a new verbs-lovin' component (i.e., a component that
need verbs), your configure.m4 should
AC_REQUIRE([OPAL_CHECK_VERBS_DIR]).
* You can then use three global shell variables: $opal_want_verbs,
$opal_verbs_dir, $opal_verbs_libdir, which will be set as follows:
* opal_want_verbs will be "yes" and opal_verbs_dir and
opal_verbs_libdir will both be set to directory values, '''OR'''
* opal_want_verbs will be "no" and opal_verbs_dir and
opal_verbs_libdir will both be set empty
This commit was SVN r26640.
* opal_hwloc_base_cset2str(): Make a human-readable string of a
hwloc_cpuset_t (e.g., socket 2[core 3[hwt 1]])
* opal_hwloc_base_cset2mapstr(): Make a map-like string of a
hwloc_cpuset_t (e.g., [B./..])
This commit was SVN r26532.
will be the final solution. But I'm committing it now so that
Oracle's Solaris Studio builds can resume.
The issue is that the C++ bindings are now (eventually) including
<hwloc.h>. We use !__hwloc_inline__ and #define it to an appropriate
value at compile-time. The issue is that when we're compiling C++
code, we should just set !__hwloc_inline__ to "inline", because that's
a keyword in the C++ language (as opposed to !__inline__, or
somesuch).
This commit was SVN r26418.
helper file, even if we find that the system has <infiniband/verbs.h>.
The reason is because there are some inline functions in that verbs
helper file that invoke ibv_* functions. Some linkers (e.g., Solaris
Studio Compilers) will instantiate those static inline functions --
even if we don't use them -- and therefore we need to be able to
resolve the ibv_* symbols at link time.
But since -libverbs is only specified in places where we use other
ibv_* functions (e.g., the OpenFabrics-based BTLs), that means that
linking random executables can/will fail (e.g., orterun).
So instead, introduce a new #define: OPAL_HWLOC_WANT_VERBS_HELPER. If
this macro is set to 1 before including opal/mca/hwloc/hwloc.h, then
you'll also get the hwloc OpenFabrics verbs helper header file (*if*
hwloc found <infiniband/verbs.h> -- otherwise, it'll #error).
This commit was SVN r26417.
* Remove paffinity, maffinity, and carto frameworks -- they've been
wholly replaced by hwloc.
* Move ompi_mpi_init() affinity-setting/checking code down to ORTE.
* Update sm, smcuda, wv, and openib components to no longer use carto.
Instead, use hwloc data. There are still optimizations possible in
the sm/smcuda BTLs (i.e., making multiple mpools). Also, the old
carto-based code found out how many NUMA nodes were ''available''
-- not how many were used ''in this job''. The new hwloc-using
code computes the same value -- it was not updated to calculate how
many NUMA nodes are used ''by this job.''
* Note that I cannot compile the smcuda and wv BTLs -- I ''think''
they're right, but they need to be verified by their owners.
* The openib component now does a bunch of stuff to figure out where
"near" OpenFabrics devices are. '''THIS IS A CHANGE IN DEFAULT
BEHAVIOR!!''' and still needs to be verified by OpenFabrics vendors
(I do not have a NUMA machine with an OpenFabrics device that is a
non-uniform distance from multiple different NUMA nodes).
* Completely rewrite the OMPI_Affinity_str() routine from the
"affinity" mpiext extension. This extension now understands
hyperthreads; the output format of it has changed a bit to reflect
this new information.
* Bunches of minor changes around the code base to update names/types
from maffinity/paffinity-based names to hwloc-based names.
* Add some helper functions into the hwloc base, mainly having to do
with the fact that we have the hwloc data reporting ''all''
topology information, but sometimes you really only want the
(online | available) data.
This commit was SVN r26391.
(http://www.open-mpi.org/community/lists/devel/2012/04/10905.php), set
opal_cache_line_size via hwloc data, if we have it.
opal_cache_line_size will be set to an hwloc-inspired value by the end
of orte_init(), but will always have a safe value to use (i.e., a
default value 128) -- even before opal_init() has completed.
Default to the same value of 128 that Open MPI has used for several
years if a) we have no hwloc data, or b) we weren't able to find L2
objects in the hwloc data.
This commit was SVN r26322.
Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code.
Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch.
This commit was SVN r26242.
fakeroot. If so, exit out of the pre-main hook immediately (without
calling functions such as stat, which will be replaced by fakeroot to
things that are not safe to call in a pre-main environment).
This commit was SVN r26203.
The following Trac tickets were found above:
Ticket 2812 --> https://svn.open-mpi.org/trac/ompi/ticket/2812
Worked with Oracle to verify that hwloc PCI detection is correctly
disabled on the Suse 10/64 bit platform and is enabled by default on
all other platforms. The --[en|dis]able-hwloc-pci switch is also
available for manual override of the configure decision about hwloc
PCI support.
This commit was SVN r26175.
This commit was SVN r26165.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r4402
hwloc component weren't reverse applied to the external hwloc
component. Additionaly, if we add stuff to LDFLAGS/LIBS, we also may
need to append (DY)LD_LIBRARY_PATH (here in this configure process
only), otherwise future configure tests may fail because they can't
find libhwloc.so (e.g., if you --with-hwloc=/some/path, we need to add
/some/path/lib to (DY)LD_LIBRARY_PATH).
This commit was SVN r26082.
The following Trac tickets were found above:
Ticket 3043 --> https://svn.open-mpi.org/trac/ompi/ticket/3043
This commit was SVN r26037.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r4340
This commit was SVN r25987.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r4319
This commit was SVN r25986.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r4314
Easiest solution is to set the include in a POST_CONFIG macro based on
whether the configure system says the component was selected or not.
This commit was SVN r25968.
1. no binding support - indicated by a negative return code from get_cpubind
2. binding supported, but not bound - the bitset returned by get_cpubind is the same as the available cpuset
3. binding supported and bound - bitset from get_cpubind is a subset of available cpuset
4. only one cpu is available - in this case, get_cpubind matches the available cpuset, but we are effectively bound
This commit was SVN r25957.
in the tarball. Thanks to Paul Hargrove for the fix.
This commit was SVN r25952.
The following Trac tickets were found above:
Ticket 2951 --> https://svn.open-mpi.org/trac/ompi/ticket/2951
paffinity hwloc was returning "NOT_SUPPORTED" when the real problem
was that the underlying hwloc simply hadn't been initialized yet. So
let's clearly delineate this case: return OPAL_ERR_NOT_INITIALIZED if
the underlying hwloc is not initialized.
This commit was SVN r25902.
into account when configuring the components of the faramework. If
--without-memory-manager was given, then we really don't want any
memory managers to be used.
This commit was SVN r25807.
The following Trac tickets were found above:
Ticket 2844 --> https://svn.open-mpi.org/trac/ompi/ticket/2844
problem on SuSE 10 (which might be related to Oracle's dual-bitness
builds, but we aren't completely sure yet).
So just turn it off for now, and bring this over to v1.5. Find a
proper fix (that enables pci support properly) for trunk/v1.7 later.
This commit was SVN r25769.
The following Trac tickets were found above:
Ticket 2952 --> https://svn.open-mpi.org/trac/ompi/ticket/2952
This commit was SVN r25759.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r4094
This commit was SVN r25707.
The following SVN revision numbers were found above:
r4102 --> open-mpi/ompi@8961ca568d
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r4104
.ompi_ignored to allow other developers to test with it. It is
expected that we'll remove the .ompi_ignore here soon, and
simultaneously remove the hwloc 1.2.2ompi component.
There was one very minor patch added to stock hwloc 1.3.1 in
hwloc/config/hwloc.m4:
{{{
--- hwloc-1.3.1/config/hwloc.m4 2011-12-14
+++ ompi3/opal/mca/hwloc/hwloc131/hwloc/config/hwloc.m4
@@ -583,6 +583,7 @@
])
fi
AC_SUBST(HWLOC_PCI_LIBS)
+ HWLOC_LIBS="$HWLOC_LIBS $HWLOC_PCI_LIBS"
# If we asked for pci support but couldn't deliver, fail
AS_IF([test "$enable_pci" = "yes" -a "$hwloc_pci_happy" = "no"],
[AC_MSG_WARN([Specified --enable-pci switch, but could
not])
}}}
This will be pushed upstream to hwloc.
This commit was SVN r25706.
No need for it to be in the base (we mistakenly thought it was used in
multiple shmem components).
This commit was SVN r25662.
The following SVN revision numbers were found above:
r25652 --> open-mpi/ompi@7e223b5799
* Fixes trac:2329 : Improves the error message, and ensures opal-restart will not segv in opal_finalize.
This commit was SVN r25586.
The following Trac tickets were found above:
Ticket 2329 --> https://svn.open-mpi.org/trac/ompi/ticket/2329
to -I the opal/config directory so that autoregen can find our .m4
files.
This commit was SVN r25564.
The following SVN revision numbers were found above:
r25511 --> open-mpi/ompi@5751c45916
supposed to. I.e., half-baked/not complete stuff.
This commit backs out all of r25545. Sorry folks!
This commit was SVN r25546.
The following SVN revision numbers were found above:
r25545 --> open-mpi/ompi@7f9ae11faf
to make MPI_IN_PLACE (and other sentinel Fortran constants) work on OS
X, we need to use the following compiler (linker) flag:
-Wl,-commons,use_dylibs
So if we're compiling on OS X, test to see if that flag works with the
compiler. If so, add it to the wrapper FFLAGS and FCFLAGS (note that
per a future update, we'll only have one Fortran compiler anyway).
Fixes trac:1982.
This commit was SVN r25545.
The following Trac tickets were found above:
Ticket 1982 --> https://svn.open-mpi.org/trac/ompi/ticket/1982
https://svn.open-mpi.org/trac/ompi/wiki/ProcessPlacement
The wiki page is incomplete at the moment, but I hope to complete it over the next few days. I will provide updates on the devel list. As the wiki page states, the default and most commonly used options remain unchanged (except as noted below). New, esoteric and complex options have been added, but unless you are a true masochist, you are unlikely to use many of them beyond perhaps an initial curiosity-motivated experimentation.
In a nutshell, this commit revamps the map/rank/bind procedure to take into account topology info on the compute nodes. I have, for the most part, preserved the default behaviors, with three notable exceptions:
1. I have at long last bowed my head in submission to the system admin's of managed clusters. For years, they have complained about our default of allowing users to oversubscribe nodes - i.e., to run more processes on a node than allocated slots. Accordingly, I have modified the default behavior: if you are running off of hostfile/dash-host allocated nodes, then the default is to allow oversubscription. If you are running off of RM-allocated nodes, then the default is to NOT allow oversubscription. Flags to override these behaviors are provided, so this only affects the default behavior.
2. both cpus/rank and stride have been removed. The latter was demanded by those who didn't understand the purpose behind it - and I agreed as the users who requested it are no longer using it. The former was removed temporarily pending implementation.
3. vm launch is now the sole method for starting OMPI. It was just too darned hard to maintain multiple launch procedures - maybe someday, provided someone can demonstrate a reason to do so.
As Jeff stated, it is impossible to fully test a change of this size. I have tested it on Linux and Mac, covering all the default and simple options, singletons, and comm_spawn. That said, I'm sure others will find problems, so I'll be watching MTT results until this stabilizes.
This commit was SVN r25476.