1
1
Граф коммитов

1187 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
61d331d5b5 MCA/base: fix some warnings and an error in the MCA variable system
This commit was SVN r28909.
2013-07-22 17:52:39 +00:00
Brian Barrett
0d8b57211a add missing include
This commit was SVN r28900.
2013-07-21 20:18:17 +00:00
Nathan Hjelm
1e8ba2b8cf fix condition in common/pmi init that c caused pmi to fail if PMI2_Init succeeds
This commit was SVN r28856.
2013-07-19 02:43:42 +00:00
Ralph Castain
4eb0dfa039 This has apparently been wrong for some time! Fix the common/pmi libraries so we build them dynamic so they can be properly linked into the components that use them. Define required library version numbers and so some other cuteness to make it all work.
cmr:v1.7.3:reviewer=jsquyres

This commit was SVN r28842.
2013-07-18 18:42:42 +00:00
Ralph Castain
92c6b806b9 Based on a patch submitted by Piotr Lesnicki of Bull, cleanup the PMI2 support. This has not been tested yet on multiple environments (e.g., Cray), so it needs more evaluation prior to moving to the 1.7 branch.
cmr:v1.7.3:reviewer=rhc

This commit was SVN r28837.
2013-07-18 14:46:07 +00:00
Nathan Hjelm
b88509af36 don't close components that failed to register. cmr:v1.7:reviewer=rhc
This commit was SVN r28823.
2013-07-17 19:49:05 +00:00
Nathan Hjelm
d446675526 MCA: Per-RFC, add support for performance variables
This commit adds an API for registering and querying performance
variables (mca_base_pvar) in the MCA base. The existing MCA variable
system API has been updated to reflect the new API: MCA variable
groups have performance variables, and new types have been added (double,
unsigned long long) to reflect what is required by the MPI_T
interface. Additionally, the MCA variable group code has been split
into its own set of files: mca_base_var_group.[ch].

Details of the new API can be found in doxygen comments in the header:
mca_base_pvar.h.

Other changes to the variable system:

 - Use an opal_hash_table to speed up variable/group lookup.

 - Clean up code associated with MCA variable types.

 - Registered performance variables are printed by ompi_info -a. In the
   future an option should be added to control this behavior.

Changes to OMPI:

 - Added full support for the MPI_T performance variable interface.

This commit was SVN r28800.
2013-07-16 16:02:13 +00:00
Jeff Squyres
14424daf4c Remove auto-generated file
This commit was SVN r28784.
2013-07-13 20:55:09 +00:00
Nathan Hjelm
8f9b7926ec mca/base: fix component selection negation. cmr:v1.7:reviewer=jsquyres
This commit was SVN r28770.
2013-07-12 17:55:20 +00:00
Ralph Castain
b001d31c27 Per RFC, remove libevent 2.0.19 and leave 2.0.21 as the default
This commit was SVN r28767.
2013-07-12 16:37:15 +00:00
Jeff Squyres
9252afdcd9 Updates and tweaks to the documentation of the new MCA parameter
system (written in conjunction with Nathan).

This commit was SVN r28758.
2013-07-11 20:04:51 +00:00
Nathan Hjelm
a694bcb6b6 Add support for the MCA variable information level to ompi_info.
Add an option to ompi_info (-l, --level) that takes a number in the
interval (1,9). Only MCA variables up to this level will be printed.
The default level is 1.

Print the level as part of both the parsable and readable output.

This commit was SVN r28750.
2013-07-10 18:52:36 +00:00
Ralph Castain
028f5ee7a6 Cleanup some bitrot from moving the db framework to opal and from the new mca param system
This commit was SVN r28741.
2013-07-09 14:37:08 +00:00
Ralph Castain
315da8125d Remove stale headers
cmr:v1.7.3:reviewer=jsquyres

This commit was SVN r28732.
2013-07-08 18:26:58 +00:00
Ralph Castain
eac174e624 For purposes of testing the RFC, make libevent2021 the default for now so it gets tested by MTT
This commit was SVN r28730.
2013-07-05 23:14:22 +00:00
Brian Barrett
ea9cee73c1 Per RFC, remove darwin backtrace, since OS X since 10.5 has supported the
execinfo() interface (which has been the default for OMPI to use on Darwin)

This commit was SVN r28727.
2013-07-05 19:06:27 +00:00
Ralph Castain
21c8041a40 Update libevent 2021 component so it also only warns once when detecting reentrant behavior
This commit was SVN r28721.
2013-07-04 04:41:04 +00:00
Ralph Castain
45fad1ddcc We really should be closing the event framework when told to do so.
cmr:v1.7.3,reviewer=jsquyres

This commit was SVN r28714.
2013-07-03 16:57:14 +00:00
Ralph Castain
9166a8cc95 Per telecon today, add a flag so we only warn once about reentrant libevent loops - this will allow developers to better diagnose the problem as we won't swamp filesystems with warning messages.
This commit was SVN r28712.
2013-07-03 04:51:36 +00:00
Jeff Squyres
ad16bcd6d1 Followup from Justin Bronder: Looks like I spoke too soon. The
sandbox team has informed me that they are getting rid of SANDBOX_PID
in the future and that using SANDBOX_ON would be preferred.

This commit was SVN r28708.
2013-07-03 01:38:26 +00:00
Jeff Squyres
fea15ec34e Add memory hooks override for Gentoo sandbox v2.5, too. Thanks to
Justin Bronder for the patch.

This commit was SVN r28702.
2013-07-02 12:34:51 +00:00
Ralph Castain
446e33a5d8 There are cases where we want to use the novm state machine, but the backend node topology differs from that where mpirun is executing. In those cases, we can wind up thinking we are oversubscribed because the head node has fewer cores than the compute nodes.
To resolve this situation, add the ability to specify a backend topology file that mpirun shall use for its mapping operations. Create a new "set_topology" function in opal hwloc to support it.

This commit was SVN r28682.
2013-06-27 03:04:50 +00:00
Jeff Squyres
dd25421d48 Convert strcpy() to strncpy(), and just to be extra-super paranoid,
use memset(0) for extra bonus points.

This commit was SVN r28668.
2013-06-22 12:21:18 +00:00
Joshua Ladd
0b5c1f2ea8 Add 'generic' support for PMI2 (previously, we checked for PMI2 only on Cray systems.) If your resource manager (e.g. SLURM) has support for PMI2, then the --with-pmi configure flag will enable its usage. If you don't have PMI2, then you will fallback to regular old PMI1. This patch was submitted by Ralph Castain and reviewed and pushed by Josh Ladd. This should be added to cmr:v1.7:reviewer=jladd
This commit was SVN r28666.
2013-06-21 15:28:14 +00:00
Nathan Hjelm
518d1fe200 Fix two typos that prevented alps direct launch from working
This commit was SVN r28628.
2013-06-13 17:04:08 +00:00
Joshua Ladd
46362d2761 Stomps compiler warnings in HCA min-dist calculation. This should be added to cmr:v1.7:reviewer=jladd
This commit was SVN r28620.
2013-06-12 16:25:25 +00:00
Tom Naughton
d86c3ce669 + remove autogenerated 'install-sh'
This commit was SVN r28602.
2013-06-07 20:40:24 +00:00
Jeff Squyres
6d173af329 This commit introduces a new "mindist" ORTE RMAPS mapper, as well as
some relevant updates/new functionality in the opal/mca/hwloc and
orte/mca/rmaps bases.  This work was mainly developed by Mellanox,
with a bunch of advice from Ralph Castain, and some minor advice from
Brice Goglin and Jeff Squyres.

Even though this is mainly Mellanox's work, Jeff is committing only
for logistical reasons (he holds the hg+svn combo tree, and can
therefore commit it directly back to SVN).

-----

Implemented distance-based mapping algorithm as a new "mindist"
component in the rmaps framework.  It allows mapping processes by NUMA
due to PCI locality information as reported by the BIOS - from the
closest to device to furthest.

To use this algorithm, specify:

   {{{mpirun --map-by dist:<device_name>}}}

where <device_name> can be mlx5_0, ib0, etc.

There are two modes provided:

 1. bynode: load-balancing across nodes
 1. byslot: go through slots sequentially (i.e., the first nodes are
     more loaded)

These options are regulated by the optional ''span'' modifier; the
command line parameter looks like:

    {{{mpirun --map-by dist:<device_name>,span}}}

So, for example, if there are 2 nodes, each with 8 cores, and we'd
like to run 10 processes, the mindist algorithm will place 8 processes
to the first node and 2 to the second by default. But if you want to
place 5 processes to each node, you can add a span modifier in your
command line to do that.

If there are two NUMA nodes on the node, each with 4 cores, and we run
6 processes, the mindist algorithm will try to find the NUMA closest
to the specified device, and if successful, it will place 4 processes
on that NUMA but leaving the remaining two to the next NUMA node.

You can also specify the number of cpus per MPI process. This option
is handled so that we map as many processes to the closest NUMA as we
can (number of available processors at the NUMA divided by number of
cpus per rank) and then go on with the next closest NUMA.

The default binding option for this mapping is bind-to-numa. It works
if you don't specify any binding policy. But if you specified binding
level that was "lower" than NUMA (i.e hwthread, core, socket) it would
bind to whatever level you specify.

This commit was SVN r28552.
2013-05-22 13:04:40 +00:00
Jeff Squyres
55382c1bf8 Bring over upstream hwloc trunk commit
https://svn.open-mpi.org/trac/hwloc/changeset/5592 to fix the merging
of groups when they are I/O objects.

This commit was SVN r28551.
2013-05-22 12:34:59 +00:00
Nathan Hjelm
721779d7ab Per RFC: remove old MCA parameter system.
This commit was SVN r28541.
2013-05-20 15:36:13 +00:00
Jeff Squyres
089c632cce Remove a bunch of dead code: gcc 4.7 warns of set-but-unused
variables.  So get rid of them.

This commit was SVN r28538.
2013-05-17 21:45:49 +00:00
Jeff Squyres
4d9da92e60 Fixes trac:376: bu default the wrappr compilers will enable rpath support
in generated executables on systems that support it.  Use
--disable-wrapper-rpath to disable this behavior.  See text in
README about --disable-wrapper-rpath for more details.

This commit was SVN r28479.

The following Trac tickets were found above:
  Ticket 376 --> https://svn.open-mpi.org/trac/ompi/ticket/376
2013-05-11 00:49:17 +00:00
Jeff Squyres
cad1d920b2 Check to ensure that we have struct ifreq.ifr_mtu before we try to use
it, because Solaris although has SIOCFIGMTU, it curiously does not
have ifreq.ifr_mtu.

This commit was SVN r28460.
2013-05-07 13:51:50 +00:00
Jeff Squyres
4b9b3a81ff Update the list of post-1.5.2 r numbers from hwloc that we have
committed here.

This commit was SVN r28458.
2013-05-07 01:22:06 +00:00
Jeff Squyres
ee0cdf86fd Fix issue raised by Stefan Friedel: remove an extraneous -L that is
added by hwloc's embedding so that it doesn't appear in
libhwloc_embedded.la (and therefore propogate all the way up to
libmpi.la). 

Committed upstream in hwloc SVN r5588.

This commit was SVN r28457.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r5588
2013-05-07 01:21:18 +00:00
Ralph Castain
527ea1d090 Per the RFC, always enable libevent thread support.
This commit was SVN r28443.
2013-05-03 15:39:05 +00:00
Ralph Castain
4c0dcb1aa2 Update ignores and remove build product
This commit was SVN r28412.
2013-04-29 19:02:03 +00:00
Ralph Castain
5d7a93c032 Add the ability to use an external version of libevent. Clearly not recommended at this time. I've verified that it works in limited scenarios, but more thorough testing and performance impacts need to be assessed.
Interesting how many includes had to be fixed here and there to fill in missing dependencies :-)

This commit was SVN r28411.
2013-04-29 17:02:37 +00:00
Ralph Castain
3052acd968 Fix minor typo
This commit was SVN r28410.
2013-04-29 17:02:11 +00:00
Ralph Castain
3818e88365 Remove and ignore build products
This commit was SVN r28404.
2013-04-27 00:07:18 +00:00
Ralph Castain
c081a520a3 Fix --without-hwloc
This commit was SVN r28396.
2013-04-25 19:13:56 +00:00
Nathan Hjelm
bccf8c657a Per RFC add initial support for the MPI 3.0 tools interface.
Current MPI_T support:
  - Full cvar interface.
  - Full categories interface.
  - No pvar support at this time.

This commit was SVN r28376.
2013-04-24 15:59:23 +00:00
Ralph Castain
d721437c8d Somebody (accidentally) removed the instructions for updating libevent releases in OMPI, so replace them with at least an outline on how to do it.
This commit was SVN r28349.
2013-04-22 17:05:56 +00:00
Ralph Castain
1dc65b5fd7 Update libevent to 2.0.21-stable, but currently ignore it for all but those testing it
This commit was SVN r28348.
2013-04-22 17:01:07 +00:00
Ralph Castain
6c6681e880 Fix an error in a test in the libevent configure.ac that we introduced - there are two brackets around the entire test code, so no need for double-brackets around the array indices within it
cmr:v1.7.2

This commit was SVN r28347.
2013-04-22 15:29:44 +00:00
Jeff Squyres
e88881c25f Also support getting the MAC and MTU.
This commit was SVN r28344.
2013-04-17 22:17:42 +00:00
Jeff Squyres
eb012c2aad Defensive programming: add a constructor for opal_if_t that zeros
everything out before using it.  

This is not in response to any known bug, but rather just a
pre-emptive, defensive move to help prevent bugs in code that forgets
to initialize a field.

This commit was SVN r28343.
2013-04-17 22:09:02 +00:00
Jeff Squyres
349ee654c1 Fix some --without-hwloc compile errors. Also remove one
assigned-but-not-used variable assignment.

This commit was SVN r28321.
2013-04-10 15:08:31 +00:00
Jeff Squyres
aef371c8f6 Fix bug introduced by r28236: make declaration and instantiation agree
on "const".

This commit was SVN r28320.

The following SVN revision numbers were found above:
  r28236 --> open-mpi/ompi@cf377db823
2013-04-10 14:10:47 +00:00
Ralph Castain
45af6cf59e The move of the orte_db framework to opal required that we create an opaque opal_identifier_t type as OPAL cannot know anything about the ORTE process name. However, passing a value down to opal and then having the db components reference it causes alignment issues on Solaris Sparc platforms. So pass the pointer instead and do the old "memcpy" trick to avoid the problem.
This commit was SVN r28308.
2013-04-08 23:34:16 +00:00
Ralph Castain
4dbc468c3c Remove stale file
This commit was SVN r28299.
2013-04-07 13:52:48 +00:00
Ralph Castain
c121a784ae Remove some weird code around opal_db_close and cleanup that framework's open/close operation
This commit was SVN r28298.
2013-04-07 13:52:28 +00:00
Ralph Castain
10257b8b43 Add missing include
This commit was SVN r28297.
2013-04-07 01:32:08 +00:00
Ralph Castain
1067b1f5ee Add a little debug
This commit was SVN r28295.
2013-04-06 15:24:35 +00:00
Ralph Castain
3bfa53eb91 Cleanup (again) the solaris topology code in hwloc...sigh.
This commit was SVN r28294.
2013-04-06 14:45:32 +00:00
Ralph Castain
ec00fa3132 Fix missing variable declaration in hwloc 1.5.2
This commit was SVN r28293.
2013-04-05 17:43:34 +00:00
Ralph Castain
39a4e93e44 Correct the includes so that compiling with devel headers works
This commit was SVN r28267.
2013-03-30 16:25:24 +00:00
Ralph Castain
d12eed0703 Silence warning
This commit was SVN r28249.
2013-03-27 22:07:29 +00:00
Nathan Hjelm
3b3506717e de-deprecate mca_base_param_init mca_base_param_finalize as they will be needed until the mca_base_param shim layer goes away
This commit was SVN r28248.
2013-03-27 22:07:23 +00:00
Ralph Castain
95cf39b224 Fix non-updated opal_output channel
This commit was SVN r28245.
2013-03-27 21:57:24 +00:00
Nathan Hjelm
c041156f60 Update ORTE frameworks to use the MCA framework system.
This commit was SVN r28240.
2013-03-27 21:14:43 +00:00
Nathan Hjelm
365cf48db5 Update OPAL frameworks to use the MCA framework system.
This commit was SVN r28239.
2013-03-27 21:11:47 +00:00
Nathan Hjelm
020b9991a4 Introduce the MCA framework system. This formalizes the interface frameworks must provide.
Other changes:
 - Added a flag to the MCA variable system to indicate a variable should go away
   when its group does. Both mca_base_framework_var_register() and
   mca_base_component_var_register() set this flag.

Notes:
 - mca_base_components_open is deprecated. It will be removed in a future commit.
 - All frameworks should use MCA_BASE_FRAMEWORK_DECLARE to declare their
   framework structure.
 - All calls to framework open/close functions should be changed to use the
   mca_base_framework_* functions.
 - Instead of special-casing installdirs a flag was added to prevent calling
   into the variable system when opening a framework.
 - Ralph: Clarify the functional definition of the "register" function in the
   MCA framework object - it had the same name as another function that does a
   totally different thing.
 - As per discussion with Ralph the behavior of mca_base_framework_register()
   is to always call mca_base_framework_components_register() if the framework's
   register function was successful. This removed the need for frameworks to
   have to call this function directly.

This commit was SVN r28237.
2013-03-27 21:10:18 +00:00
Nathan Hjelm
cf377db823 MCA/base: Add new MCA variable system
Features:
 - Support for an override parameter file (openmpi-mca-param-override.conf).
   Variable values in this file can not be overridden by any file or environment
   value.
 - Support for boolean, unsigned, and unsigned long long variables.
 - Support for true/false values.
 - Support for enumerations on integer variables.
 - Support for MPIT scope, verbosity, and binding.
 - Support for command line source.
 - Support for setting variable source via the environment using
   OMPI_MCA_SOURCE_<var name>=source (either command or file:filename)
 - Cleaner API.
 - Support for variable groups (equivalent to MPIT categories).

Notes:
 - Variables must be created with a backing store (char **, int *, or bool *)
   that must live at least as long as the variable.
 - Creating a variable with the MCA_BASE_VAR_FLAG_SETTABLE enables the use of
   mca_base_var_set_value() to change the value.
 - String values are duplicated when the variable is registered. It is up to
   the caller to free the original value if necessary. The new value will be
   freed by the mca_base_var system and must not be freed by the user.
 - Variables with constant scope may not be settable.
 - Variable groups (and all associated variables) are deregistered when the
   component is closed or the component repository item is freed. This
   prevents a segmentation fault from accessing a variable after its component
   is unloaded.
 - After some discussion we decided we should remove the automatic registration
   of component priority variables. Few component actually made use of this
   feature.
 - The enumerator interface was updated to be general enough to handle
   future uses of the interface.
 - The code to generate ompi_info output has been moved into the MCA variable
   system. See mca_base_var_dump().

opal: update core and components to mca_base_var system
orte: update core and components to mca_base_var system
ompi: update core and components to mca_base_var system

This commit also modifies the rmaps framework. The following variables were
moved from ppr and lama: rmaps_base_pernode, rmaps_base_n_pernode,
rmaps_base_n_persocket. Both lama and ppr create synonyms for these variables.

This commit was SVN r28236.
2013-03-27 21:09:41 +00:00
Jeff Squyres
1a048d6ee6 Remove a duplicate variable declaration.
This commit was SVN r28224.
2013-03-27 01:15:27 +00:00
Ralph Castain
317915225c Finish the binding cleanup by removing the no-longer-used binding level scheme. This proved to be fallible as there is no guarantee that the hierarchy it used matched physical reality of the machine (e.g., is L3 "above" the socket or not). Still have to complete the ppr update, but get the rest of it correct.
This commit was SVN r28223.
2013-03-26 20:09:49 +00:00
Ralph Castain
6ee32767d4 Restore the cpus-per-proc option for byslot and bynode mapping. Remove the bind_idx (which recorded the index of the hwloc object where the proc was bound) as this would no longer be unique, and just use the bitmap as the standard reference for location. Update the relative locality computation to take bitmaps as its argument.
This commit was SVN r28219.
2013-03-26 18:27:50 +00:00
Jeff Squyres
6c8d0450a3 Update the post-hwloc-1.5.2 patch list.
This commit was SVN r28218.
2013-03-26 16:18:52 +00:00
Jeff Squyres
f79716dfd4 Include <hwloc.h> so that the symbols in this file are subject to the
<hwloc/rename.h> renaming.

This commit was SVN r28215.
2013-03-26 15:49:52 +00:00
George Bosilca
a856f926de Remove a bunch of unused variables.
This commit was SVN r28213.
2013-03-26 14:34:29 +00:00
Jeff Squyres
6695b5e17a Re-apply r28040 from Eugene: a post-hwloc release fix for Solaris
binding.  This fix was included in the upstream 1.6 series, but not
the upstream 1.5 series, and was therefore missed when we brought
1.5.2 to OMPI.

This commit was SVN r28212.

The following SVN revision numbers were found above:
  r28040 --> open-mpi/ompi@3d44f97572
2013-03-26 13:27:23 +00:00
Ralph Castain
8a79d37ac2 Fix a few bugs in the hwloc integration code. The "set binding policy" macro should flag that the policy was indeed set. Some systems don't report sockets, so the print functions need to check for that condition.
cmr:v1.7

This commit was SVN r28209.
2013-03-25 17:51:45 +00:00
Brian Barrett
bc3ca9e009 Make the linux memory component do the failure path if it was disabled.
This commit was SVN r28206.
2013-03-22 16:56:09 +00:00
Brian Barrett
6c3f986d79 * Fix issue with duplicate symbol for the initialize hook due to it existing in both libmpi and libopen-pal by removing the one for libopen-pal. This won't work if we eventually need registration caching in opal/orte, but I'm hoping that by that point, OFED will have gotten off its butt and properly integrated ummunotify into the verbs layer so that this code can go away.
At the same time, fix a minor issue where the init hook was being called twice, once by the libc malloc and once by our malloc by removing the call from our malloc.

This commit was SVN r28202.
2013-03-21 23:05:54 +00:00
Jeff Squyres
e5838e6121 Don't mandate PCI support, because this will make builds on platforms
that don't have libpciaccess fail (e.g., OS X, or any machine without
libpciaccess).

This commit was SVN r28181.
2013-03-19 16:20:08 +00:00
Jeff Squyres
90802410a8 Update hwloc from 1.5.1 to 1.5.2. Re-enable hwloc PCI support by
default, since it will now use libpciaccess (if available).

This commit was SVN r28178.
2013-03-18 23:02:56 +00:00
Jeff Squyres
f8bbfacf65 Fix CID 967922: minor memory leak possibility.
This commit was SVN r28175.
2013-03-15 17:59:00 +00:00
Brian Barrett
fc2b3b8d46 Ugh. Work around an issue with memory hooks and the change from one big
library to multiple libraries that are implicitly sucked into the executable
as a dependency of libmpi.  The initialize hook isn't visible to libc on some
linux distributions when it's in libopal and libopal isn't explicity linked
into the executable.  The fix is to have a duplicate initialize hook in
libmpi as well as libopal.  *sigh*.

This commit was SVN r28164.
2013-03-11 19:22:24 +00:00
Ralph Castain
a4b6fb241f Remove all remaining vestiges of the Windows integration
This commit was SVN r28137.
2013-02-28 17:31:47 +00:00
Ralph Castain
8d2fa3693b First cut at removing the native Windows support. Remove all the Windows-specific components, and the .windows files sprinkled around. Remove the Windows platform files and MTT scripts. Update the NEWS to point Windows users to the cygwin package.
This commit was SVN r28116.
2013-02-26 20:44:56 +00:00
Ralph Castain
9479635e31 Missing include here too...
This commit was SVN r28115.
2013-02-26 20:21:10 +00:00
Ralph Castain
8b8333da3e Add missing include
This commit was SVN r28114.
2013-02-26 19:56:05 +00:00
Ralph Castain
e413596705 Add the loopexit API to the opal_event definitions
This commit was SVN r28113.
2013-02-26 19:27:26 +00:00
Ralph Castain
bd9265c560 Per the meeting on moving the BTLs to OPAL, move the ORTE database "db" framework to OPAL so the relocated BTLs can access it. Because the data is indexed by process, this requires that we define a new "opal_identifier_t" that corresponds to the orte_process_name_t struct. In order to support multiple run-times, this is defined in opal/mca/db/db_types.h as a uint64_t without identifying the meaning of any part of that data.
A few changes were required to support this move:

1. the PMI component used to identify rte-related data (e.g., host name, bind level) and package them as a unit to reduce the number of PMI keys. This code was moved up to the ORTE layer as the OPAL layer has no understanding of these concepts. In addition, the component locally stored data based on process jobid/vpid - this could no longer be supported (see below for the solution).

2. the hash component was updated to use the new opal_identifier_t instead of orte_process_name_t as its index for storing data in the hash tables. Previously, we did a hash on the vpid and stored the data in a 32-bit hash table. In the revised system, we don't see a separate "vpid" field - we only have a 64-bit opaque value. The orte_process_name_t hash turned out to do nothing useful, so we now store the data in a 64-bit hash table. Preliminary tests didn't show any identifiable change in behavior or performance, but we'll have to see if a move back to the 32-bit table is required at some later time.

3. the db framework was a "select one" system. However, since the PMI component could no longer use its internal storage system, the framework has now been changed to a "select many" mode of operation. This allows the hash component to handle all internal storage, while the PMI component only handles pushing/pulling things from the PMI system. This was something we had planned for some time - when fetching data, we first check internal storage to see if we already have it, and then automatically go to the global system to look for it if we don't. Accordingly, the framework was provided with a custom query function used during "select" that lets you seperately specify the "store" and "fetch" ordering.

4. the ORTE grpcomm and ess/pmi components, and the nidmap code,  were updated to work with the new db framework and to specify internal/global storage options.

No changes were made to the MPI layer, except for modifying the ORTE component of the OMPI/rte framework to support the new db framework.

This commit was SVN r28112.
2013-02-26 17:50:04 +00:00
Brian Barrett
7c3e42a689 Work around issue shown in #3505 by not linking against libpci by default.
This commit was SVN r28076.
2013-02-19 16:19:33 +00:00
Jeff Squyres
acefc1588e Patch for Cygwin support: Use S_IRWXU for shmget() and include
<sys/stat.h>.  Thanks to Marco Atzeri for reporting the issue and
providing an initial patch.

This commit was SVN r28060.
2013-02-15 14:31:58 +00:00
Ralph Castain
037918e7b4 Correctly parse the rank file slot_list when given "S:C" - the first position holds the socket, so start looking for cores at posn=1
This commit was SVN r28054.
2013-02-13 13:06:03 +00:00
Eugene Loh
3d44f97572 Fix hwloc get-cpubind routine for Solaris. FIRST, check
processor_bind to see if we're bound to a single core.
If not, THEN check lgroup affinity.  Already CMR'ed to
v1.6 (trac 3507) and fixed upstream in hwloc (r5295).

This commit was SVN r28040.

The following SVN revision numbers were found above:
  r5295 --> open-mpi/ompi@6df8cb0f02
2013-02-10 04:02:19 +00:00
Nathan Hjelm
05a8958bb0 shmem_RUNTIME_QUERY_hint is not really read_only as it is set from the environment not the default value
This commit was SVN r28005.
2013-01-31 23:41:00 +00:00
Brian Barrett
b8442ba505 Revamp the handling of wrapper compiler flags. The user flags, main configure
flags, and mca flags are kept seperate until the very end.  The main configure
wrapper flags should now be modified by using the OPAL_WRAPPER_FLAGS_ADD
macro.  MCA components should either let <framework>_<component>_{LIBS,LDFLAGS}
be copied over OR set <framework>_<component>_WRAPPER_EXTRA_{LIBS,LDFLAGS}.
The situations in which WRAPPER CPPFLAGS can be set by MCA components was
made very small to match the one use case where it makes sense.

This commit was SVN r27950.
2013-01-29 00:00:43 +00:00
Ralph Castain
f6b4db0b79 Fix rank_file operations. We changed the syntax to use semi-colons between multiple slot assignments so that we could use the comma to separate specific cores, but somehow the flex definitions didn't get updated to accept that character. We also incorrectly zero'd the bitmap between slot assignment sections, and so multiple slot assignments only wound up making the last one in the list.
This commit was SVN r27908.
2013-01-25 18:33:25 +00:00
Brian Barrett
cdf0325589 Keep libevent headers from being installed in the wrong place. The top level
Makefile.am gets them installed in the right place already

This commit was SVN r27903.
2013-01-24 22:23:51 +00:00
Brian Barrett
4f41f5ce5b OPAL_WRAPPER_EXTRA_CPPFLAGS is the wrong variable, want to set
WRAPPER_EXTRA_CPPFLAGS

This commit was SVN r27886.
2013-01-21 23:37:35 +00:00
Ralph Castain
92e297d1fa Pack/unpack the disk and net stats so they get passed along
This commit was SVN r27844.
2013-01-16 21:54:48 +00:00
Samuel Gutierrez
cba06776f1 Fix copy and paste error in linux memory component debug output.
This commit was SVN r27842.
2013-01-16 18:27:57 +00:00
Ralph Castain
f29f1b731c Extend the node statistics to include disk and network traffic data.
This commit was SVN r27834.
2013-01-15 22:42:36 +00:00
Ralph Castain
2379b7369f Hey Jeff - AC_HELP_STRING takes *two* arguments, dude!
This commit was SVN r27820.
2013-01-15 15:25:58 +00:00
Jeff Squyres
e30d9a2bfb The "external" hwloc component didn't have the same fixes applied to
it that the others did: move the "I won!" code up into the POST_CONFIG
macro.  Also, fix a long-standing typo when restoring the $CPPFLAGS (!).

This commit was SVN r27813.
2013-01-14 21:44:47 +00:00
Jeff Squyres
423208932e HWLOC_DO_AM_CONDITIONALS must be run unconditionally.
This commit was SVN r27812.
2013-01-14 21:43:16 +00:00
Jeff Squyres
c17ec83de3 Add some post-v1.5.1 release hwloc bug fixes
This commit was SVN r27805.
2013-01-14 16:25:21 +00:00