1
1
Граф коммитов

2576 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
99adeb7f6e Fix support for complex datatypes when fortran is not available but _Complex is
This commit was SVN r28951.
2013-07-25 19:08:21 +00:00
Nathan Hjelm
ebbb32120a MCA/base: variable system updates
- Use an enumerator to handle bool values.

 - Fix a leak in the variable enumerator.

 - Fix a leak in an orte parameter.

This commit was SVN r28949.
2013-07-25 15:42:01 +00:00
Ralph Castain
41f97931e9 Need to include module-level CPPFLAGS so it can build
This commit was SVN r28947.
2013-07-24 23:07:43 +00:00
Nathan Hjelm
c4c69b4ddf MPI-3: add support for large counts using derived datatypes
Add support for MPI_Count type and MPI_COUNT datatype and add the required
MPI-3 functions MPI_Get_elements_x, MPI_Status_set_elements_x,
MPI_Type_get_extent_x, MPI_Type_get_true_extent_x, and MPI_Type_size_x.
This commit adds only the C bindings. Fortran bindins will be added in
another commit. For now the MPI_Count type is define to have the same size
as MPI_Offset. The type is required to be at least as large as MPI_Offset
and MPI_Aint. The type was initially intended to be a ssize_t (if it was
the same size as a long long) but there were issues compiling romio with
that definition (despite the inclusion of stddef.h).

I updated the datatype engine to use size_t instead of uint32_t to support
large datatypes. This will require some review to make sure that 1) the
changes are beneficial, 2) nothing was broken by the change (I doubt
anything was), and 3) there are no performance regressions due to this
change.

Increase the maximum number of predifined datatypes to support MPI_Count

Put common get_elements code to ompi/datatype/ompi_datatype_get_elements.c

Update MPI_Get_count to reflect changes in MPI-3 (return MPI_UNDEFINED when the count is too large for an int)

This commit was SVN r28932.
2013-07-23 15:35:14 +00:00
Ralph Castain
6c1a140e99 Per request from Nathan, add a "commit" API to the opal db framework. This allows him to aggregate keys to work around the Cray's severe PMI limitations
This commit was SVN r28917.
2013-07-22 22:57:16 +00:00
Jeff Squyres
49b5342130 After talking with Nathan, update some comments/documentation about
the new MCA var and pvar systems.

This commit was SVN r28913.
2013-07-22 20:34:42 +00:00
Nathan Hjelm
61d331d5b5 MCA/base: fix some warnings and an error in the MCA variable system
This commit was SVN r28909.
2013-07-22 17:52:39 +00:00
Brian Barrett
0d8b57211a add missing include
This commit was SVN r28900.
2013-07-21 20:18:17 +00:00
Nathan Hjelm
1e8ba2b8cf fix condition in common/pmi init that c caused pmi to fail if PMI2_Init succeeds
This commit was SVN r28856.
2013-07-19 02:43:42 +00:00
Ralph Castain
4eb0dfa039 This has apparently been wrong for some time! Fix the common/pmi libraries so we build them dynamic so they can be properly linked into the components that use them. Define required library version numbers and so some other cuteness to make it all work.
cmr:v1.7.3:reviewer=jsquyres

This commit was SVN r28842.
2013-07-18 18:42:42 +00:00
Ralph Castain
92c6b806b9 Based on a patch submitted by Piotr Lesnicki of Bull, cleanup the PMI2 support. This has not been tested yet on multiple environments (e.g., Cray), so it needs more evaluation prior to moving to the 1.7 branch.
cmr:v1.7.3:reviewer=rhc

This commit was SVN r28837.
2013-07-18 14:46:07 +00:00
Nathan Hjelm
b88509af36 don't close components that failed to register. cmr:v1.7:reviewer=rhc
This commit was SVN r28823.
2013-07-17 19:49:05 +00:00
Nathan Hjelm
456de007a8 ignore unavailable components when registering
This commit was SVN r28802.
2013-07-16 16:02:33 +00:00
Nathan Hjelm
d446675526 MCA: Per-RFC, add support for performance variables
This commit adds an API for registering and querying performance
variables (mca_base_pvar) in the MCA base. The existing MCA variable
system API has been updated to reflect the new API: MCA variable
groups have performance variables, and new types have been added (double,
unsigned long long) to reflect what is required by the MPI_T
interface. Additionally, the MCA variable group code has been split
into its own set of files: mca_base_var_group.[ch].

Details of the new API can be found in doxygen comments in the header:
mca_base_pvar.h.

Other changes to the variable system:

 - Use an opal_hash_table to speed up variable/group lookup.

 - Clean up code associated with MCA variable types.

 - Registered performance variables are printed by ompi_info -a. In the
   future an option should be added to control this behavior.

Changes to OMPI:

 - Added full support for the MPI_T performance variable interface.

This commit was SVN r28800.
2013-07-16 16:02:13 +00:00
Ralph Castain
10ca1c1b04 Turns out that there was exactly ONE place in all of the OMPI code base that still referred to OPAL_TRACE, though a few places retained the include file for no reason. So no point in letting this sit as it is clearly an unused "feature".
This commit was SVN r28789.
2013-07-14 18:57:20 +00:00
Jeff Squyres
14424daf4c Remove auto-generated file
This commit was SVN r28784.
2013-07-13 20:55:09 +00:00
Nathan Hjelm
8f9b7926ec mca/base: fix component selection negation. cmr:v1.7:reviewer=jsquyres
This commit was SVN r28770.
2013-07-12 17:55:20 +00:00
Ralph Castain
b001d31c27 Per RFC, remove libevent 2.0.19 and leave 2.0.21 as the default
This commit was SVN r28767.
2013-07-12 16:37:15 +00:00
Jeff Squyres
9252afdcd9 Updates and tweaks to the documentation of the new MCA parameter
system (written in conjunction with Nathan).

This commit was SVN r28758.
2013-07-11 20:04:51 +00:00
Jeff Squyres
bdb45a2e4f Add an oh-so-slightly faster variant of the hotel "checkin" action
(since this is used in the fast path) for when you ''know'' that there
will be a room available:

 * Don't do the last_unoccupied_room check
 * Return void

This commit was SVN r28757.
2013-07-11 20:00:37 +00:00
Nathan Hjelm
a694bcb6b6 Add support for the MCA variable information level to ompi_info.
Add an option to ompi_info (-l, --level) that takes a number in the
interval (1,9). Only MCA variables up to this level will be printed.
The default level is 1.

Print the level as part of both the parsable and readable output.

This commit was SVN r28750.
2013-07-10 18:52:36 +00:00
Ralph Castain
028f5ee7a6 Cleanup some bitrot from moving the db framework to opal and from the new mca param system
This commit was SVN r28741.
2013-07-09 14:37:08 +00:00
Ralph Castain
315da8125d Remove stale headers
cmr:v1.7.3:reviewer=jsquyres

This commit was SVN r28732.
2013-07-08 18:26:58 +00:00
Ralph Castain
2ccc0438af On some systems, pthread_kill is actually in the "signals.h" header, so include it
This commit was SVN r28731.
2013-07-08 17:40:38 +00:00
Ralph Castain
eac174e624 For purposes of testing the RFC, make libevent2021 the default for now so it gets tested by MTT
This commit was SVN r28730.
2013-07-05 23:14:22 +00:00
Brian Barrett
ea9cee73c1 Per RFC, remove darwin backtrace, since OS X since 10.5 has supported the
execinfo() interface (which has been the default for OMPI to use on Darwin)

This commit was SVN r28727.
2013-07-05 19:06:27 +00:00
Ralph Castain
21c8041a40 Update libevent 2021 component so it also only warns once when detecting reentrant behavior
This commit was SVN r28721.
2013-07-04 04:41:04 +00:00
Ralph Castain
bd65937bf3 If we enable ipv6, we resolve a hosts addresses and check them all against our local interfaces to determine if the given host is us. However, if we don't enable ipv6, we only checked the first address returned. This can cause us to incorrectly identify a hostname as "not us".
Make -disable-ipv6 behave the same as --enable-ipv6 by checking all the returned addresses.

This commit was SVN r28716.
2013-07-03 21:41:36 +00:00
Ralph Castain
45fad1ddcc We really should be closing the event framework when told to do so.
cmr:v1.7.3,reviewer=jsquyres

This commit was SVN r28714.
2013-07-03 16:57:14 +00:00
Ralph Castain
9166a8cc95 Per telecon today, add a flag so we only warn once about reentrant libevent loops - this will allow developers to better diagnose the problem as we won't swamp filesystems with warning messages.
This commit was SVN r28712.
2013-07-03 04:51:36 +00:00
Jeff Squyres
ad16bcd6d1 Followup from Justin Bronder: Looks like I spoke too soon. The
sandbox team has informed me that they are getting rid of SANDBOX_PID
in the future and that using SANDBOX_ON would be preferred.

This commit was SVN r28708.
2013-07-03 01:38:26 +00:00
Jeff Squyres
fea15ec34e Add memory hooks override for Gentoo sandbox v2.5, too. Thanks to
Justin Bronder for the patch.

This commit was SVN r28702.
2013-07-02 12:34:51 +00:00
George Bosilca
a5bda43cfc Small typo.
This commit was SVN r28689.
2013-07-01 16:48:45 +00:00
Ralph Castain
446e33a5d8 There are cases where we want to use the novm state machine, but the backend node topology differs from that where mpirun is executing. In those cases, we can wind up thinking we are oversubscribed because the head node has fewer cores than the compute nodes.
To resolve this situation, add the ability to specify a backend topology file that mpirun shall use for its mapping operations. Create a new "set_topology" function in opal hwloc to support it.

This commit was SVN r28682.
2013-06-27 03:04:50 +00:00
Jeff Squyres
dd25421d48 Convert strcpy() to strncpy(), and just to be extra-super paranoid,
use memset(0) for extra bonus points.

This commit was SVN r28668.
2013-06-22 12:21:18 +00:00
Joshua Ladd
0b5c1f2ea8 Add 'generic' support for PMI2 (previously, we checked for PMI2 only on Cray systems.) If your resource manager (e.g. SLURM) has support for PMI2, then the --with-pmi configure flag will enable its usage. If you don't have PMI2, then you will fallback to regular old PMI1. This patch was submitted by Ralph Castain and reviewed and pushed by Josh Ladd. This should be added to cmr:v1.7:reviewer=jladd
This commit was SVN r28666.
2013-06-21 15:28:14 +00:00
George Bosilca
f5a55ccb39 Various cleanups.
This commit was SVN r28647.
2013-06-15 16:23:11 +00:00
George Bosilca
a6c3477e89 Remove useless include.
This commit was SVN r28646.
2013-06-15 16:07:30 +00:00
Nathan Hjelm
8924140916 Per RFC: use a better hash algorithm for the opal_hash_table_*_ptr functions.
Chose the crc32 function present in opal/util/crc.c as the hash function. The
performance should be sufficient for most cases. If not we can always change
the function again.

This commit was SVN r28629.
2013-06-13 17:11:04 +00:00
Nathan Hjelm
518d1fe200 Fix two typos that prevented alps direct launch from working
This commit was SVN r28628.
2013-06-13 17:04:08 +00:00
Joshua Ladd
46362d2761 Stomps compiler warnings in HCA min-dist calculation. This should be added to cmr:v1.7:reviewer=jladd
This commit was SVN r28620.
2013-06-12 16:25:25 +00:00
Tom Naughton
d86c3ce669 + remove autogenerated 'install-sh'
This commit was SVN r28602.
2013-06-07 20:40:24 +00:00
Rolf vandeVaart
62ab008017 Fix SEGV because missing CUDA initialization.
This commit was SVN r28601.
2013-06-07 18:31:36 +00:00
Rolf vandeVaart
1230029aa1 The debug messages were swapped. Fixed.
This commit was SVN r28600.
2013-06-07 17:23:41 +00:00
George Bosilca
72877f078f Based on the MPI 3.0 count equal to zero has a clear meaning, no modification
of the original datatype are allowed (not in type map nor extent). Make it
clear in the code.
Allow 0-count cases to the contiguous memory check.

This commit was SVN r28568.
2013-05-29 16:02:54 +00:00
Jeff Squyres
6d173af329 This commit introduces a new "mindist" ORTE RMAPS mapper, as well as
some relevant updates/new functionality in the opal/mca/hwloc and
orte/mca/rmaps bases.  This work was mainly developed by Mellanox,
with a bunch of advice from Ralph Castain, and some minor advice from
Brice Goglin and Jeff Squyres.

Even though this is mainly Mellanox's work, Jeff is committing only
for logistical reasons (he holds the hg+svn combo tree, and can
therefore commit it directly back to SVN).

-----

Implemented distance-based mapping algorithm as a new "mindist"
component in the rmaps framework.  It allows mapping processes by NUMA
due to PCI locality information as reported by the BIOS - from the
closest to device to furthest.

To use this algorithm, specify:

   {{{mpirun --map-by dist:<device_name>}}}

where <device_name> can be mlx5_0, ib0, etc.

There are two modes provided:

 1. bynode: load-balancing across nodes
 1. byslot: go through slots sequentially (i.e., the first nodes are
     more loaded)

These options are regulated by the optional ''span'' modifier; the
command line parameter looks like:

    {{{mpirun --map-by dist:<device_name>,span}}}

So, for example, if there are 2 nodes, each with 8 cores, and we'd
like to run 10 processes, the mindist algorithm will place 8 processes
to the first node and 2 to the second by default. But if you want to
place 5 processes to each node, you can add a span modifier in your
command line to do that.

If there are two NUMA nodes on the node, each with 4 cores, and we run
6 processes, the mindist algorithm will try to find the NUMA closest
to the specified device, and if successful, it will place 4 processes
on that NUMA but leaving the remaining two to the next NUMA node.

You can also specify the number of cpus per MPI process. This option
is handled so that we map as many processes to the closest NUMA as we
can (number of available processors at the NUMA divided by number of
cpus per rank) and then go on with the next closest NUMA.

The default binding option for this mapping is bind-to-numa. It works
if you don't specify any binding policy. But if you specified binding
level that was "lower" than NUMA (i.e hwthread, core, socket) it would
bind to whatever level you specify.

This commit was SVN r28552.
2013-05-22 13:04:40 +00:00
Jeff Squyres
55382c1bf8 Bring over upstream hwloc trunk commit
https://svn.open-mpi.org/trac/hwloc/changeset/5592 to fix the merging
of groups when they are I/O objects.

This commit was SVN r28551.
2013-05-22 12:34:59 +00:00
Nathan Hjelm
721779d7ab Per RFC: remove old MCA parameter system.
This commit was SVN r28541.
2013-05-20 15:36:13 +00:00
Jeff Squyres
089c632cce Remove a bunch of dead code: gcc 4.7 warns of set-but-unused
variables.  So get rid of them.

This commit was SVN r28538.
2013-05-17 21:45:49 +00:00
Ralph Castain
1ec13d530c Allow simple way to request comparison to full address regardless of addr family
This commit was SVN r28519.
2013-05-14 22:08:39 +00:00
Ralph Castain
eb2edb4b2b Silence warning
This commit was SVN r28516.
2013-05-14 22:00:01 +00:00
Rolf vandeVaart
8a8ea9ba1b Fix compile error in optimize build for CUDA-aware code.
This commit was SVN r28512.
2013-05-14 21:07:27 +00:00
Ralph Castain
37088f23d8 When ipv6 disabled, we still have getaddrinfo, so use it when checking common networks for resolving to kindex
This commit was SVN r28496.
2013-05-14 15:54:46 +00:00
Ralph Castain
3fc1bafd82 fix typo
This commit was SVN r28490.
2013-05-14 12:36:45 +00:00
Ralph Castain
f4f07bdb21 Ensure the opal_ifaddrtokindex function considers the full range of address space by using the netmask
This commit was SVN r28487.
2013-05-14 03:37:44 +00:00
Ralph Castain
c33219a51b Extend the bitmap API a bit to provide a test if all bits zero
This commit was SVN r28486.
2013-05-14 03:34:57 +00:00
Jeff Squyres
4d9da92e60 Fixes trac:376: bu default the wrappr compilers will enable rpath support
in generated executables on systems that support it.  Use
--disable-wrapper-rpath to disable this behavior.  See text in
README about --disable-wrapper-rpath for more details.

This commit was SVN r28479.

The following Trac tickets were found above:
  Ticket 376 --> https://svn.open-mpi.org/trac/ompi/ticket/376
2013-05-11 00:49:17 +00:00
Jeff Squyres
cad1d920b2 Check to ensure that we have struct ifreq.ifr_mtu before we try to use
it, because Solaris although has SIOCFIGMTU, it curiously does not
have ifreq.ifr_mtu.

This commit was SVN r28460.
2013-05-07 13:51:50 +00:00
Jeff Squyres
4b9b3a81ff Update the list of post-1.5.2 r numbers from hwloc that we have
committed here.

This commit was SVN r28458.
2013-05-07 01:22:06 +00:00
Jeff Squyres
ee0cdf86fd Fix issue raised by Stefan Friedel: remove an extraneous -L that is
added by hwloc's embedding so that it doesn't appear in
libhwloc_embedded.la (and therefore propogate all the way up to
libmpi.la). 

Committed upstream in hwloc SVN r5588.

This commit was SVN r28457.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r5588
2013-05-07 01:21:18 +00:00
Ralph Castain
527ea1d090 Per the RFC, always enable libevent thread support.
This commit was SVN r28443.
2013-05-03 15:39:05 +00:00
George Bosilca
1169ebdff8 Indentation.
This commit was SVN r28426.
2013-04-30 23:26:23 +00:00
Ralph Castain
4c0dcb1aa2 Update ignores and remove build product
This commit was SVN r28412.
2013-04-29 19:02:03 +00:00
Ralph Castain
5d7a93c032 Add the ability to use an external version of libevent. Clearly not recommended at this time. I've verified that it works in limited scenarios, but more thorough testing and performance impacts need to be assessed.
Interesting how many includes had to be fixed here and there to fill in missing dependencies :-)

This commit was SVN r28411.
2013-04-29 17:02:37 +00:00
Ralph Castain
3052acd968 Fix minor typo
This commit was SVN r28410.
2013-04-29 17:02:11 +00:00
Ralph Castain
3818e88365 Remove and ignore build products
This commit was SVN r28404.
2013-04-27 00:07:18 +00:00
Jeff Squyres
5bf9fffacd As initially reported by Eric Chamberland in
http://www.open-mpi.org/community/lists/users/2013/04/21689.php, the
assert in opal_datatype_is_contiguous_memory_layout() is not always
correct -- he supplied a test case where it was not valid,
essentially:

 1. Call MPI_Type_create_indexed_block(0, ..., &newtype) and commit newtype
 1. Call MPI_Type_create_resized(newtype, 0, nonzero_value, &resized) and commit resized
 1. Call MPI_File_set_view with resized

This will eventually call opal_datatype_is_contiguous_memory_layout(),
and the assert will fail.  After some consultation with George, it was
determined that the assert() is basically good, but it needs to also
check for (count != 0).

This commit was SVN r28398.
2013-04-25 20:54:25 +00:00
Ralph Castain
b73f25e839 Add a function to return the kernel index of the corresponding interface from an IPv4/6 string or hostname
This commit was SVN r28397.
2013-04-25 19:40:34 +00:00
Ralph Castain
c081a520a3 Fix --without-hwloc
This commit was SVN r28396.
2013-04-25 19:13:56 +00:00
Ralph Castain
cef639f578 Ahem....cleanup a copy/paste error in naming of these functions
This commit was SVN r28395.
2013-04-25 15:21:53 +00:00
Nathan Hjelm
c50b99005d fix typo in opal_info_show_component_version and clean up more from ompi_info
This commit was SVN r28389.
2013-04-24 22:07:06 +00:00
Nathan Hjelm
4896b3bc4b clean up some ompi_info code
This commit was SVN r28388.
2013-04-24 21:37:24 +00:00
Ralph Castain
4fae24f2f1 Crud - missed this file, needs to go with prior commit, will add to cmr
This commit was SVN r28382.
2013-04-24 17:47:18 +00:00
Nathan Hjelm
bccf8c657a Per RFC add initial support for the MPI 3.0 tools interface.
Current MPI_T support:
  - Full cvar interface.
  - Full categories interface.
  - No pvar support at this time.

This commit was SVN r28376.
2013-04-24 15:59:23 +00:00
Ralph Castain
d721437c8d Somebody (accidentally) removed the instructions for updating libevent releases in OMPI, so replace them with at least an outline on how to do it.
This commit was SVN r28349.
2013-04-22 17:05:56 +00:00
Ralph Castain
1dc65b5fd7 Update libevent to 2.0.21-stable, but currently ignore it for all but those testing it
This commit was SVN r28348.
2013-04-22 17:01:07 +00:00
Ralph Castain
6c6681e880 Fix an error in a test in the libevent configure.ac that we introduced - there are two brackets around the entire test code, so no need for double-brackets around the array indices within it
cmr:v1.7.2

This commit was SVN r28347.
2013-04-22 15:29:44 +00:00
Rolf vandeVaart
5e1dde419c Fix some compile errors in CUDA-aware code that has crept in.
This commit was SVN r28346.
2013-04-18 15:34:16 +00:00
Jeff Squyres
c722440411 Add public functions for retrieving the MAC and MTU (paired with
r28344).

This commit was SVN r28345.

The following SVN revision numbers were found above:
  r28344 --> open-mpi/ompi@e88881c25f
2013-04-17 22:32:32 +00:00
Jeff Squyres
e88881c25f Also support getting the MAC and MTU.
This commit was SVN r28344.
2013-04-17 22:17:42 +00:00
Jeff Squyres
eb012c2aad Defensive programming: add a constructor for opal_if_t that zeros
everything out before using it.  

This is not in response to any known bug, but rather just a
pre-emptive, defensive move to help prevent bugs in code that forgets
to initialize a field.

This commit was SVN r28343.
2013-04-17 22:09:02 +00:00
Ralph Castain
d0e34adacb Add debug
This commit was SVN r28331.
2013-04-15 13:09:43 +00:00
Jeff Squyres
349ee654c1 Fix some --without-hwloc compile errors. Also remove one
assigned-but-not-used variable assignment.

This commit was SVN r28321.
2013-04-10 15:08:31 +00:00
Jeff Squyres
aef371c8f6 Fix bug introduced by r28236: make declaration and instantiation agree
on "const".

This commit was SVN r28320.

The following SVN revision numbers were found above:
  r28236 --> open-mpi/ompi@cf377db823
2013-04-10 14:10:47 +00:00
George Bosilca
43e4d3654e Fix an issue identified by Thomas Jahns and his colleague when the data
representation is not correctly optimized (it is off by the extend).

During the data representation process, if the opportunity to merge several
items appear, we replace them with the new merged element. However, if one
of the components of this merged element was comming from a "loop representation"
then the new first element of this loop must have a displacement moved by the
extent of the loop.

This commit was SVN r28319.
2013-04-09 23:01:54 +00:00
Ralph Castain
45af6cf59e The move of the orte_db framework to opal required that we create an opaque opal_identifier_t type as OPAL cannot know anything about the ORTE process name. However, passing a value down to opal and then having the db components reference it causes alignment issues on Solaris Sparc platforms. So pass the pointer instead and do the old "memcpy" trick to avoid the problem.
This commit was SVN r28308.
2013-04-08 23:34:16 +00:00
George Bosilca
c5909bffe8 Make the opal_convertor_raw similar to opal_convertor_pack and _unpack, by
allowing it to handle completed convertors. In this case it will return a
length of zero and an iov_count set to zero.

This commit was SVN r28305.
2013-04-08 13:49:14 +00:00
Ralph Castain
4dbc468c3c Remove stale file
This commit was SVN r28299.
2013-04-07 13:52:48 +00:00
Ralph Castain
c121a784ae Remove some weird code around opal_db_close and cleanup that framework's open/close operation
This commit was SVN r28298.
2013-04-07 13:52:28 +00:00
Ralph Castain
10257b8b43 Add missing include
This commit was SVN r28297.
2013-04-07 01:32:08 +00:00
Ralph Castain
1067b1f5ee Add a little debug
This commit was SVN r28295.
2013-04-06 15:24:35 +00:00
Ralph Castain
3bfa53eb91 Cleanup (again) the solaris topology code in hwloc...sigh.
This commit was SVN r28294.
2013-04-06 14:45:32 +00:00
Ralph Castain
ec00fa3132 Fix missing variable declaration in hwloc 1.5.2
This commit was SVN r28293.
2013-04-05 17:43:34 +00:00
Ralph Castain
1f011bef99 Cleanup the updated sys limits capability. Fix a few copy/paste bugs (my bad). Shift the limit set to the ODLS default module so that we sete the limits for all apps, even those that don't call opal_init. Leave it in opal_init as well to support direct-launch apps, but ensure we only set the limits once by removing the envar after launch by ODLS.
Provide some nice error messages if we fail to set the limits. Since the user had to specifically request we set the limit, treat failure as an error-out situation.

This commit was SVN r28288.
2013-04-04 16:00:17 +00:00
Ralph Castain
d09a9e8096 Upgrade the system limit code to support a broader range of parameters. For now, we support stack size, #open files, #children, and file size we can c
reate. Continue to support the old "1" or "0" options for backward compatibility.

This commit was SVN r28282.
2013-04-03 18:57:53 +00:00
Ralph Castain
39a4e93e44 Correct the includes so that compiling with devel headers works
This commit was SVN r28267.
2013-03-30 16:25:24 +00:00
Ralph Castain
a9dc5a31f2 Fix verbosity setting
This commit was SVN r28251.
2013-03-27 22:12:01 +00:00
Ralph Castain
d12eed0703 Silence warning
This commit was SVN r28249.
2013-03-27 22:07:29 +00:00
Nathan Hjelm
3b3506717e de-deprecate mca_base_param_init mca_base_param_finalize as they will be needed until the mca_base_param shim layer goes away
This commit was SVN r28248.
2013-03-27 22:07:23 +00:00
Ralph Castain
95cf39b224 Fix non-updated opal_output channel
This commit was SVN r28245.
2013-03-27 21:57:24 +00:00
Nathan Hjelm
17315bf360 Now that the entire codebase has been updated to use the MCA framework
system remove the last calls to the MCA parameter system.

This commit was SVN r28242.
2013-03-27 21:17:53 +00:00
Nathan Hjelm
9d4a26f47d Update OMPI frameworks to use the MCA framework system.
Notes:
  - This commit also eliminates the need for an available components list in use
    in several frameworks. None of the code in question was making use of the
    priority field of the priority component list item so these extra lists were
    removed.
  - Cleaned up selection code in several frameworks to sort lists using opal_list_sort.
  - Cleans up the ompi/orte-info functions. Expose the functions that construct the
    list of params so they can be used elsewhere.

patches for mtl/portals4 from brian

missed a few output variables in openib

This commit was SVN r28241.
2013-03-27 21:17:31 +00:00
Nathan Hjelm
c041156f60 Update ORTE frameworks to use the MCA framework system.
This commit was SVN r28240.
2013-03-27 21:14:43 +00:00
Nathan Hjelm
365cf48db5 Update OPAL frameworks to use the MCA framework system.
This commit was SVN r28239.
2013-03-27 21:11:47 +00:00
Nathan Hjelm
c3b67d0187 Automatically generate a list of installed frameworks in project/include/project/frameworks.h
This commit was SVN r28238.
2013-03-27 21:10:32 +00:00
Nathan Hjelm
020b9991a4 Introduce the MCA framework system. This formalizes the interface frameworks must provide.
Other changes:
 - Added a flag to the MCA variable system to indicate a variable should go away
   when its group does. Both mca_base_framework_var_register() and
   mca_base_component_var_register() set this flag.

Notes:
 - mca_base_components_open is deprecated. It will be removed in a future commit.
 - All frameworks should use MCA_BASE_FRAMEWORK_DECLARE to declare their
   framework structure.
 - All calls to framework open/close functions should be changed to use the
   mca_base_framework_* functions.
 - Instead of special-casing installdirs a flag was added to prevent calling
   into the variable system when opening a framework.
 - Ralph: Clarify the functional definition of the "register" function in the
   MCA framework object - it had the same name as another function that does a
   totally different thing.
 - As per discussion with Ralph the behavior of mca_base_framework_register()
   is to always call mca_base_framework_components_register() if the framework's
   register function was successful. This removed the need for frameworks to
   have to call this function directly.

This commit was SVN r28237.
2013-03-27 21:10:18 +00:00
Nathan Hjelm
cf377db823 MCA/base: Add new MCA variable system
Features:
 - Support for an override parameter file (openmpi-mca-param-override.conf).
   Variable values in this file can not be overridden by any file or environment
   value.
 - Support for boolean, unsigned, and unsigned long long variables.
 - Support for true/false values.
 - Support for enumerations on integer variables.
 - Support for MPIT scope, verbosity, and binding.
 - Support for command line source.
 - Support for setting variable source via the environment using
   OMPI_MCA_SOURCE_<var name>=source (either command or file:filename)
 - Cleaner API.
 - Support for variable groups (equivalent to MPIT categories).

Notes:
 - Variables must be created with a backing store (char **, int *, or bool *)
   that must live at least as long as the variable.
 - Creating a variable with the MCA_BASE_VAR_FLAG_SETTABLE enables the use of
   mca_base_var_set_value() to change the value.
 - String values are duplicated when the variable is registered. It is up to
   the caller to free the original value if necessary. The new value will be
   freed by the mca_base_var system and must not be freed by the user.
 - Variables with constant scope may not be settable.
 - Variable groups (and all associated variables) are deregistered when the
   component is closed or the component repository item is freed. This
   prevents a segmentation fault from accessing a variable after its component
   is unloaded.
 - After some discussion we decided we should remove the automatic registration
   of component priority variables. Few component actually made use of this
   feature.
 - The enumerator interface was updated to be general enough to handle
   future uses of the interface.
 - The code to generate ompi_info output has been moved into the MCA variable
   system. See mca_base_var_dump().

opal: update core and components to mca_base_var system
orte: update core and components to mca_base_var system
ompi: update core and components to mca_base_var system

This commit also modifies the rmaps framework. The following variables were
moved from ppr and lama: rmaps_base_pernode, rmaps_base_n_pernode,
rmaps_base_n_persocket. Both lama and ppr create synonyms for these variables.

This commit was SVN r28236.
2013-03-27 21:09:41 +00:00
Jeff Squyres
1a048d6ee6 Remove a duplicate variable declaration.
This commit was SVN r28224.
2013-03-27 01:15:27 +00:00
Ralph Castain
317915225c Finish the binding cleanup by removing the no-longer-used binding level scheme. This proved to be fallible as there is no guarantee that the hierarchy it used matched physical reality of the machine (e.g., is L3 "above" the socket or not). Still have to complete the ppr update, but get the rest of it correct.
This commit was SVN r28223.
2013-03-26 20:09:49 +00:00
Ralph Castain
6ee32767d4 Restore the cpus-per-proc option for byslot and bynode mapping. Remove the bind_idx (which recorded the index of the hwloc object where the proc was bound) as this would no longer be unique, and just use the bitmap as the standard reference for location. Update the relative locality computation to take bitmaps as its argument.
This commit was SVN r28219.
2013-03-26 18:27:50 +00:00
Jeff Squyres
6c8d0450a3 Update the post-hwloc-1.5.2 patch list.
This commit was SVN r28218.
2013-03-26 16:18:52 +00:00
Jeff Squyres
f79716dfd4 Include <hwloc.h> so that the symbols in this file are subject to the
<hwloc/rename.h> renaming.

This commit was SVN r28215.
2013-03-26 15:49:52 +00:00
George Bosilca
a856f926de Remove a bunch of unused variables.
This commit was SVN r28213.
2013-03-26 14:34:29 +00:00
Jeff Squyres
6695b5e17a Re-apply r28040 from Eugene: a post-hwloc release fix for Solaris
binding.  This fix was included in the upstream 1.6 series, but not
the upstream 1.5 series, and was therefore missed when we brought
1.5.2 to OMPI.

This commit was SVN r28212.

The following SVN revision numbers were found above:
  r28040 --> open-mpi/ompi@3d44f97572
2013-03-26 13:27:23 +00:00
Ralph Castain
8a79d37ac2 Fix a few bugs in the hwloc integration code. The "set binding policy" macro should flag that the policy was indeed set. Some systems don't report sockets, so the print functions need to check for that condition.
cmr:v1.7

This commit was SVN r28209.
2013-03-25 17:51:45 +00:00
Brian Barrett
bc3ca9e009 Make the linux memory component do the failure path if it was disabled.
This commit was SVN r28206.
2013-03-22 16:56:09 +00:00
Brian Barrett
6c3f986d79 * Fix issue with duplicate symbol for the initialize hook due to it existing in both libmpi and libopen-pal by removing the one for libopen-pal. This won't work if we eventually need registration caching in opal/orte, but I'm hoping that by that point, OFED will have gotten off its butt and properly integrated ummunotify into the verbs layer so that this code can go away.
At the same time, fix a minor issue where the init hook was being called twice, once by the libc malloc and once by our malloc by removing the call from our malloc.

This commit was SVN r28202.
2013-03-21 23:05:54 +00:00
Jeff Squyres
3938b85182 Fix CID 752007: missing break statements.
This commit was SVN r28191.
2013-03-21 11:04:36 +00:00
Ralph Castain
b7f0e46319 Provide a nicer error message when someone gives a bad signal number to opal_signal
cmr:v1.7.1

This commit was SVN r28188.
2013-03-20 15:30:59 +00:00
Jeff Squyres
e5838e6121 Don't mandate PCI support, because this will make builds on platforms
that don't have libpciaccess fail (e.g., OS X, or any machine without
libpciaccess).

This commit was SVN r28181.
2013-03-19 16:20:08 +00:00
Jeff Squyres
7f34dc266b Add missing unlocks. Fixes CID 967022 (which covers the unlock on
line 627; there's probably another CID for the unlock added on line
537).

This commit was SVN r28179.
2013-03-18 23:19:25 +00:00
Jeff Squyres
90802410a8 Update hwloc from 1.5.1 to 1.5.2. Re-enable hwloc PCI support by
default, since it will now use libpciaccess (if available).

This commit was SVN r28178.
2013-03-18 23:02:56 +00:00
Jeff Squyres
f8bbfacf65 Fix CID 967922: minor memory leak possibility.
This commit was SVN r28175.
2013-03-15 17:59:00 +00:00
Brian Barrett
fc2b3b8d46 Ugh. Work around an issue with memory hooks and the change from one big
library to multiple libraries that are implicitly sucked into the executable
as a dependency of libmpi.  The initialize hook isn't visible to libc on some
linux distributions when it's in libopal and libopal isn't explicity linked
into the executable.  The fix is to have a duplicate initialize hook in
libmpi as well as libopal.  *sigh*.

This commit was SVN r28164.
2013-03-11 19:22:24 +00:00
George Bosilca
6a933e7593 Use the libs not some weird path.
This commit was SVN r28139.
2013-02-28 22:34:47 +00:00
Ralph Castain
a4b6fb241f Remove all remaining vestiges of the Windows integration
This commit was SVN r28137.
2013-02-28 17:31:47 +00:00
Ralph Castain
e71b40fdcb If we are redirecting to files, ensure we don't create duplicate file descriptors for output streams going to the same file. If we do, then the output gets completely jumbled - best to avoid that problem.
This commit was SVN r28136.
2013-02-28 17:21:53 +00:00
Ralph Castain
8d2fa3693b First cut at removing the native Windows support. Remove all the Windows-specific components, and the .windows files sprinkled around. Remove the Windows platform files and MTT scripts. Update the NEWS to point Windows users to the cygwin package.
This commit was SVN r28116.
2013-02-26 20:44:56 +00:00
Ralph Castain
9479635e31 Missing include here too...
This commit was SVN r28115.
2013-02-26 20:21:10 +00:00
Ralph Castain
8b8333da3e Add missing include
This commit was SVN r28114.
2013-02-26 19:56:05 +00:00
Ralph Castain
e413596705 Add the loopexit API to the opal_event definitions
This commit was SVN r28113.
2013-02-26 19:27:26 +00:00
Ralph Castain
bd9265c560 Per the meeting on moving the BTLs to OPAL, move the ORTE database "db" framework to OPAL so the relocated BTLs can access it. Because the data is indexed by process, this requires that we define a new "opal_identifier_t" that corresponds to the orte_process_name_t struct. In order to support multiple run-times, this is defined in opal/mca/db/db_types.h as a uint64_t without identifying the meaning of any part of that data.
A few changes were required to support this move:

1. the PMI component used to identify rte-related data (e.g., host name, bind level) and package them as a unit to reduce the number of PMI keys. This code was moved up to the ORTE layer as the OPAL layer has no understanding of these concepts. In addition, the component locally stored data based on process jobid/vpid - this could no longer be supported (see below for the solution).

2. the hash component was updated to use the new opal_identifier_t instead of orte_process_name_t as its index for storing data in the hash tables. Previously, we did a hash on the vpid and stored the data in a 32-bit hash table. In the revised system, we don't see a separate "vpid" field - we only have a 64-bit opaque value. The orte_process_name_t hash turned out to do nothing useful, so we now store the data in a 64-bit hash table. Preliminary tests didn't show any identifiable change in behavior or performance, but we'll have to see if a move back to the 32-bit table is required at some later time.

3. the db framework was a "select one" system. However, since the PMI component could no longer use its internal storage system, the framework has now been changed to a "select many" mode of operation. This allows the hash component to handle all internal storage, while the PMI component only handles pushing/pulling things from the PMI system. This was something we had planned for some time - when fetching data, we first check internal storage to see if we already have it, and then automatically go to the global system to look for it if we don't. Accordingly, the framework was provided with a custom query function used during "select" that lets you seperately specify the "store" and "fetch" ordering.

4. the ORTE grpcomm and ess/pmi components, and the nidmap code,  were updated to work with the new db framework and to specify internal/global storage options.

No changes were made to the MPI layer, except for modifying the ORTE component of the OMPI/rte framework to support the new db framework.

This commit was SVN r28112.
2013-02-26 17:50:04 +00:00
Brian Barrett
7c3e42a689 Work around issue shown in #3505 by not linking against libpci by default.
This commit was SVN r28076.
2013-02-19 16:19:33 +00:00
Jeff Squyres
acefc1588e Patch for Cygwin support: Use S_IRWXU for shmget() and include
<sys/stat.h>.  Thanks to Marco Atzeri for reporting the issue and
providing an initial patch.

This commit was SVN r28060.
2013-02-15 14:31:58 +00:00
Ralph Castain
037918e7b4 Correctly parse the rank file slot_list when given "S:C" - the first position holds the socket, so start looking for cores at posn=1
This commit was SVN r28054.
2013-02-13 13:06:03 +00:00
Brian Barrett
504a6d036f * Rather than use the extra_includes directive, add the extra includes (which is really just -I${includedir}/openmpi/ for devel headers) to CPPFLAGS, since all the other necessary -Is for devel headers (like libevent and hwloc) are added to CPPFLAGS.
* Clean up ${includedir} and ${libdir} for script wrapper compilers
* Update script wrapper compilers to work like the C wrapper compilers w.r.t static and dynamic linking
* Remove the ORTE script wrapper compilers since they didn't support the ${includedir} stuff and Ralph said they weren't used anymore.

This commit was SVN r28052.
2013-02-13 00:33:05 +00:00
Joshua Ladd
70ad711337 Backing out the Open SHMEM project
This commit was SVN r28050.
2013-02-12 17:45:27 +00:00
Mike Dubman
ff384daab4 Added new project: oshmem.
This commit was SVN r28048.
2013-02-12 15:33:21 +00:00
Brian Barrett
33cb4d21fe Need to include libltdl's includes so that the lt wrappers can compile
This commit was SVN r28042.
2013-02-12 00:41:03 +00:00
Rolf vandeVaart
6843f02b37 Add wrapper functions to LTDL functions so other parts of the library can access the LTDL functionality.
Reviewed by jsquyres. 

This commit was SVN r28041.
2013-02-11 15:11:47 +00:00
Eugene Loh
3d44f97572 Fix hwloc get-cpubind routine for Solaris. FIRST, check
processor_bind to see if we're bound to a single core.
If not, THEN check lgroup affinity.  Already CMR'ed to
v1.6 (trac 3507) and fixed upstream in hwloc (r5295).

This commit was SVN r28040.

The following SVN revision numbers were found above:
  r5295 --> open-mpi/ompi@6df8cb0f02
2013-02-10 04:02:19 +00:00
Brian Barrett
57b21014f8 Fix issue where the static inline part of the declaration would be improperly
set when using C++

This commit was SVN r28034.
2013-02-05 18:15:32 +00:00
Ralph Castain
6dd4a8cdf9 The opal_list_t destructor doesn't release the items on the list prior to destructing or releasing it. Provide two convenience macros for doing so.
This commit was SVN r28029.
2013-02-04 19:42:57 +00:00
Rolf vandeVaart
82fb093955 Revert changeset 28011. This can break the build on some systems.
This commit was SVN r28017.
2013-02-01 20:41:47 +00:00
Rolf vandeVaart
79b623d7e3 Add wrapper interface to LTDL functions so that other parts of the library can access the LTDL functionality.
Reviewed by jsquyres.

This commit was SVN r28011.
2013-02-01 14:11:39 +00:00
Nathan Hjelm
05a8958bb0 shmem_RUNTIME_QUERY_hint is not really read_only as it is set from the environment not the default value
This commit was SVN r28005.
2013-01-31 23:41:00 +00:00
Rolf vandeVaart
3d1f9d3b29 Fix bug introduced by changeset 27986.
This commit was SVN r27999.
2013-01-31 20:36:02 +00:00
Rolf vandeVaart
729caaf0cd Remove any dependency on libcuda.so in opal layer. All changes are within OMPI_CUDA_SUPPORT code.
This commit was SVN r27986.
2013-01-30 23:07:32 +00:00
Nathan Hjelm
4bfb701115 add iterator macros for opal_list_t
This commit was SVN r27985.
2013-01-30 19:02:55 +00:00
Rolf vandeVaart
94a78dda0d Fix previous checkin. Need to fall through to get initialized variable set.
This commit was SVN r27977.
2013-01-30 00:01:42 +00:00
Brian Barrett
29aaa21c5a Fix some warnings when we don't have sockets or syslog
This commit was SVN r27973.
2013-01-29 23:02:26 +00:00
Rolf vandeVaart
aa04de4f1e Add run-time parameter to enable and disable CUDA GPU support.
This commit was SVN r27970.
2013-01-29 20:24:04 +00:00
Jeff Squyres
ef8ab5507e Per discussion on the devel list, revert r27882.
Leif would like to revamp the ARM support in a different way, and will
submit a patch to do so in the future.

This commit was SVN r27960.

The following SVN revision numbers were found above:
  r27882 --> open-mpi/ompi@8649b5eece
2013-01-29 14:10:46 +00:00
Brian Barrett
b8442ba505 Revamp the handling of wrapper compiler flags. The user flags, main configure
flags, and mca flags are kept seperate until the very end.  The main configure
wrapper flags should now be modified by using the OPAL_WRAPPER_FLAGS_ADD
macro.  MCA components should either let <framework>_<component>_{LIBS,LDFLAGS}
be copied over OR set <framework>_<component>_WRAPPER_EXTRA_{LIBS,LDFLAGS}.
The situations in which WRAPPER CPPFLAGS can be set by MCA components was
made very small to match the one use case where it makes sense.

This commit was SVN r27950.
2013-01-29 00:00:43 +00:00
Ralph Castain
f6b4db0b79 Fix rank_file operations. We changed the syntax to use semi-colons between multiple slot assignments so that we could use the comma to separate specific cores, but somehow the flex definitions didn't get updated to accept that character. We also incorrectly zero'd the bitmap between slot assignment sections, and so multiple slot assignments only wound up making the last one in the list.
This commit was SVN r27908.
2013-01-25 18:33:25 +00:00
Brian Barrett
cdf0325589 Keep libevent headers from being installed in the wrong place. The top level
Makefile.am gets them installed in the right place already

This commit was SVN r27903.
2013-01-24 22:23:51 +00:00
Ralph Castain
c7fe108601 Cleanup a warning - if we are doing linkall and cannot find either static or dynamic libs, than that is clearly an unrecoverable error. Print a nice message and exit.
This commit was SVN r27895.
2013-01-23 22:16:32 +00:00
Brian Barrett
4f41f5ce5b OPAL_WRAPPER_EXTRA_CPPFLAGS is the wrong variable, want to set
WRAPPER_EXTRA_CPPFLAGS

This commit was SVN r27886.
2013-01-21 23:37:35 +00:00
George Bosilca
8649b5eece The patch from ticket #3469 adapted for the trunk.
This commit was SVN r27882.
2013-01-21 11:45:05 +00:00
George Bosilca
15b18cd2cf Make CMA compile and run.
This commit was SVN r27873.
2013-01-19 14:27:54 +00:00
Rolf vandeVaart
f63c88701f Improve CUDA GPU transfers over openib BTL. Use aynchronous copies.
This is RFC that was submitted in July and December of 2012.

This commit was SVN r27862.
2013-01-17 22:34:43 +00:00
Ralph Castain
92e297d1fa Pack/unpack the disk and net stats so they get passed along
This commit was SVN r27844.
2013-01-16 21:54:48 +00:00
Samuel Gutierrez
cba06776f1 Fix copy and paste error in linux memory component debug output.
This commit was SVN r27842.
2013-01-16 18:27:57 +00:00
Ralph Castain
f29f1b731c Extend the node statistics to include disk and network traffic data.
This commit was SVN r27834.
2013-01-15 22:42:36 +00:00
Brian Barrett
579cf4adcd After discussion with Jeff, don't do C++ inline assembly (there is a non-inline
version still avaiable for C++).  This is yet another push to try to make
OPAL a C only interface...

This commit was SVN r27828.
2013-01-15 17:04:42 +00:00
Ralph Castain
2379b7369f Hey Jeff - AC_HELP_STRING takes *two* arguments, dude!
This commit was SVN r27820.
2013-01-15 15:25:58 +00:00
Brian Barrett
fc3df11e08 Remove the (only two) fortran constants from OPAL. The only places that
actually care if opal_pointer_array is limited to handle_max already passes
that in as the max_size during init, so don't need it there.  The arch
constant was a bit more difficult, so pass that in during MPI init and
leave empty otherwise.

This is to help with the effort to allow building ompi against an external
opal or orte.

This commit was SVN r27817.
2013-01-15 01:27:36 +00:00
Jeff Squyres
e30d9a2bfb The "external" hwloc component didn't have the same fixes applied to
it that the others did: move the "I won!" code up into the POST_CONFIG
macro.  Also, fix a long-standing typo when restoring the $CPPFLAGS (!).

This commit was SVN r27813.
2013-01-14 21:44:47 +00:00
Jeff Squyres
423208932e HWLOC_DO_AM_CONDITIONALS must be run unconditionally.
This commit was SVN r27812.
2013-01-14 21:43:16 +00:00
Jeff Squyres
c17ec83de3 Add some post-v1.5.1 release hwloc bug fixes
This commit was SVN r27805.
2013-01-14 16:25:21 +00:00
Jeff Squyres
c7cb363da9 Remove some more generated files.
This commit was SVN r27800.
2013-01-12 03:30:43 +00:00
Jeff Squyres
4d6f026941 Fix a typo.
This commit was SVN r27799.
2013-01-12 03:30:29 +00:00
Ralph Castain
4d43585a1e Cleanup new hwloc install - remove build products that were accidentally included in the commit, remove non-existent file from Makefile.am
This commit was SVN r27798.
2013-01-12 03:21:53 +00:00
Jeff Squyres
3ce170d463 Update the embedded hwloc from v1.4.2 to v1.5.1.
This commit was SVN r27797.
2013-01-12 02:08:04 +00:00
Jeff Squyres
427c154800 Similar to r27794, simplify the hwloc framework by changing it to
STOP_AT_FIRST.  And move the side-effect-inducing code in
hwloc142/configure.m4 up to POST_CONFIG.

Also change the priority of the external hwloc component to 90 so that
it is evaluated before the internal component (as a direct result of
changing to STOP_AT_FIRST).

This commit was SVN r27796.

The following SVN revision numbers were found above:
  r27794 --> open-mpi/ompi@569a60c2de
2013-01-12 01:48:53 +00:00
Jeff Squyres
a0874b61e6 Remove debugging message.
This commit was SVN r27795.
2013-01-12 01:33:54 +00:00
Jeff Squyres
569a60c2de In short: this commit removes a bunch of code by switching the opal
event framework to STOP_AT_FIRST, and then moves a bunch of
side-effect-inducing code in the libevent2019 configure.m4 up to
POST_CONFIG.

== More detail ==

Change the event framework from STOP_AT_FIRST_PRIORITY to
STOP_AT_FIRST.  This means that only one component can win (vs. all
STOP_AT_FIRST_PRIORITY, in which multiple components of the same
priority can all win).

You still need to ensure that there are no side-effects from the
winner, however, so check for winning during POST_CONFIG, and set
things like the base_include there.

This simplifies the configury quite a bit -- you don't have to assume
that mulitple components can win: zero or one components will win.

Also change the libevent 2019 priority to 50 so that some other
(developer-specific/local) component could win, if it wanted to.

This commit was SVN r27794.
2013-01-12 01:28:37 +00:00
Jeff Squyres
b2d5d1e348 Along with the Automake 1.13.x changes in r27790, rename these third
party configure.in scripts to be configure.ac so that Automake stops
complaining about them.

This commit was SVN r27791.

The following SVN revision numbers were found above:
  r27790 --> open-mpi/ompi@675a2f5c48
2013-01-11 20:26:19 +00:00
Jeff Squyres
675a2f5c48 Updates for Automake 1.13.x. Without these changes, Automake 1.13.x
will error out, due to use of the
previously-deprecated-and-now-removed AM_CONFIG_HEADER macro.

This commit was SVN r27790.
2013-01-11 20:20:02 +00:00
Samuel Gutierrez
4c28c8cbd0 New sm BTL initialization take two. This approach is pretty simple. Instead of
using the modex or RML to share sm initialization information, have node rank 0
create a file containing initialization information in a well-known place. Then
during add_procs, the rest of the node processes requiring sm BTL initialization
will just read from that file to complete their initialization.

This commit was SVN r27789.
2013-01-11 16:24:56 +00:00
Jeff Squyres
e9ae2567f0 Based on a bug report and suggested fix from Darshan maintainer Phil
Carns, change to use access(.., F_OK) instead of stat() to check for
the presence of files.

Also remove redundant check for FAKEROOTKEY, and update all comments
to match.

This commit was SVN r27785.
2013-01-10 14:43:07 +00:00
Ralph Castain
756d2441a8 Actually output the values from opal_value_t
This commit was SVN r27750.
2013-01-05 06:31:47 +00:00
Ralph Castain
4834fb7e6d Minor change to the way we record test data
This commit was SVN r27749.
2013-01-05 06:31:20 +00:00
Samuel Gutierrez
c4acd20eb9 Backout r27739.
This commit was SVN r27745.

The following SVN revision numbers were found above:
  r27739 --> open-mpi/ompi@a159bfaf25
2013-01-05 01:54:23 +00:00
Samuel Gutierrez
a159bfaf25 sm BTL initialization via modex, as discussed at last year's meeting.
This commit was SVN r27739.
2013-01-03 21:52:20 +00:00
Ralph Castain
cada035f38 Fix the segfault problem in the orteds - turns out it only occurred with progress threads enabled. Ensure the thread gets started at the right time (at the end of init), although the event base gets created earlier. Remove the finalize event as we can instead use the loopbreak call to exit the event loop.
This commit was SVN r27721.
2012-12-25 19:30:18 +00:00
Ralph Castain
6046812952 Add float and struct timeval fields to the opal_value_t object, and provide dss support for those data types
This commit was SVN r27705.
2012-12-19 00:14:19 +00:00
Jeff Squyres
b29b852281 Consolidate all the opal/orte/ompi .m4 files back to the top-level
config/ directory.  We split them apart a while ago in the hopes that
it would simplify things, but it didn't really (e.g., because there
were still some ompi/opal .m4 files in the top-level config/
directory, resulting in developer confusion where any given m4 macro
was defined).

So this commit consolidates them back into the top-level directory for
simplicity.  

There's still (at least) two changes that would be nice to make:

 1. Split any generated .m4 file (e.g., autogen-generated .m4 files)
    into a separate directory somewhere so that a top-level -Iconfig/
    will only get our explicitly defined macros, not the autogen stuff
    (e.g., with libevent2019 needing to get the visibility macro, but
    NOT all the autogen-generated inclusion of component configure.m4
    files).
 1. Change configure to be of the form:
{{{
# ...a small amount of preamble/setup...
OPAL_SETUP
m4_ifdef([project_orte], [ORTE_SETUP])
m4_ifdef([project_ompi], [OMPI_SETUP])
# ...a small amount of finishing stuff...
}}}

I doubt we'll ever get anything as clean as that, but that would be
the goal to shoot for.

This commit was SVN r27704.
2012-12-19 00:00:36 +00:00
Ralph Castain
885fc8432d Fix the printing and handling of sample times in stats objects
This commit was SVN r27681.
2012-12-18 03:45:09 +00:00
Jeff Squyres
f779b1ded9 Put back the static-library-detection stuff from r27668, with some
additional functionality.  Rationale (refs trac:3422):

 * Normal MPI applications only ever use the MPI API. Hence, -lmpi is
   sufficient (they'll never directly call ORTE or OPAL
   functions). This is arguably the most common case.
 * That being said, we do have some test programs (e.g., those in
   orte/test/mpi) that call MPI functions but also call ORTE/OPAL
   functions. I've also written the occasional MPI test program that
   calls opal_output, for example (there even might be a few tests in
   the IBM test suite that directly call ORTE/OPAL functions).
   * Even though this is not a common case, these applications should
     also compile/link with mpicc.
   * So we should add a --openmpi:linkall option that will also link
     in whatever is necessary to call ORTE/OPAL functions
   * Yes, we could hard-code "-lopen-rte -lopen-pal" in Makefiles, but
     we do reserve the right to change those library names and/or add
     others someday, so it's better to abstract out the names and let
     the wrapper supply whatever is necessary.
 * ORTE programs, however, are different. They almost always call OPAL
   functions (e.g., if they want to send a message, they must use the
   OPAL DSS). As such, it seems like the ORTE programs should always
   link in OPAL.

Therefore:

 * Add undocumented --openmpi:linkall flag to the wrapper compilers.
   See the comment in opal_wrapper.c for an explanation of what it
   does.  This flag is only intended for Open MPI developers -- not
   end users.  That's why it's undocumented.
 * Update orte/test/mpi/Makefile.am to add --openmpi:linkall
 * Make ortecc/ortec++'s wrapper data text files always explicitly
   link in libopen-pal

This commit was SVN r27670.

The following SVN revision numbers were found above:
  r27668 --> open-mpi/ompi@cf845897aa

The following Trac tickets were found above:
  Ticket 3422 --> https://svn.open-mpi.org/trac/ompi/ticket/3422
2012-12-13 22:31:37 +00:00
Jeff Squyres
cf845897aa Temporarily revert r27662 and r27667 because something wonky is
happening on OS X.  Grumble...

This commit was SVN r27668.

The following SVN revision numbers were found above:
  r27662 --> open-mpi/ompi@97cc916007
  r27667 --> open-mpi/ompi@529f6244ca
2012-12-11 23:08:14 +00:00
Jeff Squyres
529f6244ca If a user supplies both (some form of --static) and (some form of
--dynamic), use the one that was farthest to the right on the command
line.

This commit was SVN r27667.
2012-12-11 21:25:00 +00:00
Jeff Squyres
97cc916007 Per discussion at the Open MPI developer meeting last week:
1. Restore libopen-pal.la, libopen-rte.la, and libmpi.la to be
    separate entities (i.e., don't have libopen-rte.la include
    libopen-pal.la, and don't have libmpi.la include libopen-pal.la).
    Yay!
 1. Consequently, make the wrapper compilers look for flags indicating
    that the user wants to compile statically (currently: -static,
    !--static, -Bstatic, and "-Wl," in front of all of those).  If it
    is, follow a 6-way matrix for determinining which libraries to
    list on the underlying command line.
 1. To support that, add the name of a token static and dynamic
    library to look for in each of the wrapper compiler data files.
 1. Fix a long-standing typo in the opalcc wrapper data file.

This commit was SVN r27662.
2012-12-11 01:46:59 +00:00
Nathan Hjelm
3e1b13b13a Re-add support for old flex (2.5.4a and earlier) while still cleaning up properly in new flex.
This commit was SVN r27657.
2012-12-07 00:12:43 +00:00
Nathan Hjelm
5449c45444 Per RFC: Make mca_base_param_deregister usable by changing its behavior to create a hole in the parameter array instead of deleting the parameter.
The old behavior of mca_base_param_deregister could cause the indices of other mca parameters to change. This could potentially cause problems if a mca user saves and later references an affected index.

This commit was SVN r27633.
2012-11-26 20:55:02 +00:00
Nathan Hjelm
a427a7e727 do not include c99 flag in compiler wrappers
This commit was SVN r27625.
2012-11-20 19:33:14 +00:00
Ralph Castain
fdf7633cff Per Jeff's suggestion, set the default answer when asking for IP aliases in case we don't find any
This commit was SVN r27620.
2012-11-16 14:28:30 +00:00
Ralph Castain
a52071a17d Add a function to return the aliases (based on IP addrs) for the current node
This commit was SVN r27618.
2012-11-16 04:02:29 +00:00
Ralph Castain
ed05185ade Ensure we set the flag indicating that ltdl_advise was found for external installations. Thanks to opoplawski for pointing it out!
This commit was SVN r27609.
2012-11-14 22:11:10 +00:00
Nathan Hjelm
aebd1ea432 Per discussion we will now require a C99 compiant compiler.
This change will enable the use of C99 features in Open MPI; subobject naming, restricted pointers, etc.

cmr:v1.7

This commit was SVN r27604.
2012-11-14 04:52:39 +00:00
Nathan Hjelm
87e5f97400 add missing #include of opal/util/output.h
This commit was SVN r27599.
2012-11-13 07:14:41 +00:00
Ralph Castain
b9609203b7 Pack the buffer object from the beginning
This commit was SVN r27592.
2012-11-12 02:52:37 +00:00
Ralph Castain
de486d3000 Silence compiler warnings
This commit was SVN r27589.
2012-11-12 02:51:05 +00:00
Ralph Castain
ddbbc0fb7c Add the ability to pack the contents of one buffer into another as a block, thus allowing the transfer of blocks of info as a unit without messing with load/unload
This commit was SVN r27584.
2012-11-10 14:02:23 +00:00
Ralph Castain
f9f07e9535 Add a function to test if a string is in the form of an IP address - doesn't test for validity of the address
This commit was SVN r27583.
2012-11-10 14:01:12 +00:00
Nathan Hjelm
e0f5137e46 add prototypes for lex destroy functions
This commit was SVN r27580.
2012-11-09 22:00:27 +00:00
Nathan Hjelm
a754674fd7 Per the specification for putenv (http://pubs.opengroup.org/onlinepubs/009604599/functions/putenv.html) the string given to putenv becomes part of the environment. The string must not be changed or freed.
cmr:v1.7

This commit was SVN r27578.
2012-11-09 16:33:14 +00:00
Nathan Hjelm
8658bbc902 instead of relying on yyterminate to clean up the lex context call the destroy functions directly (after closing the file)
This commit was SVN r27577.
2012-11-09 16:10:55 +00:00
Nathan Hjelm
7fb5caea92 Remove the finish_parsing function from various .l files. The function is incomplete (doesn't clean up the lex state) and should be replaced by *_yylex_destroy which correctly cleans up the state.
Checked with the flex 2.5.35. Verified with valgrind that this fixes several "still reachable" leaks.

cmr:v1.7

This commit was SVN r27571.
2012-11-06 19:26:14 +00:00
Nathan Hjelm
bdedd8b0d3 Per RFC modify the behavior of mca_base_components_close to NOT close the output. Modify frameworks to always close their output and set to -1.
Reasoning: The old behavior was a little confusing. mca_base_components_open does not open an output stream so it is a little unexpected that mca_base_components_close does. To add to this several frameworks (that don't use mca_base_components_close) failed to close their output in the framework close function and others closed their output a second time. This change is an improvement to the symantics of mca_base_components_open/close as they are now symetric in their functionality.

This commit was SVN r27570.
2012-11-06 19:09:26 +00:00
Nathan Hjelm
f3ce12e71a Per RFC fix several leaks in opal and ompi. Details below.
pml/v:
  - If vprotocol is not being used vprotocol_include_list is leaked. Assume vprotocol never takes ownership (see below) and always free the string.

coll/ml:
  - (patch verified) calling mca_base_param_lookup_string after mca_base_param_reg_string is unnecessary. The call to mca_base_param_lookup_string causes the value returned by mca_base_param_reg_string to be leaked.
  - Need to free mca_coll_ml_component.config_file_name on component close.

btl/openib:
  - calling mca_base_param_lookup_string after mca_base_param_reg_string is unnecessary. The call to mca_base_param_lookup_string causes the value returned by mca_base_param_reg_string to be leaked.

vprotocol/base:
  - There was no way for pml/v to determine if vprotocol took ownership of vprotocol_include_list. Fix by always never ownership (use strdup).

mca/base:
  - param_lookup will result in storage->stringval to be a newly allocated string if the mca parameter has a string value. ensure this string is always freed.

cmr:v1.7

This commit was SVN r27569.
2012-11-06 18:57:46 +00:00
Nathan Hjelm
906e29ed96 Fix leaks in the opal if posix code. Error paths were not calling OBJ_RELEASE on an opal_if_t created with OBJ_NEW.
This affects both trunk and 1.7 and might affect 1.6.

cmr:v1.7

This commit was SVN r27562.
2012-11-05 20:51:10 +00:00
Jeff Squyres
3d05c5cca3 There's no point in having a separate opal_hotel_finalize() function
-- just move that functionality into the hotel destructor.

This commit was SVN r27555.
2012-11-02 14:00:54 +00:00
Ralph Castain
bc54976f13 Silence warnings when threads are enabled
This commit was SVN r27550.
2012-11-01 03:34:51 +00:00
Ralph Castain
a1c51dc1d6 Wow - fix an error that has been around for a long time. opal_path_access requires a NULL pointer, not an empty string, to correctly operate.
Thanks to Marco Atzeri for chasing this down!

cmr:v1.6,v1.7

This commit was SVN r27539.
2012-10-31 14:10:51 +00:00
Nathan Hjelm
2acd0f83de Revert "Revert r27451 and r27456 - the cmd line parser is incorrectly marking the application as an MCA parameter".
It appears the problem was not with the command line parser but the rsh plm. I don't know why this problem was not occuring before the command line parser changes but it appears to be resolved now.

This commit was SVN r27527.

The following SVN revision numbers were found above:
  r27451 --> open-mpi/ompi@d59034e6ef
  r27456 --> open-mpi/ompi@ecdbf34937
2012-10-30 19:45:18 +00:00
Ralph Castain
6aac54b02e Revert r27510, r27509, and r27508.
Not sure what happened here, but the resulting trunk wouldn't even configure. After spending time fixing that problem, I found it wouldn't compile due to multiple syntax errors that had been introduced in both the OPAL and OMPI layer. This raised questions as to the completeness of the work.

Given that the author is departing, I pinged Jeff about it and we agreed to revert this for now. Hopefully, it can either be fixed by the author prior to actual departure, or someone else can pick it up (now that it is in the history) and fix it.

This commit was SVN r27511.

The following SVN revision numbers were found above:
  r27508 --> open-mpi/ompi@12c3c743de
  r27509 --> open-mpi/ompi@79e4a8ca38
  r27510 --> open-mpi/ompi@1ad5ff625a
2012-10-27 16:43:45 +00:00
Shiqing Fan
12c3c743de Per the MemPin RFC, submit the component source files, and update the memchecker macros.
This commit was SVN r27508.
2012-10-27 02:48:20 +00:00
Ralph Castain
094d6f3143 Add a new "distributed file system" capability to support file access operations across nodes that do not have a network file system attached to them.
Add a set of URI create/parse utilities

This commit was SVN r27483.
2012-10-25 17:15:17 +00:00
Ralph Castain
e6014bf2e1 Revert r27451 and r27456 - the cmd line parser is incorrectly marking the application as an MCA parameter
This commit was SVN r27477.

The following SVN revision numbers were found above:
  r27451 --> open-mpi/ompi@d59034e6ef
  r27456 --> open-mpi/ompi@ecdbf34937
2012-10-24 18:38:44 +00:00
Jeff Squyres
e72c74a549 Fix backwards asserts in the OPAL hotel code.
This commit was SVN r27462.
2012-10-22 18:05:39 +00:00
Ralph Castain
ecdbf34937 Remove unused variable
This commit was SVN r27456.
2012-10-18 15:42:57 +00:00
Nathan Hjelm
d59034e6ef MCA: remove deprecated mca_base_param functions (mca_base_param_register_int, mca_base_param_register_string, mca_base_param_environ_variable). Remove all uses of deprecated functions.
cmr:v1.7

This commit was SVN r27451.
2012-10-17 20:17:37 +00:00
Nathan Hjelm
47fff80a56 remove unused, deprecated function opal_cmd_line_make_opt
This commit was SVN r27437.
2012-10-11 18:50:11 +00:00
Samuel Gutierrez
1f24f1d305 Update the data types used in opaldf to minimize the chance of overflow when
determining the amount of available space. Thanks to Eugene for pointing out the
issue.

This commit was SVN r27436.
2012-10-11 16:11:23 +00:00
Samuel Gutierrez
21be553e21 Add Windows support to opaldf and shmem/windows -- thanks Shiqing. Next commit
will fix issues found by Eugene.

This commit was SVN r27435.
2012-10-11 14:49:41 +00:00
Samuel Gutierrez
dcd4493f54 Properly report the amount of free space on failure.
This commit was SVN r27434.
2012-10-10 19:49:52 +00:00
Samuel Gutierrez
0461826a4b Fix bus errors caused by an inadequate amount of space during
opal_shmem_segment_create by testing whether or not the target mount has enough
space to accommodate the shared-memory backing store. Fixes trac:2827. Will work
with Shiqing to add Windows support (if required).

This commit was SVN r27433.

The following Trac tickets were found above:
  Ticket 2827 --> https://svn.open-mpi.org/trac/ompi/ticket/2827
2012-10-09 20:48:04 +00:00
Jeff Squyres
6af6809dc2 * Fix some comments.
* Use the hwloc logical index, not the os_index.  Fixes problems with
   opal_hwloc_base_cset2str() output (e.g., --report-bindings output)
   on machines where the os_index is not tightly packed in the range
   ![0, n-1]

This commit was SVN r27394.
2012-10-03 09:33:40 +00:00
Ralph Castain
36679e19df Add a convenient macro for debugging process binding that shows the current binding pattern - helps when trying to figure out when a process got bound, and to where
This commit was SVN r27387.
2012-10-01 15:06:15 +00:00
Ralph Castain
5639d1617f Move missing piece to required visibility
This commit was SVN r27380.
2012-09-27 01:43:54 +00:00
Ralph Castain
54db4c35eb Get the trunk to build again when --without-hwloc is specified. Move a couple of key type definitions and utilities out from under the HAVE_HWLOC test so they are always available as they don't really depend on hwloc's presence. Tell two compnents not to build if hwloc is disabled:
ompi/mca/sbgp/basesmsocket
orte/mca/rmaps/lama

Remove stale configure.params files from the sbgp framework as the OMPI build system no longer looks at those files.

This commit was SVN r27377.
2012-09-26 23:24:27 +00:00
Pavel Shamis
4935b9d930 Fixing compilation error. Adding missing output.h.
This commit was SVN r27375.
2012-09-26 15:52:09 +00:00
Brian Barrett
cb6831830a Remove the TSD_HACKS macro. The TSD hack is only for non-glibc libraries
and we only build the linux memory component on glibc, so this shouldn't
be needed.

This commit was SVN r27371.
2012-09-26 07:42:43 +00:00
Ralph Castain
01504239e6 Enable some debugging in the if discovery code
This commit was SVN r27367.
2012-09-25 20:23:37 +00:00
Ralph Castain
662bc05aa6 Refs trac:3322
Cannot start the data clearing at the root object level as the root object has a different struct attached to userdata.

This commit was SVN r27357.

The following Trac tickets were found above:
  Ticket 3322 --> https://svn.open-mpi.org/trac/ompi/ticket/3322
2012-09-20 23:30:32 +00:00
Ralph Castain
d95025f53a Ensure we clear the usage numbers when binding on multiple nodes so we don't "carry over" info from one node to the next. Use the same tracking mechanism for binding upwards and in-place to avoid doing a bunch of mallocs.
Refs trac:3322

This commit was SVN r27356.

The following Trac tickets were found above:
  Ticket 3322 --> https://svn.open-mpi.org/trac/ompi/ticket/3322
2012-09-20 15:16:06 +00:00
Ralph Castain
a3060cdd15 Fix the bind_downward code - it was incorrectly looking across the entire node instead of only looking below the locale to which the proc had been assigned. In other words, if the proc was mapped to a core, then the only hwthreads that should be considered for binding are those directly below that core. The binding algo was incorrectly looking at ALL hwthreads in that scenario, causing the proc to be bound to an HT outside of the mapped location.
This now results in the procs being bound within their assigned location. It also causes us to use only the 0th HT on a core unless --use-hwthread-cpus has been specified (in which case, we use all the HTs in a core). Bind to core binds you to all HTs regardless - the --use-hwthread-cpus only impacts the oversubscribed determination and when binding to HT.

cmr:v1.7

This commit was SVN r27342.
2012-09-14 22:01:19 +00:00
Jeff Squyres
fb2e543a57 Refs trac:3275.
We ran into a case where the OMPI SVN trunk grew a new acceptable MCA
parameter value, but this new value was not accepted on the v1.6
branch (hwloc_base_mem_bind_failure_action -- on the trunk it accepts
the value "silent", but on the older v1.6 branch, it doesn't).  If you
set "hwloc_base_mem_bind_failure_action=silent" in the default MCA
params file and then accidentally ran with the v1.6 branch, every OMPI
executable (including ompi_info) just failed because hwloc_base_open()
would say "hey, 'silent' is not a valid value for
hwloc_base_mem_bind_failure_action!".  Kaboom.

The only problem is that it didn't give you any indication of where
this value was being set.  Quite maddening, from a user perspective.

So we changed the ompi_info handles this case.  If any framework open
function return OMPI_ERR_BAD_PARAM (either because its base MCA params
got a bad value or because one of its component register/open
functions return OMPI_ERR_BAD_PARAM), ompi_info will stop, print out
a warning that it received and error, and then dump out the parameters
that it has received so far in the framework that had a problem.

At a minimum, this will show the user the MCA param that had an error
(it's usually the last one), and ''where it was set from'' (so that
they can go fix it).  

We updated ompi_info to check for O???_ERR_BAD_PARAM from each from
the framework opens.  Also updated the doxygen docs in mca.h for this
O???_BAD_PARAM behavior.  And we noticed that mca.h had MCA_SUCCESS
and MCA_ERR_??? codes.  Why?  I think we used them in exactly one
place in the code base (mca_base_components_open.c).  So we deleted
those and just used the normal OPAL_* codes instead.

While we were doing this, we also cleaned up a little memory
management during ompi_info/orte-info/opal-info finalization.
Valgrind still reports a truckload of memory still in use at ompi_info
termination, but they mostly look to be components not freeing
memory/resources properly (and outside the scope of this fix).

This commit was SVN r27306.

The following Trac tickets were found above:
  Ticket 3275 --> https://svn.open-mpi.org/trac/ompi/ticket/3275
2012-09-11 20:47:24 +00:00
Jeff Squyres
8076cf8089 Abort configure if --enable-memchecker was specified, but then no
memchecker components were able to configure successfully.

This commit was SVN r27267.
2012-09-07 16:08:43 +00:00
Ralph Castain
67f34c3be6 Record the bind_level recvd by the daemon for each job so it can be correctly sent to the procs. Add test in get_relative_locality to avoid descending into an infinite loop if the level is NODE (==0).
This commit was SVN r27252.
2012-09-06 20:50:07 +00:00
Ralph Castain
ee6c7702d2 Ensure the cma.h file is included in the tarball
This commit was SVN r27235.
2012-09-04 19:34:09 +00:00
Shiqing Fan
0326e88c51 As opal_hwloc_topo_data_t has to create a class instance in orte, its definition has to be exported. Otherwise, there will be unresolved variable error on Windows.
This commit was SVN r27227.
2012-09-04 13:52:29 +00:00
Shiqing Fan
ddbd542732 Remove one .windows file.
Add a macro definition for isblank function.

This commit was SVN r27217.
2012-09-03 09:51:44 +00:00
Jeff Squyres
287e47a04d Fixes, improvements, and enhancements to the opal_tree class (used by
the LAMA RMAPS component, to be committed shortly).

This commit was SVN r27204.
2012-08-31 16:35:49 +00:00
Ralph Castain
38ce23db43 Add some protection to allow NULL bytes in byte objects and NULL strings to be handled cleanly in nidmaps and modex entries. Ensure there is a valid nidmap available for the HNP to pass down to any local procs when it is operating alone.
This commit was SVN r27188.
2012-08-31 01:07:36 +00:00
Ralph Castain
7c96c5498a Don't just leave the byte object uninitialized if size is 0
This commit was SVN r27185.
2012-08-30 14:01:21 +00:00
Jeff Squyres
dd5bd99942 Clean up the error message names from the hwloc base, and add a
missing error message.

This commit was SVN r27180.
2012-08-29 16:40:46 +00:00
Ralph Castain
ab39d81691 Protect copy of an opal_byte_object_t - it is okay to copy a zero-byte object
This commit was SVN r27164.
2012-08-28 21:15:25 +00:00
Jeff Squyres
e497894c4d Gah!! We inlined some of the functionality, so we need these structs
to be defined.  Put comments in there indicating that they're private
and should not be used by public consumers.

This commit was SVN r27075.
2012-08-16 18:42:23 +00:00
Jeff Squyres
01256c36c6 Gah -- meant to make these changes before committing to SVN. :-\
Hide some struct declarations in the .c file to emphasize that they
are not part of the public opal_hotel interface.

This commit was SVN r27068.
2012-08-16 17:37:57 +00:00
Jeff Squyres
96f640a762 Add new "opal_hotel" class. Abstractly speaking, this class does the
following:

 * Provides a fixed number of resource slots (i.e., "hotel rooms").
 * Allows one thing to occupy a resource slot at a time (i.e., each
   hotel room can have an occupant check in to that room).
 * Resource slots can be vacated at any time (i.e., occupants can
   voluntarily check out of their hotel room).
 * Resource slots can be occupied for a specific maximum amount of
   time.  If that time expires, the occupant is forcibly evicted and
   the upper layer is notified via (libevent) callback (i.e., the maid
   will kick an occupant of out of their room when their reservation
   is over).

This class can be to be used for things like retransmission schemes
for unreliable transports.  For example, a message sent on an
unreliable transport can be checked in to a hotel room.  If an ACK for
that message is received, the message can be checked out.  But if the
ACK is never received, the message will eventually be evicted from its
room and the upper layer will be notified that the message failed to
check out in time (i.e., that an ACK for that message was not received
in time).

Code using this class is currently being developed off-trunk, but will
be coming to SVN soon.

This commit was SVN r27067.
2012-08-16 17:29:55 +00:00
Ralph Castain
cb48fd52d4 Implement the MPI_Info part of MPI-3 Ticket 313. Add an MPI_info object MPI_INFO_GET_ENV that contains a number of run-time related pieces of info. This includes all the required ones in the ticket, plus a few that specifically address recent user questions:
"num_app_ctx" - the number of app_contexts in the job
"first_rank" - the MPI rank of the first process in each app_context
"np" - the number of procs in each app_context

Still need clarification on the MPI_Init portion of the ticket. Specifically, does the ticket call for returning an error is someone calls MPI_Init more than once in a program? We set a flag to tell us that we have been initialized, but currently never check it.

This commit was SVN r27005.
2012-08-12 01:28:23 +00:00
Ralph Castain
ad4cdd1a64 Sigh - add a continuation character so we don't lose required files
This commit was SVN r27004.
2012-08-11 16:19:29 +00:00
Ralph Castain
85af056090 GARRR...Remove the stupid dot <sigh>
This commit was SVN r27003.
2012-08-11 15:49:31 +00:00
Ralph Castain
acaaadb7a1 Correct file names for Windows events
This commit was SVN r27002.
2012-08-11 15:28:28 +00:00
Samuel Gutierrez
6188d97e1a Getting out of bed this morning was a bad idea... Reverting the sm update once more because it breaks direct launch. Will address this issue and commit the update once it has all been tested. Sorry everyone!
This commit was SVN r27001.
2012-08-10 22:20:38 +00:00
Jeff Squyres
65b4657159 Shqing needs a few more files from libevent for the Windows build.
This commit was SVN r26998.
2012-08-10 21:22:03 +00:00
Samuel Gutierrez
159bd2e62e Let's try this again: sm BTL initialization via modex.
This commit was SVN r26989.
2012-08-10 20:12:36 +00:00
George Bosilca
f7528bb404 Remove unused variables.
This commit was SVN r26966.
2012-08-08 12:43:13 +00:00
George Bosilca
2303cd0bdb Remove initialized but unused variables.
This commit was SVN r26959.
2012-08-07 12:05:25 +00:00
Shiqing Fan
2f442799f8 fix several typecasts
This commit was SVN r26957.
2012-08-07 10:41:53 +00:00
Jeff Squyres
7fe860c8a2 Per http://www.open-mpi.org/community/lists/devel/2012/08/11362.php,
there actually are some cases where we don't want to uniq-ize all
flags.  Thanks to P. Martin for raising the issue.

This commit was SVN r26955.
2012-08-06 21:05:11 +00:00
Jeff Squyres
0b7b3feba9 Minor fix for the command line parser: we didn't previously
distinguish between unknown ''options'' (i.e., command line options
that are registered and have some meaning) and unknown ''tokens''
(i.e., strings that do not begin with "-").  Hence, if you did:

   mpirun --fo my_mpi_program

(when perhaps you meant to type "--foo", mpirun would complain that no
such executable "--fo" existed.  That is ''correct,'' but perhaps not
completely useful.  It is more accurate for mpirun to report that
there is no such "--fo" option.

This change to cmd_line.c makes it so that we will ''always'' report
errors regarding tokens that begin with "-".

This commit was SVN r26953.
2012-08-06 17:13:08 +00:00
Jeff Squyres
91ccba9643 Minor enhancements to the hwloc base:
* NULL's out the hwloc_obj_t->userdata in
   hwloc_base_util.c:free_object() and
   hwloc_base_util.c:opal_hwloc_base_free_topology() after it has been
   OBJ_RELEASE'd.
 * Adds a userdata field to opal_hwloc_topo_data_t.  This field will
   be used in an upcoming rmaps component ("lama") to cache some
   associated data during hardware tree traversals.

This commit was SVN r26938.
2012-08-02 16:29:44 +00:00
Jeff Squyres
92adbc5d72 Fix some problems with the Fortran wrapper compiler flags,
particularly with respect to threading flags.  

Before this change, the following scenario would fail (e.g., on Linux
with pthreads):

{{{
$ ./configure --disable-shared --enable-static ...
$ make clean install
$ cd examples
$ make clean all
}}}

Linking the Fortran examples would fail with missing pthread symbols.

This commit was SVN r26927.
2012-07-31 18:12:24 +00:00
Shiqing Fan
42dfbc7d2f Another CMake scripts update for:
correctly generate hwloc library
automatically define OMPI/OPAL/ORTE_OMPORTS for user applications
update the f77 bindings

This commit was SVN r26893.
2012-07-27 11:49:09 +00:00
Jeff Squyres
46591b0b1a Clarify a configure warning: we're ''not'' adding to DYLD_LIBRARY_PATH.
This commit was SVN r26880.
2012-07-26 21:47:00 +00:00
Jeff Squyres
ce85596bc9 Also show the memcpy framework (if there are any components, which
there probably won't be, but...).

This commit was SVN r26879.
2012-07-26 21:28:41 +00:00
Jeff Squyres
a8a5f26bc2 Fix typo in comment.
This commit was SVN r26874.
2012-07-26 18:09:33 +00:00
Jeff Squyres
89a4258dfc Shorten the help message, per
http://www.open-mpi.org/community/lists/devel/2012/07/11314.php.  

This commit was SVN r26853.
2012-07-24 12:48:12 +00:00
Shiqing Fan
5d81c27282 Update the CMake files for Fortran 77 bindings, get ready for F90 bindings.
Change several variable names and update the macros.

This commit was SVN r26851.
2012-07-24 08:49:34 +00:00
Shiqing Fan
8c4a3e1269 correct the symbol dllexports for windows build
This commit was SVN r26827.
2012-07-22 08:54:50 +00:00
Jeff Squyres
11feeb61f3 Clarify the comment: we ''do'' apply the memory policy before main()
starts... unless you direct launch MPI applications, in which case the
policy isn't in effect until MPI_INIT completes.

This commit was SVN r26823.
2012-07-20 22:46:34 +00:00
Shiqing Fan
12d99a9ebb Update the hwloc build on Windows and related files.
This commit was SVN r26818.
2012-07-20 12:14:28 +00:00
Shiqing Fan
0f6184985d correct a few typecasts
This commit was SVN r26816.
2012-07-20 12:10:00 +00:00
Jeff Squyres
7a4f6a6a1a Assign the "ret" variable before it is used.
This commit was SVN r26781.
2012-07-11 12:09:00 +00:00
George Bosilca
7d6006a5a6 Fix various compiler warnings.
This commit was SVN r26774.
2012-07-10 15:57:15 +00:00
Shiqing Fan
bd6cb5decd change "#ifndef WIN32" to "#ifdef HAVE_DIRENT_H"
This commit was SVN r26755.
2012-07-05 16:37:30 +00:00
Shiqing Fan
1244f1f93a Use HAVE_STRINGS_H instead of WIN32 in r26728.
This commit was SVN r26729.

The following SVN revision numbers were found above:
  r26728 --> open-mpi/ompi@c97f46bcc7
2012-07-03 14:45:24 +00:00
Shiqing Fan
c97f46bcc7 minor changes on hwloc source files to support windows build.
This commit was SVN r26728.
2012-07-03 12:57:39 +00:00
Ralph Castain
e335de3564 Refactor ompi_info, splitting it into parts according to the layer involved. Thus, we call down to the opal layer to get those frameworks and components, and down to the orte layer to get those. Still some abstraction breaks, but they mostly involve renaming of OMPI_foo labels that have been around since before we split the build system by layer.
This commit was SVN r26695.
2012-06-28 18:23:34 +00:00
Ralph Castain
0dfe29b1a6 Roll in the rest of the modex change. Eliminate all non-modex API access of RTE info from the MPI layer - in some cases, the info was already present (either in the ompi_proc_t or in the orte_process_info struct) and no call was necessary. This removes all calls to orte_ess from the MPI layer. Calls to orte_grpcomm remain required.
Update all the orte ess components to remove their associated APIs for retrieving proc data. Update the grpcomm API to reflect transfer of set/get modex info to the db framework.

Note that this doesn't recreate the old GPR. This is strictly a local db storage that may (at some point) obtain any missing data from the local daemon as part of an async methodology. The framework allows us to experiment with such methods without perturbing the default one.

This commit was SVN r26678.
2012-06-27 14:53:55 +00:00
Josh Hursey
28681deffa Backout the ORCA commit. :(
There is a linking issue on Mac OSX that needs to be addressed before this is able to come back into the trunk.

This commit was SVN r26676.
2012-06-27 01:28:28 +00:00
Josh Hursey
542330e3a7 Commit of ORCA: Open MPI Runtime Collaborative Abstraction
This is a runtime interposition project that sits between the OMPI and ORTE layers in Open MPI.

The project is described on the wiki:
  https://svn.open-mpi.org/trac/ompi/wiki/Runtime_Interposition

And on this email thread:
  http://www.open-mpi.org/community/lists/devel/2012/06/11109.php

This commit was SVN r26670.
2012-06-26 21:42:16 +00:00
Nathan Hjelm
780a25945c do not compile unused libevent code
This commit was SVN r26657.
2012-06-25 22:24:51 +00:00
Jeff Squyres
11808852be Fixes trac:2913. Ensure that OMPI_VERSION is tolerant of having spaces in
it (i.e., so that it doesn't screw up the ident version string).

This commit was SVN r26651.

The following Trac tickets were found above:
  Ticket 2913 --> https://svn.open-mpi.org/trac/ompi/ticket/2913
2012-06-25 17:04:25 +00:00
Ralph Castain
ceeb8f001f Delete unused file
This commit was SVN r26647.
2012-06-25 13:55:10 +00:00
Ralph Castain
b990c65a53 Remove another antiquated dss function - the 'size' API isn't used anywhere since the GPR went away
This commit was SVN r26646.
2012-06-25 13:33:45 +00:00
Ralph Castain
abe7dd8274 Cleanup the dss by removing unused functions
This commit was SVN r26644.
2012-06-23 21:20:09 +00:00
Jeff Squyres
148ae6d6e3 This commit unifies the configury of some verbs-lovin' components.
* Add new configure command line options and deprecate some old ones:
   * --with-verbs replaces --with-openib
   * --with-verbs-libdir replaces --with-openib-libdir
 * If you specify --with-openib[-libdir] without
   --with-verbs[-libdir], you'll get a "these options have been
   deprecated!" warning, but then they'll act just like
   --with-verbs[--libdir]. 

  '''Sidenote:''' Note that we are not renaming any components at this
  time, nor are we renaming the top-level OMPI_CHECK_OPENIB m4 macro
  (which is pretty strongly tied to the openib BTL and is bastaridzed
  by the ofud BTL).  Note that there will likely be more changes in
  this area coming soon (next week?) when some long-standing changes
  move to the SVN trunk: some openib BTL infrastructure will move to
  ompi/mca/common, and its configury gets split up / refactored.

We extend our philosophy of other --with-<foo> configure options of
--with-verbs to ''all'' verbs-lovin components:

 * If you specify --with-verbs, then all verbs-lovin' components must
   configure successfully (or abort).  This currently means: OOB ud,
   BTL ofud, BTL openib.
 * If you specify --with-verbs=DIR, then all verbs-lovin' component
   must configure successfully (or abort), and will use DIR to find
   verbs headers and libraries.
 * If you specify --without-verbs, then all verbs-lovin' components
   will be ignored.

This commit also fixes a problem where the --with-openib=DIR form
would not use DIR for ''all'' verbs-lovin' components (I think only
BTL openib and BTL ofud used that DIR).  Now all of them do, as does
hwloc (because hwloc has some !OpenFabrics helper functions that
require ibv types from verbs.h).

There's a little new m4 infrastructure worth mentioning:

 * If you create a new verbs-lovin' component (i.e., a component that
   need verbs), your configure.m4 should
   AC_REQUIRE([OPAL_CHECK_VERBS_DIR]). 
 * You can then use three global shell variables: $opal_want_verbs,
   $opal_verbs_dir, $opal_verbs_libdir, which will be set as follows:
   * opal_want_verbs will be "yes" and opal_verbs_dir and
     opal_verbs_libdir will both be set to directory values, '''OR'''
   * opal_want_verbs will be "no" and opal_verbs_dir and
     opal_verbs_libdir will both be set empty

This commit was SVN r26640.
2012-06-22 19:53:56 +00:00
Brian Barrett
9af72072a3 Use MKDIR_P instead of mkdir_p in Makefiles, as MKDIR_P is the only one
defined in recent versions of AC/AM.

This commit was SVN r26625.
2012-06-21 16:52:37 +00:00
Christopher Yeoh
9c353cf9d8 This adds a file that was missed in r26134 (adds support for Cross Memory
Attach). It is a header file that contains syscall defs for process_vm_readv
and process_vm_writev. It is only used on systems where glibc does not yet
have support for the new syscalls and where --with-cma has been passed 
to configure. The syscall numbers are hardcoded but have been in a released
kernel and so will not change in the future

Once all linux distros have the new glibc this file can be removed.

This commit was SVN r26615.

The following SVN revision numbers were found above:
  r26134 --> open-mpi/ompi@524de80eaa
2012-06-18 05:01:09 +00:00
Jeff Squyres
06c4317dd4 Ensure to include external.h in the tarball.
This commit was SVN r26610.
2012-06-15 16:29:21 +00:00
Jeff Squyres
6760840ebb Fix builds with the external hwloc component when we use the
hwloc/openfabrics-verbs.h helper header file.

This commit was SVN r26603.
2012-06-14 19:00:57 +00:00
Terry Dontje
6d7cf4a0e5 corrected picl dependency checking to occur in the hwloc.m4 instead of Makefile.am
This commit was SVN r26595.
2012-06-12 14:47:05 +00:00
Ralph Castain
cee5a75d19 Revert the default configuration to no orte progress thread and no libevent thread support until we can get more of the kinks ironed out.
This commit was SVN r26593.
2012-06-11 20:52:28 +00:00
Nathan Hjelm
ceee4bcb0d libevent2019: libevent_pthreads.la is never built. don't include it
This commit was SVN r26570.
2012-06-07 19:22:45 +00:00
Jeff Squyres
ba040e3a42 Upgrade hwloc from 1.3.2+patches to 1.4.2+patches.
This commit was SVN r26566.
2012-06-07 16:24:46 +00:00
Ralph Castain
5876496f4c Enable orte progress threads and libevent thread support by default
This commit was SVN r26565.
2012-06-07 04:25:00 +00:00
Shiqing Fan
aa6cde9886 Change f77 to fortran for the rest of windows build files.
This commit was SVN r26558.
2012-06-06 14:09:51 +00:00
Jeff Squyres
8d161af059 Move hwloc_cpuset_t prettyprint routines down into the hwloc base:
* opal_hwloc_base_cset2str(): Make a human-readable string of a
   hwloc_cpuset_t (e.g., socket 2[core 3[hwt 1]])
 * opal_hwloc_base_cset2mapstr(): Make a map-like string of a
   hwloc_cpuset_t (e.g., [B./..])

This commit was SVN r26532.
2012-06-01 16:02:18 +00:00
Jeff Squyres
05807ef19a Record the upstream SVN commit
This commit was SVN r26525.
2012-05-29 23:44:25 +00:00
Jeff Squyres
b3fbb0a2d5 Ensure to actually exit the non-voice function, even in non-debug
builds (i.e., where assert() is preprocessed away).

This commit was SVN r26524.
2012-05-29 23:41:23 +00:00
Jeff Squyres
99c5afb397 Remove clang compiler warnings.
This commit was SVN r26523.
2012-05-29 23:36:06 +00:00
Ralph Castain
9bedb25dda Cleanup some compiler warnings, some of which are actual logic errors
This commit was SVN r26519.
2012-05-29 20:11:51 +00:00
Jeff Squyres
551b53dd89 Keep the help string less than 509 characters so that compilers don't complain.
This commit was SVN r26514.
2012-05-29 18:43:04 +00:00
Jeff Squyres
96901d9503 Slightly change the wording in the help message for the
hwloc_base_mem_alloc_policy MCA parameter to be more explicit.

This commit was SVN r26512.
2012-05-29 18:08:39 +00:00
Jeff Squyres
5391e98d56 Remove some generated files.
This commit was SVN r26495.
2012-05-25 00:45:38 +00:00
Ralph Castain
32337d0d5c Well, we dont need the -levent any more
This commit was SVN r26487.
2012-05-24 01:54:37 +00:00
Ralph Castain
63a23f1f90 Get the event lib built correctly - missing the OMPI-specific pieces from Makefile.am!
Copmlete the symbol renaming to cover new symbols in the updated version.

This commit was SVN r26486.
2012-05-24 00:27:41 +00:00
Ralph Castain
1d41c6bb80 Some more libevent configure fixes. Ensure -levent gets added to the wrapper flags as the LANL machines can't seem to find it otherwise. Remove the duplicate evaluation of --visibility in the libevent area as it was already done by opal for everyone. Set some more ignores.
This commit was SVN r26485.
2012-05-23 21:26:54 +00:00
Ralph Castain
5821e63e27 Remove built files and update ignores
This commit was SVN r26482.
2012-05-23 18:17:29 +00:00
Ralph Castain
8a2ca3d96f Delete built files
This commit was SVN r26481.
2012-05-23 18:14:26 +00:00
Ralph Castain
11d5a31b1e Update ignores, remove build result file from repo
This commit was SVN r26460.
2012-05-21 00:59:04 +00:00
Ralph Castain
40317c0290 Remove the old event lib version - new one seems to be working just fine.
This commit was SVN r26459.
2012-05-21 00:57:47 +00:00
Ralph Castain
83d69b6c95 Enable the ORTE progress thread for apps (not needed in the tools as they already continuously loop in the event lib). This appears to be working, at least for MPI apps that only use shared memory (a simple "hello"). More testing is required to identify where problems will occur - this is only intended to allow further development.
In order to use the progress thread, you must configure with:

--enable-orte-progress-threads --enable-event-thread-support

This commit was SVN r26457.
2012-05-20 15:14:43 +00:00
Ralph Castain
1ce59d08b5 Continue cleaning up the libevent ignores
This commit was SVN r26455.
2012-05-19 16:11:24 +00:00
Ralph Castain
1826b513ee Add missing windows file
This commit was SVN r26433.
2012-05-12 02:05:25 +00:00
Ralph Castain
09f413025b As per the RFC, upgrade OMPI to libevent 2.0.19. Leave the 2.0.13 release in the system, but inactive, for now in case we discover a need to rollback.
This commit was SVN r26431.
2012-05-11 01:05:36 +00:00
Jeff Squyres
1d7fef001c Record the upstream hwloc commit that we've committed here in the OMPI
tree

This commit was SVN r26422.
2012-05-10 12:15:23 +00:00
Jeff Squyres
9c9d7e77df Commit a fix for hwloc -- still checking with upstream to see if this
will be the final solution.  But I'm committing it now so that
Oracle's Solaris Studio builds can resume.

The issue is that the C++ bindings are now (eventually) including
<hwloc.h>.  We use !__hwloc_inline__ and #define it to an appropriate
value at compile-time.  The issue is that when we're compiling C++
code, we should just set !__hwloc_inline__ to "inline", because that's
a keyword in the C++ language (as opposed to !__inline__, or
somesuch).

This commit was SVN r26418.
2012-05-09 21:03:45 +00:00
Jeff Squyres
de4bbacd13 It turns out that we can't always include the hwloc OpenFabrics verbs
helper file, even if we find that the system has <infiniband/verbs.h>.
The reason is because there are some inline functions in that verbs
helper file that invoke ibv_* functions.  Some linkers (e.g., Solaris
Studio Compilers) will instantiate those static inline functions --
even if we don't use them -- and therefore we need to be able to
resolve the ibv_* symbols at link time.

But since -libverbs is only specified in places where we use other
ibv_* functions (e.g., the OpenFabrics-based BTLs), that means that
linking random executables can/will fail (e.g., orterun).

So instead, introduce a new #define: OPAL_HWLOC_WANT_VERBS_HELPER.  If
this macro is set to 1 before including opal/mca/hwloc/hwloc.h, then
you'll also get the hwloc OpenFabrics verbs helper header file (*if*
hwloc found <infiniband/verbs.h> -- otherwise, it'll #error).

This commit was SVN r26417.
2012-05-09 20:18:31 +00:00
Jeff Squyres
2ba10c37fe Per RFC, bring in the following changes:
* Remove paffinity, maffinity, and carto frameworks -- they've been
   wholly replaced by hwloc.
 * Move ompi_mpi_init() affinity-setting/checking code down to ORTE.
 * Update sm, smcuda, wv, and openib components to no longer use carto.
   Instead, use hwloc data.  There are still optimizations possible in
   the sm/smcuda BTLs (i.e., making multiple mpools).  Also, the old
   carto-based code found out how many NUMA nodes were ''available''
   -- not how many were used ''in this job''.  The new hwloc-using
   code computes the same value -- it was not updated to calculate how
   many NUMA nodes are used ''by this job.''
   * Note that I cannot compile the smcuda and wv BTLs -- I ''think''
     they're right, but they need to be verified by their owners.
 * The openib component now does a bunch of stuff to figure out where
   "near" OpenFabrics devices are.  '''THIS IS A CHANGE IN DEFAULT
   BEHAVIOR!!''' and still needs to be verified by OpenFabrics vendors
   (I do not have a NUMA machine with an OpenFabrics device that is a
   non-uniform distance from multiple different NUMA nodes).
 * Completely rewrite the OMPI_Affinity_str() routine from the
   "affinity" mpiext extension.  This extension now understands
   hyperthreads; the output format of it has changed a bit to reflect
   this new information.
 * Bunches of minor changes around the code base to update names/types
   from maffinity/paffinity-based names to hwloc-based names.
 * Add some helper functions into the hwloc base, mainly having to do
   with the fact that we have the hwloc data reporting ''all''
   topology information, but sometimes you really only want the
   (online | available) data.

This commit was SVN r26391.
2012-05-07 14:52:54 +00:00
Jeff Squyres
c30d1ef0df Patch from Evan Clinton, reviewed by Leif Lindholm, for supporting
ARM5 and ARM6.

This commit was SVN r26361.
2012-04-30 20:49:55 +00:00
Nathan Hjelm
e84f9ec8c3 don't define OPAL_HAVE_ATOMIC_SWAP_64/32 in amd/atomic.h unless we have inlined assembly. fixes pgi complilation on XE/XK-6
This commit was SVN r26343.
2012-04-26 20:43:30 +00:00
Jeff Squyres
0652e2e913 Fix/clarify a comment from r26322.
This commit was SVN r26323.

The following SVN revision numbers were found above:
  r26322 --> open-mpi/ompi@aba398ce09
2012-04-24 17:35:19 +00:00
Jeff Squyres
aba398ce09 Per RFC
(http://www.open-mpi.org/community/lists/devel/2012/04/10905.php), set
opal_cache_line_size via hwloc data, if we have it.
opal_cache_line_size will be set to an hwloc-inspired value by the end
of orte_init(), but will always have a safe value to use (i.e., a
default value 128) -- even before opal_init() has completed.

Default to the same value of 128 that Open MPI has used for several
years if a) we have no hwloc data, or b) we weren't able to find L2
objects in the hwloc data.

This commit was SVN r26322.
2012-04-24 17:31:06 +00:00
Ralph Castain
bd8b4f7f1e Sorry for mid-day commit, but I had promised on the call to do this upon my return.
Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code.

Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch.

This commit was SVN r26242.
2012-04-06 14:23:13 +00:00
Josh Hursey
1941f6b3b1 Cleanup some compiler warnings when doing an optimized/non-debug build.
This commit was SVN r26236.
2012-04-04 20:40:16 +00:00
Mike Dubman
ff1c84c53f revert previous commit
This commit was SVN r26206.
2012-03-29 14:07:13 +00:00
Mike Dubman
43a5775e8a performance fix: set alignment for openib internal buffers
This commit was SVN r26205.
2012-03-29 14:00:08 +00:00
Jeff Squyres
028f471a20 Using the right env variable name helps!
This commit was SVN r26204.
2012-03-28 17:59:21 +00:00
Jeff Squyres
8a2df3311d Fixes trac:2812: check for env. markers indicating that we're in a
fakeroot.  If so, exit out of the pre-main hook immediately (without
calling functions such as stat, which will be replaced by fakeroot to
things that are not safe to call in a pre-main environment).

This commit was SVN r26203.

The following Trac tickets were found above:
  Ticket 2812 --> https://svn.open-mpi.org/trac/ompi/ticket/2812
2012-03-28 16:41:29 +00:00
Pavel Shamis
39a55df333 Adding exported libevent globabl variables to the opal_rename file.
Otherwise the varables case name conflicts.

This commit was SVN r26201.
2012-03-27 17:26:21 +00:00
Ralph Castain
811413e9bc Correctly handle multiple cpu-set ranges. Correctly support optional binding directives combined with cpu-set.
This commit was SVN r26187.
2012-03-23 14:50:41 +00:00
Ralph Castain
ce0caf7567 Support -cpu-set by binding to the specified cpus in the absence of any other binding directive. Allows users to subdivide nodes for multiple parallel mpirun invocations.
This commit was SVN r26186.
2012-03-23 14:05:52 +00:00
Ralph Castain
6f6930eb66 Resolve infinite loop when -cpu-set is specified
This commit was SVN r26184.
2012-03-23 07:18:58 +00:00
Jeff Squyres
95148f3310 Don't force the use of libpci support in hwloc in the default case --
just let hwloc decide for itself.

This commit was SVN r26178.
2012-03-22 15:28:35 +00:00
Jeff Squyres
3bf038bb1c Per RFC from long ago:
http://www.open-mpi.org/community/lists/devel/2011/10/9784.php

Bring support for a DMTCP CRS module into the trunk.  See
http://dmtcp.sourceforge.net/ for a description of DMTCP.  Thanks to
the contribution from Alex Brick at Northeastern University, and all
the others up there who helped shepherd this into being ready to
submit.

This commit was SVN r26176.
2012-03-22 12:01:46 +00:00
Jeff Squyres
d30bbc2ef9 Fix an old issue: enable hwloc PCI detection except on SuSE 10 64 bit.
Worked with Oracle to verify that hwloc PCI detection is correctly
disabled on the Suse 10/64 bit platform and is enabled by default on
all other platforms.  The --[en|dis]able-hwloc-pci switch is also
available for manual override of the configure decision about hwloc
PCI support.

This commit was SVN r26175.
2012-03-22 11:30:57 +00:00
Josh Hursey
f7c66f54b6 Add a pair of functions to count the number of set/unset bits in an opal_bitmap.
I found this useful on a branch, and thought others might as well.

This commit was SVN r26171.
2012-03-21 14:16:45 +00:00
Jeff Squyres
ab543fce58 We have no common components in opal any more, so we can remove this directory.
This commit was SVN r26169.
2012-03-20 21:21:49 +00:00
Jeff Squyres
0322db7cde Bring over r4402 from hwloc trunk.
This commit was SVN r26165.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r4402
2012-03-19 16:39:54 +00:00
Jeff Squyres
aeca190744 Refs trac:3046: feedback from Brian -- don't set DYLD_LIBRARY_PATH.
This commit was SVN r26108.

The following Trac tickets were found above:
  Ticket 3046 --> https://svn.open-mpi.org/trac/ompi/ticket/3046
2012-03-07 13:12:22 +00:00
Ralph Castain
366f9d1518 Add some missing localities to the hwloc pretty-print, fix pmi modex
This commit was SVN r26105.
2012-03-06 06:21:10 +00:00
George Bosilca
24bfb3e77b Promote from int32_t to size_t.
This commit was SVN r26096.
2012-03-05 14:08:47 +00:00
Jeff Squyres
d4a05ec26e Disable inline assembly for the PGI C++ compiler per patch from Larry
Baker.  We already had it disabled for the PGI C compiler; I'm not
sure why it was left enabled for the C++ compiler.

This commit was SVN r26083.
2012-03-02 20:45:51 +00:00
Jeff Squyres
f84c16bb65 Fixes trac:3043. Looks like some of the improvements to the hwloc132
hwloc component weren't reverse applied to the external hwloc
component.  Additionaly, if we add stuff to LDFLAGS/LIBS, we also may
need to append (DY)LD_LIBRARY_PATH (here in this configure process
only), otherwise future configure tests may fail because they can't
find libhwloc.so (e.g., if you --with-hwloc=/some/path, we need to add
/some/path/lib to (DY)LD_LIBRARY_PATH).

This commit was SVN r26082.

The following Trac tickets were found above:
  Ticket 3043 --> https://svn.open-mpi.org/trac/ompi/ticket/3043
2012-03-02 20:15:07 +00:00
Jeff Squyres
97b3603036 A bunch of fixes and improvements to Open MPI's various command line tools.
* fixed some bugs where "unknown" tokens were allowed on the command
   line (which should really only be used for ortertun).
 * if an unknown token is encountered, print a short error to stderr
   and quit with a nonzero exit status
 * if we don't find the right number of parameters to an option, print
   a short error to stderr and quit with a nonzero exit status
 * when --help is given, print the help message to stdout (not stderr)
   and quit with a zero exit status
 * added --showme:help option to the wrapper compilers
 * updated docs in opal/util/cmd_line.h
 * other small/miscellaneous CLI parsing bugs in various tools

I won't bore you with what we did before.  :-)  Here's some examples
of what the new behavior looks like:

{{{
% ompi_info --bogus
ompi_info: Error: unknown option "--bogus"
Type 'ompi_info --help' for usage.
% ompi_info --param bogus
ompi_info: Error: option "--param" did not have enough parameters (2)
Type 'ompi_info --help' for usage.
%
}}}

This commit was SVN r26072.
2012-02-29 17:52:38 +00:00