1
1
Граф коммитов

1241 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
ad4cdd1a64 Sigh - add a continuation character so we don't lose required files
This commit was SVN r27004.
2012-08-11 16:19:29 +00:00
Ralph Castain
85af056090 GARRR...Remove the stupid dot <sigh>
This commit was SVN r27003.
2012-08-11 15:49:31 +00:00
Ralph Castain
acaaadb7a1 Correct file names for Windows events
This commit was SVN r27002.
2012-08-11 15:28:28 +00:00
Samuel Gutierrez
6188d97e1a Getting out of bed this morning was a bad idea... Reverting the sm update once more because it breaks direct launch. Will address this issue and commit the update once it has all been tested. Sorry everyone!
This commit was SVN r27001.
2012-08-10 22:20:38 +00:00
Jeff Squyres
65b4657159 Shqing needs a few more files from libevent for the Windows build.
This commit was SVN r26998.
2012-08-10 21:22:03 +00:00
Samuel Gutierrez
159bd2e62e Let's try this again: sm BTL initialization via modex.
This commit was SVN r26989.
2012-08-10 20:12:36 +00:00
George Bosilca
f7528bb404 Remove unused variables.
This commit was SVN r26966.
2012-08-08 12:43:13 +00:00
George Bosilca
2303cd0bdb Remove initialized but unused variables.
This commit was SVN r26959.
2012-08-07 12:05:25 +00:00
Shiqing Fan
2f442799f8 fix several typecasts
This commit was SVN r26957.
2012-08-07 10:41:53 +00:00
Jeff Squyres
91ccba9643 Minor enhancements to the hwloc base:
* NULL's out the hwloc_obj_t->userdata in
   hwloc_base_util.c:free_object() and
   hwloc_base_util.c:opal_hwloc_base_free_topology() after it has been
   OBJ_RELEASE'd.
 * Adds a userdata field to opal_hwloc_topo_data_t.  This field will
   be used in an upcoming rmaps component ("lama") to cache some
   associated data during hardware tree traversals.

This commit was SVN r26938.
2012-08-02 16:29:44 +00:00
Jeff Squyres
46591b0b1a Clarify a configure warning: we're ''not'' adding to DYLD_LIBRARY_PATH.
This commit was SVN r26880.
2012-07-26 21:47:00 +00:00
Jeff Squyres
a8a5f26bc2 Fix typo in comment.
This commit was SVN r26874.
2012-07-26 18:09:33 +00:00
Jeff Squyres
89a4258dfc Shorten the help message, per
http://www.open-mpi.org/community/lists/devel/2012/07/11314.php.  

This commit was SVN r26853.
2012-07-24 12:48:12 +00:00
Jeff Squyres
11feeb61f3 Clarify the comment: we ''do'' apply the memory policy before main()
starts... unless you direct launch MPI applications, in which case the
policy isn't in effect until MPI_INIT completes.

This commit was SVN r26823.
2012-07-20 22:46:34 +00:00
Shiqing Fan
12d99a9ebb Update the hwloc build on Windows and related files.
This commit was SVN r26818.
2012-07-20 12:14:28 +00:00
Shiqing Fan
0f6184985d correct a few typecasts
This commit was SVN r26816.
2012-07-20 12:10:00 +00:00
Shiqing Fan
bd6cb5decd change "#ifndef WIN32" to "#ifdef HAVE_DIRENT_H"
This commit was SVN r26755.
2012-07-05 16:37:30 +00:00
Shiqing Fan
1244f1f93a Use HAVE_STRINGS_H instead of WIN32 in r26728.
This commit was SVN r26729.

The following SVN revision numbers were found above:
  r26728 --> open-mpi/ompi@c97f46bcc7
2012-07-03 14:45:24 +00:00
Shiqing Fan
c97f46bcc7 minor changes on hwloc source files to support windows build.
This commit was SVN r26728.
2012-07-03 12:57:39 +00:00
Ralph Castain
0dfe29b1a6 Roll in the rest of the modex change. Eliminate all non-modex API access of RTE info from the MPI layer - in some cases, the info was already present (either in the ompi_proc_t or in the orte_process_info struct) and no call was necessary. This removes all calls to orte_ess from the MPI layer. Calls to orte_grpcomm remain required.
Update all the orte ess components to remove their associated APIs for retrieving proc data. Update the grpcomm API to reflect transfer of set/get modex info to the db framework.

Note that this doesn't recreate the old GPR. This is strictly a local db storage that may (at some point) obtain any missing data from the local daemon as part of an async methodology. The framework allows us to experiment with such methods without perturbing the default one.

This commit was SVN r26678.
2012-06-27 14:53:55 +00:00
Nathan Hjelm
780a25945c do not compile unused libevent code
This commit was SVN r26657.
2012-06-25 22:24:51 +00:00
Ralph Castain
b990c65a53 Remove another antiquated dss function - the 'size' API isn't used anywhere since the GPR went away
This commit was SVN r26646.
2012-06-25 13:33:45 +00:00
Ralph Castain
abe7dd8274 Cleanup the dss by removing unused functions
This commit was SVN r26644.
2012-06-23 21:20:09 +00:00
Jeff Squyres
148ae6d6e3 This commit unifies the configury of some verbs-lovin' components.
* Add new configure command line options and deprecate some old ones:
   * --with-verbs replaces --with-openib
   * --with-verbs-libdir replaces --with-openib-libdir
 * If you specify --with-openib[-libdir] without
   --with-verbs[-libdir], you'll get a "these options have been
   deprecated!" warning, but then they'll act just like
   --with-verbs[--libdir]. 

  '''Sidenote:''' Note that we are not renaming any components at this
  time, nor are we renaming the top-level OMPI_CHECK_OPENIB m4 macro
  (which is pretty strongly tied to the openib BTL and is bastaridzed
  by the ofud BTL).  Note that there will likely be more changes in
  this area coming soon (next week?) when some long-standing changes
  move to the SVN trunk: some openib BTL infrastructure will move to
  ompi/mca/common, and its configury gets split up / refactored.

We extend our philosophy of other --with-<foo> configure options of
--with-verbs to ''all'' verbs-lovin components:

 * If you specify --with-verbs, then all verbs-lovin' components must
   configure successfully (or abort).  This currently means: OOB ud,
   BTL ofud, BTL openib.
 * If you specify --with-verbs=DIR, then all verbs-lovin' component
   must configure successfully (or abort), and will use DIR to find
   verbs headers and libraries.
 * If you specify --without-verbs, then all verbs-lovin' components
   will be ignored.

This commit also fixes a problem where the --with-openib=DIR form
would not use DIR for ''all'' verbs-lovin' components (I think only
BTL openib and BTL ofud used that DIR).  Now all of them do, as does
hwloc (because hwloc has some !OpenFabrics helper functions that
require ibv types from verbs.h).

There's a little new m4 infrastructure worth mentioning:

 * If you create a new verbs-lovin' component (i.e., a component that
   need verbs), your configure.m4 should
   AC_REQUIRE([OPAL_CHECK_VERBS_DIR]). 
 * You can then use three global shell variables: $opal_want_verbs,
   $opal_verbs_dir, $opal_verbs_libdir, which will be set as follows:
   * opal_want_verbs will be "yes" and opal_verbs_dir and
     opal_verbs_libdir will both be set to directory values, '''OR'''
   * opal_want_verbs will be "no" and opal_verbs_dir and
     opal_verbs_libdir will both be set empty

This commit was SVN r26640.
2012-06-22 19:53:56 +00:00
Jeff Squyres
06c4317dd4 Ensure to include external.h in the tarball.
This commit was SVN r26610.
2012-06-15 16:29:21 +00:00
Jeff Squyres
6760840ebb Fix builds with the external hwloc component when we use the
hwloc/openfabrics-verbs.h helper header file.

This commit was SVN r26603.
2012-06-14 19:00:57 +00:00
Terry Dontje
6d7cf4a0e5 corrected picl dependency checking to occur in the hwloc.m4 instead of Makefile.am
This commit was SVN r26595.
2012-06-12 14:47:05 +00:00
Ralph Castain
cee5a75d19 Revert the default configuration to no orte progress thread and no libevent thread support until we can get more of the kinks ironed out.
This commit was SVN r26593.
2012-06-11 20:52:28 +00:00
Nathan Hjelm
ceee4bcb0d libevent2019: libevent_pthreads.la is never built. don't include it
This commit was SVN r26570.
2012-06-07 19:22:45 +00:00
Jeff Squyres
ba040e3a42 Upgrade hwloc from 1.3.2+patches to 1.4.2+patches.
This commit was SVN r26566.
2012-06-07 16:24:46 +00:00
Ralph Castain
5876496f4c Enable orte progress threads and libevent thread support by default
This commit was SVN r26565.
2012-06-07 04:25:00 +00:00
Jeff Squyres
8d161af059 Move hwloc_cpuset_t prettyprint routines down into the hwloc base:
* opal_hwloc_base_cset2str(): Make a human-readable string of a
   hwloc_cpuset_t (e.g., socket 2[core 3[hwt 1]])
 * opal_hwloc_base_cset2mapstr(): Make a map-like string of a
   hwloc_cpuset_t (e.g., [B./..])

This commit was SVN r26532.
2012-06-01 16:02:18 +00:00
Jeff Squyres
05807ef19a Record the upstream SVN commit
This commit was SVN r26525.
2012-05-29 23:44:25 +00:00
Jeff Squyres
b3fbb0a2d5 Ensure to actually exit the non-voice function, even in non-debug
builds (i.e., where assert() is preprocessed away).

This commit was SVN r26524.
2012-05-29 23:41:23 +00:00
Ralph Castain
9bedb25dda Cleanup some compiler warnings, some of which are actual logic errors
This commit was SVN r26519.
2012-05-29 20:11:51 +00:00
Jeff Squyres
551b53dd89 Keep the help string less than 509 characters so that compilers don't complain.
This commit was SVN r26514.
2012-05-29 18:43:04 +00:00
Jeff Squyres
96901d9503 Slightly change the wording in the help message for the
hwloc_base_mem_alloc_policy MCA parameter to be more explicit.

This commit was SVN r26512.
2012-05-29 18:08:39 +00:00
Jeff Squyres
5391e98d56 Remove some generated files.
This commit was SVN r26495.
2012-05-25 00:45:38 +00:00
Ralph Castain
32337d0d5c Well, we dont need the -levent any more
This commit was SVN r26487.
2012-05-24 01:54:37 +00:00
Ralph Castain
63a23f1f90 Get the event lib built correctly - missing the OMPI-specific pieces from Makefile.am!
Copmlete the symbol renaming to cover new symbols in the updated version.

This commit was SVN r26486.
2012-05-24 00:27:41 +00:00
Ralph Castain
1d41c6bb80 Some more libevent configure fixes. Ensure -levent gets added to the wrapper flags as the LANL machines can't seem to find it otherwise. Remove the duplicate evaluation of --visibility in the libevent area as it was already done by opal for everyone. Set some more ignores.
This commit was SVN r26485.
2012-05-23 21:26:54 +00:00
Ralph Castain
5821e63e27 Remove built files and update ignores
This commit was SVN r26482.
2012-05-23 18:17:29 +00:00
Ralph Castain
8a2ca3d96f Delete built files
This commit was SVN r26481.
2012-05-23 18:14:26 +00:00
Ralph Castain
11d5a31b1e Update ignores, remove build result file from repo
This commit was SVN r26460.
2012-05-21 00:59:04 +00:00
Ralph Castain
40317c0290 Remove the old event lib version - new one seems to be working just fine.
This commit was SVN r26459.
2012-05-21 00:57:47 +00:00
Ralph Castain
83d69b6c95 Enable the ORTE progress thread for apps (not needed in the tools as they already continuously loop in the event lib). This appears to be working, at least for MPI apps that only use shared memory (a simple "hello"). More testing is required to identify where problems will occur - this is only intended to allow further development.
In order to use the progress thread, you must configure with:

--enable-orte-progress-threads --enable-event-thread-support

This commit was SVN r26457.
2012-05-20 15:14:43 +00:00
Ralph Castain
1ce59d08b5 Continue cleaning up the libevent ignores
This commit was SVN r26455.
2012-05-19 16:11:24 +00:00
Ralph Castain
1826b513ee Add missing windows file
This commit was SVN r26433.
2012-05-12 02:05:25 +00:00
Ralph Castain
09f413025b As per the RFC, upgrade OMPI to libevent 2.0.19. Leave the 2.0.13 release in the system, but inactive, for now in case we discover a need to rollback.
This commit was SVN r26431.
2012-05-11 01:05:36 +00:00
Jeff Squyres
1d7fef001c Record the upstream hwloc commit that we've committed here in the OMPI
tree

This commit was SVN r26422.
2012-05-10 12:15:23 +00:00
Jeff Squyres
9c9d7e77df Commit a fix for hwloc -- still checking with upstream to see if this
will be the final solution.  But I'm committing it now so that
Oracle's Solaris Studio builds can resume.

The issue is that the C++ bindings are now (eventually) including
<hwloc.h>.  We use !__hwloc_inline__ and #define it to an appropriate
value at compile-time.  The issue is that when we're compiling C++
code, we should just set !__hwloc_inline__ to "inline", because that's
a keyword in the C++ language (as opposed to !__inline__, or
somesuch).

This commit was SVN r26418.
2012-05-09 21:03:45 +00:00
Jeff Squyres
de4bbacd13 It turns out that we can't always include the hwloc OpenFabrics verbs
helper file, even if we find that the system has <infiniband/verbs.h>.
The reason is because there are some inline functions in that verbs
helper file that invoke ibv_* functions.  Some linkers (e.g., Solaris
Studio Compilers) will instantiate those static inline functions --
even if we don't use them -- and therefore we need to be able to
resolve the ibv_* symbols at link time.

But since -libverbs is only specified in places where we use other
ibv_* functions (e.g., the OpenFabrics-based BTLs), that means that
linking random executables can/will fail (e.g., orterun).

So instead, introduce a new #define: OPAL_HWLOC_WANT_VERBS_HELPER.  If
this macro is set to 1 before including opal/mca/hwloc/hwloc.h, then
you'll also get the hwloc OpenFabrics verbs helper header file (*if*
hwloc found <infiniband/verbs.h> -- otherwise, it'll #error).

This commit was SVN r26417.
2012-05-09 20:18:31 +00:00
Jeff Squyres
2ba10c37fe Per RFC, bring in the following changes:
* Remove paffinity, maffinity, and carto frameworks -- they've been
   wholly replaced by hwloc.
 * Move ompi_mpi_init() affinity-setting/checking code down to ORTE.
 * Update sm, smcuda, wv, and openib components to no longer use carto.
   Instead, use hwloc data.  There are still optimizations possible in
   the sm/smcuda BTLs (i.e., making multiple mpools).  Also, the old
   carto-based code found out how many NUMA nodes were ''available''
   -- not how many were used ''in this job''.  The new hwloc-using
   code computes the same value -- it was not updated to calculate how
   many NUMA nodes are used ''by this job.''
   * Note that I cannot compile the smcuda and wv BTLs -- I ''think''
     they're right, but they need to be verified by their owners.
 * The openib component now does a bunch of stuff to figure out where
   "near" OpenFabrics devices are.  '''THIS IS A CHANGE IN DEFAULT
   BEHAVIOR!!''' and still needs to be verified by OpenFabrics vendors
   (I do not have a NUMA machine with an OpenFabrics device that is a
   non-uniform distance from multiple different NUMA nodes).
 * Completely rewrite the OMPI_Affinity_str() routine from the
   "affinity" mpiext extension.  This extension now understands
   hyperthreads; the output format of it has changed a bit to reflect
   this new information.
 * Bunches of minor changes around the code base to update names/types
   from maffinity/paffinity-based names to hwloc-based names.
 * Add some helper functions into the hwloc base, mainly having to do
   with the fact that we have the hwloc data reporting ''all''
   topology information, but sometimes you really only want the
   (online | available) data.

This commit was SVN r26391.
2012-05-07 14:52:54 +00:00
Jeff Squyres
aba398ce09 Per RFC
(http://www.open-mpi.org/community/lists/devel/2012/04/10905.php), set
opal_cache_line_size via hwloc data, if we have it.
opal_cache_line_size will be set to an hwloc-inspired value by the end
of orte_init(), but will always have a safe value to use (i.e., a
default value 128) -- even before opal_init() has completed.

Default to the same value of 128 that Open MPI has used for several
years if a) we have no hwloc data, or b) we weren't able to find L2
objects in the hwloc data.

This commit was SVN r26322.
2012-04-24 17:31:06 +00:00
Ralph Castain
bd8b4f7f1e Sorry for mid-day commit, but I had promised on the call to do this upon my return.
Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code.

Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch.

This commit was SVN r26242.
2012-04-06 14:23:13 +00:00
Mike Dubman
ff1c84c53f revert previous commit
This commit was SVN r26206.
2012-03-29 14:07:13 +00:00
Mike Dubman
43a5775e8a performance fix: set alignment for openib internal buffers
This commit was SVN r26205.
2012-03-29 14:00:08 +00:00
Jeff Squyres
028f471a20 Using the right env variable name helps!
This commit was SVN r26204.
2012-03-28 17:59:21 +00:00
Jeff Squyres
8a2df3311d Fixes trac:2812: check for env. markers indicating that we're in a
fakeroot.  If so, exit out of the pre-main hook immediately (without
calling functions such as stat, which will be replaced by fakeroot to
things that are not safe to call in a pre-main environment).

This commit was SVN r26203.

The following Trac tickets were found above:
  Ticket 2812 --> https://svn.open-mpi.org/trac/ompi/ticket/2812
2012-03-28 16:41:29 +00:00
Pavel Shamis
39a55df333 Adding exported libevent globabl variables to the opal_rename file.
Otherwise the varables case name conflicts.

This commit was SVN r26201.
2012-03-27 17:26:21 +00:00
Ralph Castain
811413e9bc Correctly handle multiple cpu-set ranges. Correctly support optional binding directives combined with cpu-set.
This commit was SVN r26187.
2012-03-23 14:50:41 +00:00
Ralph Castain
ce0caf7567 Support -cpu-set by binding to the specified cpus in the absence of any other binding directive. Allows users to subdivide nodes for multiple parallel mpirun invocations.
This commit was SVN r26186.
2012-03-23 14:05:52 +00:00
Ralph Castain
6f6930eb66 Resolve infinite loop when -cpu-set is specified
This commit was SVN r26184.
2012-03-23 07:18:58 +00:00
Jeff Squyres
95148f3310 Don't force the use of libpci support in hwloc in the default case --
just let hwloc decide for itself.

This commit was SVN r26178.
2012-03-22 15:28:35 +00:00
Jeff Squyres
3bf038bb1c Per RFC from long ago:
http://www.open-mpi.org/community/lists/devel/2011/10/9784.php

Bring support for a DMTCP CRS module into the trunk.  See
http://dmtcp.sourceforge.net/ for a description of DMTCP.  Thanks to
the contribution from Alex Brick at Northeastern University, and all
the others up there who helped shepherd this into being ready to
submit.

This commit was SVN r26176.
2012-03-22 12:01:46 +00:00
Jeff Squyres
d30bbc2ef9 Fix an old issue: enable hwloc PCI detection except on SuSE 10 64 bit.
Worked with Oracle to verify that hwloc PCI detection is correctly
disabled on the Suse 10/64 bit platform and is enabled by default on
all other platforms.  The --[en|dis]able-hwloc-pci switch is also
available for manual override of the configure decision about hwloc
PCI support.

This commit was SVN r26175.
2012-03-22 11:30:57 +00:00
Jeff Squyres
ab543fce58 We have no common components in opal any more, so we can remove this directory.
This commit was SVN r26169.
2012-03-20 21:21:49 +00:00
Jeff Squyres
0322db7cde Bring over r4402 from hwloc trunk.
This commit was SVN r26165.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r4402
2012-03-19 16:39:54 +00:00
Jeff Squyres
aeca190744 Refs trac:3046: feedback from Brian -- don't set DYLD_LIBRARY_PATH.
This commit was SVN r26108.

The following Trac tickets were found above:
  Ticket 3046 --> https://svn.open-mpi.org/trac/ompi/ticket/3046
2012-03-07 13:12:22 +00:00
Ralph Castain
366f9d1518 Add some missing localities to the hwloc pretty-print, fix pmi modex
This commit was SVN r26105.
2012-03-06 06:21:10 +00:00
Jeff Squyres
f84c16bb65 Fixes trac:3043. Looks like some of the improvements to the hwloc132
hwloc component weren't reverse applied to the external hwloc
component.  Additionaly, if we add stuff to LDFLAGS/LIBS, we also may
need to append (DY)LD_LIBRARY_PATH (here in this configure process
only), otherwise future configure tests may fail because they can't
find libhwloc.so (e.g., if you --with-hwloc=/some/path, we need to add
/some/path/lib to (DY)LD_LIBRARY_PATH).

This commit was SVN r26082.

The following Trac tickets were found above:
  Ticket 3043 --> https://svn.open-mpi.org/trac/ompi/ticket/3043
2012-03-02 20:15:07 +00:00
Jeff Squyres
e77653511b Bring in upstream hwloc v1.3 branch SVN commit r4345
This commit was SVN r26048.

The following SVN revision numbers were found above:
  r4345 --> open-mpi/ompi@b6c2a5b602
2012-02-24 13:57:18 +00:00
Jeff Squyres
f8f7f6b3ef Bring over upstream hwloc fix r4340
This commit was SVN r26037.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r4340
2012-02-23 20:44:21 +00:00
Jeff Squyres
d0df08c953 Bring in upstream hwloc SVN r4319.
This commit was SVN r25987.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r4319
2012-02-21 15:39:21 +00:00
Jeff Squyres
9f7b1d76cd Apply upstream hwloc fix; hwloc SVN r4314
This commit was SVN r25986.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r4314
2012-02-21 15:10:40 +00:00
Brian Barrett
2d4bbfb083 Need to make sure that only the winning component sets the include file.
Easiest solution is to set the include in a POST_CONFIG macro based on
whether the configure system says the component was selected or not.

This commit was SVN r25968.
2012-02-20 16:45:54 +00:00
Brian Barrett
628aa0d84d Altix check needs to occur before linux check.
This commit was SVN r25967.
2012-02-20 16:07:41 +00:00
Ralph Castain
534d70025f Cleanup the detection of process binding during mpi_init. There are several cases that need to be checked:
1. no binding support - indicated by a negative return code from get_cpubind

2. binding supported, but not bound - the bitset returned by get_cpubind is the same as the available cpuset

3. binding supported and bound - bitset from get_cpubind is a subset of available cpuset

4. only one cpu is available - in this case, get_cpubind matches the available cpuset, but we are effectively bound

This commit was SVN r25957.
2012-02-17 21:18:53 +00:00
Jeff Squyres
72e44cfefe Fixes trac:2951: make .../hwloc/include/autogen/config.h not be included
in the tarball.  Thanks to Paul Hargrove for the fix.

This commit was SVN r25952.

The following Trac tickets were found above:
  Ticket 2951 --> https://svn.open-mpi.org/trac/ompi/ticket/2951
2012-02-17 14:27:27 +00:00
Jeff Squyres
eb47a97025 After a bunch more back-n-forth with Paul Hargrove, hopefully this
visibility stuff will now be fixed!

This commit was SVN r25944.
2012-02-17 00:09:32 +00:00
Jeff Squyres
a055c5662c This is already ompi_ignore'd -- let's remove it.
This commit was SVN r25943.
2012-02-16 22:58:58 +00:00
Ralph Castain
7fd3ee6662 Ensure we see the correct config.h, and silence the warnings caused by duplicate defines
This commit was SVN r25938.
2012-02-16 02:50:57 +00:00
Jeff Squyres
14457accd7 Add hwloc 1.3.2 and ompi_ignore hwloc 1.3.1 (with the intent of
removing 1.3.1 in the near future).

This commit was SVN r25927.
2012-02-14 21:01:36 +00:00
Jeff Squyres
63a96e92b5 In a recent v1.5 branch issue, it took a while to figure out that
paffinity hwloc was returning "NOT_SUPPORTED" when the real problem
was that the underlying hwloc simply hadn't been initialized yet.  So
let's clearly delineate this case: return OPAL_ERR_NOT_INITIALIZED if
the underlying hwloc is not initialized.

This commit was SVN r25902.
2012-02-10 18:29:52 +00:00
Jeff Squyres
8d0bc199df hwloc131_module.c isn't necessary -- there's no module.
This commit was SVN r25901.
2012-02-10 18:09:19 +00:00
Jeff Squyres
6557d74e01 Make sure we get the entire hwloc tree, including IO devices.
This commit was SVN r25887.
2012-02-09 16:59:38 +00:00
Jeff Squyres
6dde3b6d86 Remove the old hwloc component; we bumped up to 1.3.1 a long time ago.
This commit was SVN r25885.
2012-02-09 12:27:00 +00:00
Ralph Castain
a3ab70c53f Correctly parse socket:core syntax in rankfile
This commit was SVN r25848.
2012-02-01 01:50:05 +00:00
Jeff Squyres
ba1b02dea0 Don't install this extra libevent file.
This commit was SVN r25808.
2012-01-27 20:38:00 +00:00
Jeff Squyres
9e9b06d9f7 Fixes trac:2844: ensure to take the value of --with(out)-memory-manager
into account when configuring the components of the faramework.  If
--without-memory-manager was given, then we really don't want any
memory managers to be used.

This commit was SVN r25807.

The following Trac tickets were found above:
  Ticket 2844 --> https://svn.open-mpi.org/trac/ompi/ticket/2844
2012-01-27 18:05:48 +00:00
Ralph Castain
3f31feee6f Handle the case where a user's rankfile specifies only cpus, and not socket:cpu pairs.
This commit was SVN r25803.
2012-01-27 12:21:45 +00:00
Shiqing Fan
debe91aefa Change the syntax to be compatible with C++ compiler, as this has to be compiled as C++ on Windows. Thanks Ralph.
This commit was SVN r25785.
2012-01-26 14:53:45 +00:00
Jeff Squyres
e162945090 This script is generated and should not be in SVN.
This commit was SVN r25778.
2012-01-25 16:38:25 +00:00
Jeff Squyres
40e23e3979 Refs trac:2952: temporarily turn off hwloc PCI support because it causes a
problem on SuSE 10 (which might be related to Oracle's dual-bitness
builds, but we aren't completely sure yet).

So just turn it off for now, and bring this over to v1.5.  Find a
proper fix (that enables pci support properly) for trunk/v1.7 later.

This commit was SVN r25769.

The following Trac tickets were found above:
  Ticket 2952 --> https://svn.open-mpi.org/trac/ompi/ticket/2952
2012-01-24 15:07:41 +00:00
Jeff Squyres
6c6d19f5f5 We always want to add HWLOC_EMBEDDED_LIBS to the wrapper flags. It'll
either be empty or have meaningful stuff in it.

This commit was SVN r25761.
2012-01-21 02:57:17 +00:00
Jeff Squyres
6cad1f34e0 Bring r4182 from the hwloc v1.3 branch: fix static linking issues with
libhwloc_embedded.la.

This commit was SVN r25760.

The following SVN revision numbers were found above:
  r4182 --> open-mpi/ompi@b240395d9a
2012-01-21 02:56:42 +00:00
Jeff Squyres
878a0365be Bring over r4094 from the hwloc v1.3 branch: add missing HWLOC_PCI_LIBS
This commit was SVN r25759.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r4094
2012-01-21 02:17:07 +00:00
Jeff Squyres
1d15d39fb8 Remove the libnuma component; the hwloc maffinity component does everything that this compnent used to do. Good riddance!
This commit was SVN r25749.
2012-01-19 23:53:05 +00:00
Jeff Squyres
45636b0558 Make hwloc 1.3.1 the default. Will likel remove 1.2.2ompi shortly.
This commit was SVN r25748.
2012-01-19 23:18:40 +00:00
Jeff Squyres
1a73ba6ce8 Note the upstream patches that we have in addition to stock hwloc 1.3.1.
This commit was SVN r25708.
2012-01-11 00:22:34 +00:00
Jeff Squyres
4243cb7af0 Bring over hwloc r4102 and r4104 for some upstream patches.
This commit was SVN r25707.

The following SVN revision numbers were found above:
  r4102 --> open-mpi/ompi@8961ca568d

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r4104
2012-01-11 00:21:47 +00:00
Jeff Squyres
50e5b0937c Add hwloc 1.3.1, but it is not yet the default -- it is currently
.ompi_ignored to allow other developers to test with it.  It is
expected that we'll remove the .ompi_ignore here soon, and
simultaneously remove the hwloc 1.2.2ompi component.

There was one very minor patch added to stock hwloc 1.3.1 in
hwloc/config/hwloc.m4:

{{{
--- hwloc-1.3.1/config/hwloc.m4	2011-12-14 
+++ ompi3/opal/mca/hwloc/hwloc131/hwloc/config/hwloc.m4
@@ -583,6 +583,7 @@
         ])
     fi
     AC_SUBST(HWLOC_PCI_LIBS)
+    HWLOC_LIBS="$HWLOC_LIBS $HWLOC_PCI_LIBS"
     # If we asked for pci support but couldn't deliver, fail
     AS_IF([test "$enable_pci" = "yes" -a "$hwloc_pci_happy" = "no"],
           [AC_MSG_WARN([Specified --enable-pci switch, but could
	   not])
}}}

This will be pushed upstream to hwloc.

This commit was SVN r25706.
2012-01-10 23:38:14 +00:00
Samuel Gutierrez
0ca6603fa0 remove some unused cruft in shmem. minor common sm cleanup.
This commit was SVN r25665.
2011-12-16 22:43:55 +00:00
Jeff Squyres
1d3dc0af28 Gah! opal_shmem_base_register_params() ''wasn't'' added for the mmap
on NFS warning -- it was already there!  So put it back so that it can
register base_verbose and RUNTIME_QUERY_hint.

This commit was SVN r25663.
2011-12-15 21:14:34 +00:00
Jeff Squyres
9cef715194 Updates to r25652 -- put this MCA param in the shmem/mmap component.
No need for it to be in the base (we mistakenly thought it was used in
multiple shmem components).

This commit was SVN r25662.

The following SVN revision numbers were found above:
  r25652 --> open-mpi/ompi@7e223b5799
2011-12-15 20:41:14 +00:00
Ralph Castain
7e223b5799 Okay, okay...stop the whining! Put the mca param registration in the shmem base.
This commit was SVN r25652.
2011-12-14 22:25:32 +00:00
Ralph Castain
4303958968 Allow users to silence warning
This commit was SVN r25650.
2011-12-14 21:50:34 +00:00
Shiqing Fan
a58e4ae809 Add a missing .windows file into the tarball.
This commit was SVN r25638.
2011-12-14 13:29:26 +00:00
Jeff Squyres
efd8106d0a Fix typo in help message
This commit was SVN r25628.
2011-12-13 21:55:48 +00:00
Samuel Gutierrez
8b9bb66b1c fixes shared memory in windows.
This commit was SVN r25623.
2011-12-12 22:16:31 +00:00
Nathan Hjelm
239e9c8740 clean up tabs
This commit was SVN r25622.
2011-12-12 20:54:14 +00:00
Nathan Hjelm
885d5cbcf8 enable ptmalloc with using uGNI
This commit was SVN r25621.
2011-12-12 20:52:51 +00:00
Samuel Gutierrez
de8d3a4f79 more windows shmem updates. maybe this is closer..?
This commit was SVN r25607.
2011-12-09 06:34:06 +00:00
Samuel Gutierrez
989d75bfec some more windows update... this may really break things :-(.
This commit was SVN r25592.
2011-12-08 00:10:09 +00:00
Samuel Gutierrez
dcb965e60e Some shmem Windows updates. I'm not sure if this makes a difference. Shiqing - can you please test?.
This commit was SVN r25591.
2011-12-07 21:56:27 +00:00
Josh Hursey
076db435cd * Fixes trac:2807 : Improve the BLCR configure option so that it checks if the {{{--with-blcr}}} option is specified, but not {{{--with-ft=cr}}}, then configure returns an error (since this is clearly not what the user intended).
This commit was SVN r25590.

The following Trac tickets were found above:
  Ticket 2807 --> https://svn.open-mpi.org/trac/ompi/ticket/2807
2011-12-07 21:36:03 +00:00
Josh Hursey
58938b2f50 * Clarified show help when CRS component cannot be loaded.
* Fixes trac:2329 : Improves the error message, and ensures opal-restart will not segv in opal_finalize.

This commit was SVN r25586.

The following Trac tickets were found above:
  Ticket 2329 --> https://svn.open-mpi.org/trac/ompi/ticket/2329
2011-12-07 14:58:08 +00:00
Ralph Castain
6fefe236a4 Warn users if they set opal_paffinity_alone, either to true or false, that this parameter is no longer functional - they must use the --bind-to option and its corresponding mca param.
This commit was SVN r25567.
2011-12-03 01:10:52 +00:00
Jeff Squyres
93a797cc33 r25511 didn't fully fix the OPAL_CHECK_VISIBILITY issue; we also need
to -I the opal/config directory so that autoregen can find our .m4
files. 

This commit was SVN r25564.

The following SVN revision numbers were found above:
  r25511 --> open-mpi/ompi@5751c45916
2011-12-02 19:45:39 +00:00
Jeff Squyres
ecf6ba910c Silence a few icc warnings and about mixing enums with other types.
This commit was SVN r25560.
2011-12-02 13:18:54 +00:00
Jeff Squyres
6fbbfd0f7a Gah! r25545 acidentally included ''waaaay'' more stuff than it was
supposed to.  I.e., half-baked/not complete stuff.

This commit backs out all of r25545.  Sorry folks!

This commit was SVN r25546.

The following SVN revision numbers were found above:
  r25545 --> open-mpi/ompi@7f9ae11faf
2011-11-29 23:24:52 +00:00
Jeff Squyres
7f9ae11faf Per http://www.open-mpi.org/community/lists/users/2011/11/17862.php,
to make MPI_IN_PLACE (and other sentinel Fortran constants) work on OS
X, we need to use the following compiler (linker) flag:

    -Wl,-commons,use_dylibs 

So if we're compiling on OS X, test to see if that flag works with the
compiler.  If so, add it to the wrapper FFLAGS and FCFLAGS (note that
per a future update, we'll only have one Fortran compiler anyway).

Fixes trac:1982.  

This commit was SVN r25545.

The following Trac tickets were found above:
  Ticket 1982 --> https://svn.open-mpi.org/trac/ompi/ticket/1982
2011-11-29 23:05:54 +00:00
Samuel Gutierrez
375162c693 this commit fixes a few things. 1. silence warning in common sm. 2. remove unneeded config code in common sm. 3. move opal_shmem_base_close to a better place in opal_finalize. 4. fix opal_path_nfs output.
This commit was SVN r25518.
2011-11-28 23:41:19 +00:00
Samuel Gutierrez
b4edf0ff5c getting ready for 1.5 port of the shared memory enhancements. remove some unused/unneeded stuff and minor style update.
This commit was SVN r25513.
2011-11-28 16:08:32 +00:00
George Bosilca
5751c45916 Actually ... OMPI is lacking visibility, as it was all moved down in OPAL.
This commit was SVN r25511.
2011-11-28 04:27:06 +00:00
Terry Dontje
1f53b32216 This commit fixes trac:2917. By using the cleaned up version of check_visibility that is in the hwloc trunk repo.
This commit was SVN r25495.

The following Trac tickets were found above:
  Ticket 2917 --> https://svn.open-mpi.org/trac/ompi/ticket/2917
2011-11-22 00:01:09 +00:00
George Bosilca
61f273b987 Do not tolerate uninitialized variables.
This commit was SVN r25489.
2011-11-18 10:19:24 +00:00
Samuel Gutierrez
15249dfc01 rename some variables following ompi's naming convention.
This commit was SVN r25487.
2011-11-17 19:22:58 +00:00
Samuel Gutierrez
47499d1d3d fixes CID 2256.
This commit was SVN r25486.
2011-11-17 18:06:28 +00:00
Samuel Gutierrez
c1012f502f added backing file relocation capability in shmem mmap. two new mca
parameters control its behavior (shmem_mmap_relocate_backing_file and
shmem_mmap_backing_file_base_dir).

This commit was SVN r25480.
2011-11-16 01:38:23 +00:00
Ralph Castain
6310361532 At long last, the fabled revision to the affinity system has arrived. A more detailed explanation of how this all works will be presented here:
https://svn.open-mpi.org/trac/ompi/wiki/ProcessPlacement

The wiki page is incomplete at the moment, but I hope to complete it over the next few days. I will provide updates on the devel list. As the wiki page states, the default and most commonly used options remain unchanged (except as noted below). New, esoteric and complex options have been added, but unless you are a true masochist, you are unlikely to use many of them beyond perhaps an initial curiosity-motivated experimentation.

In a nutshell, this commit revamps the map/rank/bind procedure to take into account topology info on the compute nodes. I have, for the most part, preserved the default behaviors, with three notable exceptions:

1. I have at long last bowed my head in submission to the system admin's of managed clusters. For years, they have complained about our default of allowing users to oversubscribe nodes - i.e., to run more processes on a node than allocated slots. Accordingly, I have modified the default behavior: if you are running off of hostfile/dash-host allocated nodes, then the default is to allow oversubscription. If you are running off of RM-allocated nodes, then the default is to NOT allow oversubscription. Flags to override these behaviors are provided, so this only affects the default behavior.

2. both cpus/rank and stride have been removed. The latter was demanded by those who didn't understand the purpose behind it - and I agreed as the users who requested it are no longer using it. The former was removed temporarily pending implementation.

3. vm launch is now the sole method for starting OMPI. It was just too darned hard to maintain multiple launch procedures - maybe someday, provided someone can demonstrate a reason to do so.

As Jeff stated, it is impossible to fully test a change of this size. I have tested it on Linux and Mac, covering all the default and simple options, singletons, and comm_spawn. That said, I'm sure others will find problems, so I'll be watching MTT results until this stabilizes.

This commit was SVN r25476.
2011-11-15 03:40:11 +00:00
Shiqing Fan
fc46ed6438 Use the new libevent On Windows.
This commit was SVN r25462.
2011-11-09 14:05:35 +00:00
George Bosilca
bd55a19db4 Disable vector optimization for ICC v12.1.0 release 2011.6.233.
This commit was SVN r25461.
2011-11-08 21:23:30 +00:00
Jeff Squyres
78538b701d Turns out that we're not even using that $includedir properly, so just
remove it.

This commit was SVN r25458.
2011-11-08 03:18:09 +00:00
Ralph Castain
a931c5b1eb Redo a patch from late last night that replaces libevent 2.0.7 with libevent 2.0.13 as our default event library. Cleanup the libevent renames to correctly state 2013 as our new version.
This commit was SVN r25457.
2011-11-08 01:36:15 +00:00
Ralph Castain
a3ce355a60 Revert r25453 and r25450 until we can fix the libevent2013 configure code - still not getting the includedir to eval correctly.
This commit was SVN r25454.

The following SVN revision numbers were found above:
  r25450 --> open-mpi/ompi@7f7d5c4f1f
  r25453 --> open-mpi/ompi@c9fe8c32e2
2011-11-07 16:23:44 +00:00
Ralph Castain
c9fe8c32e2 Fix a bug in the configure.m4 to ensure we disable unused libevent modes. Edit their Makefile.am to remove bufferevent support since we don't use it either.
Sorry for mid-day correction.

This commit was SVN r25453.
2011-11-07 15:21:29 +00:00
Nathan Hjelm
7f7d5c4f1f RFC: upgrade to libevent 2.0.13 (removing 2.0.7) timeout. Removed libevent 2.0.7
This commit was SVN r25450.
2011-11-07 04:32:36 +00:00
Jeff Squyres
f08b8bf2d4 Per this thread:
http://www.open-mpi.org/community/lists/devel/2011/10/9878.php

I am making a final decision to decide the behavior of what happens
when an MCA parameter is re-registered and changes types.  In
developer builds (i.e., OPAL_ENABLE_DEBUG==1), a show_help message
will be displayed.  In all builds, an error status will be returned.
Specifically, the logic looks like this:

{{{
    if (detect_re-registration_with_type_change) {
#if OPAL_ENABLE_DEBUG
        opal_show_help(...);
#endif
        return OPAL_ERR_VALUE_OUT_OF_BOUNDS;
    }
}}}

If someone would like to change this behavior, they are welcome to do
so.  :-) I am committing this so that ''some'' action occurs (rather
than talking about the issue and then nothing happens).

This commit was SVN r25432.
2011-11-04 14:16:49 +00:00
Jeff Squyres
886a9d589b Custom patch from Brice for the hwloc-1.2.2ompi distro, per an issue
that Chris Yeoh/IBM found.  See the thread below for more info:

  http://www.open-mpi.org/community/lists/hwloc-devel/2011/11/2521.php

This commit was SVN r25429.
2011-11-03 14:53:22 +00:00
Jeff Squyres
4fe26b0392 Fix some minor memory leaks
This commit was SVN r25410.
2011-11-01 20:22:26 +00:00
Ralph Castain
4368199c86 Missing include
This commit was SVN r25402.
2011-10-31 13:39:57 +00:00
Ralph Castain
96332a2859 Fix typo
This commit was SVN r25400.
2011-10-30 13:23:42 +00:00
Ralph Castain
71ed8e3cd3 Bring back the local node's binding capabilities along with its topology. Clean up indentation.
This commit was SVN r25399.
2011-10-30 13:20:16 +00:00
Ralph Castain
7ba4675adf Bring over some useful utilities and definitions for working with hwloc inside ORTE/OMPI. Cache frequently computed info to save processing time when handling multiple nodes with the same topology. Deal with available cpus as defined by online vs allowed vs user-specified limits. Help deal with hwloc's unfortunate decision to lump all caches in the same object type.
This commit was SVN r25393.
2011-10-29 14:58:58 +00:00
Jeff Squyres
6092b50ebb Fix the cases where the default values of MCA params were not always
handled properly when MCA parameters are re-registered and their types
change.  Specifically, this case was broken:

 1. Register an int MCA param with a non-zero default value
 1. Re-register the same MCA param as a string with a NULL default value

The 2nd step would cause a segv because the first int default value
wasn't being reset properly.  Here's sample code that shows the issue:

{{{
{
    int ibogus;
    char *sbogus;
    opal_init(&argc, &argv);
    mca_base_param_reg_int_name("type", "name", "help", false, false, 3, &ibogus);
    printf("Ibogus: %d\n", ibogus);
    mca_base_param_reg_string_name("type", "name", "help", false, false, NULL, &sbogus);
    printf("Sbogus: %s\n", (NULL == sbogus) ? "NULL" : sbogus);
    exit(0);
}
}}}

This commit fixes the problem from the sample code above as well as
the a similar issue for file-set MCA params and override values.  It
also resets default values for MCA params initially registered as a
string but then re-registered as an int.

This commit was SVN r25392.
2011-10-29 12:29:31 +00:00
Ralph Castain
21d45b0807 Just some cleanup in case of error
This commit was SVN r25387.
2011-10-29 01:55:19 +00:00
Ralph Castain
a7cbc25658 Minor cleanups - check hwloc returns everywhere. Thanks to Chris Yeoh for pointing this out.
This commit was SVN r25360.
2011-10-24 14:05:26 +00:00
Shiqing Fan
5711414eb7 Fix Windows build
This commit was SVN r25351.
2011-10-21 14:46:58 +00:00
Jeff Squyres
cbafea8f69 Add a DEPENDENCIES line so that if you edit something down in the
hwloc tree, it'll get picked up by the component (and therefore by
libopen-pal).

Thanks to Terry for finding the problem.

This commit was SVN r25349.
2011-10-21 11:39:52 +00:00
Ralph Castain
b44f8d4b28 Complete implementation of the ess.proc_get_locality API. Up to this point, the API was only capable of telling if the specified proc was sharing a node with you. However, the returned value was capable of telling you much more detailed info - e.g., if the proc shares a socket, a cache, or numa node. We just didn't have the data to provide that detail.
Use hwloc to obtain the cpuset for each process during mpi_init, and share that info in the modex. As it arrives, use a new opal_hwloc_base utility function to parse the value against the local proc's cpuset and determine where they overlap. Cache the value in the pmap object as it may be referenced multiple times.

Thus, the return value from orte_ess.proc_get_locality is a 16-bit bitmask that describes the resources being shared with you. This bitmask can be tested using the macros in opal/mca/paffinity/paffinity.h

Locality is available for all procs, whether launched via mpirun or directly with an external launcher such as slurm or aprun.

This commit was SVN r25331.
2011-10-19 20:18:14 +00:00
Rainer Keller
ec6ac33b75 - On Linux x86-64 with intel compiler v12.1, any ompi-app fails before
calling main():
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
xxx:~/openmpi-1.5.4/COMPILE-intel-12.1.0> which ompi_info
~/openmpi-1.5.4/COMPILE-intel-12.1.0/usr/bin/ompi_info
xxx:~/openmpi-1.5.4/COMPILE-intel-12.1.0> ompi_info
Segmentation fault
xxx:~/openmpi-1.5.4/COMPILE-intel-12.1.0> gdb usr/bin/ompi_info
...
(gdb) run
Starting program:
...
Program received signal SIGSEGV, Segmentation fault.
opal_memory_ptmalloc2_int_malloc (av=0x7ffff7fe83d8, bytes=4102) at
../../../../../opal/mca/memory/linux/malloc.c:4080
4080          /* remove from unsorted list */
(gdb) where
#0  opal_memory_ptmalloc2_int_malloc (av=0x7ffff7fe83d8, bytes=4102) at
../../../../../opal/mca/memory/linux/malloc.c:4080
#1  0x00007ffff7c232b9 in opal_memory_linux_malloc_hook
(sz=140737354040280, caller=0x1006) at
../../../../../opal/mca/memory/linux/hooks.c:687
#2  0x0000003dd96a6871 in __alloc_dir () from /lib64/libc.so.6
#3  0x0000003ddfa053cd in ?? () from /usr/lib64/libnuma.so.1
#4  0x0000003dd8e0e445 in _dl_init_internal () from /lib64/ld-linux-x86-64.so.2
...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    A lot of combinations and trials have been done, yet to no avail.
    Intel v11.0 worked...

    Thanks to Hubert Haberstock (Intel) providing the hint in:
    http://software.intel.com/en-us/forums/showthread.php?t=87132

    This was tested on openmpi-1.5.4 and therefore should
    cmr:v1.5

This commit was SVN r25290.
2011-10-14 20:47:08 +00:00
Ralph Castain
69a0882207 Correctly setup hwloc when passing a topology from an external source
This commit was SVN r25277.
2011-10-12 21:34:46 +00:00
Jeff Squyres
ff97b57c90 Change the names to be slightly more descriptive.
This commit was SVN r25271.
2011-10-12 16:07:09 +00:00
Jeff Squyres
951c745590 We always have hwloc xml support (now that it's built into to hwloc
without needing libxml2).  So OPAL_HAVE_HWLOC_XML is no longer
necessary.  

This commit was SVN r25263.
2011-10-11 20:20:59 +00:00
Jeff Squyres
e4f8b662a1 Remove this component; it was wholly superceded by hwloc122ompi a
little while ago.

This commit was SVN r25261.
2011-10-11 20:13:33 +00:00
Jeff Squyres
7fd3a7f696 Remove no-longer-true comment. Process-wide memory affinity policy is
set by calling opal_hwloc_base_set_process_membind_policy().

This commit was SVN r25260.
2011-10-11 20:00:45 +00:00
Ralph Castain
8c4512a994 Fix the verbose output for caches (again) so they are properly labeled, pending adoption of the upstream patch we supplied.
This commit was SVN r25251.
2011-10-11 05:54:26 +00:00
Swen Boehm
08b4322a1a patched the lex files to not issue the following compiler warning:
'yyunput' defined but not used

This commit was SVN r25246.
2011-10-10 18:13:04 +00:00
George Bosilca
649af6c925 Enumerated mixed with another type (int) is tolerated but
easily fixable.

This commit was SVN r25241.
2011-10-09 03:54:52 +00:00
George Bosilca
9d68d7c0c8 iFix a bunch of warnings.
This commit was SVN r25227.
2011-10-03 18:46:49 +00:00
George Bosilca
b4c076ad28 Remove an unused function.
This commit was SVN r25226.
2011-10-03 18:46:27 +00:00
Jeff Squyres
34deb0db97 Sync with final hwloc 1.2.2 release
This commit was SVN r25221.
2011-10-03 14:12:38 +00:00
Jeff Squyres
6a32aa4a04 Oops -- it looks like we ''do'' still use this variable in the
trunk... 

This commit was SVN r25203.
2011-09-28 12:12:37 +00:00
Jeff Squyres
bc3e213a69 After fixing an svn/hg kerfluffle, there's a few files left over from
last night's hwloc/paffinity/maffinity minor update.  Nothing huge;
just a little cleanup.

This commit was SVN r25202.
2011-09-28 11:46:28 +00:00
Jeff Squyres
9fa2130cfb Fix typo that prevents VPATH builds.
This commit was SVN r25201.
2011-09-28 11:29:12 +00:00
Jeff Squyres
970a75a7b6 Update to a custom OMPI roll of hwloc v1.2.2. Upgrade the configry to
match similar stuff in the event framework; only add CPPFLAGS /
LDFLAGS / LIBS / and WRAPPER_EXTRA_* of the same for the one, single,
winning component (because this framework is compile-time,
one-of-many).

This commit was SVN r25199.
2011-09-27 23:54:09 +00:00
Jeff Squyres
3d61d0f357 Fix up some long-latency bugs in the MCA even framework configury that
only became evident when there was more than one event component.

The libevent2013 component is still ompi_ignore'd for most developers.

This commit was SVN r25198.
2011-09-27 23:18:07 +00:00
Jeff Squyres
d4603f080d Refs trac:2854.
Since hwloc has a dynamic bitmap size, it could actually have bits set
that will not fit in the paffinity mask.  We already made sure that we
didn't overrun the paffinity mask; now also set the return value to
OPAL_ERR_VALUE_OUT_OF_BOUNDS (wow, we really thought of everything
with those error codes, eh?) if the hwloc bitmap has bits set higher
than what will fit into the paffinity bitmask.

This commit was SVN r25179.

The following Trac tickets were found above:
  Ticket 2854 --> https://svn.open-mpi.org/trac/ompi/ticket/2854
2011-09-24 13:52:27 +00:00
Jeff Squyres
57323570e3 These calls were mistakenly added in r25164.
This commit was SVN r25176.

The following SVN revision numbers were found above:
  r25164 --> open-mpi/ompi@51129cc2a8
2011-09-22 18:20:30 +00:00
Jeff Squyres
82c93611e6 Fix some problems with the libevent and hwloc frameworks:
* change components from setting <framework>_base_include to
   opal_<framework>_<component>_include; the framework m4 will figure
   out the winning component and pick the right "include" shell
   variable.  Ditto for the other shell variables (cppflags, ldflags,
   etc.). 
 * misc fixes to hwloc/external
 * add a bunch of missing "opal_" prefixes to shell variables
 * add a few more / update a few comments in framework m4's

This commit was SVN r25174.
2011-09-21 23:06:13 +00:00
Ralph Castain
a818101d19 Silence compiler warning
This commit was SVN r25172.
2011-09-21 16:39:59 +00:00
Shiqing Fan
3e0ee394ef Select only one libevent component.
This commit was SVN r25169.
2011-09-20 16:10:01 +00:00
Shiqing Fan
4caed984ed Need to exclude another file for windows build.
This commit was SVN r25168.
2011-09-20 16:09:03 +00:00
Ralph Castain
51129cc2a8 If built without hwloc xml support, we cannot currently pass the local topology from the daemon to an MPI app. This makes it impossible to set affinity, for example. In this case, have the app get its own copy of the topology at startup.
For safety sake, protect hwloc-based affinity modules from NULL topology

This commit was SVN r25164.
2011-09-20 14:46:55 +00:00
Ralph Castain
052ccd4b1e Set ignores
This commit was SVN r25163.
2011-09-20 13:47:05 +00:00
Ralph Castain
45396d8f9c Add missing files
This commit was SVN r25162.
2011-09-20 13:37:22 +00:00
Nathan Hjelm
7cd8f21b7f add libevent 2.0.13 module
This commit was SVN r25161.
2011-09-20 00:13:05 +00:00
Jeff Squyres
9db4542c2b Move maffinity_base_alloc_policy and
maffinity_base_bind_failure_action MCA params to the hwloc base
(hwloc_base_alloc_polocy and hwloc_base_bind_failure_action).  Since
these MCA parameters were never on a release branch, I'm just
moving/renaming them outright and not leaving aliases to the old
names.

Note that some upper layer needs to call
opal_hwloc_base_set_process_membind_policy() to set the
set-by-MCA-param process-wide memory affinity policy.  We can't do
this automatically during hwloc_base_open() because, for reasons
described elsewhere, opal_hwloc_topology is not automatically filled
during hwloc_base_open() (in short: potential scalability issues when
launching many MPI processes simultaneously on a single machine, for
example).

This commit was SVN r25156.
2011-09-19 16:10:37 +00:00
Jeff Squyres
dc70100cee In reviewing CMR #2866, it was noticed that the maffinity/hwloc and
paffinity/hwloc components were still calling hwloc_topology_init/load
themselves, and not using the opal_hwloc_topology.  Doh!

This commit fixes that -- these 2 components no longer have their own
copy of the topology tree; they just use opal_hwloc_topology.

This commit was SVN r25151.
2011-09-17 13:13:36 +00:00
Jeff Squyres
ecd603256a * Rename opal_hwloc_components to opal_hwloc_base_components
* Fix some comments

This commit was SVN r25150.
2011-09-17 11:54:36 +00:00
Jeff Squyres
4a2cf81c6f Fixes to ensure that dependent libraries are carried forward from the embedded hwloc
This commit was SVN r25140.
2011-09-13 22:43:39 +00:00
Jeff Squyres
d6682523f6 Put in proper basename so that "make dist" can find it.
This commit was SVN r25135.
2011-09-13 11:09:56 +00:00
Jeff Squyres
4771c36061 * With some m4 trickery, if no form of --with-hwloc is specified on
the command line, hwloc is just like any other external dependency
   in OMPI: if we find it, we'll use it. If we don't find it, we'll
   ignore it.  See comments in opal/mca/hwloc/configure.m4 for an
   explanation. 
 * Fix some copy-n-paste errors in opal/mca/hwloc/configure.m4
   w.r.t. flags coming in from the winning component.
 * Add another line in ompi_info's output about whether hwloc support
   is included or not.

This commit was SVN r25134.
2011-09-13 00:39:14 +00:00
Jeff Squyres
7dc352a328 Add some notes about maintaining the hwloc framework.
This commit was SVN r25132.
2011-09-12 19:40:18 +00:00
Jeff Squyres
c5bfa09574 Fixes from Brice Goglin, post hwloc v1.2.1 for AMD Magny-Cours. See
http://www.open-mpi.org/community/lists/users/2011/09/17164.php.

This commit was SVN r25131.
2011-09-12 19:03:48 +00:00
Shiqing Fan
b61eed801f Fix the problem of building hwloc on Windows. Temporarily not using it for Windows.
This commit was SVN r25128.
2011-09-12 13:55:34 +00:00
Ralph Castain
6460fe5480 Silence warning
This commit was SVN r25127.
2011-09-12 13:32:21 +00:00
Ralph Castain
92c7372e20 Per the RFC from Jeff, move hwloc from opal/mca/common to its own static framework ala libevent. Have ORTE daemons collect the topology info at startup and, if --enable-hwloc-xml is set, send that info back to the HNP for later use. The HNP only retains unique topology "templates" to reduce memory footprint. Have the daemon include the local topology info in the nidmap buffer sent to each app so the apps don't all hammer the local system to discover it for themselves.
Remove the sysinfo framework as hwloc replaces that functionality.

This commit was SVN r25124.
2011-09-11 19:02:24 +00:00
Ralph Castain
b2971df7df Ensure we loop over all cpu's.
Thanks to Nadia Derbey for spotting the error.

This commit was SVN r25102.
2011-08-29 14:24:34 +00:00
Eugene Loh
55a7b474dd Change a stray __volatile to __volatile__.
This commit was SVN r25092.
2011-08-26 15:36:10 +00:00
Jeff Squyres
495ceef60d Upgrade hwloc to v1.2.1.
This commit was SVN r25088.
2011-08-26 13:14:26 +00:00
Ralph Castain
df28c63164 If we are on a single processor, then we are effectively bound - so have the macro correctly report it.
Thanks to Pascal Deveze for the patch.

This commit was SVN r25068.
2011-08-22 16:28:40 +00:00
Shiqing Fan
3af7c9f7bb Complete the MinGW build support on Windows.
This commit was SVN r25048.
2011-08-15 09:47:23 +00:00
Jeff Squyres
1cbfb53801 r24976 wasn't quite right -- you now actually get a warning if you
specify btl_tcp_if_include because btl_tcp_if_exclude is defaulted to
the loopback devices.

This commit does a few things:

 * Introduce a new OPAL MCA base function:
   mca_base_param_check_exclusive_string().  It checks to see that the
   ''user'' does not set two MCA parameters that are mutually
   exclusive by checking the source of those MCS param values.
 * Use the above function in many BTLs (and the OOB TCP) to ensure
   that <foo>_if_include and <foo>_if_exclude are not both specified
   ''by the user''.
 * Re-arrange many of these BTLs to move their MCA registration code
   into a separate component_register() function (vs. the
   component_open() function).

This code has been nominally reviewed and checked by Ralph, George,
Terry, and Shiqing.

This commit was SVN r25043.

The following SVN revision numbers were found above:
  r24976 --> open-mpi/ompi@8f4ac54336
2011-08-10 17:24:36 +00:00
Samuel Gutierrez
bb791eaa23 change opal_output_verbose level to be consistent with shmem base.
This commit was SVN r25036.
2011-08-09 21:34:12 +00:00
Samuel Gutierrez
b144c8c343 silence warning in shmem posix run-time test when err is not equal to EEXIST.
This commit was SVN r25034.
2011-08-09 21:13:28 +00:00
Jeff Squyres
ba432393d4 Remove some really old (internal) kruft that never ended up getting
used. 

This commit was SVN r24988.
2011-08-04 15:24:37 +00:00
Samuel Gutierrez
adde221413 use memcpy in ds_copy.
This commit was SVN r24942.
2011-07-25 17:16:29 +00:00
Ralph Castain
8a7f9f8997 Hide libevent symbols when internal thread support enabled
This commit was SVN r24922.
2011-07-22 19:49:47 +00:00
Ralph Castain
3f0d13efe2 Fix libevent internal thread support
This commit was SVN r24920.
2011-07-22 19:18:49 +00:00
Shiqing Fan
edaa7b96e4 This should not be commented out.
This commit was SVN r24914.
2011-07-21 12:56:18 +00:00
Shiqing Fan
665d1284be Fix a bug that memcpy'ing a wrong temp string.
This commit was SVN r24912.
2011-07-21 12:53:03 +00:00
Ralph Castain
6201581544 Fix the symbol visibility issue for libevent by renaming all visible libevent symbols
This commit was SVN r24902.
2011-07-14 07:10:52 +00:00
Ralph Castain
5e99d45ae4 Remove unused variable
This commit was SVN r24887.
2011-07-13 03:42:20 +00:00
Nadia Derbey
0d0cead33a Fix a hang in carto_base_select() if carto_module_init() fails
This commit was SVN r24876.
2011-07-12 05:47:28 +00:00
Abhishek Kulkarni
7363938ba8 add a missing include.
This commit was SVN r24873.
2011-07-11 00:04:31 +00:00
Jeff Squyres
08a05a1e35 Minor additions to make OMPI trunk compatible with the latest GNU
Autotools:

 * Autoconf 2.68
 * Automake 1.11.1
 * Libtool 2.4
 * m4 1.4.16

This commit was SVN r24867.
2011-07-10 12:11:47 +00:00
Jeff Squyres
e2df4d4a8d Some platforms don't have <execinfo.h>, even if they have backtrace()
function (e.g., NetBSD).  Thanks to Aleksej Saushev for pointing out
the issue. 

This commit was SVN r24866.
2011-07-10 11:14:19 +00:00
Jeff Squyres
b2b781e537 Fix a few miscelaneous memory leaks.
This commit was SVN r24865.
2011-07-08 16:39:58 +00:00
Terry Dontje
86a80411f0 update changes from review comments of #2816
This commit was SVN r24856.
2011-07-05 22:51:39 +00:00
Terry Dontje
8c0af7838a add configure check for Solaris Legacy munmap prototype
This commit was SVN r24839.
2011-06-29 23:45:27 +00:00
Ralph Castain
4dc3ee369f If event threads are enabled, we don't need to wakeup the event lib to pickup new events - so help valgrind to quit whining about it.
This commit was SVN r24837.
2011-06-29 22:52:28 +00:00
Samuel Gutierrez
93110ce805 place a bandage on ds_copy plus minor cleanup. i need to rethink this part of the framework. thanks to Rolf for pointing out the issue.
This commit was SVN r24831.
2011-06-28 19:37:12 +00:00
Ralph Castain
cd6b8417ec Cleanup a set of warnings that appear to be caused by failure of PRIsize_t on Linux.
Set ignore properties

This commit was SVN r24812.
2011-06-23 15:07:58 +00:00
Samuel Gutierrez
61ff422562 fix a few more spots in posix.
This commit was SVN r24808.
2011-06-22 23:17:26 +00:00
Samuel Gutierrez
7fcf806dc9 fix posix builds on solaris. shmem still needs more cleanup on solaris, but at least shmem will stop breaking builds (i hope).
This commit was SVN r24807.
2011-06-22 23:08:58 +00:00
Samuel Gutierrez
5b5ce434fc fix shmem sysv build on solaris.
This commit was SVN r24806.
2011-06-22 18:05:08 +00:00
Samuel Gutierrez
867df203bc fix shmem mmap build on solaris. thanks terry.
This commit was SVN r24805.
2011-06-22 16:05:50 +00:00
Rolf vandeVaart
856a9c43b1 Add string.h. Needed when configuring with --enable-picky
This commit was SVN r24804.
2011-06-22 15:48:32 +00:00
Samuel Gutierrez
81f38b258a commit of new shared memory backing facility framework (shmem) and its components.
This commit was SVN r24795.
2011-06-21 15:41:57 +00:00
Josh Hursey
b223d355fc Add explicit number for opal_crs_state_type_t enum (for debugging). Also add a MAX so we can easily check for out of bounds states during debugging.
This commit was SVN r24766.
2011-06-09 14:27:24 +00:00
Ralph Castain
8f401a0563 Enable the ability to constrain applications to hosts on the basis of resources.
This commit was SVN r24736.
2011-05-28 22:18:19 +00:00
Ralph Castain
b47ec2ee87 Remove lingering references to opal_profile option
This commit was SVN r24709.
2011-05-18 18:27:29 +00:00
Ralph Castain
ddf4914094 Plug fd leak
This commit was SVN r24707.
2011-05-18 13:46:27 +00:00
Ralph Castain
4083e23073 Complete cleanup of pstat linux
This commit was SVN r24701.
2011-05-16 14:08:08 +00:00
Ralph Castain
08c3ecd608 Handle the case where memory stats are in different order, or don't exist on that platform
This commit was SVN r24700.
2011-05-16 13:32:42 +00:00
Ralph Castain
a3e43594a4 Extend node stats to include additional memory info. Change "darwin" pstat module to "test" as we don't really know how to get all the stat info for darwin.
Add a new OPAL_ERROR_LOG macro similar to the ORTE_ERROR_LOG one.

This commit was SVN r24692.
2011-05-08 14:45:16 +00:00
Jeff Squyres
0882d636a6 Oops -- need string.h, too (for strcasecmp).
This commit was SVN r24649.
2011-04-28 15:42:35 +00:00
Jeff Squyres
7362a0730a Change the default to "none". David Singleton raises a good point
that enabling "local_only" by default could cause excessive
by-NUMA-node paging and/or OOMs (rather than allowing memory
allocations to spill over to other NUMA nodes).

This brought home the very real-world example of people buying servers
with more processors/cores than they need, just to get more memory.
We wouldn't want Badness to occur in such scenarios by default.
Instead, let people turn on "only allow memory allocations on my local
NUMA node" if their application would benefit from it.

This commit was SVN r24648.
2011-04-28 15:16:39 +00:00
Jeff Squyres
7b48042ffd Commit patch from upstream hwloc: r3482. Fixes some compiler
warnings. 

This commit was SVN r24641.

The following SVN revision numbers were found above:
  r3482 --> open-mpi/ompi@2435be8d49
2011-04-27 17:08:15 +00:00
Jeff Squyres
d134ff9b4d Refs trac:2698
After a long period of development with many starts and stops, we
finally got this where we wanted it.

This commit introduces 2 new MCA params (note that the
"maffinity_libnuma_policy" MCA param introduced by r24290 was removed
when libnuma support was removed).  Remember that maffinity policies
are only in effect when paffinity is enaabled -- i.e., when processes
are bound to processors!

 * '''maffinity_base_alloc_policy:''' Policy that determines how
   general memory allocations are bound after MPI_INIT.  A value of
   "none" means that no memory policy is applied.  A value of
   "local_only" means that all memory allocations will be restricted
   to the local NUMA node where each process is placed.  Note that
   operating system paging policies are unaffected by this setting.
   For example, if "local_only" is used and local NUMA node memory is
   exhausted, a new memory allocation may cause paging.
 * '''maffinity_base_bind_failure_action:''' What Open MPI will do if
   it explicitly tries to bind memory to a specific NUMA location, and
   fails.  Note that this is a different case than the general
   allocation policy described by maffinity_base_alloc_policy.  A
   value of "warn" means that Open MPI will warn the first time this
   happens, but allow the job to continue (possibly with degraded
   performance).  A value of "error" means that Open MPI will abort
   the job if this happens.

This needs at least a little soak time on the trunk before going to
v1.5.

This commit was SVN r24639.

The following SVN revision numbers were found above:
  r24290 --> open-mpi/ompi@afa654746c

The following Trac tickets were found above:
  Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
2011-04-26 13:31:07 +00:00
Jeff Squyres
926af377fe Refs trac:2778.
Upgrade to hwloc 1.2 (from hwloc 1.1.2).  This should fix the problems
Nathan's seeing in #2778.

Let's let this soak on the trunk for a little while and see how LANL's
MTT's work out.  If that works, then we can CMR this to v1.5.

This commit was SVN r24635.

The following Trac tickets were found above:
  Ticket 2778 --> https://svn.open-mpi.org/trac/ompi/ticket/2778
2011-04-25 19:31:49 +00:00
Jeff Squyres
b8af3b7c4a New comment explains it all -- previous code was failing to find the
Nth core, so it fell over to try to find the Nth PU.

-----

hwloc isn't able to find cores on all platforms.  Example: PPC64
running RHEL 5.4 (linux kernel 2.6.18) only reports NUMA nodes and
PU's.  Fine.

However, note that hwloc_get_obj_by_type() will return NULL in 2
(effectively) different cases:

- no objects of the requested type were found
- the Nth object of the requested type was not found

So first we have to see if we can find *any* cores by looking for the
0th core.  If we find it, then try to find the Nth core.  Otherwise,
try to find the Nth PU.

This commit was SVN r24632.
2011-04-25 16:55:27 +00:00
Ralph Castain
9988b97b97 Extend/update how we handle process stats. Add the ability to collect node-level stats separate from the process stats. Update the process stat memory fields to report in MBytes instead of KBytes as I can't find any process that runs in KBytes nowadays.
Rename the memusage sensor plugin to "resusage" as it will soon be updated to include full process stat monitoring.

Extend the heartbeat sensor to report node and process stats in the heartbeat.

Store the process and node stats in their respective orte_xxx_t object.

This commit was SVN r24629.
2011-04-21 22:55:45 +00:00
Jeff Squyres
2fe94b929a Manually add hwloc v1.1 branch r3418 commit (went in after v1.1.2
released): 

backport hwloc r 3416 from trunk: Add cache info entry _after_ checking
that we need one, thanks Andriy Gapon for the fix

This commit was SVN r24612.

The following SVN revision numbers were found above:
  r3418 --> open-mpi/ompi@9972663a12
2011-04-12 14:41:46 +00:00
Jeff Squyres
9dc3a1aa54 Upgrade to hwloc 1.1.2; most likely the last release of the hwloc
1.1.x series

This commit was SVN r24611.
2011-04-12 14:35:26 +00:00
Jeff Squyres
38d3cdd4a6 Update hwloc to 1.1.1. Next stop: 1.1.2.
This commit was SVN r24610.
2011-04-12 14:16:37 +00:00
Shiqing Fan
4b3b713bfc Update the windows installdir component.
Don't use the old env component for windows, so remove the .windows file.

This commit was SVN r24597.
2011-04-05 12:15:41 +00:00
Ralph Castain
f40edd6b4f Add the stupid test word
This commit was SVN r24578.
2011-03-26 03:38:59 +00:00
Ralph Castain
5bfb01c6c8 Only build the linux component of sysinfo if linux is the operating system.
Thanks to Paul Hargrove for the suggestion.

This commit was SVN r24576.
2011-03-25 20:55:57 +00:00
Jeff Squyres
cf6c5e8d48 Fix a bug noted by Gus Correa on the user's list: mpi_paffinity_alone
appeared multiple times in ompi_info output (so did others, but this
is the one that was noticed).  Ensure that we don't repeat
opal_paffinity_base_register_params() multiple times.

This commit was SVN r24569.
2011-03-24 00:58:25 +00:00
Jeff Squyres
324b90142f Fix CID 1583: hwloc bitmap leak.
This commit was SVN r24496.
2011-03-08 16:47:26 +00:00
Josh Hursey
7c737b9274 Some string and state cleanup. Thanks to George Bosilca for the initial patch.
This commit was SVN r24471.
2011-03-01 20:12:23 +00:00
Jeff Squyres
ad985260d3 Ensure to disable XML and Cairo support in hwloc; OMPI doesn't use it. Additionally, ensure that the right flags are passed back to the wrappers in the case of static builds. We probably won't need these (especially since XML has been disabled), but it's the Right Thing to do.
This commit was SVN r24451.
2011-02-23 23:11:45 +00:00
Jeff Squyres
e8ba72258e Patch for PPC64 platforms with smt=off, issue raised by Brad. This
fix will be included in hwloc 1.1.2.

Brad -- can you verify that this fixes the issue for you?

Fixes trac:2732.

This commit was SVN r24450.

The following Trac tickets were found above:
  Ticket 2732 --> https://svn.open-mpi.org/trac/ompi/ticket/2732
2011-02-23 22:43:58 +00:00
Jeff Squyres
8143b201a9 Custom patch for hwloc (that will be included in hwloc 1.1.2) so that
we don't barf on Linux non-NUMA (NNUMA, aka UMA ;-) ) platforms.

This commit was SVN r24448.
2011-02-23 21:02:02 +00:00
Jeff Squyres
2368410eff * Ensure to follow standard filename conventions for output MCA DSO
filenames -- don't include the project name ("opal")
 * Don't link maffinity/hwloc and paffinity/hwloc against the common
   hwloc in the static build case (because this will result in
   duplicate symbols)

This commit was SVN r24447.
2011-02-23 21:00:20 +00:00
Jeff Squyres
5e082d68f6 Fix the compile error for libnuma 0.9.x introduced in r24442;
hopefully, this now compiles for libnuma 0.9.x and libnuma 2.0.x.

Fixes for the strategy discussed in the commit message for r24442
(i.e., check against numa_get_mems_allowed(), which only exists in
libnuma 2.0.x) and the new new new plan on #2698 coming in a separate
commit.

This commit was SVN r24443.

The following SVN revision numbers were found above:
  r24442 --> open-mpi/ompi@90a8fe4aad
2011-02-23 13:44:46 +00:00
Rainer Keller
90a8fe4aad - Addendum to r24421: get mca_maffinity_libnuma to compile on linux
(with libnuma-2.0.4 / LIBNUMA_API_VERSION 2): numa_get_run_node_mask
   returns a struct bitmask *.

   Whether it's a good idea to blindly pass that on to
   numa_set_membind() is another matter: one might want to match against
   the list returned by numa_get_mems_allowed(), which may be set by the
   outside environment.

   Refs trac:2698.

This commit was SVN r24442.

The following SVN revision numbers were found above:
  r24421 --> open-mpi/ompi@31510e683b

The following Trac tickets were found above:
  Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
2011-02-23 12:59:49 +00:00
Jeff Squyres
c1b26005d7 Create new opal/mca/common area, similar to ompi/mca/common. Move hwloc into this new opal MCA common area, and link the hwloc paffinity component against it. Also add a new hwloc maffinity component, and also link it against the opal MCA common hwloc. More development coming soon regarding this common hwloc instance (i.e., an OPAL-ized version of the hwloc API via a new framework so that we can safely use hwloc's services throughout the rest of the OPAL/ORTE/OMPI code bases.
This commit was SVN r24440.
2011-02-22 23:21:48 +00:00
Jeff Squyres
31510e683b Replace r24290 with something more meaningful. In this case, find out
what memory node the process is running on (which is guaranteed to be
a good answer because maffinity won't be invoked unless the process is
already bound to a specific processor), and then bind our memory to
that. 

Refs trac:2698.

This commit was SVN r24421.

The following SVN revision numbers were found above:
  r24290 --> open-mpi/ompi@afa654746c

The following Trac tickets were found above:
  Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
2011-02-21 20:07:11 +00:00
Shiqing Fan
ddc05e05d7 Avoid blocking select on Windows.
This commit was SVN r24396.
2011-02-16 08:48:21 +00:00
Ralph Castain
bf1cff3711 Plug a couple of additional memory leaks - try to highlight a little better that strings returned from reg_string_name must be freed by caller
This commit was SVN r24383.
2011-02-14 20:58:22 +00:00
Ralph Castain
d85916c1c2 Plug memory leak
This commit was SVN r24380.
2011-02-14 19:51:33 +00:00
Jeff Squyres
b0ce9bae8e Oops. Also need to remove myriexpress.h from the Makefile.am.
This commit was SVN r24357.
2011-02-04 03:29:49 +00:00
Jeff Squyres
6421abecc7 Fixes trac:2690.
Temporarily remove hwloc's internal version of myriexpress.h.  It is
causing a problem when compiling Open MPI with MX support because
hwloc uses AC_CONFIG_HEADER in hwloc's hwloc.m4 to generate
opal/mca/paffinity/hwloc/hwloc/include/hwloc/config.h.
AC_CONFIG_HEADER apparently has the (undocumented) side effect of
adding -I$(top_builddir)/opal/mca/paffinity/hwloc/hwloc/include/hwloc
to OMPI's compilation flags.  Hence, when the OMPI MX components are
compiled and #include "myriexpress.h" (or <myriexpress.h>) they see
hwloc's myriexpress.h before the system one.  Badness ensures.

This removal is temporary because we need to figure out a better
solution.  But for now, OMPI is not using hwloc's myriexpress.h file --
so it's safe to remove.  I'll push this issue upstream to hwloc to
figure out a better solution...

This commit was SVN r24354.

The following Trac tickets were found above:
  Ticket 2690 --> https://svn.open-mpi.org/trac/ompi/ticket/2690
2011-02-03 14:24:32 +00:00
Jeff Squyres
4674e62929 These files are superflouos.
This commit was SVN r24331.
2011-02-01 21:31:35 +00:00
Jeff Squyres
c8badb79df Don't instantiate variables in for loops; we don't assume C99
compilers. 

This commit was SVN r24330.
2011-02-01 19:23:14 +00:00
Nysal Jan
42015cf30a Fix build failure on AIX
This commit was SVN r24321.
2011-01-28 08:09:45 +00:00
Nysal Jan
857c32784e Fix detection of fd_mask
This commit was SVN r24320.
2011-01-28 06:20:32 +00:00
Jeff Squyres
6c8de8fb76 Bump up to hwloc 1.1.1
This commit was SVN r24312.
2011-01-26 23:20:26 +00:00
Josh Hursey
66af515061 Fix C/R functionality with the new libtool. This fixes the case where the restarted process cannot be checkpointed or finalized.
Short Version:
--------------
Event engine needs to be flushed so it does not use old/stale file descriptors.

Long Version:
-------------
The problem was that the restarted process was waiting for the socket to the local daemon to finish establishing during the 'sync' operation. The core problem was that the daemon was sending a header of 36 bytes, but the restarted process only received 35 bytes of the message. So the restarted process became stuck waiting for the last byte to arrive.

After many hours of digging, I figured out that the event engine was using the same file descriptor for its evsig_cb functionality (to signal itself when a signal arrives). So when the daemon wrote in to the new fd the event engine was stealing the first byte (*shakes fist at event engine*) before the recv() could be posted.

The solution is to use the event_reinit() function on restart to re-establish the now-stale file descriptors in the event engine. This seems to have fixed the problem.


A few other minor things:
-------------------------
 * Add a check to make sure the event engine is balanced in its init/finalize
 * Add the opal_event_base_close() to the BLCR restart exec function (still not 100% sure it is needed, but there it is).

This commit was SVN r24296.
2011-01-25 22:43:47 +00:00
Jeff Squyres
afa654746c Somehow this has been sitting, uncommitted, in a local checkout since
last December.  :-(

Add new MCA param: maffinity_libnuma_policy.  Thanks to David
Singleton for the suggestion.  Here's the help text about it:

{{{
   MCA maffinity: parameter "maffinity_libnuma_policy" (current value:
                  <loose>, data source: default value)
                  Binding policy that determines what happens if memory
                  is unavailable on the local NUMA node.  A value of
                  "strict" means that the memory allocation will fail;
                  a value of "loose" means that the memory allocation
                  will spill over to another NUMA node.
}}}

This commit was SVN r24290.
2011-01-24 14:39:16 +00:00
Abhishek Kulkarni
3243b16bb3 Decode SOS error code before checking it with the native error code.
This commit was SVN r24281.
2011-01-20 23:21:38 +00:00
Jeff Squyres
189b541dbd Add a proper help message for the mca_verbose MCA param (and shuffle
the code to be slightly more efficient).

This commit was SVN r24256.
2011-01-14 20:18:06 +00:00
Terry Dontje
56c03a3853 removing a file I should not have added
This commit was SVN r24220.
2011-01-11 19:02:08 +00:00
Terry Dontje
a374661ead add configure.params to solaris sysinfo module to allow it to be built
This commit was SVN r24219.
2011-01-11 18:31:55 +00:00
Jeff Squyres
cd8f12d8e5 Remove a few useless files that were missed last night.
This commit was SVN r24218.
2011-01-11 14:15:31 +00:00
Jeff Squyres
54cb4eb2b5 Merge over new version of hwloc 1.1 from the vendor branch. Update
the module to use the new hwloc bitmap API (the cpuset API is both
klunkier and deprecated), which simplified a few things.

This commit was SVN r24217.
2011-01-11 01:41:10 +00:00
Josh Hursey
bbfdf04a81 Fix a couple of 'unused variable' warnings, and one return value warning.
{{{
base/paffinity_base_service.c: In function ‘opal_paffinity_base_cset2mapstr’:
base/paffinity_base_service.c:623: warning: unused variable ‘range_last’
base/paffinity_base_service.c:623: warning: unused variable ‘range_first’
base/paffinity_base_service.c:622: warning: unused variable ‘count’
base/paffinity_base_service.c:622: warning: unused variable ‘m’
}}}

{{{
connect/btl_openib_connect_oob.c: In function ‘init_ud_qp’:
connect/btl_openib_connect_oob.c:1111: warning: control reaches end of non-void function
connect/btl_openib_connect_oob.c: In function ‘init_device’:
connect/btl_openib_connect_oob.c:1235: warning: unused variable ‘i’
connect/btl_openib_connect_oob.c: In function ‘get_pathrecord_sl’:
connect/btl_openib_connect_oob.c:1323: warning: unused variable ‘i’
}}}

This commit was SVN r24196.
2010-12-30 15:37:50 +00:00
Terry Dontje
6da16ab0d7 add format parameter and layout format to OMPI_Affinity_str
This commit was SVN r24182.
2010-12-16 15:11:17 +00:00
Shiqing Fan
ec82e73bce use sockets instead of pipes on Windows.
This commit was SVN r24174.
2010-12-15 14:34:25 +00:00
Rolf vandeVaart
3f7dd84278 Fix libevent so it can compile in the few cases where sys/queue.h does not exist.
1. Remove it from libevent207.h because it is not needed.
2. Add compat to the include list so it can use queue.h when needed.

This commit was SVN r24144.
2010-12-02 23:05:02 +00:00
Shiqing Fan
f43862420c Convert the bad dos line endings to unix style for all windows related files.
This commit was SVN r24137.
2010-12-02 12:08:08 +00:00
Ralph Castain
c56185887b Change the event base "wakeup" support to enable the passing of events to the central thread for add/del. Add a macro OPAL_UPDATE_EVBASE for this purpose as it will likely be widely used.
Update the ORTE thread support to utilize this capability. Update the rmcast framework to track the change.

This commit was SVN r24121.
2010-12-01 04:26:43 +00:00
Ralph Castain
2523c9b2e8 Overload the event_base_t struct to include a (hopefully) temporary change to deal with cross-event-base synchronization. This is done transparently so no code changes are required within the rest of the code base. Comments explain what was changed and why.
This commit was SVN r24105.
2010-11-30 16:14:19 +00:00
Ralph Castain
71f116d21f Expose the event_active API
This commit was SVN r24090.
2010-11-24 23:30:13 +00:00
Ralph Castain
380835602c Add support for internal libevent threading support. Add configure logic to define an appropriate flag, and then use that flag to expose the required functions.
This commit was SVN r24088.
2010-11-24 23:24:53 +00:00
Shiqing Fan
39c9f7468e Add support for managing priorities of windows mca components.
Correct the generated strings in mpi.h.

This commit was SVN r24082.
2010-11-23 19:09:06 +00:00
Rolf vandeVaart
e7ff9375d7 Use pid_t to avoid warnings on some platforms.
This commit was SVN r24072.
2010-11-19 17:14:33 +00:00
Rolf vandeVaart
1735f98c78 Avoid potential warnings by using pid_t in all places.
This commit was SVN r24071.
2010-11-19 16:29:45 +00:00
Shiqing Fan
4fea0f021e Per r24062, this should also be removed.
This commit was SVN r24064.

The following SVN revision numbers were found above:
  r24062 --> open-mpi/ompi@3b0caf7dea
2010-11-17 17:14:55 +00:00
Rolf vandeVaart
3b0caf7dea Remove inclusion of stdbool.h where not needed.
Change OMPI code in libevent to not use bool.
Add some comments to indicate OMPI specific code.
This should fix compiles on Sun Studio Solaris.

This commit was SVN r24062.
2010-11-17 15:14:00 +00:00
Ralph Castain
32be69eaef Update the OMPI libevent interface module and the internal libevent event.c file to provide ability to disable specific event modes. Basically an issue between #define and checking to see if the value was defined to zero.
This commit was SVN r24056.
2010-11-16 16:06:32 +00:00
Ralph Castain
1b3421f16e Fix a bug spotted by Rolf - ensure that disable-event-xxx results in the corresponding have_event_xxx being undefined or defined to 0
This commit was SVN r24055.
2010-11-16 04:37:30 +00:00
Ralph Castain
b43a4509ac Remove stale mca param. Ensure that verbosity gets properly set for event framework debug
This commit was SVN r24050.
2010-11-13 15:37:17 +00:00
Ralph Castain
db014edb0b Initialize boolean
This commit was SVN r24048.
2010-11-13 15:31:55 +00:00
Jeff Squyres
e4744b4ed5 Per http://www.open-mpi.org/community/lists/devel/2010/11/8671.php,
change a bunch of OMPI_<foo> names to OPAL_<foo>.

This commit was SVN r24046.
2010-11-12 23:22:11 +00:00
Shiqing Fan
1f4eae2046 Type cast for compiling under VS 2010.
This commit was SVN r24044.
2010-11-12 08:31:23 +00:00
Jeff Squyres
dded8a9756 Ensure to always remove the .new file
This commit was SVN r24023.
2010-11-09 23:33:53 +00:00
Terry Dontje
8e0b24a45b add comment to r23998 code change to be able to track libevent code change better
This commit was SVN r24005.

The following SVN revision numbers were found above:
  r23998 --> open-mpi/ompi@e8aa8984a8
2010-11-08 14:36:28 +00:00
Terry Dontje
e8aa8984a8 corrected stdbool.h inclusion to allow Oracle C++ compilers to work with OMPI
This commit was SVN r23998.
2010-11-05 18:54:19 +00:00
Josh Hursey
676adfb7cc fix compile error, need to revisit this line later
This commit was SVN r23991.
2010-11-03 17:34:11 +00:00
Shiqing Fan
a7dc32afb0 Remove the OPAL_DECLSPEC for the event functions.
This commit was SVN r23987.
2010-11-03 09:10:12 +00:00
Shiqing Fan
505efbaa27 Update the CMake scripts, solve a few export symbols for Windows.
This commit was SVN r23976.
2010-11-02 16:39:27 +00:00
Jeff Squyres
9c15a30b75 Really fix the libevent make distcheck problem. The main issue is how
libevent creates its event-config.h during "make all" (vs. during
configure).  The prior method around this didn't work because it wrote
an event-config.h.in in the source tree -- a Bad Idea(tm).  The new
way uses AC_CONFIG_COMMAND to get stuff executed at the end of
config.status to create event-config.h.  This seems to work properly
during make distcheck.

This commit was SVN r23975.
2010-11-01 23:28:50 +00:00
Jeff Squyres
6bd41cf5d8 Fixes for vpath builds; this should enable 'make dist' again.
This commit was SVN r23973.
2010-10-29 22:07:52 +00:00
Ralph Castain
0171e05942 Only add include paths for event headers if --with-devel-headers was specified
This commit was SVN r23968.
2010-10-29 00:43:10 +00:00
Ralph Castain
838ed14401 Include the libevent headers when --with-devel-headers is specified. Ensure that the proper include paths are added to the wrapper compilers - thanks to Jeff for figuring out how to do it.
This commit was SVN r23967.
2010-10-28 21:26:07 +00:00