1
1
Граф коммитов

1524 Коммитов

Автор SHA1 Сообщение Дата
Shiqing Fan
23bb52ad05 Remove a few files from the CMakeList, that no long exist in the new libevent.
Add #ifdef for including _libevent_time.h.
Use the Windows version of function random().

This commit was SVN r22556.
2010-02-04 15:18:54 +00:00
Brian Barrett
b5e391251f Update libevent to 1.4.13
This commit was SVN r22548.
2010-02-04 05:38:30 +00:00
Ralph Castain
db1b07c02c In multiple places in the code base, we expect opal_bitmap_is_set_bit to return either true or false...not a negative error code. If the index is out of range, this is effectively a "false" condition as the bit was clearly not set by the program.
This revision is consistent with how the function was called in the OMPI code base. Will fix the test to match.

This commit was SVN r22544.
2010-02-03 19:45:22 +00:00
Shiqing Fan
7ad3d310b8 Define SIGPIPE for Windows, just for fixing the v1.4 Windows build.
cmr:v1.4.2:reviewer=jsquyres

This commit was SVN r22543.
2010-02-03 18:49:22 +00:00
Rainer Keller
0009d10c4d - This fixes the failing mpic++/mpiCC MTT tests, bailing due to not
finding symbol pthread_atfork, e.g. cxx-test-suite.

   Fixes trac:2088

   cmr:v1.5:reviewer=jsquyres

This commit was SVN r22542.

The following Trac tickets were found above:
  Ticket 2088 --> https://svn.open-mpi.org/trac/ompi/ticket/2088
2010-02-03 18:47:13 +00:00
Brian Barrett
8b4825ff37 Updates to make trunk run on Catamount again:
* Don't build the pstat component if all defines needed aren't there.
 * Update platform file to work better
 * Work around two places that depended on modex being operational

This commit was SVN r22536.
2010-02-03 05:07:40 +00:00
Jeff Squyres
007a6c7b99 Per #2201, move the user arguments up to be the first set of argv
after the compiler argv tokens.  

Not closing #2201 yet; there's still discussion on that ticket about
whether we want to do more or not.

Refs trac:2201
cmr:v1.4.2 
cmr:v1.5

This commit was SVN r22513.

The following Trac tickets were found above:
  Ticket 2201 --> https://svn.open-mpi.org/trac/ompi/ticket/2201
2010-01-29 22:51:35 +00:00
Jeff Squyres
93e930ae13 Fix minor typo
This commit was SVN r22497.
2010-01-26 23:21:00 +00:00
Josh Hursey
b749ecbab8 This commit fixes trac:2190.
Originally the patch was to improve the error message, but when digging into the code I found a subtle bug. If the daemon does not tell the HNP what CRS component it used, then the HNP tries to figure it out from the metadata (this is an uncommon case). The path the HNP used was not complete, so it was unable to find the metadata information. This patch fixes this by adding the 'snapshot_reference' to the 'snapshot_location' which completes the path for this search.

cmr:v1.4 (needs a custom patch)

cmr:v1.5

This commit was SVN r22479.

The following Trac tickets were found above:
  Ticket 2190 --> https://svn.open-mpi.org/trac/ompi/ticket/2190
2010-01-25 20:28:38 +00:00
Ralph Castain
2e2e49e46f Define a standard return value for when a thread exits.
Not sure what Windows or Solaris are looking for, so defined it all the same

This commit was SVN r22471.
2010-01-23 03:57:24 +00:00
Shiqing Fan
c29a668e37 Remove flex.exe and its license file from the tarball.
cmr:v1.4
cmr:v1.5

This commit was SVN r22469.
2010-01-22 16:40:13 +00:00
Shiqing Fan
4836e8878a Update a few more CMake scripts.
This commit was SVN r22454.
2010-01-19 17:34:55 +00:00
Jeff Squyres
596473e7ca Patch from Aleksej Saushev to properly only check for /proc/cpuinfo on Linux-based systems
This commit was SVN r22417.
2010-01-14 23:16:31 +00:00
Shiqing Fan
ad763c327d Restore several linked libraries that were deleted by mistake in r22405.
This commit was SVN r22415.

The following SVN revision numbers were found above:
  r22405 --> open-mpi/ompi@872a4047ba
2010-01-14 21:50:42 +00:00
Shiqing Fan
872a4047ba Fix the bug that caused by ADD_DEPENDENCIES() from different version of CMake.
In CMake 2.6 and earlier, this function add dependencies for targets and also link the target libraries automatically, but in CMake 2.8,this behavior has been changed, i.e. it will only add the dependencies but no link, which will cause linking errors at compilation time.

This commit was SVN r22405.
2010-01-14 18:10:20 +00:00
Ralph Castain
f0646b3603 Need separate flag for select when initializing sysinfo framework
This commit was SVN r22394.
2010-01-12 23:22:46 +00:00
Ralph Castain
b35486d945 The CM ess module needs to open the sysinfo framework and select modules prior to when others need it. Thus, setup a flag to avoid multiple open/select within that framework.
This commit was SVN r22393.
2010-01-12 22:03:49 +00:00
Jeff Squyres
d9fc4e0a9d Per http://www.open-mpi.org/community/lists/devel/2010/01/7283.php, allow MCA components to fail the component.register and component.open methods without the MCA base printing errors.
This commit was SVN r22391.
2010-01-12 19:29:12 +00:00
Brian Barrett
86d8356b13 Updates to allow OMPI to build on Cray XT platforms running Catamount
This commit was SVN r22381.
2010-01-07 18:14:03 +00:00
Jeff Squyres
dbb29663e8 Update the embedded PLPA version to v1.3.2. Since this is a 3rd
party/"vendor" import, the changes are actually far smaller than the
size of this changeset implies.  Here's a list of the changes:

 * Update the AMD license header in plpa_map.c to be less restrictive
   (see https://svn.open-mpi.org/trac/plpa/changeset/262 for details)
   -- '''this is the most/only important change of this update.'''  No
   code is changed by this; only removing a clase from a license
   header in plpa_map.c.
 * Changes to the generated {{{configure}}}, {{{config.guess}}}, and
   {{{config.sub}}} scripts (which aren't used by OMPI).
 * soname version tracking changes (which also aren't used by OMPI;
   they're only used when PLPA is built/installed in "standalone"
   mode).
 * Update the "get version" m4 (which was stolen from OMPI's m4 to
   begin with, and is only used during OMPI's autogen.sh step).
 * Update various PLPA version numbers to 1.3.2.
 * Bug fix in plpa-taskset (which is not built in the OMPI PLPA build).

This commit was SVN r22367.
2010-01-06 00:44:14 +00:00
Ralph Castain
fad1ba15b0 Move the test for case-sensitive file system from ompi to opal so that all layers can have that knowledge.
Use that for the orte wrapper compilers

This commit was SVN r22348.
2009-12-29 23:26:45 +00:00
Josh Hursey
313acba4ce Move the mca_base_is_component_required() functionality to mca/base per suggestion so that it can be reused in other components.
This commit was SVN r22327.
2009-12-17 15:12:26 +00:00
George Bosilca
7ba371cd92 Correct the atomics on x86 and x86_64. Thanks to Iain for the catch,
to Eugene, Jeff, and Briand for the help. This patch is supposed to
fix several outstanding issues, notably the one on tickets #2043.

This commit was SVN r22324.
2009-12-16 22:34:56 +00:00
George Bosilca
9fa5f1d7a8 Spaces instead of tabs.
This commit was SVN r22314.
2009-12-15 18:07:33 +00:00
Josh Hursey
6e584c151f We need to check the value of {{{opal_crs_base_metadata_read_token}}} since it may segv if we have a malformed metadata file.
Bug found by Sergio Diaz Montes:
  http://www.open-mpi.org/community/lists/users/2009/11/11176.php

This commit was SVN r22290.
2009-12-09 18:41:56 +00:00
Josh Hursey
e8de64d5a0 Make sure that we release the components that do not qualify for selection. These components are never open'ed really so we never need to close them.
This will need to be applied to v1.4 and v1.5, CMRs to follow.

This commit was SVN r22288.
2009-12-09 15:45:53 +00:00
Rainer Keller
787538ae38 Correct the spelling, and try cmr:v1.5 This should succeed
This commit was SVN r22280.
2009-12-08 18:46:46 +00:00
Ralph Castain
70e385bcab Picky, picky, picky...the a-retentive amongst us wants the default value to show in ompi_info! Of all the nerve...
:-)

Okay, cleanup the prior commit so that the default component search path shows in ompi_info, and remains available in component_find.

This commit was SVN r22278.
2009-12-08 17:32:22 +00:00
George Bosilca
e55f89dda7 Add a format to stop the complaining.
This commit was SVN r22276.
2009-12-08 17:26:42 +00:00
Ralph Castain
703ec3d6ce Some minor cleanups to the handling of multi-path component find
This commit was SVN r22275.
2009-12-08 09:34:49 +00:00
Ralph Castain
0b654ba4dc Extend the mca_component_path param usage by allowing a user to add paths to the default system and user ones defined in the program. Thus, the user can specify a param value of:
"my_perfect_path":SYSTEM_DEFAULT:USER_DEFAULT

and OPAL will substitute its internally derived values for the defaults (instead of forcing the user to figure them out).

This commit was SVN r22272.
2009-12-07 20:29:28 +00:00
Jeff Squyres
16b100219d A patch from UTK to allow orte_init(), opal_init(), and associated
friends also receive &argc and &argv (George asked Jeff to Ralph to
review before committing).  The thought is that passing argv and argc
to opal/orte_init be useful to other projects outside of OMPI that are
using OPAL and/or ORTE (especially in conjunction with some other
bootstrapping code where it is helpful to modify argv).  It's such a
small thing that it's easy to apply here to make others' lives a
little easier.

Ask George for more details; I'm just the messenger.  :-)

Judging by the copyrights on this patch, it's been around for a
while.  :-)

This commit was SVN r22260.
2009-12-04 00:51:15 +00:00
Ralph Castain
a0d5c80ce0 Add a new framework for discovering local resource information such as cpu type/model, #cpus, available physical memory, etc. Two initial components (darwin and linux) are provided. This is needed to support bootstrap operations where daemons are started at node boot, and applications where initial knowledge of cpu identification is needed to guide framework component selection.
Add orte configuration option to control the use of the framework in the system. Although the code will build, it will not be active unless configured with --enable-bootstrap.

If bootstrap is enabled and the new opal_sysinfo framework can successfully determine the cpu model, pass that info to the application as an MCA param to support some work at Sun.

Also, have daemons report back the resources they find to guide process mapping in bootstrap operations (i.e., where the daemon starts at node boot as opposed to being launched at application start).

Adjust some platform files to enable these capabilities.

This commit was SVN r22244.
2009-11-30 23:11:25 +00:00
Jeff Squyres
978fb43a26 Add a Big Hairy Warning if you --enable-progress-threads
This commit was SVN r22230.
2009-11-24 23:20:37 +00:00
Shiqing Fan
6f8d0a1ab8 Update a few CMake scripts.
Add Program Database (pdb) files for installation for debug build.

This commit was SVN r22188.
2009-11-03 10:40:58 +00:00
Jeff Squyres
5e6c494269 Remove the mistaken line (confirmed by Shiqing).
This commit was SVN r22175.
2009-10-30 12:45:05 +00:00
Jeff Squyres
16f42c45a6 Ensure to have a PARAM_CONFIG_FILES (I don't know if
PARAM_WINDOWS_FILES is a mistake or not).  Fixes trac:2079.

This commit was SVN r22171.

The following Trac tickets were found above:
  Ticket 2079 --> https://svn.open-mpi.org/trac/ompi/ticket/2079
2009-10-29 22:05:26 +00:00
Shiqing Fan
48dd7ff7d0 Get rid of the shadow file for mpi.h.in on Windows.
This commit was SVN r22154.
2009-10-28 15:49:01 +00:00
Rainer Keller
4ce710a147 - The internal function may fail make_opt (e.g. OPAL_ERR_OUT_OF_RESOURCE),
pass that on to callers of opal_cmd_line_make_opt_mca().
   Thanks to Thomas Naughton III.

 - Additionally, cmd-line parameters passed in table to opal_cmd_line_create()
   may be wrong (e.g. OPAL_ERR_BAD_PARAM), which may be missed in the
   loop.

This commit was SVN r22153.
2009-10-28 15:14:31 +00:00
Terry Dontje
c6ebc7c341 rename macros ompi_check_optflags and ompi_make_stripped_flags based on comments in #2072
This commit was SVN r22151.
2009-10-28 10:51:59 +00:00
Shiqing Fan
63cdfc0ab1 Get rid of several shadow files for windows build, use the same input file as on Linux.
This commit was SVN r22145.
2009-10-27 18:22:14 +00:00
Terry Dontje
6df802424d remove duplicate setting of CFLAGS_WITHOUT_OPTFLAGS and special case DEBUGGER_FLAGS for intel compiler
This commit was SVN r22143.
2009-10-26 18:41:53 +00:00
Shiqing Fan
af0830107c Generate the compiler wrappers more nicely on Windows.
This commit was SVN r22142.
2009-10-26 13:26:06 +00:00
Ralph Castain
13d86e100b Courtesy of Ralph and Jeff:
Continue the reorganization of the configure system. Move files from the main config directory to their appropriate level-specific config directories. Modify the configure system to correctly handle compiler detection, test, and setup so that all things pertaining to opal and orte are done at the lower level, with the ompi configure system only looking at mpi-specific options.

Ensure the wrapper compilers for orte and ompi only get built when appropriate. Add support for c++ to the orte wrapper compilers, both script and non-script versions.

This commit was SVN r22138.
2009-10-24 01:04:35 +00:00
Ralph Castain
214e26b539 Per Jeff (this work was done on a branch of mine, so I will do the commit):
Re-enable "./autogen.sh -no-ompi" again. If you -no-ompi, the entire OMPI
configury is skipped and the entire ompi/ subtree is not built. There's
some simple m4-isms that prune out the relevant parts.

I added ompi/config/, orte/config/, and opal/config/ directories. I moved a
bunch of m4 files from the top-level config/ dir into ompi/config/, and a few
into orte/config/.

Note that all 3 <project>/config directories have a config_files.m4 file. This
file contains the AC_CONFIG_FILES list for that project. The AC_CONFIG_FILES
call cannot be in an AC_DEFUN macro and conditionally called -- if it is
included at all, Autoconf will process it. Hence, these config_files.m4 files
don't AC_DEFUN -- they just have AC_CONFIG_FILES. m4_ifdef() is used to
conditionally include the files or not.

I moved a bunch of obvious OMPI-only m4 files from config/ to ompi/config/,
but I'm sure that there's more that could go. A ticket will be filed with
thoughts on future work in this area.

This commit was SVN r22113.
2009-10-20 23:44:20 +00:00
Ralph Castain
c991d155f4 Fix a minor omission in opal/util/path. If someone provides a relative path to the current working directory, without starting it with a
'.', we should still find the executable - it is in a directory beneath us.

In other words, if someone gives us "foo/bar" instead of "./foo/bar", we should still be able to find bar

This commit was SVN r22110.
2009-10-20 04:05:16 +00:00
Ralph Castain
c58a30ea10 Add two new functions:
1. check for loopback interface

2. convert tuple addresses to ip addrs + mask

This commit was SVN r22080.
2009-10-09 15:24:41 +00:00
Jeff Squyres
9afe50d886 Update Cisco copyrights for consistency
This commit was SVN r22072.
2009-10-07 22:02:32 +00:00
Jeff Squyres
d317ce0367 Fix CID 1381: don't bother checking for (NULL == p); it's overkill.
posix_memalign() will either return 0 or not, indicating success.  And
if posix_memalign() fails, it's not always going to be due to
out-of-memory -- just return ERR_IN_ERRNO.

This commit was SVN r22070.
2009-10-07 20:01:50 +00:00
Jeff Squyres
7900451e4e Fix CID 1326: for the (unlikely) case where
opal_paffinity_base_get_processor_info() returns failure.

This commit was SVN r22069.
2009-10-07 19:52:08 +00:00
Jeff Squyres
5c1af9c2ba Fix CID 1355: ensure that mca_base_param_reg_int() actually
succeeded.

This commit was SVN r22068.
2009-10-07 19:43:35 +00:00
Jeff Squyres
3b4f695009 MAP_FAILED is more POSIX-ly correct than ((void*)-1).
This commit was SVN r22063.
2009-10-07 14:20:18 +00:00
Jeff Squyres
d7db5f4c32 mmap(2) says that you must call mmap() with either MAP_SHARED or
MAP_PRIVATE.  We didn't catch this because we checked for a NULL
return, not a -1 return.  Doh!  Thanks again to Julian Seward for
continuing to track this down.

This commit was SVN r22062.
2009-10-07 12:39:01 +00:00
Jeff Squyres
977574bd45 Fix a problem noted by Julian Seward: MAKE_MEM_UNDEFINED is not the
opposite of MAKE_MEM_DEFINED. Also add in a call to NOACCESS to
(mostly) reverse the effects of MAKE_MEM_DEFINED (technically, page 0
was accessible before this, even though it's a Bad Idea to access it).

This commit was SVN r22056.
2009-10-06 17:55:49 +00:00
Jeff Squyres
932b43be04 Check to ensure that the mmap succeeded. Thanks to Julia Seward for
pointing out the problem and suggesting the fix.

This commit was SVN r22055.
2009-10-06 17:44:14 +00:00
George Bosilca
01bb4dafe0 Add a comment.
This commit was SVN r22052.
2009-10-05 17:36:11 +00:00
Jeff Squyres
0f8ac9223f Refs trac:2023, #2027.
This commit does a bunch of things:

 * Address all remaining code review items from CMR #2023:

   * Defer mmap setup to be lazy; only set it up the first time we
     invoke a collective.  In this way, we don't penalize apps that
     make lots of communicators but don't invoke collectives on them
     (per #2027).
   * Remove the extra assignments of mca_coll_sm_one (fixing a
     convertor count setup that was the real problem).
   * Remove another extra/unnecessary assignment.
   * Increase libevent polling frequency when using the RML to
     bootstrap mmap'ed memory.
   * Fix a minor procs-related memory leak in btl_sm.
 * Commit a datatype fix that George and I discovered along the way to
   fixing the coll sm.
 * Improve error messages when mmap fails, potentially trying to
   de-alloc any allocated memory when that happens.
 * Fix a previously-unnoticed confusion between extent and true_extent
   in coll sm reduce.

This commit was SVN r22049.

The following Trac tickets were found above:
  Ticket 2023 --> https://svn.open-mpi.org/trac/ompi/ticket/2023
2009-10-02 17:13:56 +00:00
Jeff Squyres
c8c3132605 Also check for posix_memalign.
This commit was SVN r22045.
2009-10-01 23:51:48 +00:00
George Bosilca
b04a42ba3b Add the format to the opal_output call.
This commit was SVN r22041.
2009-09-30 23:33:12 +00:00
Ralph Castain
84a45fea0a Add a convenience macro for assembling network addresses
This commit was SVN r22036.
2009-09-30 14:38:52 +00:00
Ralph Castain
176fdd3a83 Add a new API to the show_help system that allows external users (e.g., libraries built upon OMPI) to define their own locations for show_help files. This allows such users to exploit the rather nice features of the OPAL show_help system -without- interfering with the ability of the ORTE and OMPI layers to use show_help themselves.
Reviewed by Jeff to protect toes...and to get some good comments :-)

This commit was SVN r22026.
2009-09-29 02:07:46 +00:00
Jeff Squyres
1886d5a004 Remove the libopenmpi_malloc library; it is only necessary for
backwards compatibility in the v1.3 series.

This commit was SVN r22013.
2009-09-25 17:09:54 +00:00
Josh Hursey
5406fdfb80 Add support for sending SIGSTOP the MPI job after the checkpoint is taken (uses a BLCR feature for the option).
This commit looks larger than it really is since it includes a fair amount of code cleanup.

The SIGSTOP/SIGCONT+checkpointing work uses some of the functionality in r20391. Basic use case below (note that the checkpoint generated is useable as usual if the stopped application is terminated).
{{{
shell 1) mpirun -np 2 -am ft-enable-cr my-app
... running ...

shell 2) ompi-checkpoint --stop -v MPIRUN_PID
[localhost:001300] [  0.00 /   0.20]                 Requested - ...
[localhost:001300] [  0.00 /   0.20]                   Pending - ...
[localhost:001300] [  0.01 /   0.21]                   Running - ...
[localhost:001300] [  1.01 /   1.22]                   Stopped - ompi_global_snapshot_1234.ckpt
Snapshot Ref.: 0 ompi_global_snapshot_1234.ckpt

shell 2) killall -CONT mpirun

... Application Continues execution in shell 1 ...
}}}

Other items in this commit are mostly cleanup that has been sitting off-trunk for too long:
 * Add a new {{{opal_crs_base_ckpt_options_t}}} type that encapsulates the various options that could be passed to the CRS. Currently only TERM and STOP, but this makes adding others ''much'' easier.
 * Eliminate ORTE_SNAPC_CKPT_STATE_PENDING_TERM, since it served a redundant purpose with the new options type.
 * Lay some basic ground work for some future features.

This commit was SVN r21995.

The following SVN revision numbers were found above:
  r20391 --> open-mpi/ompi@0704b98668
2009-09-22 18:26:12 +00:00
Eugene Loh
67bac2fe31 Fix paffinity_linux_module.c. The set and get functions transferred cpu
masks between the mask argument and a local PLPA mask.  There were three
problems:
1) The "get" function computed the number of bits as sizeof(mask),
   which is the size of the pointer to the mask rather than the mask
   itself.  So, only 4 bits were copied with m32 and 8 bits with m64.
   There are actually 1024 bits.
2) The "get" and "set" functions both copied a number of bits computed
   from the sizeof() mask, but sizeof() reports the number of bytes.
   We have to multiply by 8 to get the number of bits.
3) These two functions check to make sure tha the mask argument is not
   bigger than the PLPA mask.  But, the set function copies a number
   of bits in the PLPA mask, which is conceivably greater than the
   number of bits in the mask argument.  So, accesses to the mask
   argument may overrun that argument.
Problems 1 and 2 meant that one would encounter errors when the number of
cores exceeded 4 (with -m32) or 8 (with -m64).  Problem 3 probably caused
no errors.

This commit was SVN r21993.
2009-09-22 16:00:37 +00:00
Ralph Castain
7765c71428 Add a macro for formatting IP addresses for printing
This commit was SVN r21985.
2009-09-22 00:53:54 +00:00
George Bosilca
b18ca686ae Correct the pointer math when we copy the opal_datatype_t object. In addition
don't set the ref count to 1, it has been already set by the call to OBJ_NEW
when the type was allocated. This fixes ticket #2014.

This commit was SVN r21976.
2009-09-18 20:05:22 +00:00
Ralph Castain
2028017554 Modify the paffinity system to handle binding directives that are "soft" - i.e., when someone directs that we bind if the system supports it. This allows community members to distribute OMPI with default MCA param files that direct general binding policies, without having the distributed software fail if the system cannot support those policies.
The new options work by adding an ":if-avail" qualifier to the "bind-to-socket" and "bind-to-core" MCA params. If the system does not support this capability, the job will launch anyway. Without the qualifier, the job will abort with an error message indicating that the required functionality is not supported on this system.

This commit was SVN r21975.
2009-09-18 19:48:42 +00:00
Rainer Keller
5983aeb753 - This fixes trac:2014:
As noted in http://www.open-mpi.org/community/lists/devel/2009/08/6741.php,
   we do not correctly free a dupped predefined datatype.
   The fix is a bit more involving. See ticket for details.
   Tested with ibm tests and mpi_test_suite (though there's two "old" failures
   zero5.c and zero6.c)

   Thanks to Lisandro Dalcin for bringing this up.

This commit was SVN r21929.

The following Trac tickets were found above:
  Ticket 2014 --> https://svn.open-mpi.org/trac/ompi/ticket/2014
2009-09-02 17:34:01 +00:00
Jeff Squyres
2fa048b0e0 Make the paffinity test component only build when you --enable-debug
(or have a developer build where that's enabled for you by default).

This commit was SVN r21928.
2009-09-02 11:23:54 +00:00
Ralph Castain
388c65fd80 Add missing include file
This commit was SVN r21924.
2009-09-01 13:31:54 +00:00
Ralph Castain
888f3c3afe Extend the paffinity test module to allow users to specify the number of sockets and cores - provides an extended ability to mimic archs.
This commit was SVN r21912.
2009-08-29 03:35:39 +00:00
Ralph Castain
3c4f28b22c Modify the paffinity test module to take a param indicating whether or not to mimic being externally bound
This commit was SVN r21908.
2009-08-28 02:31:01 +00:00
Shiqing Fan
4119497c5a Use select() for Windows events by default.
For some historical reasons, we didn't use select() for the Windows events. Now it could be merged back to have a better performance on Windows.

This commit was SVN r21899.
2009-08-27 08:11:56 +00:00
Shiqing Fan
1b6db85988 Complete the support for building on UNC path.
This commit was SVN r21897.
2009-08-27 07:57:26 +00:00
Ralph Castain
7cc045f9c5 Check return codes when init'ing the paffinity framework to avoid segfaulting
This commit was SVN r21884.
2009-08-26 01:58:15 +00:00
Ralph Castain
2178f995b9 Add a new "test" module to the paffinity framework that mimics a system that supports affinity when running on a Mac for development purposes. Only active if specifically called out.
This commit was SVN r21881.
2009-08-26 01:55:30 +00:00
Josh Hursey
1fc454103a MTT found a linking problem with the previous fix (r21860). This commit cleans up the configure.m4 a bit so that is works as expected. Tested with static and shared builds, but MTT should have a pass at it before it moved to v1.3 (I'll attach a custom patch to ticket #2004 for v1.3).
Additionally this commit fixes trac:1658
Refs trac:2004

This commit was SVN r21866.

The following SVN revision numbers were found above:
  r21860 --> open-mpi/ompi@7f31473bd7

The following Trac tickets were found above:
  Ticket 1658 --> https://svn.open-mpi.org/trac/ompi/ticket/1658
  Ticket 2004 --> https://svn.open-mpi.org/trac/ompi/ticket/2004
2009-08-21 20:07:18 +00:00
Josh Hursey
7f31473bd7 Fix the CFLAGS argument to the BLCR component.
Due to a typo (probably cut/paste error) the CFLAGS/CPPFLAGS/LDFLAGS/LIBS arguments were not propogated to the BLCR component.

This changes a few {{{crs_blcr_check_}}} to {{{crs_blcr_save}}}, which they should have been all along.

This needs to be applied to v1.3 as well :/

This commit was SVN r21860.
2009-08-20 21:28:19 +00:00
Rainer Keller
8e1b23779f - Replace combinations of
#if defined (c_plusplus)
          defined (__cplusplus)
   followed by
      extern "C" {
   and the closing counterpart by BEGIN_C_DECLS and END_C_DECLS.

   Notable exceptions are:
    - opal/include/opal_config_bottom.h:
      This is our generated code, that itself defines BEGIN_C_DECL and
      END_C_DECL
    - ompi/mpi/cxx/mpicxx.h:
      Here we do not include opal_config_bottom.h:                                 
    - Belongs to external code:                                                    
      opal/mca/backtrace/darwin/MoreBacktrace/MoreDebugging/MoreBacktrace.c        
      opal/mca/backtrace/darwin/MoreBacktrace/MoreDebugging/MoreBacktrace.h        
    - opal/include/opal/prefetch.h:
      Has C++ specific macros that are protected:                                  

    - Had #if ... } #endif  _and_ END_C_DECLS (aka end up with 2x
      END_C_DECLS)
      ompi/mca/btl/openib/btl_openib.h
    - opal/event/event.h has #ifdef __cplusplus as BEGIN_C_DECLS...
    - opal/win32/ompi_process.h: had extern "C"\n {...
      opal/win32/ompi_process.h: dito
    - ompi/mca/btl/pcie/btl_pcie_lex.l: needed to add *_C_DECLS
      ompi/mpi/f90/test/align_c.c: dito
    - ompi/debuggers/msgq_interface.h: used #ifdef __cplusplus
    - ompi/mpi/f90/xml/common-C.xsl: Amend

   Tested on linux using --with-openib and --with-mx

   The following do not contain either opal_config.h, orte_config.h or
   ompi_config.h
   (but possibly other header files, that include one of the above):
      ompi/mca/bml/r2/bml_r2_ft.h
      ompi/mca/btl/gm/btl_gm_endpoint.h
      ompi/mca/btl/gm/btl_gm_proc.h
      ompi/mca/btl/mx/btl_mx_endpoint.h
      ompi/mca/btl/ofud/btl_ofud_endpoint.h
      ompi/mca/btl/ofud/btl_ofud_frag.h
      ompi/mca/btl/ofud/btl_ofud_proc.h
      ompi/mca/btl/openib/btl_openib_mca.h
      ompi/mca/btl/portals/btl_portals_endpoint.h
      ompi/mca/btl/portals/btl_portals_frag.h
      ompi/mca/btl/sctp/btl_sctp_endpoint.h
      ompi/mca/btl/sctp/btl_sctp_proc.h
      ompi/mca/btl/tcp/btl_tcp_endpoint.h
      ompi/mca/btl/tcp/btl_tcp_ft.h
      ompi/mca/btl/tcp/btl_tcp_proc.h
      ompi/mca/btl/template/btl_template_endpoint.h
      ompi/mca/btl/template/btl_template_proc.h
      ompi/mca/btl/udapl/btl_udapl_eager_rdma.h
      ompi/mca/btl/udapl/btl_udapl_endpoint.h
      ompi/mca/btl/udapl/btl_udapl_mca.h
      ompi/mca/btl/udapl/btl_udapl_proc.h
      ompi/mca/mtl/mx/mtl_mx_endpoint.h
      ompi/mca/mtl/mx/mtl_mx.h
      ompi/mca/mtl/psm/mtl_psm_endpoint.h
      ompi/mca/mtl/psm/mtl_psm.h
      ompi/mca/pml/cm/pml_cm_component.h
      ompi/mca/pml/csum/pml_csum_comm.h
      ompi/mca/pml/dr/pml_dr_comm.h
      ompi/mca/pml/dr/pml_dr_component.h
      ompi/mca/pml/dr/pml_dr_endpoint.h
      ompi/mca/pml/dr/pml_dr_recvfrag.h
      ompi/mca/pml/example/pml_example.h
      ompi/mca/pml/ob1/pml_ob1_comm.h
      ompi/mca/pml/ob1/pml_ob1_component.h
      ompi/mca/pml/ob1/pml_ob1_endpoint.h
      ompi/mca/pml/ob1/pml_ob1_rdmafrag.h
      ompi/mca/pml/ob1/pml_ob1_recvfrag.h
      ompi/mca/pml/v/pml_v_output.h
      opal/include/opal/prefetch.h
      opal/mca/timer/aix/timer_aix.h
      opal/util/qsort.h
      test/support/components.h

This commit was SVN r21855.

The following SVN revision numbers were found above:
  r2 --> open-mpi/ompi@58fdc18855
2009-08-20 11:42:18 +00:00
Ralph Castain
aca3e71ccd Don't declare us "bound" if the cpu mask is completely zero
This commit was SVN r21839.
2009-08-19 18:55:06 +00:00
Shiqing Fan
2a05de0dda Update a few comments of how OMPI uses Windows registry.
This commit was SVN r21828.
2009-08-18 16:32:45 +00:00
Rainer Keller
c3226b36b7 - It's BUS_ADRERR, not BUSADRERR...
- Fix typos.

This commit was SVN r21808.
2009-08-12 13:07:04 +00:00
Shiqing Fan
bce2f44154 Update related .windows files with proper compiling properties, in order to have a successful DSO build.
This commit was SVN r21805.
2009-08-12 08:55:58 +00:00
Shiqing Fan
fb4be6fad7 First step to enable DSO build on Windows.
- add a cmake module for searching libltdl libraries and headers
  - a configure option to enable DSO build, default OFF.
  - update a few source files for including correct header
    and loading correct mca libraries path/suffix.

This commit was SVN r21804.
2009-08-12 08:52:48 +00:00
George Bosilca
52c722352d The COMPLEX are back. Due to some compilers flags right now the
support for _Complex is disabled until we figure out the correct
black magic. So instead of using this nice C99 feature, we use the
a strcture with a double type, the same approach that worked pretty
well for the last couple of years. 

Switching from one mode to the other is done using the 
OPAL_USE_[FLOAT|DOUBLE|LONG_DOUBLE]__COMPLEX macros defined in
opal_datatype_internal.h at line 442.

This commit was SVN r21800.
2009-08-11 18:44:06 +00:00
Ralph Castain
1dc12046f1 Modify the OMPI paffinity and mapping system to support socket-level mapping and binding. Mostly refactors existing code, with modifications to the odls_default module to support the new capabilities.
Adds several new mpirun options:

* -bysocket - assign ranks on a node by socket. Effectively load balances the procs assigned to a node across the available sockets. Note that ranks can still be bound to a specific core within the socket, or to the entire socket - the mapping is independent of the binding.

* -bind-to-socket - bind each rank to all the cores on the socket to which they are assigned.

* -bind-to-core - currently the default behavior (maintained from prior default)

* -npersocket N - launch N procs for every socket on a node. Note that this implies we know how many sockets are on a node. Mpirun will determine its local values. These can be overridden by provided values, either via MCA param or in a hostfile

Similar features/options are provided at the board level for multi-board nodes.

Documentation to follow...

This commit was SVN r21791.
2009-08-11 02:51:27 +00:00
Rainer Keller
76469ea64a - Change the property of a few files, that obviously
don't need to be svn:executable...

This commit was SVN r21786.
2009-08-11 01:40:00 +00:00
Josh Hursey
bd6426d3dd This is a minor cleanup of the configure.m4 (per suggestion from Jeff).
Refs trac:1987
The patch for v1.3 attached to Ticket #1987 already includes this change.

I did not have a chance to commit this last night, so sorry for the delay.

This commit was SVN r21777.

The following Trac tickets were found above:
  Ticket 1987 --> https://svn.open-mpi.org/trac/ompi/ticket/1987
2009-08-07 23:38:54 +00:00
Rainer Keller
3f727f0e61 - Until either r21763 is backed out, unconstify complex types.
Complex should either be a struct of float on the ompi-layer
   or as C99 float _complex in opal.
   unconstify to not break windows / the mpi* wrappers.

This commit was SVN r21770.

The following SVN revision numbers were found above:
  r21763 --> open-mpi/ompi@668f001351
2009-08-06 14:42:47 +00:00
Josh Hursey
063f5b2ff6 After talking through the patch with Jeff, we have a couple more fixes to r21766 that should also go over to v1.3 in Ticket #1987.
* Check for {{{dlfcn.h}}} in the self component's configure.m4 (also clean up the .m4 a bit.
 * Adjust the priority of the BLCR component so that the self component has a higher priority (if the application went to the trouble of writing the routines, why not use them.) The 'self' component checks for the appropriate functions during query, so it know if it -can- be used during component selection.
 * Adjust some copyrights that I missed before
 * Fix a warning when casing the result of dlsym() into a function pointer. There is a bit of pointer magic to make this happen (thanks to the following website, and RedHat EL 4 man pages for illustrating it:
  http://www.opengroup.org/onlinepubs/009695399/functions/dlsym.html

Passing to Jeff for a final review of the patch before moving to v1.3.

This commit was SVN r21768.

The following SVN revision numbers were found above:
  r21766 --> open-mpi/ompi@91e52d062b
2009-08-05 22:07:37 +00:00
Josh Hursey
91e52d062b Fix the 'self' CRS component.
Due to the visibility patch to libltdl in r21731, this module can no longer access or use the libltdl interfaces directly. Instead just use the dlopen/dlsym/dlclose functions directly. This is a portability implication here, but for the moment it does not seem to bite us.

Also in this patch, cleanup some of the 'self' specific code paths.
 * opal-restart need not special case the 'self' component since it can now interact with it as if it were a normal component.
 * Cleanup the initialization of the cmd line arguments in opal-restart.
 * Make sure to mark opal-restart as a 'tool', but do so by setting the global variable directly instead of setting the environment variable, which could be inherited by the application.
 * Most of the functions in the 'self' component should not be used by a command line tool (exception being 'restart'), so make sure that if we accidently call them then errors are returned.
 * Increase the priority of the 'none' component to be above that of 'self' when being selected in a command line tool. This allows for both mpirun and opal-restart to work correctly with the 'self' module.

This commit was SVN r21766.

The following SVN revision numbers were found above:
  r21731 --> open-mpi/ompi@0278b86456
2009-08-05 16:21:51 +00:00
George Bosilca
668f001351 Add support for complex (complex8, complex16 and complex32) in the
opal datatype engine. These types are created from two consecutive 
basic types (float, double and long double).

This commit was SVN r21763.
2009-08-04 23:42:32 +00:00
Jeff Squyres
90d6491737 Since ompi_info is no longer written in C++, we no longer require a
C++ compiler in configure.  If we have a C++ compiler, then the MPI
C++ bindings are built by default.  If we don't have a C++ compiler,
then the MPI C++ bindings are not built by default.

--enable-mpi-cxx will now force an error if there is no C++ compiler
available.  --disable-mpi-cxx (or the lack of a C++ compiler) will now
disable many of the C++ compiler checks in configure.

Note that there are a few items to clean up regarding the difference
between C's _Bool type and C++'s bool type.  Right now, we assume that
they are the same.  But they aren't, and they shouldn't be treated as
such.  This cleanup will be forced in MPI-2.2 with the introduction of
the MPI_C_BOOL MPI datatype.

This commit was SVN r21755.
2009-08-04 11:54:01 +00:00
George Bosilca
5155eaf945 The opal datatype engine should _ALWAYS_ be initialized. Therefore move the
call to opal_datatype_init in the opal_util_init.

This commit was SVN r21754.
2009-08-03 16:46:33 +00:00
Jeff Squyres
98b0a7af3d Per http://www.open-mpi.org/community/lists/devel/2009/08/6555.php,
add an lt_dlerror() in the error output one of the error cases.

This commit was SVN r21751.
2009-08-03 16:29:52 +00:00
Jeff Squyres
9455eb804f Oops -- fix the type.
This commit was SVN r21750.
2009-08-03 16:26:15 +00:00
Jeff Squyres
7b1f65095b Update to match the current code and be a bit more explicit (since
others are currently looking at this code).

This commit was SVN r21746.
2009-07-30 12:45:59 +00:00
Jeff Squyres
cb653bc4e8 Change the test memalign() call to use an alignment of 4 so that some
debuggers stop complaining. :-)

This commit was SVN r21744.
2009-07-29 20:33:38 +00:00
Jeff Squyres
69139e4171 Print a warning if someone tries to set opal_ptmalloc2_disable via an
MCA parameter file.

This commit was SVN r21743.
2009-07-29 20:05:56 +00:00
Jeff Squyres
d12db20089 This function actually returns an int, not a bool (OPAL_SUCCESS or
OPAL_ERROR).  Also add a line to the docs describing that it's ok to
pass in NULL for the source_file.

This commit was SVN r21742.
2009-07-29 19:52:18 +00:00
Jeff Squyres
c7376ae053 Start using Libtool's shared library versioning scheme. See lengthy
note in VERSION file.

NOTE: the versions will ''always'' be 0:0:0 on the SVN trunk and
developer branches.  They will only have meaningful values (starting
with 0:0:0 in 1.3.4) on release branches.  Only RM's will modify these
values immediately preceeding a release.

This commit was SVN r21729.
2009-07-23 21:35:17 +00:00
Ralph Castain
c459615f8f When someone specifies a rank-file slot-list of N:*, stop the loop at the proper place (we were going through the loop one too many times).
Thanks to Eugene for spotting it.

This commit was SVN r21728.
2009-07-23 17:51:15 +00:00
Jeff Squyres
bfd689f0ef Per discussion on the mailing list, backing out r21723.
This commit was SVN r21725.

The following SVN revision numbers were found above:
  r21723 --> open-mpi/ompi@2250be582d
2009-07-22 00:02:00 +00:00
Jeff Squyres
38faae6eab Ignore this component until it can be fixed properly.
This commit was SVN r21724.
2009-07-21 22:35:45 +00:00
Iain Bason
2250be582d Added autodetect installdirs component. Currently supports Solaris and Linux.
* Installation directories will be inferred from the actual location
  of the shared library that contains the component.

* OPAL_PREFIX and other environment variables allow users to override
  the inferred directories.  They should no longer be necessary in
  most cases, though.

* Any directories that cannot be inferred will fall back to whatever
  is provided by the config installdirs component.

This commit was SVN r21723.
2009-07-21 20:19:38 +00:00
Shiqing Fan
4c3ea7a0d2 Add a missing header for defining O_BINARY on windows.
This commit was SVN r21719.
2009-07-20 09:00:58 +00:00
Aurelien Bouteiller
9cc2557f9c fixed a typo. oapl instead of opal
This commit was SVN r21711.
2009-07-17 22:20:53 +00:00
George Bosilca
03a0b04ab8 Get rid of all unused functions.
This commit was SVN r21704.
2009-07-16 19:53:07 +00:00
George Bosilca
3e971e61f3 The system headers are supposed to be protected by #ifdef and not by #if.
This commit was SVN r21700.
2009-07-16 18:27:33 +00:00
Shiqing Fan
957cdceb20 Another OMPI->OPAL renaming.
This commit was SVN r21659.
2009-07-14 06:54:17 +00:00
Shiqing Fan
f991c87c6a A few structures that need to be exported for Windows.
This commit was SVN r21658.
2009-07-14 06:53:42 +00:00
George Bosilca
a934b9d975 Add the Open MPI specific part based on a patch from Manuel. Add the
sparc and alpha. A manpage patch is also included. This partially fixes
ticket #1973.

This commit was SVN r21654.
2009-07-13 20:01:12 +00:00
George Bosilca
81c16ac317 Add missing header.
This commit was SVN r21653.
2009-07-13 19:40:31 +00:00
Shiqing Fan
503f2817b3 Corresponding changes to r21641 and r21642 for Windows.
- Add a CMake macro for checking OPAL_MAX_XXX values, re-written from OPAL_WITH_OPTION_MIN_MAX_VALUE m4 function. 
- Definition prefix changes and additional datatype alignments checking.
- Finish the datatype splitting on Windows too. :-)

This commit was SVN r21649.

The following SVN revision numbers were found above:
  r21641 --> open-mpi/ompi@6c5532072a
  r21642 --> open-mpi/ompi@c971c09eb6
2009-07-13 17:39:41 +00:00
Shiqing Fan
2f552eb8c1 Missing C_DECLS.
This commit was SVN r21648.
2009-07-13 17:31:42 +00:00
Rainer Keller
6c5532072a - Split the datatype engine into two parts: an MPI specific part in
OMPI
   and a language agnostic part in OPAL. The convertor is completely
   moved into OPAL.  This offers several benefits as described in RFC
   http://www.open-mpi.org/community/lists/devel/2009/07/6387.php
   namely:
    - Fewer basic types (int* and float* types, boolean and wchar
    - Fixing naming scheme to ompi-nomenclature.
    - Usability outside of the ompi-layer.
 - Due to the fixed nature of simple opal types, their information is
   completely
   known at compile time and therefore constified
 - With fewer datatypes (22), the actual sizes of bit-field types may be
   reduced
   from 64 to 32 bits, allowing reorganizing the opal_datatype
   structure, eliminating holes and keeping data required in convertor
   (upon send/recv) in one cacheline...
   This has implications to the convertor-datastructure and other parts
   of the code.
 - Several performance tests have been run, the netpipe latency does not
   change with
   this patch on Linux/x86-64 on the smoky cluster.
 - Extensive tests have been done to verify correctness (no new
   regressions) using:
   1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and
    ompi-ddt:
    a. running both trunk and ompi-ddt resulted in no differences
       (except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run
       correctly).
    b. with --enable-memchecker and running under valgrind (one buglet
       when run with static found in test-suite, commited)
   2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt:
      all passed (except for the dynamic/ tests failed!! as trunk/MTT)
   3. compilation and usage of HDF5 tests on Jaguar using PGI and
      PathScale compilers.
   4. compilation and usage on Scicortex.
 - Please note, that for the heterogeneous case, (-m32 compiled
   binaries/ompi), neither
   ompi-trunk, nor ompi-ddt branch would successfully launch.

This commit was SVN r21641.
2009-07-13 04:56:31 +00:00
Jeff Squyres
d7d07e0720 Improve the help messages from r20706.
This commit was SVN r21616.

The following SVN revision numbers were found above:
  r20706 --> open-mpi/ompi@248bbb8a2f
2009-07-09 11:58:31 +00:00
George Bosilca
2570a15651 Add a TODO bullet for later processing ...
This commit was SVN r21611.
2009-07-07 17:27:47 +00:00
George Bosilca
b85e3636f3 Cope with the case where IPv6 headers are not available.
This commit was SVN r21593.
2009-07-02 18:00:26 +00:00
Ralph Castain
d3fb39073f Initialize a variable to ensure we get the correct number of bound processors
This commit was SVN r21590.
2009-07-02 17:48:04 +00:00
Shiqing Fan
0e09cb650e The kernel index of the network interface wasn't set on Windows, it really caused a lot of problems.
This commit was SVN r21587.
2009-07-02 14:44:41 +00:00
Shiqing Fan
22666721a5 Fix a typo.
This commit was SVN r21584.
2009-07-02 08:49:22 +00:00
Shiqing Fan
c8284907c4 Another header for using _getch on Windows.
This commit was SVN r21537.
2009-06-26 13:07:08 +00:00
Shiqing Fan
77f33d182b Add another definition for deprecated UNIX function on Windows, silent the warnings.
This commit was SVN r21507.
2009-06-24 18:14:23 +00:00
Jeff Squyres
246caafe06 Correct the logic of the check for the env variable
OMPI_MCA_memory_ptmalloc2_disable and also add an explicit check for
FAKEROOTKEY (see http://bugs.debian.org/531522).

This commit was SVN r21489.
2009-06-20 11:22:06 +00:00
Jeff Squyres
f42727707b Per http://bugs.debian.org/531522, add an MCA param/environment
variable to allow the disabling of the ptmalloc2 component at init
time.

This commit was SVN r21479.
2009-06-19 10:50:23 +00:00
Jeff Squyres
6777f01380 Also look for /dev/ipath
This commit was SVN r21410.
2009-06-11 00:35:21 +00:00
Ralph Castain
f966d9f972 Fix visibility issues with opal_graph functions.
Fix the carto test so it can compile - need to update input file so it can run

This commit was SVN r21403.
2009-06-09 15:02:57 +00:00
Ralph Castain
c3c1ab1337 Correct a comment in paffinity.h about what paffinity_get returns - it was inaccurate.
Revamp the affinity detection/set procedure in mpi_init to correctly detect when we have already been bound to processors, given the revised understanding of paffinity_get. Add a new paffinity macro to make checking for already bound a little nicer.

This commit was SVN r21402.
2009-06-09 14:33:35 +00:00
Josh Hursey
70333b9441 Some components were still using OMPI_*_VERSION instead of OPAL_*_VERSION, so convert them over (Jeff is taking care of PLPA, so that is not included here).
This commit was SVN r21384.
2009-06-05 15:34:59 +00:00
Shiqing Fan
e46bf10efd Correctly include win32 util header.
This commit was SVN r21343.
2009-06-01 19:16:00 +00:00
Rainer Keller
b572dc3591 - As discussed revert r21330, Fortran-configure info should
not end up in OPAL
 - Will post an updated patch for the OMPI_ALIGNMENT_ parts (within C).

This commit was SVN r21342.

The following SVN revision numbers were found above:
  r21330 --> open-mpi/ompi@95596d1814
2009-06-01 19:02:34 +00:00
Rainer Keller
95596d1814 - Move alignment and size output generated by configure-tests
into the OPAL namespace, eliminating cases like opal/util/arch.c
   testing for ompi_fortran_logical_t.
   As this is processor- and compiler-related information
   (e.g. does the compiler/architecture support REAL*16)
   this should have been on the OPAL layer.
 - Unifies f77 code using MPI_Flogical instead of opal_fortran_logical_t

 - Tested locally (Linux/x86-64) with mpich and intel testsuite
   but would like to get this week-ends MTT output


 - PLEASE NOTE: configure-internal macro-names and
   ompi_cv_ variables have not been changed, so that
   external platform (not in contrib/) files still work.

This commit was SVN r21330.
2009-05-30 15:54:29 +00:00
Shiqing Fan
06841bf721 Fix a typo.
This commit was SVN r21305.
2009-05-27 19:08:46 +00:00
Shiqing Fan
ed61198423 Rewrite inet_ntop and inet_pton with WSAAddressToString and WSAStringToString for Windows. Decrease the minimum requirement for Windows Version.
This commit was SVN r21304.
2009-05-27 19:04:59 +00:00
Shiqing Fan
b3c077e112 Microsoft doesn't provide inet_pton and inet_ntop APIs on Windows XP, but only on Windows Vista and 2008. So add a stand alone version of inet_pton and inet_ntop functions from ISC.
This commit was SVN r21295.
2009-05-27 14:32:30 +00:00
Jeff Squyres
510ec093b6 Remove some dead code
This commit was SVN r21275.
2009-05-26 21:17:00 +00:00
Iain Bason
e7ff2368d6 This fixes trac:1930.
Emit a more informative error message when the file descriptor limit is
reached during an accept() call.  Also, abort when the accept fails to
avoid an infinite loop.

Emit a more informative error message when the help file can't be opened.

This commit was SVN r21271.

The following Trac tickets were found above:
  Ticket 1930 --> https://svn.open-mpi.org/trac/ompi/ticket/1930
2009-05-26 20:03:21 +00:00
Rainer Keller
36ee105d6a - Fix Coverity CID #1207:
set the tmp_str to NULL, so we don't have any double-free...
   Additionally, we should check for malloc returning NULL...

This commit was SVN r21228.
2009-05-14 00:21:15 +00:00
Rainer Keller
3f7f2b6f0f - Multiple functions, that allocate and return new
strings, aka should have __opal_attribute_malloc__
   update comment of opal_path_access -- new string returned.

This commit was SVN r21227.
2009-05-14 00:20:07 +00:00
Rainer Keller
73fd329cbd - Add the proper __opal_attribute_format__(__printf__...) to
declarations.

This commit was SVN r21226.
2009-05-14 00:10:59 +00:00
Shiqing Fan
3137001772 Read from the correct registry entry on Windows Vista and Server 2008.
This commit was SVN r21224.
2009-05-13 15:56:37 +00:00
Ralph Castain
aa25a51c92 Do not mark the mpi_paffinity_alone param as deprecated so we don't scare Jeff...er...users.
This commit was SVN r21218.
2009-05-12 15:41:11 +00:00
Josh Hursey
ec6c5bf5e9 Make sure that when we destruct the pointer array that we set the address to NULL and size to 0. This will help to flag accidental usage of a destructed pointer array object.
This commit was SVN r21216.
2009-05-12 14:13:07 +00:00
Jeff Squyres
05d87ee7b4 Because this error comes up over and over and over and over and ...
Libltdl erroneously returns an error string of "file not found" for
lots of reasons, even if the file really *is* there, but just failed
to dlopen() for some reason.  So if lt_dlerror() returns "file not
found", do some simple hueristics and if we *do* find a file, print a
slightly better error message.

This commit was SVN r21214.
2009-05-12 12:41:42 +00:00
Ralph Castain
d396f0a6fc Per the discussion on the devel list, move the binding of processes to processors from MPI_Init to process start. This involves:
1. replacing mpi_paffinity_alone with opal_paffinity_alone - for back-compatibility, I have aliased mpi_paffinity_alone to the new param name. This caus
es a mild abstraction break in the opal/mca/paffinity framework - per the devel discussion...live with it. :-) I also moved the ompi_xxx global variable
 that tracked maffinity setup so it could be properly closed in MPI_Finalize to the opal/mca/maffinity framework to avoid an abstraction break.

2. Added code to the odls/default module to perform paffinity binding and maffinity init between process fork and exec. This has been tested on IU's odi
n cluster and works for both MPI and non-MPI apps.

3. Revise MPI_Init to detect if affinity has already been set, and to attempt to set it if not already done. I have *not* tested this as I haven't yet f
igured out a way to do so - I couldn't get slurm to perform cpu bindings, even though it supposedly does do so.

This has only been lightly tested and would definitely benefit from a wider range of evaluation...

This commit was SVN r21209.
2009-05-12 02:18:35 +00:00
Josh Hursey
d920a302f3 Some more C/R related commits that have been sitting off-trunk for a while.
* Pass the sequence number of the checkpoint along with reference from the global to the local coordinator.
 * 'orte-restart --apponly' now just generates the app context file, and does not run with it. This provides the user the ability to edit the file before launching. 
 * Add a OPAL_CRS_NONE state
 * Split the INC into three distinct parts.
 * Implement a restart mechanism for the 'none' component. If given a context it simply execvp()'s it.

This commit was SVN r21195.
2009-05-08 20:51:13 +00:00
Josh Hursey
5d0607395d A couple of C/R related commits that have been sitting off-trunk for a while.
* Add 'orte-checkpoint -l' option that lists all checkpoints currently available on the system.
 * Add 'orte-restart -i' which prints information regarding the checkpoint targeted for restart.
 * Add ability to extract the timing metadata.
 * Fix show_help() in the orte-checkpoint and orte-restart tools. They should be using the opal versions instead of the orte versions (otherwise nothing is printed).

This commit was SVN r21194.
2009-05-08 19:41:11 +00:00
Shiqing Fan
537b8cd8b1 Get rid of improper use of SET_SOURCE_FILES_PROPERTIES. When using the latest CMake (2.6 patch 4), we get many errors, which didn't show in previous version.
This commit was SVN r21188.
2009-05-07 17:41:05 +00:00
Rainer Keller
b0754071b7 - For compilation with BLCR and --with-ft=cr, #include <string.h>
This commit was SVN r21185.
2009-05-07 16:14:59 +00:00
Greg Koenig
60485ff95f This is a very large change to rename several #define values from
OMPI_* to OPAL_*.  This allows opal layer to be used more independent
from the whole of ompi.

NOTE: 9 "svn mv" operations immediately follow this commit.

This commit was SVN r21180.
2009-05-06 20:11:28 +00:00
Rainer Keller
b8bb7865bc - Fix Coverity CID 9
Check for error in fcntl, as we depend on close-on-exec,
   F_SETFD will result in -1 in case of error (stored in errno).
   To not have a follow-up warning about not freeing filename, move up.

This commit was SVN r21171.
2009-05-05 19:04:58 +00:00
Shiqing Fan
cd565923d3 Completely remove ltdl support for Windows build.
This commit was SVN r21170.
2009-05-05 18:59:13 +00:00
Josh Hursey
8b8bee04d6 It seems that some of the patches were missed in r21131. :(
This patch contains the following items:
 * Fix the flag passed to open() for the read side of the named pipe between the local and app coordinator. There is a race condition when using O_RDWR on a named pipe (not sure how that bug got in there in the first place).
 * Adjust control in the C/R thread timing
 * Clarify return code in BLCR component
 * Allow the user to adjust the max wait time for the named pipes in the FileM local coordinator by using the MCA parameter "snapc_full_max_wait_time" (Default: 20 seconds)
 * If the application terminates while there are active FileM operations, force mpirun to wait on these operations to complete.
 * Allow the user to set the local copy command (Default: cp) via MCA parameter "filem_rsh_cp"
 * Implement the ability to throttle the number of outgoing connections in FileM. At larger scales this type of explicit throttling helps prevent overwhelming the HNP machine. Default: 10, set via MCA parameter: {{{filem_rsh_max_outgoing}}}

This commit was SVN r21167.

The following SVN revision numbers were found above:
  r21131 --> open-mpi/ompi@0deb009225
2009-05-05 16:45:49 +00:00
Shiqing Fan
c3380e9df2 put all generated files in the binary directory.
This commit was SVN r21160.
2009-05-05 13:50:48 +00:00
Shiqing Fan
8db5c3c002 Add missing quotation marks to the variables, in order to keep the semicolons in the output c file.
This commit was SVN r21151.
2009-05-05 08:29:19 +00:00
Ralph Castain
468800996b Make it possible to no-build the carto framework
Could swear we had done this before...but I guess not!

This commit was SVN r21150.
2009-05-05 03:54:58 +00:00
Josh Hursey
1327c57e9d add back a missing header
This commit was SVN r21148.
2009-05-04 21:30:11 +00:00
Shiqing Fan
5856cedc2b Remove libltdl related files and folders.
Add a find module for libltdl, so that user can still enable dlopen support (default off), and use natively installed libtool.

This commit was SVN r21146.
2009-05-04 17:35:48 +00:00
Ralph Castain
e1673778be Replace missing headers
This commit was SVN r21136.
2009-05-01 15:09:10 +00:00
Josh Hursey
38aca518bd Properly initialize this variable
This commit was SVN r21130.
2009-04-30 16:43:05 +00:00
Josh Hursey
ab63ab6568 forgot to update the copyright
This commit was SVN r21128.
2009-04-30 16:39:54 +00:00
Josh Hursey
76812318bb Fix a potential NULL reference
This commit was SVN r21127.
2009-04-30 16:38:43 +00:00
Josh Hursey
759c2b5596 Add a 'crs_blcr_dev_null' MCA parameter. This causes BLCR to checkpoint directly to /dev/null instead of to a file.
Though this is not useful in checkpointing an application, it can be a useful diagnostic.

This commit was SVN r21125.
2009-04-30 16:32:55 +00:00
Shiqing Fan
ff0e51f686 Include a missing header.
This commit was SVN r21121.
2009-04-30 09:03:21 +00:00
Shiqing Fan
fbaa30bf61 Add a few log file definitions for Windows.
This commit was SVN r21119.
2009-04-30 08:59:46 +00:00
Shiqing Fan
e7b6445b32 Add a missed .windows file into the tarball.
This commit was SVN r21105.
2009-04-29 10:31:10 +00:00
Rainer Keller
221fb9dbca ... Delayed due to notifier commits earlier this day ...
- Delete unnecessary header files using
   contrib/check_unnecessary_headers.sh after applying
   patches, that include headers, being "lost" due to
   inclusion in one of the now deleted headers...

   In total 817 files are touched.
   In ompi/mpi/c/ header files are moved up into the actual c-file,
   where necessary (these are the only additional #include),
   otherwise it is only deletions of #include (apart from the above
   additions required due to notifier...)

 - To get different MCAs (OpenIB, TM, ALPS), an earlier version was
   successfully compiled (yesterday) on:
   Linux locally using intel-11, gcc-4.3.2 and gcc-SVN + warnings enabled
   Smoky cluster (x86-64 running Linux) using PGI-8.0.2 + warnings enabled
   Lens cluster (x86-64 running Linux) using Pathscale-3.2 + warnings enabled

This commit was SVN r21096.
2009-04-29 01:32:14 +00:00
Rainer Keller
6c1cce8761 - For the upcoming header cleanup commit,
several header files (previously included by header-files)
   now have to be moved "upward".
   This is mainly system headers such as string.h, stdio.h and for
   networking, but also some orte headers.

This commit was SVN r21095.
2009-04-29 00:49:23 +00:00
Josh Hursey
84471e4bd4 Protect the free of base->sig.sh_old . This was raising a warning under some finalization circumstances.
This should be moved to v1.3

Thanks to Yaakoub El Khamra for the patch. 

This commit was SVN r21079.
2009-04-27 17:05:10 +00:00
Shiqing Fan
3d4e0472d6 Add windows support files into the tarball, including .windows, CMakeLists.txt files, and CMake modules. Thanks to Jeff for testing it on Linux.
This commit was SVN r21069.
2009-04-24 16:39:33 +00:00
Jeff Squyres
0bd9ef0bb9 Some valgrind-clean fixes. Thanks to "Number Cruncher" on the devel
list for pointing these out.

This commit was SVN r21060.
2009-04-23 18:50:46 +00:00
Jeff Squyres
9b5e194d9b Fix opal_basename. Wow; how long has that been broken?
This commit was SVN r21057.
2009-04-22 20:48:24 +00:00
Brian Barrett
2ca0b7fe44 remove some checks which are not needed after the recent ptmalloc2 changes
This commit was SVN r21042.
2009-04-19 18:17:05 +00:00
Jeff Squyres
e90ecb6020 Fix a compiler warning. Put in a good comment explaining why it is
declared the way it is.  Sigh.

This commit was SVN r21040.
2009-04-17 21:59:31 +00:00
Ralph Castain
afe1950da5 Make the error message clearer - this error only is used when two buffer types don't match, thus preventing an operation from being executed
This commit was SVN r21033.
2009-04-16 16:23:28 +00:00
Jeff Squyres
35fc9fedd2 MTT is your friend: Cisco tests --enable-static --disable-shared, but
we had already tested this scenario manually to know that it seemed to
be working.  What we ''didn't'' test was --enable-static
--disable-shared --disable-dlopen -- but my MTT '''did.'''  Yay!

This commit fixes that scenario.  Essentially we need to call a dummy
function in hooks.c to ensure that the linker pulls in all those
symbols into the final executable (and therefore pulls in the
malloc_initialize_hook, etc.).  Thanks for the heads-up from Brian in
fixing this one!

This commit was SVN r21022.
2009-04-15 19:09:10 +00:00
Ralph Castain
9c39a3edd7 Enable the passing of MCA params to dynamically spawned jobs. This creates a new info_key "ompi_param" that allows a user to specify MCA params for a dynamically spawned job.
We currently apply all of the MCA params in the parent job to the child. This commit allows a user to specify additional params for the child job, and to override any pre-existing params with the new value so they can better control behavior of the child job.

This commit was SVN r20989.
2009-04-14 14:15:49 +00:00
Shiqing Fan
0ea6d48320 Add a missed .windows file for timer component, which should be built always statically.
This commit was SVN r20987.
2009-04-14 12:19:21 +00:00
Jeff Squyres
3cfa8f55c4 Gaah; I meant to include a better comment in the last commit but had
forgotten to save before the commit was sent.

This comment explains why we're doing a cache check here rather than a
real check.

This commit was SVN r20975.
2009-04-10 21:16:23 +00:00
Jeff Squyres
9fcd01035d Fix a problem reported by Steve Kagl on the user's list; the posix
component (which we probably don't test regularly because we probably
only test environments where the other paffinity components are used)
was not getting built because it had a bad configure test.

This commit was SVN r20974.
2009-04-10 21:15:20 +00:00
Jeff Squyres
8fac195a3a Fixes trac:1871. Take a slightly different approach than before:
1. Probe the signal number that we want
 1. If a handler is already installed there:
    1. if the opal_signal MCA param entry for this signal is suffixed
       with ":complain", then output a show_help message
    1. do not install our signal handler
 1. otherwise, install our signal handler

Hence, we've shifted to a policy of only complaining if the user asks
us to complain.

This commit was SVN r20969.

The following Trac tickets were found above:
  Ticket 1871 --> https://svn.open-mpi.org/trac/ompi/ticket/1871
2009-04-10 15:32:33 +00:00
Jeff Squyres
500750b542 Oops; the MCA param name is "opal_signal", not "opal_signals". Thanks
for noticing, Ralph!

This commit was SVN r20968.
2009-04-09 16:13:07 +00:00
Nysal Jan
2500d88380 Optimize the computation of 16-bit checksum
This commit was SVN r20965.
2009-04-09 11:04:38 +00:00
Shiqing Fan
6e04a4de08 On Windows, define a equivalent type for in_addr_t, and correctly include unistd.h.
This commit was SVN r20951.
2009-04-07 16:07:05 +00:00
Jeff Squyres
a13dfb2140 Add in a proper test for munmap.
This commit was SVN r20936.
2009-04-04 00:43:17 +00:00
Jeff Squyres
52a0e5fe69 Add some checks for more network driver types.
This commit was SVN r20934.
2009-04-02 19:17:21 +00:00
Nysal Jan
ab18a3629f Change the return type to handle the case where an invalid interface name is passed to this function.
This commit was SVN r20933.
2009-04-02 18:35:09 +00:00
Jeff Squyres
3bf8c7025a Remove compiler warning about function not being prototyped.
This commit was SVN r20929.
2009-04-02 13:06:47 +00:00
Jeff Squyres
0f517c3d3f Gah; some non-final code got merged in by accident. Remove debugging
printf and put in the final test code for malloc.

This commit was SVN r20924.
2009-04-01 18:20:23 +00:00
Jeff Squyres
bf17ce1d3f Doh; forgot to add the OPAL_DECLSPEC to munmap().
This commit was SVN r20923.
2009-04-01 18:09:25 +00:00
Jeff Squyres
7aa431882c Remove the mallopt component (accidentally missed in r20921); refs
#1853.

This commit was SVN r20922.

The following SVN revision numbers were found above:
  r20921 --> open-mpi/ompi@0d52271cd6
2009-04-01 18:02:08 +00:00
Jeff Squyres
0d52271cd6 Per http://www.open-mpi.org/community/lists/announce/2009/03/0029.php
and https://svn.open-mpi.org/trac/ompi/ticket/1853, mallopt() hints do
not always work -- it is possible for memory to be returned to the OS
and therefore OMPI's registration cache becomes invalid.

This commit removes all use of mallopt() and uses a different way to
integrate ptmalloc2 than we have done in the past.  In particular, we
use almost exactly the same technique as MX:

 * Remove all uses of mallopt, to include the opal/memory mallopt
   component.
 * Name-shift all of OMPI's internal ptmalloc2 public symbols (e.g.,
   malloc -> opal_memory_ptmalloc2_malloc).
 * At run-time, use the existing glibc allocator malloc hook function
   pointers to fully hijack the glibc allocator with our own
   name-shifted ptmalloc2.
 * Make the decision whether to hijack the glibc allocator ''at run
   time'' (vs. at link time, as previous ptmalloc2 integration
   attempts have done).  Look at the OMPI_MCA_mpi_leave_pinned
   and OMPI_MCA_mpi_leave_pinned_pipeline environment variables and
   the existence of /sys/class/infiniband to determine if we should
   install the hooks or not.
 * As an added bonus, we can now tell if libopen-pal is linked
   statically or dynamically, and if we're linked statically, we
   assume that munmap intercept support doesn't work.

See the opal/mca/memory/ptmalloc2/README-open-mpi.txt file for all the
gory details about the implementation.

Fixes trac:1853.

This commit was SVN r20921.

The following Trac tickets were found above:
  Ticket 1853 --> https://svn.open-mpi.org/trac/ompi/ticket/1853
2009-04-01 17:52:16 +00:00
Jeff Squyres
b7a052a81d Fix r20917 -- some systems require <unistd.h> for !_SC_PAGESIZE.
This commit was SVN r20920.

The following SVN revision numbers were found above:
  r20917 --> open-mpi/ompi@d10393a925
2009-04-01 16:41:12 +00:00
George Bosilca
b7c1ae4f76 Nothing important, just an identation.
This commit was SVN r20919.
2009-04-01 15:27:16 +00:00
George Bosilca
d10393a925 A more optimized version of the set function. It only touch the first byte on
each page. Anyway, this function is _NEVER_ called as we use bind instead of set.
So please don't rely on the first touch memory affinity to do the right thing.

This commit was SVN r20917.
2009-04-01 15:24:03 +00:00
George Bosilca
6ca6cfaafc Mark the address with the correct type.
This commit was SVN r20916.
2009-04-01 15:18:08 +00:00
Terry Dontje
4b43911c6a Remove superfluous spaces in manpages that were causing catman to
generate mangled windex files.  Made ompi-top.1 and ompi-iof.1 build
by default.  Also, added the orte-top synonym to the ompi-top manpage.

This commit was SVN r20915.
2009-04-01 14:40:27 +00:00
Shiqing Fan
36a813415d When build from a tarball, there will be Linux-generated files that could not be used on Windows, so exclude them, and use the ones generated by CMake.
This commit was SVN r20858.
2009-03-24 18:10:57 +00:00
Ralph Castain
17f51a0389 Add a new PML module that acts as a "mini-dr" - when requested, it performs a dr-like checksum on messages for BTL's that require it, as specified by MCA params.
Add two new configure options that specify:

1. when to add padding to the openib control header - this *only* happens when the configure option is specified

2. when to use the dr-like checksum as opposed to the memcpy checksum. Not selectable at runtime - to eliminate performance impacts, this is a configure-only option

Also removed an unused checksum version from opal/util/crc.h.

The new component still needs a little cleanup and some sync with recent ob1 bug fixes. It was created as a separate module to avoid performance hits in ob1 itself, though most of the code is duplicative. The component is only selectable by either specifying it directly, or configuring with the dr-like checksum -and- setting -mca pml_csum_enable_checksum 1.

Modify the LANL platform files to take advantage of the new module.

This commit was SVN r20846.
2009-03-23 23:52:05 +00:00