1
1

1356 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
86d8356b13 Updates to allow OMPI to build on Cray XT platforms running Catamount
This commit was SVN r22381.
2010-01-07 18:14:03 +00:00
Jeff Squyres
dbb29663e8 Update the embedded PLPA version to v1.3.2. Since this is a 3rd
party/"vendor" import, the changes are actually far smaller than the
size of this changeset implies.  Here's a list of the changes:

 * Update the AMD license header in plpa_map.c to be less restrictive
   (see https://svn.open-mpi.org/trac/plpa/changeset/262 for details)
   -- '''this is the most/only important change of this update.'''  No
   code is changed by this; only removing a clase from a license
   header in plpa_map.c.
 * Changes to the generated {{{configure}}}, {{{config.guess}}}, and
   {{{config.sub}}} scripts (which aren't used by OMPI).
 * soname version tracking changes (which also aren't used by OMPI;
   they're only used when PLPA is built/installed in "standalone"
   mode).
 * Update the "get version" m4 (which was stolen from OMPI's m4 to
   begin with, and is only used during OMPI's autogen.sh step).
 * Update various PLPA version numbers to 1.3.2.
 * Bug fix in plpa-taskset (which is not built in the OMPI PLPA build).

This commit was SVN r22367.
2010-01-06 00:44:14 +00:00
Ralph Castain
fad1ba15b0 Move the test for case-sensitive file system from ompi to opal so that all layers can have that knowledge.
Use that for the orte wrapper compilers

This commit was SVN r22348.
2009-12-29 23:26:45 +00:00
Josh Hursey
313acba4ce Move the mca_base_is_component_required() functionality to mca/base per suggestion so that it can be reused in other components.
This commit was SVN r22327.
2009-12-17 15:12:26 +00:00
George Bosilca
7ba371cd92 Correct the atomics on x86 and x86_64. Thanks to Iain for the catch,
to Eugene, Jeff, and Briand for the help. This patch is supposed to
fix several outstanding issues, notably the one on tickets #2043.

This commit was SVN r22324.
2009-12-16 22:34:56 +00:00
George Bosilca
9fa5f1d7a8 Spaces instead of tabs.
This commit was SVN r22314.
2009-12-15 18:07:33 +00:00
Josh Hursey
6e584c151f We need to check the value of {{{opal_crs_base_metadata_read_token}}} since it may segv if we have a malformed metadata file.
Bug found by Sergio Diaz Montes:
  http://www.open-mpi.org/community/lists/users/2009/11/11176.php

This commit was SVN r22290.
2009-12-09 18:41:56 +00:00
Josh Hursey
e8de64d5a0 Make sure that we release the components that do not qualify for selection. These components are never open'ed really so we never need to close them.
This will need to be applied to v1.4 and v1.5, CMRs to follow.

This commit was SVN r22288.
2009-12-09 15:45:53 +00:00
Rainer Keller
787538ae38 Correct the spelling, and try cmr:v1.5 This should succeed
This commit was SVN r22280.
2009-12-08 18:46:46 +00:00
Ralph Castain
70e385bcab Picky, picky, picky...the a-retentive amongst us wants the default value to show in ompi_info! Of all the nerve...
:-)

Okay, cleanup the prior commit so that the default component search path shows in ompi_info, and remains available in component_find.

This commit was SVN r22278.
2009-12-08 17:32:22 +00:00
George Bosilca
e55f89dda7 Add a format to stop the complaining.
This commit was SVN r22276.
2009-12-08 17:26:42 +00:00
Ralph Castain
703ec3d6ce Some minor cleanups to the handling of multi-path component find
This commit was SVN r22275.
2009-12-08 09:34:49 +00:00
Ralph Castain
0b654ba4dc Extend the mca_component_path param usage by allowing a user to add paths to the default system and user ones defined in the program. Thus, the user can specify a param value of:
"my_perfect_path":SYSTEM_DEFAULT:USER_DEFAULT

and OPAL will substitute its internally derived values for the defaults (instead of forcing the user to figure them out).

This commit was SVN r22272.
2009-12-07 20:29:28 +00:00
Jeff Squyres
16b100219d A patch from UTK to allow orte_init(), opal_init(), and associated
friends also receive &argc and &argv (George asked Jeff to Ralph to
review before committing).  The thought is that passing argv and argc
to opal/orte_init be useful to other projects outside of OMPI that are
using OPAL and/or ORTE (especially in conjunction with some other
bootstrapping code where it is helpful to modify argv).  It's such a
small thing that it's easy to apply here to make others' lives a
little easier.

Ask George for more details; I'm just the messenger.  :-)

Judging by the copyrights on this patch, it's been around for a
while.  :-)

This commit was SVN r22260.
2009-12-04 00:51:15 +00:00
Ralph Castain
a0d5c80ce0 Add a new framework for discovering local resource information such as cpu type/model, #cpus, available physical memory, etc. Two initial components (darwin and linux) are provided. This is needed to support bootstrap operations where daemons are started at node boot, and applications where initial knowledge of cpu identification is needed to guide framework component selection.
Add orte configuration option to control the use of the framework in the system. Although the code will build, it will not be active unless configured with --enable-bootstrap.

If bootstrap is enabled and the new opal_sysinfo framework can successfully determine the cpu model, pass that info to the application as an MCA param to support some work at Sun.

Also, have daemons report back the resources they find to guide process mapping in bootstrap operations (i.e., where the daemon starts at node boot as opposed to being launched at application start).

Adjust some platform files to enable these capabilities.

This commit was SVN r22244.
2009-11-30 23:11:25 +00:00
Jeff Squyres
978fb43a26 Add a Big Hairy Warning if you --enable-progress-threads
This commit was SVN r22230.
2009-11-24 23:20:37 +00:00
Shiqing Fan
6f8d0a1ab8 Update a few CMake scripts.
Add Program Database (pdb) files for installation for debug build.

This commit was SVN r22188.
2009-11-03 10:40:58 +00:00
Jeff Squyres
5e6c494269 Remove the mistaken line (confirmed by Shiqing).
This commit was SVN r22175.
2009-10-30 12:45:05 +00:00
Jeff Squyres
16f42c45a6 Ensure to have a PARAM_CONFIG_FILES (I don't know if
PARAM_WINDOWS_FILES is a mistake or not).  Fixes trac:2079.

This commit was SVN r22171.

The following Trac tickets were found above:
  Ticket 2079 --> https://svn.open-mpi.org/trac/ompi/ticket/2079
2009-10-29 22:05:26 +00:00
Shiqing Fan
48dd7ff7d0 Get rid of the shadow file for mpi.h.in on Windows.
This commit was SVN r22154.
2009-10-28 15:49:01 +00:00
Rainer Keller
4ce710a147 - The internal function may fail make_opt (e.g. OPAL_ERR_OUT_OF_RESOURCE),
pass that on to callers of opal_cmd_line_make_opt_mca().
   Thanks to Thomas Naughton III.

 - Additionally, cmd-line parameters passed in table to opal_cmd_line_create()
   may be wrong (e.g. OPAL_ERR_BAD_PARAM), which may be missed in the
   loop.

This commit was SVN r22153.
2009-10-28 15:14:31 +00:00
Terry Dontje
c6ebc7c341 rename macros ompi_check_optflags and ompi_make_stripped_flags based on comments in #2072
This commit was SVN r22151.
2009-10-28 10:51:59 +00:00
Shiqing Fan
63cdfc0ab1 Get rid of several shadow files for windows build, use the same input file as on Linux.
This commit was SVN r22145.
2009-10-27 18:22:14 +00:00
Terry Dontje
6df802424d remove duplicate setting of CFLAGS_WITHOUT_OPTFLAGS and special case DEBUGGER_FLAGS for intel compiler
This commit was SVN r22143.
2009-10-26 18:41:53 +00:00
Shiqing Fan
af0830107c Generate the compiler wrappers more nicely on Windows.
This commit was SVN r22142.
2009-10-26 13:26:06 +00:00
Ralph Castain
13d86e100b Courtesy of Ralph and Jeff:
Continue the reorganization of the configure system. Move files from the main config directory to their appropriate level-specific config directories. Modify the configure system to correctly handle compiler detection, test, and setup so that all things pertaining to opal and orte are done at the lower level, with the ompi configure system only looking at mpi-specific options.

Ensure the wrapper compilers for orte and ompi only get built when appropriate. Add support for c++ to the orte wrapper compilers, both script and non-script versions.

This commit was SVN r22138.
2009-10-24 01:04:35 +00:00
Ralph Castain
214e26b539 Per Jeff (this work was done on a branch of mine, so I will do the commit):
Re-enable "./autogen.sh -no-ompi" again. If you -no-ompi, the entire OMPI
configury is skipped and the entire ompi/ subtree is not built. There's
some simple m4-isms that prune out the relevant parts.

I added ompi/config/, orte/config/, and opal/config/ directories. I moved a
bunch of m4 files from the top-level config/ dir into ompi/config/, and a few
into orte/config/.

Note that all 3 <project>/config directories have a config_files.m4 file. This
file contains the AC_CONFIG_FILES list for that project. The AC_CONFIG_FILES
call cannot be in an AC_DEFUN macro and conditionally called -- if it is
included at all, Autoconf will process it. Hence, these config_files.m4 files
don't AC_DEFUN -- they just have AC_CONFIG_FILES. m4_ifdef() is used to
conditionally include the files or not.

I moved a bunch of obvious OMPI-only m4 files from config/ to ompi/config/,
but I'm sure that there's more that could go. A ticket will be filed with
thoughts on future work in this area.

This commit was SVN r22113.
2009-10-20 23:44:20 +00:00
Ralph Castain
c991d155f4 Fix a minor omission in opal/util/path. If someone provides a relative path to the current working directory, without starting it with a
'.', we should still find the executable - it is in a directory beneath us.

In other words, if someone gives us "foo/bar" instead of "./foo/bar", we should still be able to find bar

This commit was SVN r22110.
2009-10-20 04:05:16 +00:00
Ralph Castain
c58a30ea10 Add two new functions:
1. check for loopback interface

2. convert tuple addresses to ip addrs + mask

This commit was SVN r22080.
2009-10-09 15:24:41 +00:00
Jeff Squyres
9afe50d886 Update Cisco copyrights for consistency
This commit was SVN r22072.
2009-10-07 22:02:32 +00:00
Jeff Squyres
d317ce0367 Fix CID 1381: don't bother checking for (NULL == p); it's overkill.
posix_memalign() will either return 0 or not, indicating success.  And
if posix_memalign() fails, it's not always going to be due to
out-of-memory -- just return ERR_IN_ERRNO.

This commit was SVN r22070.
2009-10-07 20:01:50 +00:00
Jeff Squyres
7900451e4e Fix CID 1326: for the (unlikely) case where
opal_paffinity_base_get_processor_info() returns failure.

This commit was SVN r22069.
2009-10-07 19:52:08 +00:00
Jeff Squyres
5c1af9c2ba Fix CID 1355: ensure that mca_base_param_reg_int() actually
succeeded.

This commit was SVN r22068.
2009-10-07 19:43:35 +00:00
Jeff Squyres
3b4f695009 MAP_FAILED is more POSIX-ly correct than ((void*)-1).
This commit was SVN r22063.
2009-10-07 14:20:18 +00:00
Jeff Squyres
d7db5f4c32 mmap(2) says that you must call mmap() with either MAP_SHARED or
MAP_PRIVATE.  We didn't catch this because we checked for a NULL
return, not a -1 return.  Doh!  Thanks again to Julian Seward for
continuing to track this down.

This commit was SVN r22062.
2009-10-07 12:39:01 +00:00
Jeff Squyres
977574bd45 Fix a problem noted by Julian Seward: MAKE_MEM_UNDEFINED is not the
opposite of MAKE_MEM_DEFINED. Also add in a call to NOACCESS to
(mostly) reverse the effects of MAKE_MEM_DEFINED (technically, page 0
was accessible before this, even though it's a Bad Idea to access it).

This commit was SVN r22056.
2009-10-06 17:55:49 +00:00
Jeff Squyres
932b43be04 Check to ensure that the mmap succeeded. Thanks to Julia Seward for
pointing out the problem and suggesting the fix.

This commit was SVN r22055.
2009-10-06 17:44:14 +00:00
George Bosilca
01bb4dafe0 Add a comment.
This commit was SVN r22052.
2009-10-05 17:36:11 +00:00
Jeff Squyres
0f8ac9223f Refs trac:2023, #2027.
This commit does a bunch of things:

 * Address all remaining code review items from CMR #2023:

   * Defer mmap setup to be lazy; only set it up the first time we
     invoke a collective.  In this way, we don't penalize apps that
     make lots of communicators but don't invoke collectives on them
     (per #2027).
   * Remove the extra assignments of mca_coll_sm_one (fixing a
     convertor count setup that was the real problem).
   * Remove another extra/unnecessary assignment.
   * Increase libevent polling frequency when using the RML to
     bootstrap mmap'ed memory.
   * Fix a minor procs-related memory leak in btl_sm.
 * Commit a datatype fix that George and I discovered along the way to
   fixing the coll sm.
 * Improve error messages when mmap fails, potentially trying to
   de-alloc any allocated memory when that happens.
 * Fix a previously-unnoticed confusion between extent and true_extent
   in coll sm reduce.

This commit was SVN r22049.

The following Trac tickets were found above:
  Ticket 2023 --> https://svn.open-mpi.org/trac/ompi/ticket/2023
2009-10-02 17:13:56 +00:00
Jeff Squyres
c8c3132605 Also check for posix_memalign.
This commit was SVN r22045.
2009-10-01 23:51:48 +00:00
George Bosilca
b04a42ba3b Add the format to the opal_output call.
This commit was SVN r22041.
2009-09-30 23:33:12 +00:00
Ralph Castain
84a45fea0a Add a convenience macro for assembling network addresses
This commit was SVN r22036.
2009-09-30 14:38:52 +00:00
Ralph Castain
176fdd3a83 Add a new API to the show_help system that allows external users (e.g., libraries built upon OMPI) to define their own locations for show_help files. This allows such users to exploit the rather nice features of the OPAL show_help system -without- interfering with the ability of the ORTE and OMPI layers to use show_help themselves.
Reviewed by Jeff to protect toes...and to get some good comments :-)

This commit was SVN r22026.
2009-09-29 02:07:46 +00:00
Jeff Squyres
1886d5a004 Remove the libopenmpi_malloc library; it is only necessary for
backwards compatibility in the v1.3 series.

This commit was SVN r22013.
2009-09-25 17:09:54 +00:00
Josh Hursey
5406fdfb80 Add support for sending SIGSTOP the MPI job after the checkpoint is taken (uses a BLCR feature for the option).
This commit looks larger than it really is since it includes a fair amount of code cleanup.

The SIGSTOP/SIGCONT+checkpointing work uses some of the functionality in r20391. Basic use case below (note that the checkpoint generated is useable as usual if the stopped application is terminated).
{{{
shell 1) mpirun -np 2 -am ft-enable-cr my-app
... running ...

shell 2) ompi-checkpoint --stop -v MPIRUN_PID
[localhost:001300] [  0.00 /   0.20]                 Requested - ...
[localhost:001300] [  0.00 /   0.20]                   Pending - ...
[localhost:001300] [  0.01 /   0.21]                   Running - ...
[localhost:001300] [  1.01 /   1.22]                   Stopped - ompi_global_snapshot_1234.ckpt
Snapshot Ref.: 0 ompi_global_snapshot_1234.ckpt

shell 2) killall -CONT mpirun

... Application Continues execution in shell 1 ...
}}}

Other items in this commit are mostly cleanup that has been sitting off-trunk for too long:
 * Add a new {{{opal_crs_base_ckpt_options_t}}} type that encapsulates the various options that could be passed to the CRS. Currently only TERM and STOP, but this makes adding others ''much'' easier.
 * Eliminate ORTE_SNAPC_CKPT_STATE_PENDING_TERM, since it served a redundant purpose with the new options type.
 * Lay some basic ground work for some future features.

This commit was SVN r21995.

The following SVN revision numbers were found above:
  r20391 --> open-mpi/ompi@0704b98668
2009-09-22 18:26:12 +00:00
Eugene Loh
67bac2fe31 Fix paffinity_linux_module.c. The set and get functions transferred cpu
masks between the mask argument and a local PLPA mask.  There were three
problems:
1) The "get" function computed the number of bits as sizeof(mask),
   which is the size of the pointer to the mask rather than the mask
   itself.  So, only 4 bits were copied with m32 and 8 bits with m64.
   There are actually 1024 bits.
2) The "get" and "set" functions both copied a number of bits computed
   from the sizeof() mask, but sizeof() reports the number of bytes.
   We have to multiply by 8 to get the number of bits.
3) These two functions check to make sure tha the mask argument is not
   bigger than the PLPA mask.  But, the set function copies a number
   of bits in the PLPA mask, which is conceivably greater than the
   number of bits in the mask argument.  So, accesses to the mask
   argument may overrun that argument.
Problems 1 and 2 meant that one would encounter errors when the number of
cores exceeded 4 (with -m32) or 8 (with -m64).  Problem 3 probably caused
no errors.

This commit was SVN r21993.
2009-09-22 16:00:37 +00:00
Ralph Castain
7765c71428 Add a macro for formatting IP addresses for printing
This commit was SVN r21985.
2009-09-22 00:53:54 +00:00
George Bosilca
b18ca686ae Correct the pointer math when we copy the opal_datatype_t object. In addition
don't set the ref count to 1, it has been already set by the call to OBJ_NEW
when the type was allocated. This fixes ticket #2014.

This commit was SVN r21976.
2009-09-18 20:05:22 +00:00
Ralph Castain
2028017554 Modify the paffinity system to handle binding directives that are "soft" - i.e., when someone directs that we bind if the system supports it. This allows community members to distribute OMPI with default MCA param files that direct general binding policies, without having the distributed software fail if the system cannot support those policies.
The new options work by adding an ":if-avail" qualifier to the "bind-to-socket" and "bind-to-core" MCA params. If the system does not support this capability, the job will launch anyway. Without the qualifier, the job will abort with an error message indicating that the required functionality is not supported on this system.

This commit was SVN r21975.
2009-09-18 19:48:42 +00:00
Rainer Keller
5983aeb753 - This fixes trac:2014:
As noted in http://www.open-mpi.org/community/lists/devel/2009/08/6741.php,
   we do not correctly free a dupped predefined datatype.
   The fix is a bit more involving. See ticket for details.
   Tested with ibm tests and mpi_test_suite (though there's two "old" failures
   zero5.c and zero6.c)

   Thanks to Lisandro Dalcin for bringing this up.

This commit was SVN r21929.

The following Trac tickets were found above:
  Ticket 2014 --> https://svn.open-mpi.org/trac/ompi/ticket/2014
2009-09-02 17:34:01 +00:00