1
1
Граф коммитов

1538 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
16f42c45a6 Ensure to have a PARAM_CONFIG_FILES (I don't know if
PARAM_WINDOWS_FILES is a mistake or not).  Fixes trac:2079.

This commit was SVN r22171.

The following Trac tickets were found above:
  Ticket 2079 --> https://svn.open-mpi.org/trac/ompi/ticket/2079
2009-10-29 22:05:26 +00:00
Shiqing Fan
48dd7ff7d0 Get rid of the shadow file for mpi.h.in on Windows.
This commit was SVN r22154.
2009-10-28 15:49:01 +00:00
Rainer Keller
4ce710a147 - The internal function may fail make_opt (e.g. OPAL_ERR_OUT_OF_RESOURCE),
pass that on to callers of opal_cmd_line_make_opt_mca().
   Thanks to Thomas Naughton III.

 - Additionally, cmd-line parameters passed in table to opal_cmd_line_create()
   may be wrong (e.g. OPAL_ERR_BAD_PARAM), which may be missed in the
   loop.

This commit was SVN r22153.
2009-10-28 15:14:31 +00:00
Terry Dontje
c6ebc7c341 rename macros ompi_check_optflags and ompi_make_stripped_flags based on comments in #2072
This commit was SVN r22151.
2009-10-28 10:51:59 +00:00
Shiqing Fan
63cdfc0ab1 Get rid of several shadow files for windows build, use the same input file as on Linux.
This commit was SVN r22145.
2009-10-27 18:22:14 +00:00
Terry Dontje
6df802424d remove duplicate setting of CFLAGS_WITHOUT_OPTFLAGS and special case DEBUGGER_FLAGS for intel compiler
This commit was SVN r22143.
2009-10-26 18:41:53 +00:00
Shiqing Fan
af0830107c Generate the compiler wrappers more nicely on Windows.
This commit was SVN r22142.
2009-10-26 13:26:06 +00:00
Ralph Castain
13d86e100b Courtesy of Ralph and Jeff:
Continue the reorganization of the configure system. Move files from the main config directory to their appropriate level-specific config directories. Modify the configure system to correctly handle compiler detection, test, and setup so that all things pertaining to opal and orte are done at the lower level, with the ompi configure system only looking at mpi-specific options.

Ensure the wrapper compilers for orte and ompi only get built when appropriate. Add support for c++ to the orte wrapper compilers, both script and non-script versions.

This commit was SVN r22138.
2009-10-24 01:04:35 +00:00
Ralph Castain
214e26b539 Per Jeff (this work was done on a branch of mine, so I will do the commit):
Re-enable "./autogen.sh -no-ompi" again. If you -no-ompi, the entire OMPI
configury is skipped and the entire ompi/ subtree is not built. There's
some simple m4-isms that prune out the relevant parts.

I added ompi/config/, orte/config/, and opal/config/ directories. I moved a
bunch of m4 files from the top-level config/ dir into ompi/config/, and a few
into orte/config/.

Note that all 3 <project>/config directories have a config_files.m4 file. This
file contains the AC_CONFIG_FILES list for that project. The AC_CONFIG_FILES
call cannot be in an AC_DEFUN macro and conditionally called -- if it is
included at all, Autoconf will process it. Hence, these config_files.m4 files
don't AC_DEFUN -- they just have AC_CONFIG_FILES. m4_ifdef() is used to
conditionally include the files or not.

I moved a bunch of obvious OMPI-only m4 files from config/ to ompi/config/,
but I'm sure that there's more that could go. A ticket will be filed with
thoughts on future work in this area.

This commit was SVN r22113.
2009-10-20 23:44:20 +00:00
Ralph Castain
c991d155f4 Fix a minor omission in opal/util/path. If someone provides a relative path to the current working directory, without starting it with a
'.', we should still find the executable - it is in a directory beneath us.

In other words, if someone gives us "foo/bar" instead of "./foo/bar", we should still be able to find bar

This commit was SVN r22110.
2009-10-20 04:05:16 +00:00
Ralph Castain
c58a30ea10 Add two new functions:
1. check for loopback interface

2. convert tuple addresses to ip addrs + mask

This commit was SVN r22080.
2009-10-09 15:24:41 +00:00
Jeff Squyres
9afe50d886 Update Cisco copyrights for consistency
This commit was SVN r22072.
2009-10-07 22:02:32 +00:00
Jeff Squyres
d317ce0367 Fix CID 1381: don't bother checking for (NULL == p); it's overkill.
posix_memalign() will either return 0 or not, indicating success.  And
if posix_memalign() fails, it's not always going to be due to
out-of-memory -- just return ERR_IN_ERRNO.

This commit was SVN r22070.
2009-10-07 20:01:50 +00:00
Jeff Squyres
7900451e4e Fix CID 1326: for the (unlikely) case where
opal_paffinity_base_get_processor_info() returns failure.

This commit was SVN r22069.
2009-10-07 19:52:08 +00:00
Jeff Squyres
5c1af9c2ba Fix CID 1355: ensure that mca_base_param_reg_int() actually
succeeded.

This commit was SVN r22068.
2009-10-07 19:43:35 +00:00
Jeff Squyres
3b4f695009 MAP_FAILED is more POSIX-ly correct than ((void*)-1).
This commit was SVN r22063.
2009-10-07 14:20:18 +00:00
Jeff Squyres
d7db5f4c32 mmap(2) says that you must call mmap() with either MAP_SHARED or
MAP_PRIVATE.  We didn't catch this because we checked for a NULL
return, not a -1 return.  Doh!  Thanks again to Julian Seward for
continuing to track this down.

This commit was SVN r22062.
2009-10-07 12:39:01 +00:00
Jeff Squyres
977574bd45 Fix a problem noted by Julian Seward: MAKE_MEM_UNDEFINED is not the
opposite of MAKE_MEM_DEFINED. Also add in a call to NOACCESS to
(mostly) reverse the effects of MAKE_MEM_DEFINED (technically, page 0
was accessible before this, even though it's a Bad Idea to access it).

This commit was SVN r22056.
2009-10-06 17:55:49 +00:00
Jeff Squyres
932b43be04 Check to ensure that the mmap succeeded. Thanks to Julia Seward for
pointing out the problem and suggesting the fix.

This commit was SVN r22055.
2009-10-06 17:44:14 +00:00
George Bosilca
01bb4dafe0 Add a comment.
This commit was SVN r22052.
2009-10-05 17:36:11 +00:00
Jeff Squyres
0f8ac9223f Refs trac:2023, #2027.
This commit does a bunch of things:

 * Address all remaining code review items from CMR #2023:

   * Defer mmap setup to be lazy; only set it up the first time we
     invoke a collective.  In this way, we don't penalize apps that
     make lots of communicators but don't invoke collectives on them
     (per #2027).
   * Remove the extra assignments of mca_coll_sm_one (fixing a
     convertor count setup that was the real problem).
   * Remove another extra/unnecessary assignment.
   * Increase libevent polling frequency when using the RML to
     bootstrap mmap'ed memory.
   * Fix a minor procs-related memory leak in btl_sm.
 * Commit a datatype fix that George and I discovered along the way to
   fixing the coll sm.
 * Improve error messages when mmap fails, potentially trying to
   de-alloc any allocated memory when that happens.
 * Fix a previously-unnoticed confusion between extent and true_extent
   in coll sm reduce.

This commit was SVN r22049.

The following Trac tickets were found above:
  Ticket 2023 --> https://svn.open-mpi.org/trac/ompi/ticket/2023
2009-10-02 17:13:56 +00:00
Jeff Squyres
c8c3132605 Also check for posix_memalign.
This commit was SVN r22045.
2009-10-01 23:51:48 +00:00
George Bosilca
b04a42ba3b Add the format to the opal_output call.
This commit was SVN r22041.
2009-09-30 23:33:12 +00:00
Ralph Castain
84a45fea0a Add a convenience macro for assembling network addresses
This commit was SVN r22036.
2009-09-30 14:38:52 +00:00
Ralph Castain
176fdd3a83 Add a new API to the show_help system that allows external users (e.g., libraries built upon OMPI) to define their own locations for show_help files. This allows such users to exploit the rather nice features of the OPAL show_help system -without- interfering with the ability of the ORTE and OMPI layers to use show_help themselves.
Reviewed by Jeff to protect toes...and to get some good comments :-)

This commit was SVN r22026.
2009-09-29 02:07:46 +00:00
Jeff Squyres
1886d5a004 Remove the libopenmpi_malloc library; it is only necessary for
backwards compatibility in the v1.3 series.

This commit was SVN r22013.
2009-09-25 17:09:54 +00:00
Josh Hursey
5406fdfb80 Add support for sending SIGSTOP the MPI job after the checkpoint is taken (uses a BLCR feature for the option).
This commit looks larger than it really is since it includes a fair amount of code cleanup.

The SIGSTOP/SIGCONT+checkpointing work uses some of the functionality in r20391. Basic use case below (note that the checkpoint generated is useable as usual if the stopped application is terminated).
{{{
shell 1) mpirun -np 2 -am ft-enable-cr my-app
... running ...

shell 2) ompi-checkpoint --stop -v MPIRUN_PID
[localhost:001300] [  0.00 /   0.20]                 Requested - ...
[localhost:001300] [  0.00 /   0.20]                   Pending - ...
[localhost:001300] [  0.01 /   0.21]                   Running - ...
[localhost:001300] [  1.01 /   1.22]                   Stopped - ompi_global_snapshot_1234.ckpt
Snapshot Ref.: 0 ompi_global_snapshot_1234.ckpt

shell 2) killall -CONT mpirun

... Application Continues execution in shell 1 ...
}}}

Other items in this commit are mostly cleanup that has been sitting off-trunk for too long:
 * Add a new {{{opal_crs_base_ckpt_options_t}}} type that encapsulates the various options that could be passed to the CRS. Currently only TERM and STOP, but this makes adding others ''much'' easier.
 * Eliminate ORTE_SNAPC_CKPT_STATE_PENDING_TERM, since it served a redundant purpose with the new options type.
 * Lay some basic ground work for some future features.

This commit was SVN r21995.

The following SVN revision numbers were found above:
  r20391 --> open-mpi/ompi@0704b98668
2009-09-22 18:26:12 +00:00
Eugene Loh
67bac2fe31 Fix paffinity_linux_module.c. The set and get functions transferred cpu
masks between the mask argument and a local PLPA mask.  There were three
problems:
1) The "get" function computed the number of bits as sizeof(mask),
   which is the size of the pointer to the mask rather than the mask
   itself.  So, only 4 bits were copied with m32 and 8 bits with m64.
   There are actually 1024 bits.
2) The "get" and "set" functions both copied a number of bits computed
   from the sizeof() mask, but sizeof() reports the number of bytes.
   We have to multiply by 8 to get the number of bits.
3) These two functions check to make sure tha the mask argument is not
   bigger than the PLPA mask.  But, the set function copies a number
   of bits in the PLPA mask, which is conceivably greater than the
   number of bits in the mask argument.  So, accesses to the mask
   argument may overrun that argument.
Problems 1 and 2 meant that one would encounter errors when the number of
cores exceeded 4 (with -m32) or 8 (with -m64).  Problem 3 probably caused
no errors.

This commit was SVN r21993.
2009-09-22 16:00:37 +00:00
Ralph Castain
7765c71428 Add a macro for formatting IP addresses for printing
This commit was SVN r21985.
2009-09-22 00:53:54 +00:00
George Bosilca
b18ca686ae Correct the pointer math when we copy the opal_datatype_t object. In addition
don't set the ref count to 1, it has been already set by the call to OBJ_NEW
when the type was allocated. This fixes ticket #2014.

This commit was SVN r21976.
2009-09-18 20:05:22 +00:00
Ralph Castain
2028017554 Modify the paffinity system to handle binding directives that are "soft" - i.e., when someone directs that we bind if the system supports it. This allows community members to distribute OMPI with default MCA param files that direct general binding policies, without having the distributed software fail if the system cannot support those policies.
The new options work by adding an ":if-avail" qualifier to the "bind-to-socket" and "bind-to-core" MCA params. If the system does not support this capability, the job will launch anyway. Without the qualifier, the job will abort with an error message indicating that the required functionality is not supported on this system.

This commit was SVN r21975.
2009-09-18 19:48:42 +00:00
Rainer Keller
5983aeb753 - This fixes trac:2014:
As noted in http://www.open-mpi.org/community/lists/devel/2009/08/6741.php,
   we do not correctly free a dupped predefined datatype.
   The fix is a bit more involving. See ticket for details.
   Tested with ibm tests and mpi_test_suite (though there's two "old" failures
   zero5.c and zero6.c)

   Thanks to Lisandro Dalcin for bringing this up.

This commit was SVN r21929.

The following Trac tickets were found above:
  Ticket 2014 --> https://svn.open-mpi.org/trac/ompi/ticket/2014
2009-09-02 17:34:01 +00:00
Jeff Squyres
2fa048b0e0 Make the paffinity test component only build when you --enable-debug
(or have a developer build where that's enabled for you by default).

This commit was SVN r21928.
2009-09-02 11:23:54 +00:00
Ralph Castain
388c65fd80 Add missing include file
This commit was SVN r21924.
2009-09-01 13:31:54 +00:00
Ralph Castain
888f3c3afe Extend the paffinity test module to allow users to specify the number of sockets and cores - provides an extended ability to mimic archs.
This commit was SVN r21912.
2009-08-29 03:35:39 +00:00
Ralph Castain
3c4f28b22c Modify the paffinity test module to take a param indicating whether or not to mimic being externally bound
This commit was SVN r21908.
2009-08-28 02:31:01 +00:00
Shiqing Fan
4119497c5a Use select() for Windows events by default.
For some historical reasons, we didn't use select() for the Windows events. Now it could be merged back to have a better performance on Windows.

This commit was SVN r21899.
2009-08-27 08:11:56 +00:00
Shiqing Fan
1b6db85988 Complete the support for building on UNC path.
This commit was SVN r21897.
2009-08-27 07:57:26 +00:00
Ralph Castain
7cc045f9c5 Check return codes when init'ing the paffinity framework to avoid segfaulting
This commit was SVN r21884.
2009-08-26 01:58:15 +00:00
Ralph Castain
2178f995b9 Add a new "test" module to the paffinity framework that mimics a system that supports affinity when running on a Mac for development purposes. Only active if specifically called out.
This commit was SVN r21881.
2009-08-26 01:55:30 +00:00
Josh Hursey
1fc454103a MTT found a linking problem with the previous fix (r21860). This commit cleans up the configure.m4 a bit so that is works as expected. Tested with static and shared builds, but MTT should have a pass at it before it moved to v1.3 (I'll attach a custom patch to ticket #2004 for v1.3).
Additionally this commit fixes trac:1658
Refs trac:2004

This commit was SVN r21866.

The following SVN revision numbers were found above:
  r21860 --> open-mpi/ompi@7f31473bd7

The following Trac tickets were found above:
  Ticket 1658 --> https://svn.open-mpi.org/trac/ompi/ticket/1658
  Ticket 2004 --> https://svn.open-mpi.org/trac/ompi/ticket/2004
2009-08-21 20:07:18 +00:00
Josh Hursey
7f31473bd7 Fix the CFLAGS argument to the BLCR component.
Due to a typo (probably cut/paste error) the CFLAGS/CPPFLAGS/LDFLAGS/LIBS arguments were not propogated to the BLCR component.

This changes a few {{{crs_blcr_check_}}} to {{{crs_blcr_save}}}, which they should have been all along.

This needs to be applied to v1.3 as well :/

This commit was SVN r21860.
2009-08-20 21:28:19 +00:00
Rainer Keller
8e1b23779f - Replace combinations of
#if defined (c_plusplus)
          defined (__cplusplus)
   followed by
      extern "C" {
   and the closing counterpart by BEGIN_C_DECLS and END_C_DECLS.

   Notable exceptions are:
    - opal/include/opal_config_bottom.h:
      This is our generated code, that itself defines BEGIN_C_DECL and
      END_C_DECL
    - ompi/mpi/cxx/mpicxx.h:
      Here we do not include opal_config_bottom.h:                                 
    - Belongs to external code:                                                    
      opal/mca/backtrace/darwin/MoreBacktrace/MoreDebugging/MoreBacktrace.c        
      opal/mca/backtrace/darwin/MoreBacktrace/MoreDebugging/MoreBacktrace.h        
    - opal/include/opal/prefetch.h:
      Has C++ specific macros that are protected:                                  

    - Had #if ... } #endif  _and_ END_C_DECLS (aka end up with 2x
      END_C_DECLS)
      ompi/mca/btl/openib/btl_openib.h
    - opal/event/event.h has #ifdef __cplusplus as BEGIN_C_DECLS...
    - opal/win32/ompi_process.h: had extern "C"\n {...
      opal/win32/ompi_process.h: dito
    - ompi/mca/btl/pcie/btl_pcie_lex.l: needed to add *_C_DECLS
      ompi/mpi/f90/test/align_c.c: dito
    - ompi/debuggers/msgq_interface.h: used #ifdef __cplusplus
    - ompi/mpi/f90/xml/common-C.xsl: Amend

   Tested on linux using --with-openib and --with-mx

   The following do not contain either opal_config.h, orte_config.h or
   ompi_config.h
   (but possibly other header files, that include one of the above):
      ompi/mca/bml/r2/bml_r2_ft.h
      ompi/mca/btl/gm/btl_gm_endpoint.h
      ompi/mca/btl/gm/btl_gm_proc.h
      ompi/mca/btl/mx/btl_mx_endpoint.h
      ompi/mca/btl/ofud/btl_ofud_endpoint.h
      ompi/mca/btl/ofud/btl_ofud_frag.h
      ompi/mca/btl/ofud/btl_ofud_proc.h
      ompi/mca/btl/openib/btl_openib_mca.h
      ompi/mca/btl/portals/btl_portals_endpoint.h
      ompi/mca/btl/portals/btl_portals_frag.h
      ompi/mca/btl/sctp/btl_sctp_endpoint.h
      ompi/mca/btl/sctp/btl_sctp_proc.h
      ompi/mca/btl/tcp/btl_tcp_endpoint.h
      ompi/mca/btl/tcp/btl_tcp_ft.h
      ompi/mca/btl/tcp/btl_tcp_proc.h
      ompi/mca/btl/template/btl_template_endpoint.h
      ompi/mca/btl/template/btl_template_proc.h
      ompi/mca/btl/udapl/btl_udapl_eager_rdma.h
      ompi/mca/btl/udapl/btl_udapl_endpoint.h
      ompi/mca/btl/udapl/btl_udapl_mca.h
      ompi/mca/btl/udapl/btl_udapl_proc.h
      ompi/mca/mtl/mx/mtl_mx_endpoint.h
      ompi/mca/mtl/mx/mtl_mx.h
      ompi/mca/mtl/psm/mtl_psm_endpoint.h
      ompi/mca/mtl/psm/mtl_psm.h
      ompi/mca/pml/cm/pml_cm_component.h
      ompi/mca/pml/csum/pml_csum_comm.h
      ompi/mca/pml/dr/pml_dr_comm.h
      ompi/mca/pml/dr/pml_dr_component.h
      ompi/mca/pml/dr/pml_dr_endpoint.h
      ompi/mca/pml/dr/pml_dr_recvfrag.h
      ompi/mca/pml/example/pml_example.h
      ompi/mca/pml/ob1/pml_ob1_comm.h
      ompi/mca/pml/ob1/pml_ob1_component.h
      ompi/mca/pml/ob1/pml_ob1_endpoint.h
      ompi/mca/pml/ob1/pml_ob1_rdmafrag.h
      ompi/mca/pml/ob1/pml_ob1_recvfrag.h
      ompi/mca/pml/v/pml_v_output.h
      opal/include/opal/prefetch.h
      opal/mca/timer/aix/timer_aix.h
      opal/util/qsort.h
      test/support/components.h

This commit was SVN r21855.

The following SVN revision numbers were found above:
  r2 --> open-mpi/ompi@58fdc18855
2009-08-20 11:42:18 +00:00
Ralph Castain
aca3e71ccd Don't declare us "bound" if the cpu mask is completely zero
This commit was SVN r21839.
2009-08-19 18:55:06 +00:00
Shiqing Fan
2a05de0dda Update a few comments of how OMPI uses Windows registry.
This commit was SVN r21828.
2009-08-18 16:32:45 +00:00
Rainer Keller
c3226b36b7 - It's BUS_ADRERR, not BUSADRERR...
- Fix typos.

This commit was SVN r21808.
2009-08-12 13:07:04 +00:00
Shiqing Fan
bce2f44154 Update related .windows files with proper compiling properties, in order to have a successful DSO build.
This commit was SVN r21805.
2009-08-12 08:55:58 +00:00
Shiqing Fan
fb4be6fad7 First step to enable DSO build on Windows.
- add a cmake module for searching libltdl libraries and headers
  - a configure option to enable DSO build, default OFF.
  - update a few source files for including correct header
    and loading correct mca libraries path/suffix.

This commit was SVN r21804.
2009-08-12 08:52:48 +00:00
George Bosilca
52c722352d The COMPLEX are back. Due to some compilers flags right now the
support for _Complex is disabled until we figure out the correct
black magic. So instead of using this nice C99 feature, we use the
a strcture with a double type, the same approach that worked pretty
well for the last couple of years. 

Switching from one mode to the other is done using the 
OPAL_USE_[FLOAT|DOUBLE|LONG_DOUBLE]__COMPLEX macros defined in
opal_datatype_internal.h at line 442.

This commit was SVN r21800.
2009-08-11 18:44:06 +00:00
Ralph Castain
1dc12046f1 Modify the OMPI paffinity and mapping system to support socket-level mapping and binding. Mostly refactors existing code, with modifications to the odls_default module to support the new capabilities.
Adds several new mpirun options:

* -bysocket - assign ranks on a node by socket. Effectively load balances the procs assigned to a node across the available sockets. Note that ranks can still be bound to a specific core within the socket, or to the entire socket - the mapping is independent of the binding.

* -bind-to-socket - bind each rank to all the cores on the socket to which they are assigned.

* -bind-to-core - currently the default behavior (maintained from prior default)

* -npersocket N - launch N procs for every socket on a node. Note that this implies we know how many sockets are on a node. Mpirun will determine its local values. These can be overridden by provided values, either via MCA param or in a hostfile

Similar features/options are provided at the board level for multi-board nodes.

Documentation to follow...

This commit was SVN r21791.
2009-08-11 02:51:27 +00:00
Rainer Keller
76469ea64a - Change the property of a few files, that obviously
don't need to be svn:executable...

This commit was SVN r21786.
2009-08-11 01:40:00 +00:00
Josh Hursey
bd6426d3dd This is a minor cleanup of the configure.m4 (per suggestion from Jeff).
Refs trac:1987
The patch for v1.3 attached to Ticket #1987 already includes this change.

I did not have a chance to commit this last night, so sorry for the delay.

This commit was SVN r21777.

The following Trac tickets were found above:
  Ticket 1987 --> https://svn.open-mpi.org/trac/ompi/ticket/1987
2009-08-07 23:38:54 +00:00
Rainer Keller
3f727f0e61 - Until either r21763 is backed out, unconstify complex types.
Complex should either be a struct of float on the ompi-layer
   or as C99 float _complex in opal.
   unconstify to not break windows / the mpi* wrappers.

This commit was SVN r21770.

The following SVN revision numbers were found above:
  r21763 --> open-mpi/ompi@668f001351
2009-08-06 14:42:47 +00:00
Josh Hursey
063f5b2ff6 After talking through the patch with Jeff, we have a couple more fixes to r21766 that should also go over to v1.3 in Ticket #1987.
* Check for {{{dlfcn.h}}} in the self component's configure.m4 (also clean up the .m4 a bit.
 * Adjust the priority of the BLCR component so that the self component has a higher priority (if the application went to the trouble of writing the routines, why not use them.) The 'self' component checks for the appropriate functions during query, so it know if it -can- be used during component selection.
 * Adjust some copyrights that I missed before
 * Fix a warning when casing the result of dlsym() into a function pointer. There is a bit of pointer magic to make this happen (thanks to the following website, and RedHat EL 4 man pages for illustrating it:
  http://www.opengroup.org/onlinepubs/009695399/functions/dlsym.html

Passing to Jeff for a final review of the patch before moving to v1.3.

This commit was SVN r21768.

The following SVN revision numbers were found above:
  r21766 --> open-mpi/ompi@91e52d062b
2009-08-05 22:07:37 +00:00
Josh Hursey
91e52d062b Fix the 'self' CRS component.
Due to the visibility patch to libltdl in r21731, this module can no longer access or use the libltdl interfaces directly. Instead just use the dlopen/dlsym/dlclose functions directly. This is a portability implication here, but for the moment it does not seem to bite us.

Also in this patch, cleanup some of the 'self' specific code paths.
 * opal-restart need not special case the 'self' component since it can now interact with it as if it were a normal component.
 * Cleanup the initialization of the cmd line arguments in opal-restart.
 * Make sure to mark opal-restart as a 'tool', but do so by setting the global variable directly instead of setting the environment variable, which could be inherited by the application.
 * Most of the functions in the 'self' component should not be used by a command line tool (exception being 'restart'), so make sure that if we accidently call them then errors are returned.
 * Increase the priority of the 'none' component to be above that of 'self' when being selected in a command line tool. This allows for both mpirun and opal-restart to work correctly with the 'self' module.

This commit was SVN r21766.

The following SVN revision numbers were found above:
  r21731 --> open-mpi/ompi@0278b86456
2009-08-05 16:21:51 +00:00
George Bosilca
668f001351 Add support for complex (complex8, complex16 and complex32) in the
opal datatype engine. These types are created from two consecutive 
basic types (float, double and long double).

This commit was SVN r21763.
2009-08-04 23:42:32 +00:00
Jeff Squyres
90d6491737 Since ompi_info is no longer written in C++, we no longer require a
C++ compiler in configure.  If we have a C++ compiler, then the MPI
C++ bindings are built by default.  If we don't have a C++ compiler,
then the MPI C++ bindings are not built by default.

--enable-mpi-cxx will now force an error if there is no C++ compiler
available.  --disable-mpi-cxx (or the lack of a C++ compiler) will now
disable many of the C++ compiler checks in configure.

Note that there are a few items to clean up regarding the difference
between C's _Bool type and C++'s bool type.  Right now, we assume that
they are the same.  But they aren't, and they shouldn't be treated as
such.  This cleanup will be forced in MPI-2.2 with the introduction of
the MPI_C_BOOL MPI datatype.

This commit was SVN r21755.
2009-08-04 11:54:01 +00:00
George Bosilca
5155eaf945 The opal datatype engine should _ALWAYS_ be initialized. Therefore move the
call to opal_datatype_init in the opal_util_init.

This commit was SVN r21754.
2009-08-03 16:46:33 +00:00
Jeff Squyres
98b0a7af3d Per http://www.open-mpi.org/community/lists/devel/2009/08/6555.php,
add an lt_dlerror() in the error output one of the error cases.

This commit was SVN r21751.
2009-08-03 16:29:52 +00:00
Jeff Squyres
9455eb804f Oops -- fix the type.
This commit was SVN r21750.
2009-08-03 16:26:15 +00:00
Jeff Squyres
7b1f65095b Update to match the current code and be a bit more explicit (since
others are currently looking at this code).

This commit was SVN r21746.
2009-07-30 12:45:59 +00:00
Jeff Squyres
cb653bc4e8 Change the test memalign() call to use an alignment of 4 so that some
debuggers stop complaining. :-)

This commit was SVN r21744.
2009-07-29 20:33:38 +00:00
Jeff Squyres
69139e4171 Print a warning if someone tries to set opal_ptmalloc2_disable via an
MCA parameter file.

This commit was SVN r21743.
2009-07-29 20:05:56 +00:00
Jeff Squyres
d12db20089 This function actually returns an int, not a bool (OPAL_SUCCESS or
OPAL_ERROR).  Also add a line to the docs describing that it's ok to
pass in NULL for the source_file.

This commit was SVN r21742.
2009-07-29 19:52:18 +00:00
Jeff Squyres
c7376ae053 Start using Libtool's shared library versioning scheme. See lengthy
note in VERSION file.

NOTE: the versions will ''always'' be 0:0:0 on the SVN trunk and
developer branches.  They will only have meaningful values (starting
with 0:0:0 in 1.3.4) on release branches.  Only RM's will modify these
values immediately preceeding a release.

This commit was SVN r21729.
2009-07-23 21:35:17 +00:00
Ralph Castain
c459615f8f When someone specifies a rank-file slot-list of N:*, stop the loop at the proper place (we were going through the loop one too many times).
Thanks to Eugene for spotting it.

This commit was SVN r21728.
2009-07-23 17:51:15 +00:00
Jeff Squyres
bfd689f0ef Per discussion on the mailing list, backing out r21723.
This commit was SVN r21725.

The following SVN revision numbers were found above:
  r21723 --> open-mpi/ompi@2250be582d
2009-07-22 00:02:00 +00:00
Jeff Squyres
38faae6eab Ignore this component until it can be fixed properly.
This commit was SVN r21724.
2009-07-21 22:35:45 +00:00
Iain Bason
2250be582d Added autodetect installdirs component. Currently supports Solaris and Linux.
* Installation directories will be inferred from the actual location
  of the shared library that contains the component.

* OPAL_PREFIX and other environment variables allow users to override
  the inferred directories.  They should no longer be necessary in
  most cases, though.

* Any directories that cannot be inferred will fall back to whatever
  is provided by the config installdirs component.

This commit was SVN r21723.
2009-07-21 20:19:38 +00:00
Shiqing Fan
4c3ea7a0d2 Add a missing header for defining O_BINARY on windows.
This commit was SVN r21719.
2009-07-20 09:00:58 +00:00
Aurelien Bouteiller
9cc2557f9c fixed a typo. oapl instead of opal
This commit was SVN r21711.
2009-07-17 22:20:53 +00:00
George Bosilca
03a0b04ab8 Get rid of all unused functions.
This commit was SVN r21704.
2009-07-16 19:53:07 +00:00
George Bosilca
3e971e61f3 The system headers are supposed to be protected by #ifdef and not by #if.
This commit was SVN r21700.
2009-07-16 18:27:33 +00:00
Shiqing Fan
957cdceb20 Another OMPI->OPAL renaming.
This commit was SVN r21659.
2009-07-14 06:54:17 +00:00
Shiqing Fan
f991c87c6a A few structures that need to be exported for Windows.
This commit was SVN r21658.
2009-07-14 06:53:42 +00:00
George Bosilca
a934b9d975 Add the Open MPI specific part based on a patch from Manuel. Add the
sparc and alpha. A manpage patch is also included. This partially fixes
ticket #1973.

This commit was SVN r21654.
2009-07-13 20:01:12 +00:00
George Bosilca
81c16ac317 Add missing header.
This commit was SVN r21653.
2009-07-13 19:40:31 +00:00
Shiqing Fan
503f2817b3 Corresponding changes to r21641 and r21642 for Windows.
- Add a CMake macro for checking OPAL_MAX_XXX values, re-written from OPAL_WITH_OPTION_MIN_MAX_VALUE m4 function. 
- Definition prefix changes and additional datatype alignments checking.
- Finish the datatype splitting on Windows too. :-)

This commit was SVN r21649.

The following SVN revision numbers were found above:
  r21641 --> open-mpi/ompi@6c5532072a
  r21642 --> open-mpi/ompi@c971c09eb6
2009-07-13 17:39:41 +00:00
Shiqing Fan
2f552eb8c1 Missing C_DECLS.
This commit was SVN r21648.
2009-07-13 17:31:42 +00:00
Rainer Keller
6c5532072a - Split the datatype engine into two parts: an MPI specific part in
OMPI
   and a language agnostic part in OPAL. The convertor is completely
   moved into OPAL.  This offers several benefits as described in RFC
   http://www.open-mpi.org/community/lists/devel/2009/07/6387.php
   namely:
    - Fewer basic types (int* and float* types, boolean and wchar
    - Fixing naming scheme to ompi-nomenclature.
    - Usability outside of the ompi-layer.
 - Due to the fixed nature of simple opal types, their information is
   completely
   known at compile time and therefore constified
 - With fewer datatypes (22), the actual sizes of bit-field types may be
   reduced
   from 64 to 32 bits, allowing reorganizing the opal_datatype
   structure, eliminating holes and keeping data required in convertor
   (upon send/recv) in one cacheline...
   This has implications to the convertor-datastructure and other parts
   of the code.
 - Several performance tests have been run, the netpipe latency does not
   change with
   this patch on Linux/x86-64 on the smoky cluster.
 - Extensive tests have been done to verify correctness (no new
   regressions) using:
   1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and
    ompi-ddt:
    a. running both trunk and ompi-ddt resulted in no differences
       (except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run
       correctly).
    b. with --enable-memchecker and running under valgrind (one buglet
       when run with static found in test-suite, commited)
   2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt:
      all passed (except for the dynamic/ tests failed!! as trunk/MTT)
   3. compilation and usage of HDF5 tests on Jaguar using PGI and
      PathScale compilers.
   4. compilation and usage on Scicortex.
 - Please note, that for the heterogeneous case, (-m32 compiled
   binaries/ompi), neither
   ompi-trunk, nor ompi-ddt branch would successfully launch.

This commit was SVN r21641.
2009-07-13 04:56:31 +00:00
Jeff Squyres
d7d07e0720 Improve the help messages from r20706.
This commit was SVN r21616.

The following SVN revision numbers were found above:
  r20706 --> open-mpi/ompi@248bbb8a2f
2009-07-09 11:58:31 +00:00
George Bosilca
2570a15651 Add a TODO bullet for later processing ...
This commit was SVN r21611.
2009-07-07 17:27:47 +00:00
George Bosilca
b85e3636f3 Cope with the case where IPv6 headers are not available.
This commit was SVN r21593.
2009-07-02 18:00:26 +00:00
Ralph Castain
d3fb39073f Initialize a variable to ensure we get the correct number of bound processors
This commit was SVN r21590.
2009-07-02 17:48:04 +00:00
Shiqing Fan
0e09cb650e The kernel index of the network interface wasn't set on Windows, it really caused a lot of problems.
This commit was SVN r21587.
2009-07-02 14:44:41 +00:00
Shiqing Fan
22666721a5 Fix a typo.
This commit was SVN r21584.
2009-07-02 08:49:22 +00:00
Shiqing Fan
c8284907c4 Another header for using _getch on Windows.
This commit was SVN r21537.
2009-06-26 13:07:08 +00:00
Shiqing Fan
77f33d182b Add another definition for deprecated UNIX function on Windows, silent the warnings.
This commit was SVN r21507.
2009-06-24 18:14:23 +00:00
Jeff Squyres
246caafe06 Correct the logic of the check for the env variable
OMPI_MCA_memory_ptmalloc2_disable and also add an explicit check for
FAKEROOTKEY (see http://bugs.debian.org/531522).

This commit was SVN r21489.
2009-06-20 11:22:06 +00:00
Jeff Squyres
f42727707b Per http://bugs.debian.org/531522, add an MCA param/environment
variable to allow the disabling of the ptmalloc2 component at init
time.

This commit was SVN r21479.
2009-06-19 10:50:23 +00:00
Jeff Squyres
6777f01380 Also look for /dev/ipath
This commit was SVN r21410.
2009-06-11 00:35:21 +00:00
Ralph Castain
f966d9f972 Fix visibility issues with opal_graph functions.
Fix the carto test so it can compile - need to update input file so it can run

This commit was SVN r21403.
2009-06-09 15:02:57 +00:00
Ralph Castain
c3c1ab1337 Correct a comment in paffinity.h about what paffinity_get returns - it was inaccurate.
Revamp the affinity detection/set procedure in mpi_init to correctly detect when we have already been bound to processors, given the revised understanding of paffinity_get. Add a new paffinity macro to make checking for already bound a little nicer.

This commit was SVN r21402.
2009-06-09 14:33:35 +00:00
Josh Hursey
70333b9441 Some components were still using OMPI_*_VERSION instead of OPAL_*_VERSION, so convert them over (Jeff is taking care of PLPA, so that is not included here).
This commit was SVN r21384.
2009-06-05 15:34:59 +00:00
Shiqing Fan
e46bf10efd Correctly include win32 util header.
This commit was SVN r21343.
2009-06-01 19:16:00 +00:00
Rainer Keller
b572dc3591 - As discussed revert r21330, Fortran-configure info should
not end up in OPAL
 - Will post an updated patch for the OMPI_ALIGNMENT_ parts (within C).

This commit was SVN r21342.

The following SVN revision numbers were found above:
  r21330 --> open-mpi/ompi@95596d1814
2009-06-01 19:02:34 +00:00
Rainer Keller
95596d1814 - Move alignment and size output generated by configure-tests
into the OPAL namespace, eliminating cases like opal/util/arch.c
   testing for ompi_fortran_logical_t.
   As this is processor- and compiler-related information
   (e.g. does the compiler/architecture support REAL*16)
   this should have been on the OPAL layer.
 - Unifies f77 code using MPI_Flogical instead of opal_fortran_logical_t

 - Tested locally (Linux/x86-64) with mpich and intel testsuite
   but would like to get this week-ends MTT output


 - PLEASE NOTE: configure-internal macro-names and
   ompi_cv_ variables have not been changed, so that
   external platform (not in contrib/) files still work.

This commit was SVN r21330.
2009-05-30 15:54:29 +00:00
Shiqing Fan
06841bf721 Fix a typo.
This commit was SVN r21305.
2009-05-27 19:08:46 +00:00
Shiqing Fan
ed61198423 Rewrite inet_ntop and inet_pton with WSAAddressToString and WSAStringToString for Windows. Decrease the minimum requirement for Windows Version.
This commit was SVN r21304.
2009-05-27 19:04:59 +00:00
Shiqing Fan
b3c077e112 Microsoft doesn't provide inet_pton and inet_ntop APIs on Windows XP, but only on Windows Vista and 2008. So add a stand alone version of inet_pton and inet_ntop functions from ISC.
This commit was SVN r21295.
2009-05-27 14:32:30 +00:00
Jeff Squyres
510ec093b6 Remove some dead code
This commit was SVN r21275.
2009-05-26 21:17:00 +00:00
Iain Bason
e7ff2368d6 This fixes trac:1930.
Emit a more informative error message when the file descriptor limit is
reached during an accept() call.  Also, abort when the accept fails to
avoid an infinite loop.

Emit a more informative error message when the help file can't be opened.

This commit was SVN r21271.

The following Trac tickets were found above:
  Ticket 1930 --> https://svn.open-mpi.org/trac/ompi/ticket/1930
2009-05-26 20:03:21 +00:00
Rainer Keller
36ee105d6a - Fix Coverity CID #1207:
set the tmp_str to NULL, so we don't have any double-free...
   Additionally, we should check for malloc returning NULL...

This commit was SVN r21228.
2009-05-14 00:21:15 +00:00
Rainer Keller
3f7f2b6f0f - Multiple functions, that allocate and return new
strings, aka should have __opal_attribute_malloc__
   update comment of opal_path_access -- new string returned.

This commit was SVN r21227.
2009-05-14 00:20:07 +00:00
Rainer Keller
73fd329cbd - Add the proper __opal_attribute_format__(__printf__...) to
declarations.

This commit was SVN r21226.
2009-05-14 00:10:59 +00:00
Shiqing Fan
3137001772 Read from the correct registry entry on Windows Vista and Server 2008.
This commit was SVN r21224.
2009-05-13 15:56:37 +00:00
Ralph Castain
aa25a51c92 Do not mark the mpi_paffinity_alone param as deprecated so we don't scare Jeff...er...users.
This commit was SVN r21218.
2009-05-12 15:41:11 +00:00
Josh Hursey
ec6c5bf5e9 Make sure that when we destruct the pointer array that we set the address to NULL and size to 0. This will help to flag accidental usage of a destructed pointer array object.
This commit was SVN r21216.
2009-05-12 14:13:07 +00:00
Jeff Squyres
05d87ee7b4 Because this error comes up over and over and over and over and ...
Libltdl erroneously returns an error string of "file not found" for
lots of reasons, even if the file really *is* there, but just failed
to dlopen() for some reason.  So if lt_dlerror() returns "file not
found", do some simple hueristics and if we *do* find a file, print a
slightly better error message.

This commit was SVN r21214.
2009-05-12 12:41:42 +00:00
Ralph Castain
d396f0a6fc Per the discussion on the devel list, move the binding of processes to processors from MPI_Init to process start. This involves:
1. replacing mpi_paffinity_alone with opal_paffinity_alone - for back-compatibility, I have aliased mpi_paffinity_alone to the new param name. This caus
es a mild abstraction break in the opal/mca/paffinity framework - per the devel discussion...live with it. :-) I also moved the ompi_xxx global variable
 that tracked maffinity setup so it could be properly closed in MPI_Finalize to the opal/mca/maffinity framework to avoid an abstraction break.

2. Added code to the odls/default module to perform paffinity binding and maffinity init between process fork and exec. This has been tested on IU's odi
n cluster and works for both MPI and non-MPI apps.

3. Revise MPI_Init to detect if affinity has already been set, and to attempt to set it if not already done. I have *not* tested this as I haven't yet f
igured out a way to do so - I couldn't get slurm to perform cpu bindings, even though it supposedly does do so.

This has only been lightly tested and would definitely benefit from a wider range of evaluation...

This commit was SVN r21209.
2009-05-12 02:18:35 +00:00
Josh Hursey
d920a302f3 Some more C/R related commits that have been sitting off-trunk for a while.
* Pass the sequence number of the checkpoint along with reference from the global to the local coordinator.
 * 'orte-restart --apponly' now just generates the app context file, and does not run with it. This provides the user the ability to edit the file before launching. 
 * Add a OPAL_CRS_NONE state
 * Split the INC into three distinct parts.
 * Implement a restart mechanism for the 'none' component. If given a context it simply execvp()'s it.

This commit was SVN r21195.
2009-05-08 20:51:13 +00:00
Josh Hursey
5d0607395d A couple of C/R related commits that have been sitting off-trunk for a while.
* Add 'orte-checkpoint -l' option that lists all checkpoints currently available on the system.
 * Add 'orte-restart -i' which prints information regarding the checkpoint targeted for restart.
 * Add ability to extract the timing metadata.
 * Fix show_help() in the orte-checkpoint and orte-restart tools. They should be using the opal versions instead of the orte versions (otherwise nothing is printed).

This commit was SVN r21194.
2009-05-08 19:41:11 +00:00
Shiqing Fan
537b8cd8b1 Get rid of improper use of SET_SOURCE_FILES_PROPERTIES. When using the latest CMake (2.6 patch 4), we get many errors, which didn't show in previous version.
This commit was SVN r21188.
2009-05-07 17:41:05 +00:00
Rainer Keller
b0754071b7 - For compilation with BLCR and --with-ft=cr, #include <string.h>
This commit was SVN r21185.
2009-05-07 16:14:59 +00:00
Greg Koenig
60485ff95f This is a very large change to rename several #define values from
OMPI_* to OPAL_*.  This allows opal layer to be used more independent
from the whole of ompi.

NOTE: 9 "svn mv" operations immediately follow this commit.

This commit was SVN r21180.
2009-05-06 20:11:28 +00:00
Rainer Keller
b8bb7865bc - Fix Coverity CID 9
Check for error in fcntl, as we depend on close-on-exec,
   F_SETFD will result in -1 in case of error (stored in errno).
   To not have a follow-up warning about not freeing filename, move up.

This commit was SVN r21171.
2009-05-05 19:04:58 +00:00
Shiqing Fan
cd565923d3 Completely remove ltdl support for Windows build.
This commit was SVN r21170.
2009-05-05 18:59:13 +00:00
Josh Hursey
8b8bee04d6 It seems that some of the patches were missed in r21131. :(
This patch contains the following items:
 * Fix the flag passed to open() for the read side of the named pipe between the local and app coordinator. There is a race condition when using O_RDWR on a named pipe (not sure how that bug got in there in the first place).
 * Adjust control in the C/R thread timing
 * Clarify return code in BLCR component
 * Allow the user to adjust the max wait time for the named pipes in the FileM local coordinator by using the MCA parameter "snapc_full_max_wait_time" (Default: 20 seconds)
 * If the application terminates while there are active FileM operations, force mpirun to wait on these operations to complete.
 * Allow the user to set the local copy command (Default: cp) via MCA parameter "filem_rsh_cp"
 * Implement the ability to throttle the number of outgoing connections in FileM. At larger scales this type of explicit throttling helps prevent overwhelming the HNP machine. Default: 10, set via MCA parameter: {{{filem_rsh_max_outgoing}}}

This commit was SVN r21167.

The following SVN revision numbers were found above:
  r21131 --> open-mpi/ompi@0deb009225
2009-05-05 16:45:49 +00:00
Shiqing Fan
c3380e9df2 put all generated files in the binary directory.
This commit was SVN r21160.
2009-05-05 13:50:48 +00:00
Shiqing Fan
8db5c3c002 Add missing quotation marks to the variables, in order to keep the semicolons in the output c file.
This commit was SVN r21151.
2009-05-05 08:29:19 +00:00
Ralph Castain
468800996b Make it possible to no-build the carto framework
Could swear we had done this before...but I guess not!

This commit was SVN r21150.
2009-05-05 03:54:58 +00:00
Josh Hursey
1327c57e9d add back a missing header
This commit was SVN r21148.
2009-05-04 21:30:11 +00:00
Shiqing Fan
5856cedc2b Remove libltdl related files and folders.
Add a find module for libltdl, so that user can still enable dlopen support (default off), and use natively installed libtool.

This commit was SVN r21146.
2009-05-04 17:35:48 +00:00
Ralph Castain
e1673778be Replace missing headers
This commit was SVN r21136.
2009-05-01 15:09:10 +00:00
Josh Hursey
38aca518bd Properly initialize this variable
This commit was SVN r21130.
2009-04-30 16:43:05 +00:00
Josh Hursey
ab63ab6568 forgot to update the copyright
This commit was SVN r21128.
2009-04-30 16:39:54 +00:00
Josh Hursey
76812318bb Fix a potential NULL reference
This commit was SVN r21127.
2009-04-30 16:38:43 +00:00
Josh Hursey
759c2b5596 Add a 'crs_blcr_dev_null' MCA parameter. This causes BLCR to checkpoint directly to /dev/null instead of to a file.
Though this is not useful in checkpointing an application, it can be a useful diagnostic.

This commit was SVN r21125.
2009-04-30 16:32:55 +00:00
Shiqing Fan
ff0e51f686 Include a missing header.
This commit was SVN r21121.
2009-04-30 09:03:21 +00:00
Shiqing Fan
fbaa30bf61 Add a few log file definitions for Windows.
This commit was SVN r21119.
2009-04-30 08:59:46 +00:00
Shiqing Fan
e7b6445b32 Add a missed .windows file into the tarball.
This commit was SVN r21105.
2009-04-29 10:31:10 +00:00
Rainer Keller
221fb9dbca ... Delayed due to notifier commits earlier this day ...
- Delete unnecessary header files using
   contrib/check_unnecessary_headers.sh after applying
   patches, that include headers, being "lost" due to
   inclusion in one of the now deleted headers...

   In total 817 files are touched.
   In ompi/mpi/c/ header files are moved up into the actual c-file,
   where necessary (these are the only additional #include),
   otherwise it is only deletions of #include (apart from the above
   additions required due to notifier...)

 - To get different MCAs (OpenIB, TM, ALPS), an earlier version was
   successfully compiled (yesterday) on:
   Linux locally using intel-11, gcc-4.3.2 and gcc-SVN + warnings enabled
   Smoky cluster (x86-64 running Linux) using PGI-8.0.2 + warnings enabled
   Lens cluster (x86-64 running Linux) using Pathscale-3.2 + warnings enabled

This commit was SVN r21096.
2009-04-29 01:32:14 +00:00
Rainer Keller
6c1cce8761 - For the upcoming header cleanup commit,
several header files (previously included by header-files)
   now have to be moved "upward".
   This is mainly system headers such as string.h, stdio.h and for
   networking, but also some orte headers.

This commit was SVN r21095.
2009-04-29 00:49:23 +00:00
Josh Hursey
84471e4bd4 Protect the free of base->sig.sh_old . This was raising a warning under some finalization circumstances.
This should be moved to v1.3

Thanks to Yaakoub El Khamra for the patch. 

This commit was SVN r21079.
2009-04-27 17:05:10 +00:00
Shiqing Fan
3d4e0472d6 Add windows support files into the tarball, including .windows, CMakeLists.txt files, and CMake modules. Thanks to Jeff for testing it on Linux.
This commit was SVN r21069.
2009-04-24 16:39:33 +00:00
Jeff Squyres
0bd9ef0bb9 Some valgrind-clean fixes. Thanks to "Number Cruncher" on the devel
list for pointing these out.

This commit was SVN r21060.
2009-04-23 18:50:46 +00:00
Jeff Squyres
9b5e194d9b Fix opal_basename. Wow; how long has that been broken?
This commit was SVN r21057.
2009-04-22 20:48:24 +00:00
Brian Barrett
2ca0b7fe44 remove some checks which are not needed after the recent ptmalloc2 changes
This commit was SVN r21042.
2009-04-19 18:17:05 +00:00
Jeff Squyres
e90ecb6020 Fix a compiler warning. Put in a good comment explaining why it is
declared the way it is.  Sigh.

This commit was SVN r21040.
2009-04-17 21:59:31 +00:00
Ralph Castain
afe1950da5 Make the error message clearer - this error only is used when two buffer types don't match, thus preventing an operation from being executed
This commit was SVN r21033.
2009-04-16 16:23:28 +00:00
Jeff Squyres
35fc9fedd2 MTT is your friend: Cisco tests --enable-static --disable-shared, but
we had already tested this scenario manually to know that it seemed to
be working.  What we ''didn't'' test was --enable-static
--disable-shared --disable-dlopen -- but my MTT '''did.'''  Yay!

This commit fixes that scenario.  Essentially we need to call a dummy
function in hooks.c to ensure that the linker pulls in all those
symbols into the final executable (and therefore pulls in the
malloc_initialize_hook, etc.).  Thanks for the heads-up from Brian in
fixing this one!

This commit was SVN r21022.
2009-04-15 19:09:10 +00:00
Ralph Castain
9c39a3edd7 Enable the passing of MCA params to dynamically spawned jobs. This creates a new info_key "ompi_param" that allows a user to specify MCA params for a dynamically spawned job.
We currently apply all of the MCA params in the parent job to the child. This commit allows a user to specify additional params for the child job, and to override any pre-existing params with the new value so they can better control behavior of the child job.

This commit was SVN r20989.
2009-04-14 14:15:49 +00:00
Shiqing Fan
0ea6d48320 Add a missed .windows file for timer component, which should be built always statically.
This commit was SVN r20987.
2009-04-14 12:19:21 +00:00
Jeff Squyres
3cfa8f55c4 Gaah; I meant to include a better comment in the last commit but had
forgotten to save before the commit was sent.

This comment explains why we're doing a cache check here rather than a
real check.

This commit was SVN r20975.
2009-04-10 21:16:23 +00:00
Jeff Squyres
9fcd01035d Fix a problem reported by Steve Kagl on the user's list; the posix
component (which we probably don't test regularly because we probably
only test environments where the other paffinity components are used)
was not getting built because it had a bad configure test.

This commit was SVN r20974.
2009-04-10 21:15:20 +00:00
Jeff Squyres
8fac195a3a Fixes trac:1871. Take a slightly different approach than before:
1. Probe the signal number that we want
 1. If a handler is already installed there:
    1. if the opal_signal MCA param entry for this signal is suffixed
       with ":complain", then output a show_help message
    1. do not install our signal handler
 1. otherwise, install our signal handler

Hence, we've shifted to a policy of only complaining if the user asks
us to complain.

This commit was SVN r20969.

The following Trac tickets were found above:
  Ticket 1871 --> https://svn.open-mpi.org/trac/ompi/ticket/1871
2009-04-10 15:32:33 +00:00
Jeff Squyres
500750b542 Oops; the MCA param name is "opal_signal", not "opal_signals". Thanks
for noticing, Ralph!

This commit was SVN r20968.
2009-04-09 16:13:07 +00:00
Nysal Jan
2500d88380 Optimize the computation of 16-bit checksum
This commit was SVN r20965.
2009-04-09 11:04:38 +00:00
Shiqing Fan
6e04a4de08 On Windows, define a equivalent type for in_addr_t, and correctly include unistd.h.
This commit was SVN r20951.
2009-04-07 16:07:05 +00:00
Jeff Squyres
a13dfb2140 Add in a proper test for munmap.
This commit was SVN r20936.
2009-04-04 00:43:17 +00:00
Jeff Squyres
52a0e5fe69 Add some checks for more network driver types.
This commit was SVN r20934.
2009-04-02 19:17:21 +00:00
Nysal Jan
ab18a3629f Change the return type to handle the case where an invalid interface name is passed to this function.
This commit was SVN r20933.
2009-04-02 18:35:09 +00:00
Jeff Squyres
3bf8c7025a Remove compiler warning about function not being prototyped.
This commit was SVN r20929.
2009-04-02 13:06:47 +00:00
Jeff Squyres
0f517c3d3f Gah; some non-final code got merged in by accident. Remove debugging
printf and put in the final test code for malloc.

This commit was SVN r20924.
2009-04-01 18:20:23 +00:00
Jeff Squyres
bf17ce1d3f Doh; forgot to add the OPAL_DECLSPEC to munmap().
This commit was SVN r20923.
2009-04-01 18:09:25 +00:00
Jeff Squyres
7aa431882c Remove the mallopt component (accidentally missed in r20921); refs
#1853.

This commit was SVN r20922.

The following SVN revision numbers were found above:
  r20921 --> open-mpi/ompi@0d52271cd6
2009-04-01 18:02:08 +00:00
Jeff Squyres
0d52271cd6 Per http://www.open-mpi.org/community/lists/announce/2009/03/0029.php
and https://svn.open-mpi.org/trac/ompi/ticket/1853, mallopt() hints do
not always work -- it is possible for memory to be returned to the OS
and therefore OMPI's registration cache becomes invalid.

This commit removes all use of mallopt() and uses a different way to
integrate ptmalloc2 than we have done in the past.  In particular, we
use almost exactly the same technique as MX:

 * Remove all uses of mallopt, to include the opal/memory mallopt
   component.
 * Name-shift all of OMPI's internal ptmalloc2 public symbols (e.g.,
   malloc -> opal_memory_ptmalloc2_malloc).
 * At run-time, use the existing glibc allocator malloc hook function
   pointers to fully hijack the glibc allocator with our own
   name-shifted ptmalloc2.
 * Make the decision whether to hijack the glibc allocator ''at run
   time'' (vs. at link time, as previous ptmalloc2 integration
   attempts have done).  Look at the OMPI_MCA_mpi_leave_pinned
   and OMPI_MCA_mpi_leave_pinned_pipeline environment variables and
   the existence of /sys/class/infiniband to determine if we should
   install the hooks or not.
 * As an added bonus, we can now tell if libopen-pal is linked
   statically or dynamically, and if we're linked statically, we
   assume that munmap intercept support doesn't work.

See the opal/mca/memory/ptmalloc2/README-open-mpi.txt file for all the
gory details about the implementation.

Fixes trac:1853.

This commit was SVN r20921.

The following Trac tickets were found above:
  Ticket 1853 --> https://svn.open-mpi.org/trac/ompi/ticket/1853
2009-04-01 17:52:16 +00:00
Jeff Squyres
b7a052a81d Fix r20917 -- some systems require <unistd.h> for !_SC_PAGESIZE.
This commit was SVN r20920.

The following SVN revision numbers were found above:
  r20917 --> open-mpi/ompi@d10393a925
2009-04-01 16:41:12 +00:00
George Bosilca
b7c1ae4f76 Nothing important, just an identation.
This commit was SVN r20919.
2009-04-01 15:27:16 +00:00
George Bosilca
d10393a925 A more optimized version of the set function. It only touch the first byte on
each page. Anyway, this function is _NEVER_ called as we use bind instead of set.
So please don't rely on the first touch memory affinity to do the right thing.

This commit was SVN r20917.
2009-04-01 15:24:03 +00:00
George Bosilca
6ca6cfaafc Mark the address with the correct type.
This commit was SVN r20916.
2009-04-01 15:18:08 +00:00
Terry Dontje
4b43911c6a Remove superfluous spaces in manpages that were causing catman to
generate mangled windex files.  Made ompi-top.1 and ompi-iof.1 build
by default.  Also, added the orte-top synonym to the ompi-top manpage.

This commit was SVN r20915.
2009-04-01 14:40:27 +00:00
Shiqing Fan
36a813415d When build from a tarball, there will be Linux-generated files that could not be used on Windows, so exclude them, and use the ones generated by CMake.
This commit was SVN r20858.
2009-03-24 18:10:57 +00:00
Ralph Castain
17f51a0389 Add a new PML module that acts as a "mini-dr" - when requested, it performs a dr-like checksum on messages for BTL's that require it, as specified by MCA params.
Add two new configure options that specify:

1. when to add padding to the openib control header - this *only* happens when the configure option is specified

2. when to use the dr-like checksum as opposed to the memcpy checksum. Not selectable at runtime - to eliminate performance impacts, this is a configure-only option

Also removed an unused checksum version from opal/util/crc.h.

The new component still needs a little cleanup and some sync with recent ob1 bug fixes. It was created as a separate module to avoid performance hits in ob1 itself, though most of the code is duplicative. The component is only selectable by either specifying it directly, or configuring with the dr-like checksum -and- setting -mca pml_csum_enable_checksum 1.

Modify the LANL platform files to take advantage of the new module.

This commit was SVN r20846.
2009-03-23 23:52:05 +00:00
Ralph Castain
8d313b55ef Correct a couple of minor typos
This commit was SVN r20843.
2009-03-23 18:05:34 +00:00
Rainer Keller
a3c3babe01 - Ewww, r20817 messed up PGI on Jaguar big time!
Now, while #include "ompi_config.h" is good and fine in order
   to have OMPI_DECLSPEC,
   here it led to stdint.h (with the uint8_t) being included early
   but INSIDE a namespace "MPI" {}.
   Of course it was included anymore (thinkg #define _STDINT_H), when
   it was required in opal/class/opal_hash_list.h
   NOT good.

 - opal/class/opal_object.h: Yeah, one can have nested extern "C" {}
   but it's not necessary. Instead just have the outer *_C_DECLS.

This commit was SVN r20837.

The following SVN revision numbers were found above:
  r20817 --> open-mpi/ompi@6f808d9b05
2009-03-21 01:37:38 +00:00
Rainer Keller
be66cc2279 - We're using uint16_t, uint32_t, and friends,
so #include <stdint.h> if we have it...

This commit was SVN r20835.
2009-03-21 01:26:27 +00:00
Terry Dontje
d521a7bb71 Add visibility feature for Sun Studio compilers.
This commit was SVN r20833.
2009-03-20 10:13:01 +00:00
No Author
4711d6111f Per a comment on the users list, don't try to install our own signal
handlers if there are already non-default handlers installed.  Print a
warning if that situation arises.

'''NOTE:''' This is a definite target for OPAL_SOS conversion -- as it
is right now, this message will be displayed for ''every'' MPI
process.  We want this to be OPAL_SOS'ed when that becomes available
so that the error message can be aggregated nicely.

This commit was SVN r20831.
2009-03-20 01:05:30 +00:00
Shiqing Fan
696416057d Put the debug libraries under 'debug' sub-directory, and set the correct path to find them.
This commit was SVN r20830.
2009-03-19 17:11:47 +00:00
Shiqing Fan
0065d7f0c9 Enable two variables for CPack, for packaging binary tarball and adding a page in the installer.
Enable the debug library suffix, which is extremely necessary on Windows. If users want to debug their own programs in Visual Studio, but linking the programs to the release version libraries of Open MPI, i.e. mixing debug and release version DLLs, that will definitely cause some errors. What we have to do is providing both debug and release versions libraries, distinguished with suffix 'd', e.g. libmpid.dll for debug version.

This commit was SVN r20828.
2009-03-18 17:46:24 +00:00
Jeff Squyres
2815cb88b4 Fixes trac:1836: no reason to constrain the latter numbers to 2 hex
digits.  They likely shouldn't be more than 2 digits anyway, but let's
be social just in case they are (e.g.,
https://bugs.openfabrics.org/show_bug.cgi?id=1544).

This commit was SVN r20824.

The following Trac tickets were found above:
  Ticket 1836 --> https://svn.open-mpi.org/trac/ompi/ticket/1836
2009-03-18 14:43:00 +00:00
Jeff Squyres
730a1b80b2 Roll in 1.3rc4, which includes a fix for a problem discovered by LANL that the test for Valgrind's macro was not strong enough and may therefore try to compile in valgrind support into PLPA even if your version of Valgrind was too old.
This commit was SVN r20819.
2009-03-17 22:18:26 +00:00
Rainer Keller
6f808d9b05 Preparation work for another commit (after RFC):
- This patch solely _adds_ required headers and is rather localized
   The next patch (after RFC) heavily removes headers (based on script)
 - ompi/communicator/communicator.h: For sources that use
   ompi_mpi_comm_world, don't require them to include "mpi.h"
 - ompi/debuggers/ompi_common_dll.c: mca_topo_base_comm_1_0_0_t needs
   #include "ompi/mca/topo/topo.h"
 - ompi/errhandler/errhandler_predefined.h:
   ompi/communicator/communicator.h depends on this header file!
   To prevent recursion just have fwd declarations.
   #include "ompi/types.h" for fwd declarations of the main structs.
 - ompi/mca/btl/btl.h: #include "opal/types.h" for ompi_ptr_t 
 - ompi/mca/mpool/base/mpool_base_tree.c: We use ompi_free_list_t and
   ompi_rb_tree_t, so have the proper classes
 - ompi/mca/op/op.h:
   Op is pretty self-contained: Nobody up to now has done
   #include "opal/class/opal_object.h"
 - ompi/mca/osc/pt2pt/osc_pt2pt_replyreq.h:
   #include "opal/types.h" for ompi_ptr_t 
 - ompi/mca/pml/base/base.h:
   We use opal_lists  
 - ompi/mca/pml/dr/pml_dr_vfrag.h:
   #include "opal/types.h" for ompi_ptr_t
 - ompi/mca/pml/ob1/pml_ob1_hdr.h:
   #include "ompi/mca/btl/btl.h" for mca_btl_base_segment_t
 - opal/dss/dss_unpack.c:
   #include "opal/types.h"
 - opal/mca/base/base.h:
   #include "opal/util/cmd_line.h" for opal_cmd_line_t
 - orte/mca/oob/tcp/oob_tcp.c:
   #include "opal/types.h" for opal_socklen_t
 - orte/mca/oob/tcp/oob_tcp.h:
   #include "opal/threads/threads.h" for opal_thread_t
 - orte/mca/oob/tcp/oob_tcp_msg.c:
   #include "opal/types.h" 
 - orte/mca/oob/tcp/oob_tcp_peer.c:
   #include "opal/types.h"  for opal_socklen_t
 - orte/mca/oob/tcp/oob_tcp_send.c:
   #include "opal/types.h" 
 - orte/mca/plm/base/plm_base_proxy.c:
   #include "orte/util/name_fns.h" for ORTE_NAME_PRINT
 - orte/mca/rml/base/rml_base_receive.c:
   #include "opal/util/output.h" for OPAL_OUTPUT_VERBOSE
 - orte/mca/rml/oob/rml_oob_recv.c:
   #include "opal/types.h" for ompi_iov_base_ptr_t
 - orte/mca/rml/oob/rml_oob_send.c:
   #include "opal/types.h" for ompi_iov_base_ptr_t
 - orte/runtime/orte_data_server.c
   #include "opal/util/output.h" for OPAL_OUTPUT_VERBOSE
 - orte/runtime/orte_globals.h:
   #include "orte/util/name_fns.h" for ORTE_NAME_PRINT

 Tested on Linux/x86-64

This commit was SVN r20817.
2009-03-17 21:34:30 +00:00
Rainer Keller
481b801720 - In opal/class/opal_object.h we don't have the extern "C" {
Use BEGIN_C_DECLS/END_C_DECLS
 - Adapt the other headers as well

This commit was SVN r20802.
2009-03-17 15:11:48 +00:00
Rainer Keller
6a72c0f4d1 - As long as a header declares _DECLSPEC functionality
it should include the corresponding _config.h header file.

   Tested on Linux/x86-64

This commit was SVN r20795.
2009-03-17 01:45:19 +00:00
Jeff Squyres
7ec52bc5a4 Per the lengthy discussion on this thread:
http://www.open-mpi.org/community/lists/users/2009/03/8402.php

Just #define away restrict in C++ because we don't have an
AC_CXX_RESTRICT test to see what the C++ compilers needs to support
"restrict".

This commit was SVN r20792.
2009-03-16 21:09:54 +00:00
Rainer Keller
d8cf4c0fec - Get pgcc on XT to complain less:
In case we use memcmp, strlen, strup and friends include <string.h>
   Also several constants.h are not included directly
 - Let's have mca_topo_base_cart_create  return ompi-errors in
   ompi/mca/topo/base/topo_base_cart_create.c

This commit was SVN r20773.
2009-03-13 02:10:32 +00:00
Rainer Keller
296a6fb275 - So much fun along the way:
we normally don't do opal/include/opal/...
   Just use the std. opal/...

This commit was SVN r20766.
2009-03-12 19:21:11 +00:00
Jeff Squyres
02c4f384b8 Convert libnuma to use the new OMPI_SETUP_COMPONENT_PACKAGE macro
This commit was SVN r20747.
2009-03-06 21:49:00 +00:00
George Bosilca
f3cd687c44 We want to allow users to call opal_thread_join(**, NULL) so what we really have
to test against NULL is the void** pointer. This make this function behave 
like the pthread_join.

This commit was SVN r20724.
2009-03-04 17:02:17 +00:00
Shiqing Fan
05d9f0b933 Fix the error C2100 on Windows, i.e. an illegal indirection.
This commit was SVN r20723.
2009-03-04 16:43:51 +00:00
Rainer Keller
fd28b392bf - An intrusive commit yet again (sorry): with the separation we
get bitten by header depending on having already included
   the corresponding [opal|orte|ompi]_config.h header.
   When separating, things like [OPAL|ORTE|OMPI]_DECLSPEC
   are missed.

   Script to add the corresponding header in front of all following
   (taking care of possible #ifdef HAVE_...)

 - Including some minor cleanups to
   - ompi/group/group.h -- include _after_ #ifndef OMPI_GROUP_H
   - ompi/mca/btl/btl.h -- nclude _after_ #ifndef MCA_BTL_H
   - ompi/mca/crcp/bkmrk/crcp_bkmrk_btl.c -- still no need for
     orte/util/output.h
   - ompi/mca/pml/dr/pml_dr_recvreq.c -- no need for mpool.h
   - ompi/mca/btl/btl.h -- reorder to fit
   - ompi/mca/bml/bml.h -- reorder to fit
   - ompi/runtime/ompi_mpi_finalize.c -- reorder to fit
   - ompi/request/request.h -- additionally need ompi/constants.h

 - Tested on linux/x86-64

This commit was SVN r20720.
2009-03-04 15:35:54 +00:00
Rainer Keller
d68a8a1904 - Now that we don't need it anymore, blast away
ompi/class/ompi_bitmap.[ch] -- may always be restored from svn
   again...

This commit was SVN r20710.
2009-03-04 00:28:58 +00:00
Rainer Keller
811f2bd9b4 - As discussed on RFC, move the ompi_bitmap to the
opal layer.
   Add a check against a maximum (actually get rid of ifs internally to
   opal_bitmap.c) -- the functionality to set the current maximum size
   opal_bitmap_set_max_size() is currently only used in attribute.c
   to set the maximum OMPI_FORTRAN_HANDLE_MAX...

   Tested on linux/x86-64 with intel-tests with all_tests_no_perf_f
   run with 6 procs.
   Let's look into MTT as well...

This commit was SVN r20708.
2009-03-03 22:25:13 +00:00
George Bosilca
5f6896ce5b No memory leaks (this is an improvement for r20706).
This commit was SVN r20707.

The following SVN revision numbers were found above:
  r20706 --> open-mpi/ompi@248bbb8a2f
2009-03-03 22:14:05 +00:00
George Bosilca
248bbb8a2f Give a small chance to those with an "IP guru" admin-sys to define what exactly is
a private IPv4 address. By deafult we obide to the RFC1918 and RFC3330, but we have
the opportunity to change them.

Based on a patch from Camille Coti.

This commit was SVN r20706.
2009-03-03 22:06:09 +00:00
Jeff Squyres
cfcca7d80e Fix some typos in comments submitted by Bert Wesarg.
This commit was SVN r20695.
2009-03-03 12:50:46 +00:00
George Bosilca
826f3319b4 Don't segfault on Windows because of a NULL value.
This commit was SVN r20682.
2009-03-02 21:57:16 +00:00
Shiqing Fan
4d3f801dbd Try to find the installed flex on current windows system first, if it's not there, just use the one comes along with the source.
This commit was SVN r20642.
2009-02-26 13:03:53 +00:00
Shiqing Fan
2326f14be5 Remove the unnecessary PROJECT command, I somehow misunderstood how it should be used on Windows....
This commit was SVN r20634.
2009-02-25 16:07:43 +00:00
Eugene Loh
463f11f993 Improve shared-memory allocation:
* compute mmap-file size more wisely and pass requested size to allocator
* change MCA parameters:
  - get rid of mpool_sm_per_peer_size
  - get rid of mpool_sm_max_size
  - set default mpool_sm_min_size to 0
* no longer pad sm allocations to page boundaries
* have sm_btl_first_time_init check return codes on free-list creations

Have mca_btl_sm_prepare_src() check to see if it can allocate an EAGER fragment
rather than a MAX fragment if the smaller size works.

Remove ompi/class/ompi_[circular_buffer_]fifo.h and references thereto.

Remove opal/util/pow2.[c|h] and references thereto.

This commit was SVN r20614.
2009-02-20 19:51:57 +00:00
Jeff Squyres
1a7556d2c9 Refs trac:1805: temporarily disable some assert()s in event_base_free().
This commit was SVN r20609.

The following Trac tickets were found above:
  Ticket 1805 --> https://svn.open-mpi.org/trac/ompi/ticket/1805
2009-02-20 15:03:36 +00:00
Jeff Squyres
ed22f9744e Bring in PLPA v1.3rc3 (add a missing comma, which should fix compiling
issues at Sun).  

Sorry for the middle of the day configure change, but this should fix
a compile break at Sun...

This commit was SVN r20594.
2009-02-19 17:42:43 +00:00
Jeff Squyres
558fc2836d Bump PLPA version to 1.3rc2, which should fix the "make dist" error
from last night's nightly tarball.

This commit was SVN r20576.
2009-02-17 12:54:57 +00:00
Jeff Squyres
4590582807 Bump PLPA to v1.3rc1, which includes a valgrind API fix for a
known-bad memory access pattern.  Specifically, a NULL pointer is
passed in a system call as part of a probe to figure out which
affinity API this system has.  We know it's a NULL and we did it on
purpose, so don't have Valgrind yell about it.

This commit was SVN r20572.
2009-02-17 02:01:30 +00:00
Jeff Squyres
8b9601e35e Put a check in the destructor to ensure that we don't try to free a
NULL pointer.

This commit was SVN r20569.
2009-02-17 01:11:10 +00:00
George Bosilca
918d94f449 Put back the commit r20562 as it had a reason to be there: clean
a memory leak.

This commit was SVN r20566.

The following SVN revision numbers were found above:
  r20562 --> open-mpi/ompi@62c913f851
2009-02-16 20:03:48 +00:00
Josh Hursey
350d9c94ab Backout r20562 since it breaks finalization in the tools (per email to devel).
This commit was SVN r20565.

The following SVN revision numbers were found above:
  r20562 --> open-mpi/ompi@62c913f851
2009-02-16 19:18:43 +00:00
George Bosilca
62c913f851 Release the default base on finalize.
This commit was SVN r20562.
2009-02-14 21:51:09 +00:00
Jeff Squyres
f9043edd39 Ensure to free this string when we're done.
This commit was SVN r20536.
2009-02-12 22:54:43 +00:00
Jeff Squyres
17d9c2c240 Important clarification about the ownership of strings returned by the
current_value parameter to mca_base_param_reg_string() function.

This commit was SVN r20535.
2009-02-12 22:54:29 +00:00
Jeff Squyres
e3ae1468d3 Don't strdup here; there's already a strdup down in
param_set_override().

This commit was SVN r20533.
2009-02-12 22:36:45 +00:00
Ralph Castain
883a0972bc Add missing include file
This commit was SVN r20529.
2009-02-12 16:41:48 +00:00
George Bosilca
84d3ca0c9e Release the memory.
This commit was SVN r20524.
2009-02-11 21:34:27 +00:00
Shiqing Fan
2f1461419c Add a new feature for checking mca subdirectories, i.e. detecting if there is an exclude file list which indicates the files that shouldn't be added to the source list. By default, the CMake build system will simply add all source files in the required sub folders, without knowing which files have to be excluded. The first use of it is in plm/base/.windows.
And clean up the nested variable names, in order to make it readable.

This commit was SVN r20498.
2009-02-10 17:20:13 +00:00
Ralph Castain
4cdf91a8d4 Per the RFC, extend the current use of the ompi_proc_t flags field (without changing the field itself).
The prior ompi_proc_t structure had a uint8_t flag field in it, where only one
bit was used to flag that a proc was "local". In that context, "local" was
constrained to mean "local to this node".

This commit provides a greater degree of granularity on the term "local", to include tests
to see if the proc is on the same socket, PC board, node, switch, CU (computing
unit), and cluster.

Add #define's to designate which bits stand for which local condition. This
was added to the OPAL layer to avoid conflicting with the proposed movement of
the BTLs. To make it easier to use, a set of macros have been defined - e.g.,
OPAL_PROC_ON_LOCAL_SOCKET - that test the specific bit. These can be used in
the code base to clearly indicate which sense of locality is being considered.

All locations in the code base that looked at the current proc_t field have
been changed to use the new macros.

Also modify the orte_ess modules so that each returns a uint8_t (to match the
ompi_proc_t field) that contains a complete description of the locality of this
proc. Obviously, not all environments will be capable of providing such detailed
info. Thus, getting a "false" from a test for "on_local_socket" may simply
indicate a lack of knowledge.

This commit was SVN r20496.
2009-02-10 02:20:16 +00:00
Rainer Keller
64aa384745 - Fix comment to reflect current code.
This commit was SVN r20479.
2009-02-09 00:52:57 +00:00
Shiqing Fan
8086bb1a1b SIGSTOP and SIGTSTP are not supported on Windows. But they have to be defined anyway, although they are not used for Windows.
This commit was SVN r20453.
2009-02-05 17:02:34 +00:00
Shiqing Fan
ff7ca43dd1 Update two configuration files for windows build.
This commit was SVN r20450.
2009-02-05 16:39:40 +00:00
Shiqing Fan
7d2d6b16b1 A fix for windows mainly, adding BEGIN/END_C_DECLS pairs.
This commit was SVN r20448.
2009-02-05 16:35:58 +00:00
Jeff Squyres
67a5374a61 Re CID 1180: Actually, it would be better to also print something in
the case of an error, too...

This commit was SVN r20443.
2009-02-05 15:26:44 +00:00
Jeff Squyres
598e530de9 Fix CID 1180: ensure to check the output from snprintf, since we pass
it to write().

This commit was SVN r20442.
2009-02-05 15:24:48 +00:00
Jeff Squyres
9c2a6da128 Remove errant '>'. How on earth did that work at all?
This commit was SVN r20416.
2009-02-03 23:21:34 +00:00
Jeff Squyres
35c5e28a8e Up to SVN r20383
This commit was SVN r20384.

The following SVN revision numbers were found above:
  r20383 --> open-mpi/ompi@e0638c84c8
2009-01-29 17:59:04 +00:00
Jeff Squyres
bb3d258562 Round up a few places where PATH_MAX was used instead of
OMPI_PATH_MAX.  Thanks to Andrea Iob for the bug report.

This commit was SVN r20360.
2009-01-27 22:57:50 +00:00
Ralph Castain
c92f906d7c Move the daemon collectives out of the ODLS and into the GRPCOMM framework. This removes the inherent assumption that the OOB topology is a tree, thus allowing different grpcomm/routed combinations to implement collectives appropriate to their topology.
This commit was SVN r20357.
2009-01-27 19:13:56 +00:00
Rolf vandeVaart
1872a7b75d This change allows the trunk to be compiled with Sun
Studio compilers again. It has been broken since
1/14/2009 when some changes exposed a bug in autoconf
and how it handles support for the restrict keyword.
Basically, Sun Studio C supports the restrict keyword
but Sun Studio C++ does not.

I am also pursuing a fix with the autoconf folks, but
this change was needed to get things building again.

This commit was SVN r20351.
2009-01-26 20:13:44 +00:00
Jeff Squyres
f3b1432260 Fixes trac:1618: ensure to check to see if the symbol RTLD_NEXT exists
before trying to use it (e.g., it doesn't seem to exist on Cygwin).

This commit was SVN r20343.

The following Trac tickets were found above:
  Ticket 1618 --> https://svn.open-mpi.org/trac/ompi/ticket/1618
2009-01-25 16:38:00 +00:00
Jeff Squyres
6d805eb0dd Ensure to not do the found_files stuff is --disable-dlopen is selected.
This commit was SVN r20320.
2009-01-22 16:46:02 +00:00
Jeff Squyres
58a25cae69 Fixes trac:1271: make the OPAL MCA base read the list of MCA DSO filenames
''once'' and keep the names in an argv-style array.  Each time we go
to open a framework, we just scan that array rather than re-reading
all the filenames from the filesystem.

This commit was SVN r20309.

The following Trac tickets were found above:
  Ticket 1271 --> https://svn.open-mpi.org/trac/ompi/ticket/1271
2009-01-21 22:27:05 +00:00
Josh Hursey
fca3c6e571 Fix the BLCR configuration when explicitly disabling it.
It happened that if we supplied:
 --with-ft=cr --without-blcr
then BLCR would be loaded, due to a logic break in the old m4.

Now this works approprately. This should be moved to v1.3.1

This commit was SVN r20296.
2009-01-19 20:21:58 +00:00
Jeff Squyres
4520b00547 Fixes trac:1587: also check the mca component struct framework and
component name against the filename and ensure that they match.
Ignore the component if they do not.

This commit was SVN r20291.

The following Trac tickets were found above:
  Ticket 1587 --> https://svn.open-mpi.org/trac/ompi/ticket/1587
2009-01-17 12:53:21 +00:00
Ralph Castain
88a0af9726 Revise the way we output resolved hostnames to make life easier for the Eclipse folks. Store aliases for individual nodes (only when requested to show resolved hostnames) and then report them out as part of the display-map option.
This commit was SVN r20284.
2009-01-15 18:11:50 +00:00
Jeff Squyres
d1c6f3f89a * Fix a truckload of Cisco copyrights to be the same as the rest of
the code base.
 * Fix a few misspellings in other copyrights.

This commit was SVN r20241.
2009-01-11 02:30:00 +00:00
Ralph Castain
17e1911afa Remove unneeded include file
This commit was SVN r20204.
2009-01-05 19:20:02 +00:00
Tim Mattox
a5efe3ed77 Refs trac:868, #869
The fix for #868, r14358, introduced an (unneeded?) inconsitency...
For Mac OS X systems, inttypes.h will always be included with opal_config.h,
and NOT included for non-Mac OS X systems.  For developers using Mac OS X,
this masks the need to include inttypes.h or more properly opal_stdint.h.

This changeset corrects one of these oopses.  However, the underlying problem
still exists.  Moving the equivelent of r14358 into opal_stdint.h from
opal_config_bottom.h might be the "right" solution, but AFAIK, we would then
need to replace each direct inclusion of inttypes.h with opal_stdint.h to
properly address tickets #868 and #869.

This commit was SVN r20196.

The following SVN revision numbers were found above:
  r14358 --> open-mpi/ompi@dce72aab70

The following Trac tickets were found above:
  Ticket 868 --> https://svn.open-mpi.org/trac/ompi/ticket/868
2009-01-04 05:09:18 +00:00
Jeff Squyres
ad7cfe63a3 Fix CID 1180: check for negative return from snprintf.
This commit was SVN r20192.
2009-01-03 15:33:54 +00:00
Jeff Squyres
df3a304447 Fix CID 1182: ensure to check return of read() for failure.
This commit was SVN r20191.
2009-01-03 15:30:56 +00:00
Jeff Squyres
ae7dfdd0e0 Fix CID 1136: fix a small memory leak.
This commit was SVN r20188.
2009-01-03 15:12:16 +00:00
Jeff Squyres
c23f8e3981 Fix CIDs 1183-1186 (same as r20186 -- just missed the fact that there
were several more CIDs on the same source line before I committed).

This commit was SVN r20187.

The following SVN revision numbers were found above:
  r20186 --> open-mpi/ompi@423cce4b0a
2009-01-03 14:58:07 +00:00
Jeff Squyres
423cce4b0a Fix CID 1187: use PRIu64 instead of %lu for printing a uint64_t.
This commit was SVN r20186.
2009-01-03 14:55:08 +00:00
Tim Mattox
f911b1a63d Fix a few code comments in the new ompi-top functionality.
This commit was SVN r20166.
2008-12-22 22:36:38 +00:00
Ralph Castain
7787f84540 Per the earlier RFC and some discussion at the Dec ORTE design meeting, add the ompi-top tool and all its supporting infrastructure. This includes a new OPAL pstat framework and data type, currently with rather weak support for Mac OSX and pretty complete support for Linux. The Sun team promised to add Solaris support as well.
Also, per chat with Jeff, modified the Makefile.am's of a few orte tools so that they were consistent in the way we generate the ompi-equivalent cmds.

This commit was SVN r20165.
2008-12-22 20:23:05 +00:00
George Bosilca
80fd24c948 Small cleanups: remove an unused dependency to signal.h and include
output.h.

This commit was SVN r20155.
2008-12-18 22:39:49 +00:00
George Bosilca
24e191a076 Update the MIPS atomics. We can now compile with gcc and Pathscale.
This commit was SVN r20154.
2008-12-18 22:38:31 +00:00
Josh Hursey
ce8d18bfda This commit changes the use of the deprecated cr_request_file() to use the cr_request_checkpoint() interface to BLCR. Additional configure checks are added to use the best available checkpointing interface available for the BLCR installed on the system (default: cr_request_checkpoint()).
This commit fixes trac:1691

Thanks to Matthias Hovestadt for identifying this issue.

This commit was SVN r20114.

The following Trac tickets were found above:
  Ticket 1691 --> https://svn.open-mpi.org/trac/ompi/ticket/1691
2008-12-11 00:08:34 +00:00
Tim Mattox
4fa13a1a4d Fix two typos inside of comments.
This commit was SVN r20112.
2008-12-10 21:18:13 +00:00
Shiqing Fan
5ae5f0e173 - 4/4 commit for Windows Visual Studio and CCP support:
unnecessary clean up to non windows related files (within ifdef __WINDOWS__).

This commit was SVN r20111.
2008-12-10 21:13:27 +00:00
Shiqing Fan
20cea164db - 3/4 commit for Windows Visual Studio and CCP support:
corrections to non-windows files (but within ifdef __WINDOWS__)
  type casts, event library for windows use win32. 
  in orte runtime, add windows sockets handling and object construction.

This commit was SVN r20110.
2008-12-10 21:13:10 +00:00
Shiqing Fan
8673f19f50 - 2/4 commit for Windows Visual Studio and CCP support:
changes to the already existing ccp components
  event/win32.c: merge old FD handling into new
  opal_installdirs_windows.c:fix the registry handling

This commit was SVN r20109.
2008-12-10 21:01:54 +00:00
Shiqing Fan
a5281f0434 - 1/4 commit for Windows Visual Studio and CCP support:
CMakeLists and .windows files.
  In contribs preconfigured and precompiled parts.

This commit was SVN r20108.
2008-12-10 20:59:20 +00:00
Ralph Castain
728a24c8ec After considerable patience and help with debugging/testing from Tim M and Jeff S, return a completed and pretty well tested patch of the IOF to the trunk. This commit includes the previously reverted r20074, r20068, and r20064, as well as changes to fix those commits.
Basically, the remaining problem turned out to be:

1. closing stdout/stderr during orte_finalize of mpirun

2. inadvertently setting up a write event on fd = -1

3. devising a scheme to more accurately track when the stdin write event was active vs closed so it only got released once

This passed prelim MTT testing by Jeff and Tim, but should soak for awhile before migrating to 1.3.

This commit was SVN r20106.

The following SVN revision numbers were found above:
  r20064 --> open-mpi/ompi@a07660aea8
  r20068 --> open-mpi/ompi@ec930d14a9
  r20074 --> open-mpi/ompi@2940309613
2008-12-10 20:40:47 +00:00
Ralph Castain
1ace83c470 Enable modex-less launch. Consists of:
1. minor modification to include two new opal MCA params:
   (a) opal_profile: outputs what components were selected by each framework
       currently enabled for most, but not all, frameworks
   (b) opal_profile_file: name of file that contains profile info required
       for modex

2. introduction of two new tools:
   (a) ompi-probe: MPI process that simply calls MPI_Init/Finalize with
       opal_profile set. Also reports back the rml IP address for all
       interfaces on the node
   (b) ompi-profiler: uses ompi-probe to create the profile_file, also
       reports out a summary of what framework components are actually
       being used to help with configuration options

3. modification of the grpcomm basic component to utilize the
   profile file in place of the modex where possible

4. modification of orterun so it properly sees opal mca params and
   handles opal_profile correctly to ensure we don't get its profile

5. similar mod to orted as for orterun

6. addition of new test that calls orte_init followed by calls to
   grpcomm.barrier

This is all completely benign unless actively selected. At the moment, it only supports modex-less launch for openib-based systems. Minor mod to the TCP btl would be required to enable it as well, if people are interested. Similarly, anyone interested in enabling other BTL's for modex-less operation should let me know and I'll give you the magic details.

This seems to significantly improve scalability provided the file can be locally located on the nodes. I'm looking at an alternative means of disseminating the info (perhaps in launch message) as an option for removing that constraint.

This commit was SVN r20098.
2008-12-09 23:49:02 +00:00
Brian Barrett
8a8cf96b6c Provide configure parameter to allow the disabling of reading parameters
and components from the home directory for platforms that are bad at
reading in files from home directory at scale (like Red Storm)

This commit was SVN r20069.
2008-12-04 01:51:44 +00:00
Jeff Squyres
06097db928 Fixes trac:1667. Ensure to fill in the source_file if it was requested.
This commit was SVN r20067.

The following Trac tickets were found above:
  Ticket 1667 --> https://svn.open-mpi.org/trac/ompi/ticket/1667
2008-12-03 22:17:50 +00:00
Shiqing Fan
abd21b6d17 - An update for memchecker :
1. fix a bug in pml_ob1_recvreq/sendreq.c, buffer was made defined where the request has already been released.
2. complete memchecker support for collective functions.
3. change the wrongly spelled function name of memchecker, i.e. '*_isaddressible' should be '*_isaddressable'

This commit was SVN r20043.
2008-11-27 16:34:02 +00:00
Jeff Squyres
d7f3dd2230 Add a comment explaining exactly what is returned by this function
because we wasted a good amount of time today assuming that it was
returning the actual netmask.  Specifically, we were confused why it
returned 0x18 instead of 0xffffff00 for a class C subnet (the
head-smacking moment wasn't until [much] later when we converted 0x18
to decimal, which is 24.  Then the Clue Light(tm) went on).

This commit was SVN r20002.
2008-11-14 22:59:41 +00:00
Josh Hursey
bf96a8dea0 Fixes a bug that may occur with really long environment variables on job restart.
This happens with really long paths as part of the variable name.

Found in MTT testing (where the paths are long). This will need to be moved to v1.3

This commit was SVN r19989.
2008-11-12 21:43:34 +00:00
George Bosilca
6344b8dffe Force an explicit cast to keep the compilers quiet.
This commit was SVN r19975.
2008-11-11 14:58:53 +00:00