1
1
Граф коммитов

15378 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
1ea62f3bf6 Add svn:ignore
This commit was SVN r24288.
2011-01-24 14:15:07 +00:00
Jeff Squyres
700d601dfc Also need to check the "flag" value, because if flag!=true, then the
value of "local_spawn" (and "non_mpi") is not set by ompi_info_get*().

This commit was SVN r24286.
2011-01-22 16:27:58 +00:00
Jeff Squyres
89fb26eb1c Add missing line continuation character to prevent a Makefile syntax
error

This commit was SVN r24285.
2011-01-22 11:13:28 +00:00
Rolf vandeVaart
8171370287 Fix typo which broke builds when configured with hetero and
debug.

This commit was SVN r24283.
2011-01-21 17:10:09 +00:00
Abhishek Kulkarni
a1090575c2 Nitpick: Get rid of a redundant OPAL_SOS_GET_ERROR_CODE.
This commit was SVN r24282.
2011-01-20 23:48:11 +00:00
Abhishek Kulkarni
3243b16bb3 Decode SOS error code before checking it with the native error code.
This commit was SVN r24281.
2011-01-20 23:21:38 +00:00
Abhishek Kulkarni
45a53b4f7a Add a missing to opal_sos_finalize in opal_finalize_util.
This commit was SVN r24280.
2011-01-20 23:18:02 +00:00
George Bosilca
fc9133cc7f Correctly initialize the convertor to be used.
Don't forget to initialize the OPAL datatype module.

This commit was SVN r24279.
2011-01-20 20:05:21 +00:00
Samuel Gutierrez
2574e18de4 update LANL's tlcc and rr-class platform files
This commit was SVN r24278.
2011-01-20 18:59:37 +00:00
Rolf vandeVaart
6a5ad29c36 Update configure command since it changed.
This commit was SVN r24275.
2011-01-20 14:42:12 +00:00
Sylvain Jeaugey
46b711e164 Fixes trac:1888 introduced by r24264 : make Romio autogen.sh executable.
This commit was SVN r24272.

The following SVN revision numbers were found above:
  r24264 --> open-mpi/ompi@0e921bba7f

The following Trac tickets were found above:
  Ticket 1888 --> https://svn.open-mpi.org/trac/ompi/ticket/1888
2011-01-20 09:20:34 +00:00
Nathan Hjelm
e2126512a9 test c99 struct initialization with mtt. remove on jan 20, 2011
This commit was SVN r24271.
2011-01-19 22:21:21 +00:00
Rolf vandeVaart
acd38ff746 Final changes from jsquyres review. Moved configure
code from upper level into btl configure.m4.  Changed
prefix from "OMPI" to "BTL" in preprocessor macro.  Add
an mca param that shows it has been configured in.

This commit was SVN r24270.
2011-01-19 20:58:22 +00:00
Brian Barrett
4859bb82e2 * Update to support direct call
* Add missing cancel (not that it does anything useful)
* Fix bug in opal_output call

This commit was SVN r24269.
2011-01-19 20:49:28 +00:00
Brian Barrett
8f6a19b0fc export component/module interface so that direct call works again
This commit was SVN r24268.
2011-01-19 20:47:17 +00:00
Brian Barrett
b98afd298b update to remove unneeded fields
This commit was SVN r24267.
2011-01-19 20:46:06 +00:00
Rolf vandeVaart
f22f76a6ff Add byte swapping macro for failover control message per jsquyres review.
This commit was SVN r24266.
2011-01-19 19:58:35 +00:00
Rolf vandeVaart
e75b86d3ab Fix some issues from jsquyres review.
1. Use asprintf instead of snprintf
2. Return remote_proc where possible.
3. Remove dead code.
4. Fix two comment typos.

This commit was SVN r24265.
2011-01-19 16:09:17 +00:00
Sylvain Jeaugey
0e921bba7f Romio Refresh from mpich2-1.3.1. Work by Pascal Deveze, tested through bitbucket by Jeff Squyres (https://bitbucket.org/devezep/new-romio-for-openmpi).
This commit was SVN r24264.
2011-01-19 15:55:10 +00:00
Shiqing Fan
b2f3a5b7c2 Correctly check system specific datatypes on Windows.
This commit was SVN r24257.
2011-01-18 09:40:58 +00:00
Jeff Squyres
189b541dbd Add a proper help message for the mca_verbose MCA param (and shuffle
the code to be slightly more efficient).

This commit was SVN r24256.
2011-01-14 20:18:06 +00:00
Abhishek Kulkarni
fd7ef7a1f1 Fixes broken trunk compile: call process status notify
only when ft-enable-cr is selected.

This commit was SVN r24255.
2011-01-14 18:37:07 +00:00
Jeff Squyres
32e722e4d9 Add first bullets about 1.5.2.
This commit was SVN r24254.
2011-01-14 15:15:23 +00:00
George Bosilca
29c7f2fba5 Update the tests to match the new datatype engine.
This commit was SVN r24252.
2011-01-14 07:58:50 +00:00
Abhishek Kulkarni
87d2c9b31d Few fault tolerance updates related to the CIFTS project (http://www.mcs.anl.gov/research/cifts/)
* Improve the FTB notifier to publish (C/R, process/communication failure) events to the FTB with the
   OMPI jobid as the associated payload.
 * Add notifier calls for C/R events and process status events in SnapC and ErrMgr components.
 * Fix a bug where the SnapC states and process states collide before being thrown out over the notifier.

This commit was SVN r24251.
2011-01-13 20:13:49 +00:00
George Bosilca
5390fd6f33 Reshape the datatype engine. The basic types are built down in OPAL. MPI types are
either direct link to these basic predefined types, or a combination of them.
Anyway, the first items in the datatype list belong to OPAL, the second round
are MPI datatypes created by composing basic OPAL datatypes, and the last
batch are mapped datatype (direct correspondance between an OMPI datatype and
an OPAL one such as int -> int32_t).

Modify the op to fit this new scheme.

This commit was SVN r24247.
2011-01-13 06:08:54 +00:00
Ralph Castain
b09f57b03d Update the multicast subsystem - ported from Cisco branch
This commit was SVN r24246.
2011-01-13 01:54:05 +00:00
Terry Dontje
f3aaa885a3 corrected a couple places in orte where it said cpu_model when it should have been cpu_type.
This commit was SVN r24221.
2011-01-11 19:56:26 +00:00
Terry Dontje
56c03a3853 removing a file I should not have added
This commit was SVN r24220.
2011-01-11 19:02:08 +00:00
Terry Dontje
a374661ead add configure.params to solaris sysinfo module to allow it to be built
This commit was SVN r24219.
2011-01-11 18:31:55 +00:00
Jeff Squyres
cd8f12d8e5 Remove a few useless files that were missed last night.
This commit was SVN r24218.
2011-01-11 14:15:31 +00:00
Jeff Squyres
54cb4eb2b5 Merge over new version of hwloc 1.1 from the vendor branch. Update
the module to use the new hwloc bitmap API (the cpuset API is both
klunkier and deprecated), which simplified a few things.

This commit was SVN r24217.
2011-01-11 01:41:10 +00:00
Jeff Squyres
f08433c1e1 Fixes trac:2669.
Apparently, gcc 4.4.x and 4.5.x complain about the ''possibility'' of
us calling free() on a non-heap variable.  We know that this case can
never happen because the refcount will absolutely not go to zero
here.  We think it may be gcc being a bit too aggressive on the
warnings.

However, since this happens with gcc 4.4.x and 4.5.x, and since gcc
4.5.x ship in RHEL6 and Fedora 14 (and others), someone '''will'''
complain about this in the future, so we might as well code around it
so that we don't have to keep explaining "despite the warning, it's
really ok."

The workaround is pretty simple: just OBJ_RELEASE the values from
ompi_mpi_comm_parent before it is re-assigned to the new
intercommunicator.  Then the compiler's static code analysis can't
possibly tell that it's not a heap variable, and we're ok.

So yes, we are still calling OBJ_RELEASE on a non-heap variable.  But
free() '''will never be called''' on it because of the refcount.

This commit was SVN r24214.

The following Trac tickets were found above:
  Ticket 2669 --> https://svn.open-mpi.org/trac/ompi/ticket/2669
2011-01-10 21:12:27 +00:00
Abhishek Kulkarni
11ffa854ff Update the FTB notifier
* fix indentation issues
 * update the name of one of the fault events published to the FTB (per the FTB MPI standard)

This commit was SVN r24213.
2011-01-10 18:58:31 +00:00
Ralph Castain
ac1853b5d8 Took me a couple of days, but finally tracked this one down. Some compilers/glibc's don't like composite test statements in a return and just randomly pick one of the two options.
So....don't do that!!!

This commit was SVN r24212.
2011-01-10 16:29:42 +00:00
Nathan Hjelm
02e60d326e removed csum from rr p1 architecture conf files
This commit was SVN r24211.
2011-01-10 16:04:48 +00:00
Ethan Mallove
82054cb02c Include <stdlib.h> instead of <malloc.h>. This avoids a compiler error
on some systems caused by the definition of malloc in
opal_config_bottom.h getting expanded in the system malloc.h when
OPAL_ENABLE_MEM_DEBUG is set to 1.

This commit was SVN r24210.
2011-01-06 18:16:36 +00:00
Jeff Squyres
58445f3775 After being hit by "why is openib not working?" ''again'', add a
verbose statement that shows up when you --mca btl_base_verbose 100.
It clearly states that the openib BTL disqualifies itself when
MPI_THREAD_MULTIPLE is used.

This commit was SVN r24209.
2011-01-05 22:01:15 +00:00
Eugene Loh
9bbcd51c5a Properly initialize ep->btl_max_send_size, ep->btl_pipeline_send_length, and
ep->btl_send_limit in mca_bml_r2_del_proc_btl() so that the loops will correctly
compute new endpoint max/min after the BTL has been removed.  See
http://www.open-mpi.org/community/lists/devel/2011/01/8829.php

This commit was SVN r24202.
2011-01-04 20:35:33 +00:00
Jeff Squyres
7648c36023 Correct an error in MPI_FILE_SET_VIEW man page: the fh does ''not''
have to have an identical ''value'' (it must be the same file handle,
but that means nothing about its actual value).  But the datarep must
have an identical value.  Additionaly, the etype does not need to be
identical, but the extent of all the etypes supplied must be
identical.

See MPI-2.2 p401-402 for further details.

This commit was SVN r24201.
2011-01-04 18:10:52 +00:00
Josh Hursey
2bdff63e6f move the INIT to after the error handler, so it matches MPI_INIT. Thanks to Jeff for catching this
This commit was SVN r24200.
2011-01-03 18:16:53 +00:00
Mike Dubman
4a2e29eb32 updated Makefile with a new file
This commit was SVN r24199.
2011-01-01 14:11:49 +00:00
Mike Dubman
c56e3141cb fca: fix segmentation fault when no underlying collective implementation is found
This commit was SVN r24198.
2010-12-31 12:03:49 +00:00
Ralph Castain
80ef1af8ba Add psm key generator program
This commit was SVN r24197.
2010-12-30 20:54:58 +00:00
Josh Hursey
bbfdf04a81 Fix a couple of 'unused variable' warnings, and one return value warning.
{{{
base/paffinity_base_service.c: In function ‘opal_paffinity_base_cset2mapstr’:
base/paffinity_base_service.c:623: warning: unused variable ‘range_last’
base/paffinity_base_service.c:623: warning: unused variable ‘range_first’
base/paffinity_base_service.c:622: warning: unused variable ‘count’
base/paffinity_base_service.c:622: warning: unused variable ‘m’
}}}

{{{
connect/btl_openib_connect_oob.c: In function ‘init_ud_qp’:
connect/btl_openib_connect_oob.c:1111: warning: control reaches end of non-void function
connect/btl_openib_connect_oob.c: In function ‘init_device’:
connect/btl_openib_connect_oob.c:1235: warning: unused variable ‘i’
connect/btl_openib_connect_oob.c: In function ‘get_pathrecord_sl’:
connect/btl_openib_connect_oob.c:1323: warning: unused variable ‘i’
}}}

This commit was SVN r24196.
2010-12-30 15:37:50 +00:00
Doron Shoham
834625cc51 Currently the service lever is passed as static parameter (ib_service_level), but the service level is possibly dynamic and if so the only way to get a proper value is to ask the SA.
New mca parameter is added (ib_path_rec_service_level) - positive value means that we should get the SL from the SA.

This is usable for torus topologies where different SL value is used for different endpoints.

A cache is kept of ib queue pairs used to communicate with the SA for a particular device and port and path record SL values retrieved from that SA.

The interaction with the cache assumes that there are no recursive calls to these routines. This must be solved either by code flow, by using higher level locks, or by adding a locking mechanism to these routines along with some method for avoiding deadlock.

This code use a UD queue pair to talk to the SA, and not need to chmod /dev/infiniband/umad* for use by normal users.  

The request to the SA is a SubnAdmGet(), not a SubnAdmGetTable().
In the future we might add a support of a SubnAdmGetTable(), but it will require implementing RMPP (Reliable Multi-Packet Transaction Protocol) and I'm not sure we want to do that.

This patched is based on the work of David McMillen <davem@systemfabricworks.com>.

This commit was SVN r24195.
2010-12-30 08:20:24 +00:00
Josh Hursey
0b514e234b MPI_Init_thread is used in place of MPI_Init, so for the checkpoint/restart functionality it must correctly init the C/R functionality instead of simply making a critical section. This allows the C/R thread to be started properly.
Thanks to Takayuki Seki for finding this bug.

This commit was SVN r24194.
2010-12-29 15:37:30 +00:00
Mike Dubman
3d517c0285 ABI cleanups
This commit was SVN r24193.
2010-12-28 07:11:46 +00:00
Mike Dubman
b339a7a07b Add FCA 1.2/2.0 backward compatibility, depending on OMPI_FCA_VERSION_xx macro definition.
This commit was SVN r24192.
2010-12-27 21:32:34 +00:00
Doron Shoham
bfe611d3bd This patch fixes bugs #2627 (1.5.2) and #2623 (1.4.2) - Sending large messages over RDMA fails.
The patch includes the following:
 *  Add new mca parameter - btl_openib_max_hw_msg_size - Maximum size (in bytes) of a single fragment of a long message when using the RDMA protocols (must be > 0 and <= hw capabilities).
 *  If btl_openib_max_hw_msg_size is larger than the maximum hw limitation print error message.
 *  Change the default openib flags to include only PUT and not GET.
 *  Print error message if user choose manually GET flag in openib btl.
 *  In prepare_dst: limit the message size to be the minimum of both endpoint's hw_limitation and the user limitation (if requested).

This commit was SVN r24191.
2010-12-23 11:48:43 +00:00