1
1
Граф коммитов

15486 Коммитов

Автор SHA1 Сообщение Дата
Mike Dubman
c56e3141cb fca: fix segmentation fault when no underlying collective implementation is found
This commit was SVN r24198.
2010-12-31 12:03:49 +00:00
Ralph Castain
80ef1af8ba Add psm key generator program
This commit was SVN r24197.
2010-12-30 20:54:58 +00:00
Josh Hursey
bbfdf04a81 Fix a couple of 'unused variable' warnings, and one return value warning.
{{{
base/paffinity_base_service.c: In function ‘opal_paffinity_base_cset2mapstr’:
base/paffinity_base_service.c:623: warning: unused variable ‘range_last’
base/paffinity_base_service.c:623: warning: unused variable ‘range_first’
base/paffinity_base_service.c:622: warning: unused variable ‘count’
base/paffinity_base_service.c:622: warning: unused variable ‘m’
}}}

{{{
connect/btl_openib_connect_oob.c: In function ‘init_ud_qp’:
connect/btl_openib_connect_oob.c:1111: warning: control reaches end of non-void function
connect/btl_openib_connect_oob.c: In function ‘init_device’:
connect/btl_openib_connect_oob.c:1235: warning: unused variable ‘i’
connect/btl_openib_connect_oob.c: In function ‘get_pathrecord_sl’:
connect/btl_openib_connect_oob.c:1323: warning: unused variable ‘i’
}}}

This commit was SVN r24196.
2010-12-30 15:37:50 +00:00
Doron Shoham
834625cc51 Currently the service lever is passed as static parameter (ib_service_level), but the service level is possibly dynamic and if so the only way to get a proper value is to ask the SA.
New mca parameter is added (ib_path_rec_service_level) - positive value means that we should get the SL from the SA.

This is usable for torus topologies where different SL value is used for different endpoints.

A cache is kept of ib queue pairs used to communicate with the SA for a particular device and port and path record SL values retrieved from that SA.

The interaction with the cache assumes that there are no recursive calls to these routines. This must be solved either by code flow, by using higher level locks, or by adding a locking mechanism to these routines along with some method for avoiding deadlock.

This code use a UD queue pair to talk to the SA, and not need to chmod /dev/infiniband/umad* for use by normal users.  

The request to the SA is a SubnAdmGet(), not a SubnAdmGetTable().
In the future we might add a support of a SubnAdmGetTable(), but it will require implementing RMPP (Reliable Multi-Packet Transaction Protocol) and I'm not sure we want to do that.

This patched is based on the work of David McMillen <davem@systemfabricworks.com>.

This commit was SVN r24195.
2010-12-30 08:20:24 +00:00
Josh Hursey
0b514e234b MPI_Init_thread is used in place of MPI_Init, so for the checkpoint/restart functionality it must correctly init the C/R functionality instead of simply making a critical section. This allows the C/R thread to be started properly.
Thanks to Takayuki Seki for finding this bug.

This commit was SVN r24194.
2010-12-29 15:37:30 +00:00
Mike Dubman
3d517c0285 ABI cleanups
This commit was SVN r24193.
2010-12-28 07:11:46 +00:00
Mike Dubman
b339a7a07b Add FCA 1.2/2.0 backward compatibility, depending on OMPI_FCA_VERSION_xx macro definition.
This commit was SVN r24192.
2010-12-27 21:32:34 +00:00
Doron Shoham
bfe611d3bd This patch fixes bugs #2627 (1.5.2) and #2623 (1.4.2) - Sending large messages over RDMA fails.
The patch includes the following:
 *  Add new mca parameter - btl_openib_max_hw_msg_size - Maximum size (in bytes) of a single fragment of a long message when using the RDMA protocols (must be > 0 and <= hw capabilities).
 *  If btl_openib_max_hw_msg_size is larger than the maximum hw limitation print error message.
 *  Change the default openib flags to include only PUT and not GET.
 *  Print error message if user choose manually GET flag in openib btl.
 *  In prepare_dst: limit the message size to be the minimum of both endpoint's hw_limitation and the user limitation (if requested).

This commit was SVN r24191.
2010-12-23 11:48:43 +00:00
Ethan Mallove
9251785161 Emit an error (instead of a SEGV) if the "compiler" parameter is not set
in the wrapper data file.

This commit was SVN r24190.
2010-12-21 19:01:39 +00:00
Brian Barrett
621344cce4 Remove duplicate DT tests
This commit was SVN r24189.
2010-12-20 23:38:36 +00:00
Brian Barrett
9876e65137 Fix race condition in unlock code, as well as a small memory leak.
Somehow they got fixed in the pt2pt implementation, but not the RDMA
implementation.  Thanks to Guillaume Thouvenin for finding this issue.

This commit was SVN r24188.
2010-12-20 22:15:29 +00:00
Nathan Hjelm
c082d05ecb Reset the timer on MPIR_being_debugged only if MPIR_being_debugged is not set. Fix typo in return code.
This commit was SVN r24187.
2010-12-20 21:00:49 +00:00
Rolf vandeVaart
a57e5587f6 Brief description of bfo functionality.
This commit was SVN r24186.
2010-12-17 19:12:00 +00:00
Jeff Squyres
a525e70f46 Convert "opal_show_help" to be a global variable pointer.
It is statically initialized to the real back-end OPAL show_help
function.  During orte_show_help_init(), the variable is re-assigned
with the value of the back-end ORTE show_help function (the one that
does error message aggregation).  

Therefore, anything that calls opal_show_help() after a certain point
in orte_init() will have their show_help messages be aggregated.
w00t!  Even code down in OPAL -- that has no knowledge of ORTE -- will
have their messages aggregated.  '''Double w00t!'''

During orte_show_help_finalize(), we restore the original pointer
value so that it something calls opal_show_help() after
orte_finalize(), it'll still work properly (but it won't be
aggregated).  

This commit was SVN r24185.
2010-12-16 23:00:25 +00:00
Terry Dontje
80c1e9acac added the format parameter to OMPI_Affinity_str call in README
This commit was SVN r24184.
2010-12-16 19:22:59 +00:00
Rolf vandeVaart
b9f11a874d Bump revision in bfo utility script.
This commit was SVN r24183.
2010-12-16 18:14:18 +00:00
Terry Dontje
6da16ab0d7 add format parameter and layout format to OMPI_Affinity_str
This commit was SVN r24182.
2010-12-16 15:11:17 +00:00
Jeff Squyres
b113b1a382 Add the btl_tcp_if_seq MCA parameter. From the help string:
If specified, a comma-delimited list of TCP interfaces.  Interfaces
  will be assigned, one to each MPI process, in a round-robin fashion
  on each server.  For example, if the list is "eth0,eth1" and four
  MPI processes are run on a single server, then local ranks 0 and 2
  will use eth0 and local ranks 1 and 3 will use eth1.

This feature is only useful for environments with virtual ethernet
interfaces on the same network.  For example, if eth0 and eth1 are
virtual interfaces to the same NIC on the same subnet, and if the NIC
provides different hardware resources to eth0 and eth1 (not just
different kernel resources), some HOL blocking and congestion issues
can be eased in a modest fashion.

This commit was SVN r24181.
2010-12-16 00:54:32 +00:00
Jeff Squyres
741ba6518b Install ompi_config.h (and friends) into /openmpi, not /openmpi/ompi/include. This is consistent with prior releases and the OPAL and ORTE <foo>_config.h file installation locations.
This commit was SVN r24180.
2010-12-15 20:54:22 +00:00
Jeff Squyres
b46e0291c2 Add 1.5.1 section. Update 1.5 section to reflect what the actual text
that was shipped in 1.5.

This commit was SVN r24176.
2010-12-15 18:35:59 +00:00
Shiqing Fan
883aeedd26 Add support for nanosleep function using Sleep on Windows. The accuracy of the sleep function on Windows is 1 millisecond mentioned in MSDN doc.
This commit was SVN r24175.
2010-12-15 15:43:25 +00:00
Shiqing Fan
ec82e73bce use sockets instead of pipes on Windows.
This commit was SVN r24174.
2010-12-15 14:34:25 +00:00
Jeff Squyres
298a50ff3b Clarification that calling MPI_INIT_THREAD with MPI_THREAD_SINGLE is
the same as calling MPI_INIT.

This commit was SVN r24173.
2010-12-14 22:49:36 +00:00
Matthias Jurenz
2f52ac6d7d Updated svn:ignore to hide ltmain.sh.orig
This commit was SVN r24172.
2010-12-14 19:15:17 +00:00
Matthias Jurenz
679317f632 Patch autotools output in all sub directories where autoreconf will be invoked. Fixes trac:2599.
This commit was SVN r24171.

The following Trac tickets were found above:
  Ticket 2599 --> https://svn.open-mpi.org/trac/ompi/ticket/2599
2010-12-14 19:14:35 +00:00
Jeff Squyres
de97962aac Fixes trac:2651.
Fix off-by-one error when /dev/urandom doesn't exist.  Thanks to "pth"
for the patch.

This commit was SVN r24170.

The following Trac tickets were found above:
  Ticket 2651 --> https://svn.open-mpi.org/trac/ompi/ticket/2651
2010-12-14 14:52:51 +00:00
Shiqing Fan
f4cb293f7c Rename README.WINDOWS
This commit was SVN r24168.
2010-12-14 14:20:42 +00:00
Samuel Gutierrez
7d8d7769ee remove unneeded include in sm btl.
This commit was SVN r24165.
2010-12-13 17:30:47 +00:00
Ralph Castain
b251a59cdf Cleanup nidmap finalize
This commit was SVN r24164.
2010-12-11 16:42:06 +00:00
Brian Barrett
6cf74eeb03 Fix bug in looking at convertor_unpack return code. Always print debug
on error message for now.

This commit was SVN r24163.
2010-12-10 22:36:47 +00:00
Brian Barrett
a26fadb26e Bring Portals4 updates back into the trunk
This commit was SVN r24154.
2010-12-07 20:11:25 +00:00
Ralph Castain
2dc5cbb483 Remove stale code and API from the RML/OOB frameworks. Stopped using this code years ago.
This commit was SVN r24153.
2010-12-05 15:58:21 +00:00
George Bosilca
b4355408f5 Fix the Sparc and Sparcv9 atomics based on Nicolai Stange
patch.

CMR:v1.5
CMR:v1.4

This commit was SVN r24150.
2010-12-03 19:16:53 +00:00
Ethan Mallove
297630368a More explanation of why we need to use gpatch on Solaris
This commit was SVN r24149.
2010-12-03 19:14:05 +00:00
George Bosilca
bb412a5ff7 Indentation.
This commit was SVN r24148.
2010-12-03 19:13:57 +00:00
Ethan Mallove
42ffa6fda9 On Solaris, use gpatch in autogen.pl.
This commit was SVN r24147.
2010-12-03 17:35:00 +00:00
Rolf vandeVaart
b67d3398da It is convention to have orte_config.h included at top of file.
This commit was SVN r24146.
2010-12-03 16:13:31 +00:00
Rolf vandeVaart
3f7dd84278 Fix libevent so it can compile in the few cases where sys/queue.h does not exist.
1. Remove it from libevent207.h because it is not needed.
2. Add compat to the include list so it can use queue.h when needed.

This commit was SVN r24144.
2010-12-02 23:05:02 +00:00
Nathan Hjelm
effc240dbf Updated authors file
This commit was SVN r24138.
2010-12-02 15:34:59 +00:00
Shiqing Fan
f43862420c Convert the bad dos line endings to unix style for all windows related files.
This commit was SVN r24137.
2010-12-02 12:08:08 +00:00
Ralph Castain
aaad8ae891 Remove unused var
This commit was SVN r24136.
2010-12-02 02:38:13 +00:00
Ralph Castain
f9ffff59f8 Ensure clean termination of threads and tcp multicast
This commit was SVN r24134.
2010-12-02 00:23:42 +00:00
Nathan Hjelm
94d4aa7253 fixed wrong include
This commit was SVN r24133.
2010-12-01 23:10:12 +00:00
Nathan Hjelm
75605faa75 added support for reattaching a debugger using the MPIR_attach_fifo
This commit was SVN r24132.
2010-12-01 20:13:58 +00:00
Eugene Loh
c310eacc58 Change Fortran interface MPI_SCAN to MPI_EXSCAN in MPI_Exscan man page.
This commit was SVN r24131.
2010-12-01 17:29:48 +00:00
Ralph Castain
ad814f26cd One more time, into the breach!
Restore the use of override_oversubscribe to indicate that the data source for resources on the backend nodes used in mapping is unreliable. In this situation (e.g., data came from hostfile, or we are just using localhost because nothing was provided), we don't trust the oversubscribe condition passed by the mapper. Instead, we check locally to ensure we set sched_yield correctly.

This commit was SVN r24130.
2010-12-01 15:15:26 +00:00
Ralph Castain
eba65e97f3 Extend the rmcast APIs to allow enable/disable of comm, required for clean termination by upper layer users.
Point the recv thread event base to the right place so it can wakeup when required.

Add a new error code for "comm disabled" when attempting to communicate after disabling comm.

This commit was SVN r24129.
2010-12-01 13:41:19 +00:00
Ralph Castain
9224302c10 Remove debug
This commit was SVN r24128.
2010-12-01 13:12:24 +00:00
Ralph Castain
4f5625d699 Not totally necessary, but good form - init the oversubscribed field in the orte_nid_t object
This commit was SVN r24127.
2010-12-01 12:58:37 +00:00
Ralph Castain
30c37ea536 Ensure that the oversubscribed condition of nodes is accurately reported by the mapper, and that the results are communicated and used by the backend orteds when setting sched_yield on local procs. Restores prior behavior that was somehow lost along the way.
Includes a patch from Damien Guinier to fix vpid assignments when cpus-per-task is specified.

This commit was SVN r24126.
2010-12-01 12:51:39 +00:00