1
1
Граф коммитов

4171 Коммитов

Автор SHA1 Сообщение Дата
Donald Kerr
213daa58da support for solaris relaxed ordering
This commit was SVN r20167.
2008-12-24 15:05:12 +00:00
Ralph Castain
7787f84540 Per the earlier RFC and some discussion at the Dec ORTE design meeting, add the ompi-top tool and all its supporting infrastructure. This includes a new OPAL pstat framework and data type, currently with rather weak support for Mac OSX and pretty complete support for Linux. The Sun team promised to add Solaris support as well.
Also, per chat with Jeff, modified the Makefile.am's of a few orte tools so that they were consistent in the way we generate the ompi-equivalent cmds.

This commit was SVN r20165.
2008-12-22 20:23:05 +00:00
George Bosilca
4d5fbc5955 Remove unused lock from the ompi_proc_t. This reduce the size of the ompi_proc_t
by 64 bytes.
Remove the useless pml_proc from the PML layer.

This commit was SVN r20157.
2008-12-19 19:56:27 +00:00
George Bosilca
7fc48ae11e Update the template BTL to fulfill the requirements for #1713.
This commit was SVN r20153.
2008-12-17 22:15:43 +00:00
George Bosilca
209b844017 Update the self BTL to fulfill the requirements for #1713.
This commit was SVN r20152.
2008-12-17 22:15:27 +00:00
George Bosilca
7d404d238d Update the tcp BTL to fulfill the requirements for #1713.
This commit was SVN r20151.
2008-12-17 22:15:12 +00:00
George Bosilca
341ee1389c Update the sm BTL to fulfill the requirements for #1713.
This commit was SVN r20150.
2008-12-17 22:14:59 +00:00
George Bosilca
7cec018149 Update the Elan BTL to fulfill the requirements for #1713.
This commit was SVN r20149.
2008-12-17 22:14:45 +00:00
Josh Hursey
c954045989 Add a patch to address a deadlock in the CRCP BKMRK component.
The problem was that we doubly decremented the active count on blocking receives that we stall to complete. This moved the active count into the negative. With a negative count for 'active' a message that should have been accounted for would be over looked. This then causes the bookmark exchange to post a drain for a message that was never posted, thus locking the protocol. By eliminating the decrement on the 'active' count when we attempt to post the drain message, we only the decrement this counter when the outstanding blocking recv completes during the stall operation.

Refs trac:1619
Does not close this ticket since there is an outstanding potential problem with ANY_SOURCE and ANY_TAG, as referenced in the ticket.

This should be moved to v1.3

This commit was SVN r20147.

The following Trac tickets were found above:
  Ticket 1619 --> https://svn.open-mpi.org/trac/ompi/ticket/1619
2008-12-17 17:23:39 +00:00
Jeff Squyres
2f94a151e1 Fix a const cast.
This commit was SVN r20146.
2008-12-17 15:29:02 +00:00
Brian Barrett
f8537c0059 Following ticket #1725, when a free list item can not be allocated, return
the error to the upper layer and let it deal with the problem

This commit was SVN r20143.
2008-12-16 22:38:02 +00:00
Brian Barrett
dbccb250f0 TYPE shouldn't be surrounded by parens because it causes issues for some versions of gcc when the construct a = ((int)) sizeof(b) comes up...
This commit was SVN r20140.
2008-12-16 19:49:54 +00:00
Ethan Mallove
9003e4d722 Add missing #include <errno.h> line (for SunStudio Solaris).
This commit was SVN r20138.
2008-12-16 15:30:02 +00:00
George Bosilca
f9a1700f55 Some 64 bits architectures support pointer aligned on 4 bytes
(where the sizeof(long) is 8). On such architectures dont
assert if the datatype representation is not aligned on 64 bits.

This commit was SVN r20134.
2008-12-16 09:21:21 +00:00
George Bosilca
db424282f7 Fix an issue where the datatype description introduce a buffer misalignment. Because some
architectures (read SPARC64) require aligned accesses, we increase the storage space
when we pack a datatype description to keep the fields aligned. This has to be done
on both sided in order to be consistent.

This commit was SVN r20133.
2008-12-16 09:06:27 +00:00
Avneesh Pant
c1e508750b Check the active port MTU against the MTU statically configured for the HCA. QLogic HCA's capable of MTU had an issue when connected to switches running at 2K.
This commit was SVN r20131.
2008-12-15 21:17:58 +00:00
George Bosilca
fe87e28fee This is a temporary fix for the deadlock problem over MX. The real
problem seems to come from the free list, but due to lack of time to
understand it completely, I provide this fix. Basically, there is no
waiting in the MX BTL anymore, if we cannot allocate a fragment we
rely on the PML to take the corrective actions.

This commit was SVN r20124.
2008-12-15 03:45:34 +00:00
George Bosilca
aa4e9da26d Correct the disp array when creating a data based on the
MPI_COMBINER_INDEXED_BLOCK combiner.

This commit was SVN r20123.
2008-12-13 01:57:27 +00:00
George Bosilca
fec8692074 Get rid of all elan3 references.
This commit was SVN r20122.
2008-12-12 23:59:21 +00:00
Nysal Jan
ee8ec6f6b5 Remove dead/redundant code. Minimize number of calloc invocations
This commit was SVN r20121.
2008-12-12 10:55:50 +00:00
George Bosilca
7631eb8eed A fix for http://www.open-mpi.org/community/lists/users/2008/12/7502.php.
The solution is not to compute the OVERLAP flag, as the best we can do
is an approximative answer. Without this flag the unpack can leads to
unexpected answers if the data-type contain any overlapping regions.
As such datatypes are illegal in MPI, this became a user responsability.

This commit was SVN r20120.
2008-12-12 00:25:40 +00:00
Nysal Jan
6a5454b76a Fixes crash in openib BTL on a heterogeneous cluster Refs trac:1700
This commit was SVN r20113.

The following Trac tickets were found above:
  Ticket 1700 --> https://svn.open-mpi.org/trac/ompi/ticket/1700
2008-12-10 22:07:48 +00:00
Shiqing Fan
20cea164db - 3/4 commit for Windows Visual Studio and CCP support:
corrections to non-windows files (but within ifdef __WINDOWS__)
  type casts, event library for windows use win32. 
  in orte runtime, add windows sockets handling and object construction.

This commit was SVN r20110.
2008-12-10 21:13:10 +00:00
Shiqing Fan
a5281f0434 - 1/4 commit for Windows Visual Studio and CCP support:
CMakeLists and .windows files.
  In contribs preconfigured and precompiled parts.

This commit was SVN r20108.
2008-12-10 20:59:20 +00:00
Josh Hursey
df75abd6b2 Fix a warning. Thanks to Jeff for noticing.
This should be moved to v1.3 as well.

This commit was SVN r20101.
2008-12-10 15:38:12 +00:00
Ralph Castain
1ace83c470 Enable modex-less launch. Consists of:
1. minor modification to include two new opal MCA params:
   (a) opal_profile: outputs what components were selected by each framework
       currently enabled for most, but not all, frameworks
   (b) opal_profile_file: name of file that contains profile info required
       for modex

2. introduction of two new tools:
   (a) ompi-probe: MPI process that simply calls MPI_Init/Finalize with
       opal_profile set. Also reports back the rml IP address for all
       interfaces on the node
   (b) ompi-profiler: uses ompi-probe to create the profile_file, also
       reports out a summary of what framework components are actually
       being used to help with configuration options

3. modification of the grpcomm basic component to utilize the
   profile file in place of the modex where possible

4. modification of orterun so it properly sees opal mca params and
   handles opal_profile correctly to ensure we don't get its profile

5. similar mod to orted as for orterun

6. addition of new test that calls orte_init followed by calls to
   grpcomm.barrier

This is all completely benign unless actively selected. At the moment, it only supports modex-less launch for openib-based systems. Minor mod to the TCP btl would be required to enable it as well, if people are interested. Similarly, anyone interested in enabling other BTL's for modex-less operation should let me know and I'll give you the magic details.

This seems to significantly improve scalability provided the file can be locally located on the nodes. I'm looking at an alternative means of disseminating the info (perhaps in launch message) as an option for removing that constraint.

This commit was SVN r20098.
2008-12-09 23:49:02 +00:00
Pavel Shamis
068054132a Temporary work around for #1693 (osu_bibw segfault in xrc + coalescing mode)
This commit was SVN r20093.
2008-12-09 19:21:54 +00:00
George Bosilca
df13c2810d Undo the last commit related to the Fortran profiling. After spending few hours
pondering about this problem, we came to the conclusion that the best approach
is to keep what we had before (i.e. the original approach).

The main reason for this is being nice with tool developers. In the current
incarnation, they can either catch the Fortran calls or the C calls. If they
provide both, then they will have to figure out how to cope with the double
calls (as your example highlight).

Here is the behavior Open MPI will stick too:
Fortran MPI  -> C MPI
Fortran PMPI -> C MPI

However, the is another possible approach. This might avoid the double calls
while preserving the tool writers friendliness. This possible approach will do:
   Fortran MPI  -> C MPI
   Fortran PMPI -> C PMPI
                     ^
Unfortunately, we will have to heavily modify all files in the Fortran
interface layer in order to support this approach.

This commit was SVN r20079.
2008-12-06 00:35:32 +00:00
George Bosilca
54d9df317f Fix the Fortran profiling layer to insure that we call the C PMPI_ functions instead of
their MPI_ counterpart. This allow the profiling layer to catch each MPI function only once,
from C and Fortran.

This commit was SVN r20076.
2008-12-05 16:52:25 +00:00
Jeff Squyres
01feac443e Add a missing header file; we won't find mca_topo_base_comm_1_0_0_t
in optimized builds without this.

This commit was SVN r20075.
2008-12-05 14:32:50 +00:00
Shiqing Fan
d06604c258 Get rid of the compiler warning message when --enable-picky is used.
Do the checks according to inter/intracommunicator flags.

This commit was SVN r20063.
2008-12-03 17:44:21 +00:00
Brad Benton
0b83ab39e5 Added the part number for IBM's 2nd rev of the eHCA to the eHCA param stanza
This commit was SVN r20057.
2008-12-03 14:35:43 +00:00
Jon Mason
54b4e9901c Gracefully handle NULL strings when calling orte_show_help for preventing usage
of more than one of the btl_openib_if_include, btl_openib_if_exclude,
btl_openib_ipaddr_include, or btl_openib_ipaddr_exclude MCA parameters.

This commit was SVN r20053.
2008-12-02 23:17:46 +00:00
Jon Mason
91b26eba67 This commit adds comments regarding IP Aliases and the default behavior when
determining which IP address to use when transmitting data.  Also it adds logic
to prevent usage of more than one of the btl_openib_if_include,
btl_openib_if_exclude, btl_openib_ipaddr_include, or btl_openib_ipaddr_exclude
MCA parameters.

This should complete the code modifications needed for ticket 1665.

This commit was SVN r20052.
2008-12-02 22:42:01 +00:00
Edgar Gabriel
b05393b363 now that we have the unexpected message queue for unknown communicators, there is no need to have this additional synchronization operation for multi-threaded communicator creations.
This commit was SVN r20046.
2008-12-02 16:30:15 +00:00
Shiqing Fan
abd21b6d17 - An update for memchecker :
1. fix a bug in pml_ob1_recvreq/sendreq.c, buffer was made defined where the request has already been released.
2. complete memchecker support for collective functions.
3. change the wrongly spelled function name of memchecker, i.e. '*_isaddressible' should be '*_isaddressable'

This commit was SVN r20043.
2008-11-27 16:34:02 +00:00
Donald Kerr
668eafec88 Support to handle platforms which do not define RLIMIT_MEMLOCK
This commit was SVN r20035.
2008-11-25 03:13:09 +00:00
George Bosilca
16598d7d39 When copying data using the same datatype don't ignore the gap in
the begining.

This commit was SVN r20034.
2008-11-25 00:26:12 +00:00
George Bosilca
3927c03e18 Correctly use the snprintf. Don't zap the last char.
This commit was SVN r20033.
2008-11-25 00:25:28 +00:00
George Bosilca
69afbc084a Don't forget to cast or the compiler will do the division
as a double and then convert.

This commit was SVN r20029.
2008-11-24 15:53:56 +00:00
Jon Mason
4757970438 This patch consists of two parts. Part one is the fixing of a bug in the
determing of the IP subnet.  The netmask was being used improperly when
determining which subnet each connection is on.  Part two is the ability to
include/exclude specific subnets.

This patch fixes ticket #1665

This commit was SVN r20016.
2008-11-17 20:20:24 +00:00
Shiqing Fan
4d2c118d3b - fix a type cast. The whole libmpi library has to be compiled as CXX on Windows, and MS compiler recognizes this as an error.
This commit was SVN r20012.
2008-11-17 12:18:01 +00:00
Patrick Geoffray
0f331b4c13 Define a "fake" mpool to provide a memory release callback for the
memory hooks (munmap) and initialize the mallopt component, and 
nothing else.
Use this mpool in the MX common initialization, supporting both BTL 
and MTL. Automatically set the MX_RCACHE environment variable to 
enable registration cache in MX.

Tested with success for munmap() and large free().

This commit was SVN r20003.
2008-11-15 04:17:58 +00:00
Nysal Jan
e4bdaac6d8 Fixed the case where a device does not support inline data. Redefined the interpretation of max_inline_data MCA parameter.
* If max_inline_data == -1 perform runtime detection 
* If max_inline_data >=0 use the value provided 
* If the user does not explicitly set this via command line, use the value from INI file

This commit fixes trac:1662

This commit was SVN r19995.

The following Trac tickets were found above:
  Ticket 1662 --> https://svn.open-mpi.org/trac/ompi/ticket/1662
2008-11-14 12:15:35 +00:00
Rolf vandeVaart
76f8ce01cf Need to add sppp to list of default excluded interfaces
to support Sun M9000 server.

This commit was SVN r19988.
2008-11-12 20:30:14 +00:00
Ralph Castain
ce26e3a2fb Update the notifier framework in prep for move to v1.3. Add an API to handle the case where error messages have been expressed via "show_help" so they can look similar to what was presented to users. Add three key calls in the openib btl to drop messages into syslog.
This will sit in trunk for a few days - would like to actually see some errors reported to syslog before moving the code to 1.3

This commit was SVN r19986.
2008-11-12 18:03:51 +00:00
Jeff Squyres
a48b2d45be Fix wonky copyright year.
This commit was SVN r19985.
2008-11-12 17:51:54 +00:00
Jeff Squyres
bb0b5b04bd Remove duplicate copyright notice (found by script).
This commit was SVN r19984.
2008-11-12 17:42:40 +00:00
Kenneth Matney
07f7f00c91 This disables sendi, since it may do 0-byte requests and it still has
another bug.  This also causes 0-byte requests to be treated as a buffer
error, causing the base request to be requeued.  On Cray XT, it may be
temporarily impossible to make allocations for buffer requests, as the
default stack size is small (8 MB) and there is no true swap device.
Even with the stack size increased, there will be cases in which this
condition recurs.

One possibility is to make the buffer allocations off of the heap; but,
this does not change the fact that eventually an out-of-memory condition
will occur and we need to support multiple receives in transit, a
condition for which the available buffer space may change.  On the other
hand, if we switch to allocating the buffer space from the heap, we will
need to return an error when the allocation fails and there are no other
buffers in transit.

This commit was SVN r19981.
2008-11-12 16:04:14 +00:00
George Bosilca
e84af7920e Move __counter outside the #ifdef section. Cleanup the usage of __counter.
This commit was SVN r19979.
2008-11-11 16:46:11 +00:00