George Bosilca
4fd78c4683
Keep track of the last probe on each communicator, so we can probe all
...
peers in a round-robin fashion. A little bit more fair ...
This commit was SVN r25264.
2011-10-11 20:24:54 +00:00
George Bosilca
2fefd3a928
Don't forget to move the pointer back by the true_lb.
...
This commit was SVN r25262.
2011-10-11 20:15:49 +00:00
George Bosilca
649af6c925
Enumerated mixed with another type (int) is tolerated but
...
easily fixable.
This commit was SVN r25241.
2011-10-09 03:54:52 +00:00
George Bosilca
07f6ce235f
Return an OMPI_ error not an ORTE_.
...
This commit was SVN r25232.
2011-10-04 14:57:24 +00:00
George Bosilca
ce7935c8fa
Obviously these were not needed.
...
This commit was SVN r25231.
2011-10-04 14:56:34 +00:00
George Bosilca
80c02647c8
Each level (OPAL/ORTE/OMPI) should only return it's own constants,
...
instead of the current mismatch.
This commit was SVN r25230.
2011-10-04 14:50:31 +00:00
Mike Dubman
7a9ae43276
added support for shared memory transport in mxm
...
This commit was SVN r25220.
2011-10-03 12:59:55 +00:00
Brian Barrett
fc29ffebdb
* remove two aborts that aren't necessary
...
This commit was SVN r25214.
2011-09-29 22:27:23 +00:00
Brian Barrett
14f32a1a54
* Clean up progress function
...
* Only print returnable errors when verbose=1. Still print errors when
we're going to abort, since those obviously aren't returnable
This commit was SVN r25213.
2011-09-29 22:26:33 +00:00
Brian Barrett
758f8a4d87
* More debugging output
...
* Make recv short block events use the callback mechanism so that can
add overflow debugging
This commit was SVN r25212.
2011-09-29 21:59:48 +00:00
Brian Barrett
c08ea5c0f5
Set options correctly for the two pts
...
This commit was SVN r25211.
2011-09-29 21:56:37 +00:00
Brian Barrett
05f800abae
Properly unpack data for long unexpected
...
This commit was SVN r25210.
2011-09-29 17:25:45 +00:00
Rolf vandeVaart
3d8c6b83a9
Make some error messages more helpful
...
This commit was SVN r25209.
2011-09-29 16:32:46 +00:00
Brian Barrett
bb9e73232a
* Leverage hdr_data and opcount to improve debugging
...
* Clean up handling of short synchronous messages
This commit was SVN r25208.
2011-09-28 21:18:47 +00:00
Brian Barrett
71d8300607
* Fix name clash with macros in mtl_portals4.h
...
* hdr_data now includes opcount and length for all messages, which is the match
bits for long and rndv messages
* Re-add probe implementation
This commit was SVN r25207.
2011-09-28 16:53:01 +00:00
Brian Barrett
2fb8045fad
clean up printfs
...
This commit was SVN r25206.
2011-09-28 15:28:46 +00:00
Brian Barrett
26e781f002
* Remove triggered code for now
...
* Move from per-endpoint send/recv count to just send side op count
This commit was SVN r25205.
2011-09-28 15:25:39 +00:00
Brian Barrett
592c1ab6db
* revert probe and size information changes, since it seems to break everything
...
This commit was SVN r25204.
2011-09-28 14:57:19 +00:00
Brian Barrett
211b5c7824
* Make triggered protocol only work for non-wildcard receives
...
* Always encode length in header data to make probe work
* General send/receive cleanups
* Implement iprobe
This commit was SVN r25197.
2011-09-27 22:45:00 +00:00
Brian Barrett
77c560be42
updates to match new api changes
...
This commit was SVN r25196.
2011-09-27 20:38:22 +00:00
Brad Benton
0f2475c554
Modified set_remote_info() to use memmove() instead of memcpy() when
...
copying rem_qp info. This avoids potential errors when src & dest overlap.
This is a workaround for the issue in #2871
This commit was SVN r25180.
2011-09-26 20:07:36 +00:00
Vishwanath Venkatesan
2ee2b478d8
Modifying selection logic to select dynamic for cases where two_phase
...
fails.
This commit was SVN r25171.
2011-09-20 21:57:23 +00:00
Pavel Shamis
29c4981caa
Removing unused include from openib/ofud btls.
...
This include causes compilation failure on macos platform.
This commit was SVN r25170.
2011-09-20 19:25:59 +00:00
Rolf vandeVaart
0749a220e8
Add support for MPI_IN_PLACE to MPI_Exscan. Required for MPI 2.2 compliance.
...
Reviewed by Jeff Squyres. This fixes trac:2221.
This commit was SVN r25165.
The following Trac tickets were found above:
Ticket 2221 --> https://svn.open-mpi.org/trac/ompi/ticket/2221
2011-09-20 14:54:41 +00:00
Nathan Hjelm
98b56108c4
add unconnect datagram connection manager (udcm)
...
This commit was SVN r25160.
2011-09-19 21:24:58 +00:00
Nathan Hjelm
8cd550f49f
fixed error in last commit
...
This commit was SVN r25159.
2011-09-19 21:13:59 +00:00
Nathan Hjelm
de950959ee
print a more meaningful error message when ibv_create_qp fails
...
This commit was SVN r25158.
2011-09-19 21:12:22 +00:00
Josh Hursey
2d25d70a1c
Missing header for opal_timer_base_get_cycles
...
This commit was SVN r25157.
2011-09-19 19:52:58 +00:00
George Bosilca
9687e7f38e
This commit fixes trac:2679 and should be added to cmr:v1.4:reviewer=jsquyres
...
and cmr:v1.5:reviewer=jsquyres
This commit was SVN r25155.
The following Trac tickets were found above:
Ticket 2679 --> https://svn.open-mpi.org/trac/ompi/ticket/2679
2011-09-18 00:58:26 +00:00
Steve Wise
e4629259f0
Update T4 openib btl defaults.
...
- add 2 new device ids.
- default rq depth to 64, which proved good for large runs.
This commit should be added to cmr:v1.4:reviewer=jsquyres and
cmr:v1.5:reviewer=jsquyres
This commit was SVN r25145.
2011-09-14 22:12:25 +00:00
Steve Wise
e5bba38434
Increase the rdmacm cpc address resolution timeout to 30 seconds.
...
Global rdmacm_resolve_timeout defaults to 1000 (1000 ms), which is way
too small for even a 16 node x 12 core iwarp cluster in the presence
of drops. Bump up the default to 30000ms.
This commit fixes trac:2860 and should be added to cmr:v1.4:reviewer=jsquyres
and cmr:v1.5:reviewer=jsquyres
This commit was SVN r25144.
The following Trac tickets were found above:
Ticket 2860 --> https://svn.open-mpi.org/trac/ompi/ticket/2860
2011-09-14 21:52:58 +00:00
Nathan Hjelm
3048ce043d
permanently disable ibcm
...
This commit was SVN r25137.
2011-09-13 15:10:51 +00:00
Shiqing Fan
8bf5a61265
Fix another compile error for Windows.
...
This commit was SVN r25129.
2011-09-12 14:19:42 +00:00
Ralph Castain
b11f93a039
Check add_procs return value
...
This commit was SVN r25122.
2011-09-11 14:53:26 +00:00
Sylvain Jeaugey
002d39f345
Restored Bull vendor id for ConnectX card
...
This commit was SVN r25121.
2011-09-07 15:58:42 +00:00
Edgar Gabriel
4bc2e9b023
fix a typo and add an actual pvfs function in the configure link-test.
...
This commit was SVN r25120.
2011-09-07 15:46:41 +00:00
Edgar Gabriel
196c3819e2
- revamp the configure logic to detect pvfs2 and lustre
...
- slight change in the selection logic of the fs module, which makes
the ompio independent of the file system type (otherwise ompio
would also have required a configure script).
This commit was SVN r25118.
2011-09-07 10:39:47 +00:00
Mike Dubman
29b63fee81
better support of pml/cm for mxm
...
This commit was SVN r25113.
2011-09-06 06:38:57 +00:00
Shiqing Fan
16193771ba
Add one missing header file. Fix the MTT build for Windows.
...
This commit was SVN r25112.
2011-08-31 13:15:05 +00:00
Rainer Keller
9d5afc58c6
- Fix breakage of the epoch changes with PGI:
...
Don't juse include pre-processor macros between two strins ("s1" #if 0 ... "s2")...
Rather print out the epoch as 0 always...
This commit was SVN r25110.
2011-08-31 08:40:31 +00:00
Wesley Bland
4e7ff0bd5e
By popular demand the epoch code is now disabled by default.
...
To enable the epochs and the resilient orte code, use the configure flag:
--enable-resilient-orte
This will define both:
ORTE_ENABLE_EPOCH
ORTE_RESIL_ORTE
This commit was SVN r25093.
2011-08-26 22:16:14 +00:00
Edgar Gabriel
9b6cf80074
- update the svn:ignore properties for some generated files
...
- add ompi_unignore files for the pvfs2 and lustre components to debug the
configure problem
This commit was SVN r25090.
2011-08-26 13:23:29 +00:00
Edgar Gabriel
f46ef05c6e
ompi_ignore some components that depend on the configure logic, since some
...
libs don't seem to propagate correctly under certain circumstances. This makes
hopefully the nightly tests pass.
also, remove the files that should not have been committed in the first place
:-)
This commit was SVN r25085.
2011-08-26 00:49:32 +00:00
Ralph Castain
71e74990de
Add missing includes so this compiles under Mac OSX
...
This commit was SVN r25084.
2011-08-25 23:04:24 +00:00
Edgar Gabriel
4d23ea19cb
add a missing header file.
...
This commit was SVN r25081.
2011-08-25 21:28:28 +00:00
Edgar Gabriel
61ac1dbcf3
silence some warnings.
...
This commit was SVN r25080.
2011-08-25 21:22:34 +00:00
Edgar Gabriel
52063267df
commit of the OMPIO modules and frameworks.
...
This commit was SVN r25079.
2011-08-25 20:08:17 +00:00
Mike Dubman
98f382ba0e
fixes in mxm mtl
...
This commit was SVN r25066.
2011-08-19 22:18:17 +00:00
Shiqing Fan
627f1dd351
Correct several export declarations.
...
This commit was SVN r25047.
2011-08-15 09:45:51 +00:00
Jeff Squyres
1cbfb53801
r24976 wasn't quite right -- you now actually get a warning if you
...
specify btl_tcp_if_include because btl_tcp_if_exclude is defaulted to
the loopback devices.
This commit does a few things:
* Introduce a new OPAL MCA base function:
mca_base_param_check_exclusive_string(). It checks to see that the
''user'' does not set two MCA parameters that are mutually
exclusive by checking the source of those MCS param values.
* Use the above function in many BTLs (and the OOB TCP) to ensure
that <foo>_if_include and <foo>_if_exclude are not both specified
''by the user''.
* Re-arrange many of these BTLs to move their MCA registration code
into a separate component_register() function (vs. the
component_open() function).
This code has been nominally reviewed and checked by Ralph, George,
Terry, and Shiqing.
This commit was SVN r25043.
The following SVN revision numbers were found above:
r24976 --> open-mpi/ompi@8f4ac54336
2011-08-10 17:24:36 +00:00
Mike Dubman
e3c869d83b
fix double free
...
This commit was SVN r25041.
2011-08-10 05:47:55 +00:00
Mike Dubman
a751cd93d3
improve debug macro availability
...
This commit was SVN r25022.
2011-08-09 10:54:08 +00:00
Mike Dubman
bfd75de6f9
fix selection logic: if no suitable device found - disqulaify mxm w/o complains.
...
This commit was SVN r25021.
2011-08-09 07:09:37 +00:00
Wesley Bland
09274cd047
Make sure that the epoch is initialized everywhere so we don't get weird output
...
during valgrind. This shouldn't have caused any problems with any actual
execution. Just extra warnings in valgrind.
This commit was SVN r25015.
2011-08-08 15:11:55 +00:00
Mike Dubman
1d3f5e1314
better mxm selection mechanism, some refactoring
...
This commit was SVN r25005.
2011-08-07 12:06:49 +00:00
Yevgeny Kliteynik
7068dc64eb
Dynamic SL rework:
...
- Added dynamic SL support to xoob
- Fixed seg fault in finalization
- All the code has been moved to separate files: connect/btl_openib_connect_sl.{c,h}
- The new files compilation is conditionalized
This commit was SVN r24991.
2011-08-04 20:26:08 +00:00
Rolf vandeVaart
3d3b3d4dad
Add support for CUDA registering sm and openib buffers. Feature is disabled by default.
...
This commit was SVN r24987.
2011-08-04 10:15:45 +00:00
Mike Dubman
7b18ab2fa9
remove unused includes
...
This commit was SVN r24980.
2011-08-03 07:07:29 +00:00
Mike Dubman
45ea375531
code and readme updates, some refactoring
...
This commit was SVN r24977.
2011-08-02 14:30:11 +00:00
Jeff Squyres
8f4ac54336
Fixes trac:2838: add a warning message and disqualify the TCP BTL if both
...
btl_tcp_if_include and btl_tcp_if_exclude are specified.
This commit was SVN r24976.
The following Trac tickets were found above:
Ticket 2838 --> https://svn.open-mpi.org/trac/ompi/ticket/2838
2011-08-01 23:30:33 +00:00
Yevgeny Kliteynik
c1ab24c687
openib: added Mellanox ConnectX3 device ID to the device parameters ini file
...
This commit was SVN r24947.
2011-07-26 12:06:43 +00:00
Mike Dubman
aefffa073d
initial implementation of MXM MTL layer
...
This commit was SVN r24946.
2011-07-26 04:36:21 +00:00
Mike Dubman
96ef2fc0e4
fix handling datatypes which have a gap in the beginning
...
This commit was SVN r24936.
2011-07-25 06:30:09 +00:00
Terry Dontje
fbda6aaf89
Fixes trac:2532 issues with 32-bit binaries
...
This commit was SVN r24891.
The following Trac tickets were found above:
Ticket 2532 --> https://svn.open-mpi.org/trac/ompi/ticket/2532
2011-07-13 16:38:03 +00:00
Jeff Squyres
51ac69b05f
Remove a now-nonexistent file
...
This commit was SVN r24874.
2011-07-11 23:51:41 +00:00
Abhishek Kulkarni
5501f83fb5
shmem fixes to make the trunk build with C/R flags on.
...
This commit was SVN r24871.
2011-07-10 23:32:23 +00:00
Jeff Squyres
b2b781e537
Fix a few miscelaneous memory leaks.
...
This commit was SVN r24865.
2011-07-08 16:39:58 +00:00
Mike Dubman
fd17f20ed5
Currently MTLs do no handle communicator contexts in any special way,
...
they only add the context id to the tag selection of the underlying
messaging meachinsm.
We would like to enable an MTL to maintain its own context data
per-communicator. This way an MTL will be able to queue incoming eager
messages and rendezvous requests per-communicator basis.
The MTL will be allowed to override comm->c_pml_comm member,
since it's unused in pml_cm anyway.
This commit was SVN r24858.
2011-07-06 18:25:49 +00:00
Shiqing Fan
1ed0f40d35
Fix a few type casts on Windows.
...
This commit was SVN r24857.
2011-07-06 08:08:53 +00:00
Yevgeny Kliteynik
4fbe68dd86
Removing trailing white spaces in all the openib btl code.
...
This commit was SVN r24855.
2011-07-04 14:00:41 +00:00
Yevgeny Kliteynik
5cae33503d
Changing the weird non-ASCII sign with '*'
...
This commit was SVN r24854.
2011-07-04 13:39:38 +00:00
Brian Barrett
a4b2bd903b
* Implement long-ago discussed RFC to add a callback data pointer in the
...
request completion callback
* Use the completion callback pointer to remove all need for opal_progress
calls in the one-sided layer
This commit was SVN r24848.
2011-06-30 20:05:16 +00:00
Rolf vandeVaart
e6295159ae
Fix compilation of file due to some changes in btl structure.
...
This commit was SVN r24847.
2011-06-30 19:22:41 +00:00
Yevgeny Kliteynik
b05211148d
Supporting dynamic SL ( #2674 )
...
- Added enable/disable configuration parameter for dynamic SL
- All the dynamic SL code is conditionalized
- Removed libibmad dependency
- Using only one include - ib_types.h (part of opensm-devel package)
- Removed all the macro and data types definitions, using the
existing definitions from ib_types.h instead
- general cleaning here and there
The async mode is not implemented yet - stay tuned...
This commit was SVN r24830.
2011-06-28 14:28:29 +00:00
Wesley Bland
84be81df95
Standardize the initialization of the EPOCH's.
...
Everyone will be starting at MIN anyway (until we implement restart of course)
so there's no reason to set the epoch to INVALID and then immediately reset them
to MIN. This way there's less room to make mistakes later.
This commit was SVN r24829.
2011-06-28 14:20:33 +00:00
Wesley Bland
e1ba09ad51
Add a resilience to ORTE. Allows the runtime to continue after a process (or
...
ORTED) failure. Note that more work will be necessary to allow the MPI layer to
take advantage of this.
Per RFC:
http://www.open-mpi.org/community/lists/devel/2011/06/9299.php
This commit was SVN r24815.
2011-06-23 20:38:02 +00:00
Brian Barrett
e8817f3f63
* Don't send acks for expected triggered messages; still need to get the rest of the data
...
* Don't ask for UNLINK events for persistent long unexpected ME or the get MEs.
This commit was SVN r24814.
2011-06-23 16:21:10 +00:00
Samuel Gutierrez
81f38b258a
commit of new shared memory backing facility framework (shmem) and its components.
...
This commit was SVN r24795.
2011-06-21 15:41:57 +00:00
Jeff Squyres
3d8ef08912
Minor updates.
...
This commit was SVN r24791.
2011-06-20 17:59:37 +00:00
Jeff Squyres
c4f9debe21
Fix some names -- PTLs died a long time ago!
...
This commit was SVN r24787.
2011-06-20 17:28:27 +00:00
George Bosilca
65661a3cb4
Dont use a temporary string.
...
This commit was SVN r24786.
2011-06-20 09:29:19 +00:00
Brian Barrett
09d89242d6
Crank up the number of short receive blocks so that we're unlikely to hit the flow
...
control case. Uses about same amount of memory as the Portals 3.3 implementations
This commit was SVN r24782.
2011-06-16 21:58:53 +00:00
Brian Barrett
4fec0c198d
updtae short recv blocks to properly setup for triggered operations (where
...
they also store the triggered start message)
This commit was SVN r24777.
2011-06-16 16:51:59 +00:00
Brian Barrett
83154af74d
Check return codes a bit more closely
...
Fix broken debug output in any_source recv case
Other minor code cleanups
This commit was SVN r24774.
2011-06-13 15:18:55 +00:00
Edgar Gabriel
0173a00f6b
replace the switch-case statement on the basic datatypes by a series of
...
if-elseif statements to make it compile with OpenMPi again.
Fixes trac:2808
This commit was SVN r24768.
The following Trac tickets were found above:
Ticket 2808 --> https://svn.open-mpi.org/trac/ompi/ticket/2808
2011-06-09 15:35:35 +00:00
Rolf vandeVaart
610421a0da
Fix registration of common parameters in sm btl. This was broken by earlier checkin. Now we can adjust them via MCA parameters again and see the right values from ompi_info.
...
This commit was SVN r24763.
2011-06-09 13:57:46 +00:00
Brian Barrett
a7c682cdb0
Fix starting buffer point for triggered get. Should be after the eager part of the
...
message
This commit was SVN r24752.
2011-06-06 17:08:13 +00:00
Rolf vandeVaart
d1fdbadc91
Fix broken basic allocator. Not sure how this ever worked.
...
This commit was SVN r24746.
2011-06-03 14:43:54 +00:00
Brian Barrett
b778d785fb
Add some debugging output and fix some places where the output id and
...
verbosity level were swapped
This commit was SVN r24740.
2011-06-01 17:20:18 +00:00
Brian Barrett
37d5c7e2ca
* Add ability to set long protocol with MCA parameter
...
* Instead of static arrays of send/recv counts, put them in the endpoint
This commit was SVN r24735.
2011-05-26 21:53:39 +00:00
Brian Barrett
beb1bc70b2
* Add support for using modex to exchange NID/PID pairs when using Portals4.
...
Rather than try to support a bunch of lightweight environments like I did
with the Portals3 code, always use the "modex" and hack the grpcomm for
the SHMEM implementation to return the right nid/pid for a remote
process by "magic".
This commit was SVN r24733.
2011-05-25 22:10:27 +00:00
Ralph Castain
b47ec2ee87
Remove lingering references to opal_profile option
...
This commit was SVN r24709.
2011-05-18 18:27:29 +00:00
Ralph Castain
502cc0747f
My my...cleanup a disconnect between the man pages and how we implemented comm_spawn_multiple. We allow an info key per executable. Also fix the -host and -add-host info keys - they are supposed to accept comma-separated lists.
...
This commit was SVN r24706.
2011-05-17 20:12:31 +00:00
Mike Dubman
36db9c6233
* updated copyrights
...
* added support for non-contig data layout in FCA
This commit was SVN r24702.
2011-05-16 14:43:11 +00:00
Brian Barrett
d8b7ea315e
First take at implementing rndv and triggered protocols
...
This commit was SVN r24699.
2011-05-13 05:57:16 +00:00
Brian Barrett
43902221cc
* Fix bad argument to PtlGet in long receive
...
* Fix bad params when configuring ME for long unexpected
This commit was SVN r24698.
2011-05-13 03:56:03 +00:00
Brian Barrett
8376e0e507
Use free list get instead of wait; this is a constrained resource that will never come back, as it scales with the number of windows and not some more dynamic resources...
...
This commit was SVN r24685.
2011-05-05 17:19:59 +00:00
Jeff Squyres
d1d2cd0a87
Make the description of mca_btl_openib_cq_size be more accurate of
...
what it really is/does.
cmr:v1.5.4:kliteyn cmr:v1.4.4:reviewer=kliteyn
This commit was SVN r24684.
2011-05-05 13:10:11 +00:00
Brian Barrett
3d4b7ecbaf
updates for API changes
...
This commit was SVN r24628.
2011-04-20 16:48:27 +00:00
Jeff Squyres
25a8944e09
Fixes trac:2776. Let the openib BTL auto-detect its bandwidth.
...
cmr:v1.5.4
This commit was SVN r24621.
The following Trac tickets were found above:
Ticket 2776 --> https://svn.open-mpi.org/trac/ompi/ticket/2776
2011-04-19 16:31:36 +00:00