1
1
Граф коммитов

3678 Коммитов

Автор SHA1 Сообщение Дата
Mike Dubman
7595a80a63 fix self pid
This commit was SVN r25424.
2011-11-03 06:46:20 +00:00
Nathan Hjelm
211e2dbdf3 clean up tab characters
This commit was SVN r25413.
2011-11-02 15:07:57 +00:00
Ralph Castain
14966e0f8f Cleanup PMI startup - if a component isn't selected, it should finalize PMI IFF it started it. Otherwise, components that aren't selected can finalize PMI when it is in use by other parts of the system.
This commit was SVN r25407.
2011-11-01 16:25:12 +00:00
Mike Dubman
3edd77ea25 update mxm plugin to mxm api change: pass synchronous request as an opcode instead of a flag
This commit was SVN r25403.
2011-10-31 22:36:15 +00:00
Mike Dubman
6b50ba22a6 select mxm ptl based on user preferences
This commit was SVN r25401.
2011-10-31 10:17:43 +00:00
Samuel Gutierrez
0ba13e2f8e fix typo. use PMI_Initialized for init status instead of PMI_Init.
This commit was SVN r25378.
2011-10-27 22:41:50 +00:00
Nathan Hjelm
ee087de073 added fast boxes to vader
This commit was SVN r25376.
2011-10-27 20:22:46 +00:00
Mike Dubman
f96ae43e23 pass jobid to mxm/sm module
This commit was SVN r25375.
2011-10-27 13:14:52 +00:00
Nathan Hjelm
82efe131dc made btl_vader_max_inline_send a configurable parameter and updated and enabled sendi
This commit was SVN r25374.
2011-10-26 22:15:42 +00:00
Nathan Hjelm
033179d6ac fixed bug in frag initialization
This commit was SVN r25373.
2011-10-26 19:29:37 +00:00
George Bosilca
6fdb040eef ORTE_ERROR to OPAL_ERROR.
This commit was SVN r25372.
2011-10-26 15:59:43 +00:00
George Bosilca
9d8e84142f Survivor!!!
This commit was SVN r25371.
2011-10-26 00:58:55 +00:00
Nathan Hjelm
05114ffb51 fixed off by one error
This commit was SVN r25369.
2011-10-25 22:07:47 +00:00
George Bosilca
72f731f25f The SM2 collective component has not been updated in a long
time. Rich, the original developer, agrees with this removal.

This commit was SVN r25368.
2011-10-25 22:07:09 +00:00
Nathan Hjelm
e887d595c7 fix potential bug with non-contiguous sends
This commit was SVN r25367.
2011-10-25 19:21:45 +00:00
Nathan Hjelm
433cfa3665 use single copy for some sends
This commit was SVN r25365.
2011-10-25 18:38:42 +00:00
Mike Dubman
9ffeeb69d9 fix help message
This commit was SVN r25364.
2011-10-25 14:02:43 +00:00
Samuel Gutierrez
663f4546f5 fix define typo in psm mtl.
This commit was SVN r25362.
2011-10-24 18:38:12 +00:00
Ralph Castain
955d8e7d46 Allow apps to use pmi when launched by mpirun, if desired, without affecting daemons
This commit was SVN r25359.
2011-10-23 15:57:13 +00:00
Nathan Hjelm
fb19f56965 Cray doesn't define PMI2_SUCCESS
This commit was SVN r25354.
2011-10-21 16:34:22 +00:00
Nathan Hjelm
cd68dbe2b8 only try to build vader if xpmem is installed. unignore vader
This commit was SVN r25352.
2011-10-21 15:45:05 +00:00
Ralph Castain
3e72fccacf Cray's PMI implementation is quite different from slurm's - they extended PMI-1 by adding some, but not all, of the PMI-2 APIs. So you can't just switch to using PMI-2 functions as it isn't a complete implementation. Instead, you have to selectively figure out which ones they have in PMI-2, and use any missing ones from PMI-1. What fun.
Modify the configure logic and the PMI components to accommodate Cray's approach. Refactor the PMI error reporting code so it resides in only one place. Cray actually decided -not- to define the PMI-2 error codes, so we have to use the PMI-1 codes instead. More fun.

This commit was SVN r25348.
2011-10-21 04:54:38 +00:00
Ralph Castain
e2adc8fa3a Ignore until Nathan can fix - probably configure problem
This commit was SVN r25347.
2011-10-21 03:43:01 +00:00
Ralph Castain
5947f61b86 Remove windows reference for now
This commit was SVN r25346.
2011-10-21 01:19:03 +00:00
Nathan Hjelm
414677a082 default to no xpmem support
This commit was SVN r25345.
2011-10-20 22:13:45 +00:00
Nathan Hjelm
808a73a5c5 removed erroneous add of .deps
This commit was SVN r25343.
2011-10-20 21:41:51 +00:00
Nathan Hjelm
3dbaaf6879 initial commit of vader (xpmem) btl
This commit was SVN r25342.
2011-10-20 21:39:44 +00:00
Nathan Hjelm
586403f052 more pmi return code wtf
This commit was SVN r25337.
2011-10-20 17:53:04 +00:00
Nathan Hjelm
e1e8837992 add a uintptr_t to the seg_key union
This commit was SVN r25334.
2011-10-19 21:48:52 +00:00
George Bosilca
1bc5da0911 These are supposed to be OPAL level errors.
This commit was SVN r25326.
2011-10-19 14:22:09 +00:00
Ralph Castain
72a4b0bd8a Fix constants
This commit was SVN r25325.
2011-10-19 14:14:58 +00:00
George Bosilca
efd88e10d7 Cleanup the error codes. Get rid of all the useless ones, and
mark the distinction between ORTE and OMPI errors.

This commit was SVN r25323.
2011-10-19 03:51:53 +00:00
Ralph Castain
0bf4f48aa3 Don't need priority in this framework
This commit was SVN r25308.
2011-10-17 22:39:15 +00:00
Ralph Castain
8f0ef54130 Complete implementation of pmi support. Ensure we support both mpirun and direct launch within same configuration to avoid requiring separate builds. Add support for generic pmi, not just under slurm. Add publish/subscribe support, although slurm's pmi implementation will just return an error as it hasn't been done yet.
This commit was SVN r25303.
2011-10-17 20:51:22 +00:00
Ralph Castain
e7f6be5385 Unused variable
This commit was SVN r25301.
2011-10-17 18:59:22 +00:00
Ralph Castain
2eaadcfab9 Remove unused variable
This commit was SVN r25284.
2011-10-14 15:32:18 +00:00
Vishwanath Venkatesan
8dd07bdceb Removing .ompi_ignore and .ompi_unignore from fs/pvfs2 and fbtl/pvfs2
This commit was SVN r25283.
2011-10-14 00:40:11 +00:00
Vishwanath Venkatesan
8f6b29e95b Fixing the default file view issue and merging contiguous lengths and offsets
for explicit offset case.

This commit was SVN r25281.
2011-10-13 19:50:45 +00:00
Jeff Squyres
2c6254b70d Second change from Intel.
This commit was SVN r25279.
2011-10-12 23:26:34 +00:00
Jeff Squyres
28118d0611 Updte the parameters for the Intel iWARP devices, per request from
Faisal Latif <faisal.latif@intel.com>.

This commit was SVN r25278.
2011-10-12 22:58:30 +00:00
Brian Barrett
d8b5b544ad Update list name to match change in spec
This commit was SVN r25273.
2011-10-12 20:09:39 +00:00
Rainer Keller
4e6a6fc146 - Check, whether the compiler supports __builtin_clz (count leading
zeroes);
   if so, use it for bit-operations like opal_cube_dim and opal_hibit.
   Implement two versions of power-of-two.
   In case of opal_next_poweroftwo, this reduces the average execution
   time from 83 cycles to 4 cycles (Intel Nehalem, icc, -O2, inlining,
   measured rdtsc, with loop over 2^27 values).
   Numbers for other functions are similar (but of course heavily depend
   on the usage, e.g. opal_hibit() with a start of 4 does not save
   much).  The bsr instruction on AMD Opteron is also not as fast.

 - Replace various places where the next power-of-two is computed.
   
   Tested on Intel Nehalem Cluster with openib, compilers GNU-4.6.1 and
   Intel-12.0.4 using mpi_testsuite -t "Collective" with 128 processes.

This commit was SVN r25270.
2011-10-11 22:49:01 +00:00
George Bosilca
74c88a9e48 This was never used (sm_ctl_header).
This commit was SVN r25267.
2011-10-11 20:37:00 +00:00
George Bosilca
ca6c282f23 Small cleanups in the SM BTL.
This commit was SVN r25266.
2011-10-11 20:32:10 +00:00
George Bosilca
3241bea696 Apply a patch provided by Sébastien Boisvert fixing an issue
with the probe fairness.

This commit was SVN r25265.
2011-10-11 20:28:33 +00:00
George Bosilca
4fd78c4683 Keep track of the last probe on each communicator, so we can probe all
peers in a round-robin fashion. A little bit more fair ...

This commit was SVN r25264.
2011-10-11 20:24:54 +00:00
George Bosilca
2fefd3a928 Don't forget to move the pointer back by the true_lb.
This commit was SVN r25262.
2011-10-11 20:15:49 +00:00
George Bosilca
649af6c925 Enumerated mixed with another type (int) is tolerated but
easily fixable.

This commit was SVN r25241.
2011-10-09 03:54:52 +00:00
George Bosilca
07f6ce235f Return an OMPI_ error not an ORTE_.
This commit was SVN r25232.
2011-10-04 14:57:24 +00:00
George Bosilca
ce7935c8fa Obviously these were not needed.
This commit was SVN r25231.
2011-10-04 14:56:34 +00:00
George Bosilca
80c02647c8 Each level (OPAL/ORTE/OMPI) should only return it's own constants,
instead of the current mismatch.

This commit was SVN r25230.
2011-10-04 14:50:31 +00:00
Mike Dubman
7a9ae43276 added support for shared memory transport in mxm
This commit was SVN r25220.
2011-10-03 12:59:55 +00:00
Brian Barrett
fc29ffebdb * remove two aborts that aren't necessary
This commit was SVN r25214.
2011-09-29 22:27:23 +00:00
Brian Barrett
14f32a1a54 * Clean up progress function
* Only print returnable errors when verbose=1.  Still print errors when
  we're going to abort, since those obviously aren't returnable

This commit was SVN r25213.
2011-09-29 22:26:33 +00:00
Brian Barrett
758f8a4d87 * More debugging output
* Make recv short block events use the callback mechanism so that can
  add overflow debugging

This commit was SVN r25212.
2011-09-29 21:59:48 +00:00
Brian Barrett
c08ea5c0f5 Set options correctly for the two pts
This commit was SVN r25211.
2011-09-29 21:56:37 +00:00
Brian Barrett
05f800abae Properly unpack data for long unexpected
This commit was SVN r25210.
2011-09-29 17:25:45 +00:00
Rolf vandeVaart
3d8c6b83a9 Make some error messages more helpful
This commit was SVN r25209.
2011-09-29 16:32:46 +00:00
Brian Barrett
bb9e73232a * Leverage hdr_data and opcount to improve debugging
* Clean up handling of short synchronous messages

This commit was SVN r25208.
2011-09-28 21:18:47 +00:00
Brian Barrett
71d8300607 * Fix name clash with macros in mtl_portals4.h
* hdr_data now includes opcount and length for all messages, which is the match
  bits for long and rndv messages
* Re-add probe implementation 

This commit was SVN r25207.
2011-09-28 16:53:01 +00:00
Brian Barrett
2fb8045fad clean up printfs
This commit was SVN r25206.
2011-09-28 15:28:46 +00:00
Brian Barrett
26e781f002 * Remove triggered code for now
* Move from per-endpoint send/recv count to just send side op count

This commit was SVN r25205.
2011-09-28 15:25:39 +00:00
Brian Barrett
592c1ab6db * revert probe and size information changes, since it seems to break everything
This commit was SVN r25204.
2011-09-28 14:57:19 +00:00
Brian Barrett
211b5c7824 * Make triggered protocol only work for non-wildcard receives
* Always encode length in header data to make probe work
* General send/receive cleanups
* Implement iprobe

This commit was SVN r25197.
2011-09-27 22:45:00 +00:00
Brian Barrett
77c560be42 updates to match new api changes
This commit was SVN r25196.
2011-09-27 20:38:22 +00:00
Brad Benton
0f2475c554 Modified set_remote_info() to use memmove() instead of memcpy() when
copying rem_qp info.  This avoids potential errors when src & dest overlap.
This is a workaround for the issue in #2871

This commit was SVN r25180.
2011-09-26 20:07:36 +00:00
Vishwanath Venkatesan
2ee2b478d8 Modifying selection logic to select dynamic for cases where two_phase
fails.

This commit was SVN r25171.
2011-09-20 21:57:23 +00:00
Pavel Shamis
29c4981caa Removing unused include from openib/ofud btls.
This include causes compilation failure on macos platform.

This commit was SVN r25170.
2011-09-20 19:25:59 +00:00
Rolf vandeVaart
0749a220e8 Add support for MPI_IN_PLACE to MPI_Exscan. Required for MPI 2.2 compliance.
Reviewed by Jeff Squyres.  This fixes trac:2221.

This commit was SVN r25165.

The following Trac tickets were found above:
  Ticket 2221 --> https://svn.open-mpi.org/trac/ompi/ticket/2221
2011-09-20 14:54:41 +00:00
Nathan Hjelm
98b56108c4 add unconnect datagram connection manager (udcm)
This commit was SVN r25160.
2011-09-19 21:24:58 +00:00
Nathan Hjelm
8cd550f49f fixed error in last commit
This commit was SVN r25159.
2011-09-19 21:13:59 +00:00
Nathan Hjelm
de950959ee print a more meaningful error message when ibv_create_qp fails
This commit was SVN r25158.
2011-09-19 21:12:22 +00:00
Josh Hursey
2d25d70a1c Missing header for opal_timer_base_get_cycles
This commit was SVN r25157.
2011-09-19 19:52:58 +00:00
George Bosilca
9687e7f38e This commit fixes trac:2679 and should be added to cmr:v1.4:reviewer=jsquyres
and cmr:v1.5:reviewer=jsquyres

This commit was SVN r25155.

The following Trac tickets were found above:
  Ticket 2679 --> https://svn.open-mpi.org/trac/ompi/ticket/2679
2011-09-18 00:58:26 +00:00
Steve Wise
e4629259f0 Update T4 openib btl defaults.
- add 2 new device ids.
- default rq depth to 64, which proved good for large runs.

This commit should be added to cmr:v1.4:reviewer=jsquyres and
cmr:v1.5:reviewer=jsquyres

This commit was SVN r25145.
2011-09-14 22:12:25 +00:00
Steve Wise
e5bba38434 Increase the rdmacm cpc address resolution timeout to 30 seconds.
Global rdmacm_resolve_timeout defaults to 1000 (1000 ms), which is way
too small for even a 16 node x 12 core iwarp cluster in the presence
of drops.  Bump up the default to 30000ms.

This commit fixes trac:2860 and should be added to cmr:v1.4:reviewer=jsquyres
and cmr:v1.5:reviewer=jsquyres

This commit was SVN r25144.

The following Trac tickets were found above:
  Ticket 2860 --> https://svn.open-mpi.org/trac/ompi/ticket/2860
2011-09-14 21:52:58 +00:00
Nathan Hjelm
3048ce043d permanently disable ibcm
This commit was SVN r25137.
2011-09-13 15:10:51 +00:00
Shiqing Fan
8bf5a61265 Fix another compile error for Windows.
This commit was SVN r25129.
2011-09-12 14:19:42 +00:00
Ralph Castain
b11f93a039 Check add_procs return value
This commit was SVN r25122.
2011-09-11 14:53:26 +00:00
Sylvain Jeaugey
002d39f345 Restored Bull vendor id for ConnectX card
This commit was SVN r25121.
2011-09-07 15:58:42 +00:00
Edgar Gabriel
4bc2e9b023 fix a typo and add an actual pvfs function in the configure link-test.
This commit was SVN r25120.
2011-09-07 15:46:41 +00:00
Edgar Gabriel
196c3819e2 - revamp the configure logic to detect pvfs2 and lustre
- slight change in the selection logic of the fs module, which makes
   the ompio independent of the file system type (otherwise ompio 
   would also have required a configure script).

This commit was SVN r25118.
2011-09-07 10:39:47 +00:00
Mike Dubman
29b63fee81 better support of pml/cm for mxm
This commit was SVN r25113.
2011-09-06 06:38:57 +00:00
Shiqing Fan
16193771ba Add one missing header file. Fix the MTT build for Windows.
This commit was SVN r25112.
2011-08-31 13:15:05 +00:00
Rainer Keller
9d5afc58c6 - Fix breakage of the epoch changes with PGI:
Don't juse include pre-processor macros between two strins ("s1" #if 0 ... "s2")...
   Rather print out the epoch as 0 always...

This commit was SVN r25110.
2011-08-31 08:40:31 +00:00
Wesley Bland
4e7ff0bd5e By popular demand the epoch code is now disabled by default.
To enable the epochs and the resilient orte code, use the configure flag:

--enable-resilient-orte

This will define both:

ORTE_ENABLE_EPOCH
ORTE_RESIL_ORTE

This commit was SVN r25093.
2011-08-26 22:16:14 +00:00
Edgar Gabriel
9b6cf80074 - update the svn:ignore properties for some generated files
- add ompi_unignore files for the pvfs2 and lustre components to debug the
  configure problem

This commit was SVN r25090.
2011-08-26 13:23:29 +00:00
Edgar Gabriel
f46ef05c6e ompi_ignore some components that depend on the configure logic, since some
libs don't seem to propagate correctly under certain circumstances. This makes
hopefully the nightly tests pass.

also, remove the files that should not have been committed in the first place
:-)

This commit was SVN r25085.
2011-08-26 00:49:32 +00:00
Ralph Castain
71e74990de Add missing includes so this compiles under Mac OSX
This commit was SVN r25084.
2011-08-25 23:04:24 +00:00
Edgar Gabriel
4d23ea19cb add a missing header file.
This commit was SVN r25081.
2011-08-25 21:28:28 +00:00
Edgar Gabriel
61ac1dbcf3 silence some warnings.
This commit was SVN r25080.
2011-08-25 21:22:34 +00:00
Edgar Gabriel
52063267df commit of the OMPIO modules and frameworks.
This commit was SVN r25079.
2011-08-25 20:08:17 +00:00
Mike Dubman
98f382ba0e fixes in mxm mtl
This commit was SVN r25066.
2011-08-19 22:18:17 +00:00
Shiqing Fan
627f1dd351 Correct several export declarations.
This commit was SVN r25047.
2011-08-15 09:45:51 +00:00
Jeff Squyres
1cbfb53801 r24976 wasn't quite right -- you now actually get a warning if you
specify btl_tcp_if_include because btl_tcp_if_exclude is defaulted to
the loopback devices.

This commit does a few things:

 * Introduce a new OPAL MCA base function:
   mca_base_param_check_exclusive_string().  It checks to see that the
   ''user'' does not set two MCA parameters that are mutually
   exclusive by checking the source of those MCS param values.
 * Use the above function in many BTLs (and the OOB TCP) to ensure
   that <foo>_if_include and <foo>_if_exclude are not both specified
   ''by the user''.
 * Re-arrange many of these BTLs to move their MCA registration code
   into a separate component_register() function (vs. the
   component_open() function).

This code has been nominally reviewed and checked by Ralph, George,
Terry, and Shiqing.

This commit was SVN r25043.

The following SVN revision numbers were found above:
  r24976 --> open-mpi/ompi@8f4ac54336
2011-08-10 17:24:36 +00:00
Mike Dubman
e3c869d83b fix double free
This commit was SVN r25041.
2011-08-10 05:47:55 +00:00
Mike Dubman
a751cd93d3 improve debug macro availability
This commit was SVN r25022.
2011-08-09 10:54:08 +00:00
Mike Dubman
bfd75de6f9 fix selection logic: if no suitable device found - disqulaify mxm w/o complains.
This commit was SVN r25021.
2011-08-09 07:09:37 +00:00
Wesley Bland
09274cd047 Make sure that the epoch is initialized everywhere so we don't get weird output
during valgrind. This shouldn't have caused any problems with any actual
execution. Just extra warnings in valgrind.

This commit was SVN r25015.
2011-08-08 15:11:55 +00:00
Mike Dubman
1d3f5e1314 better mxm selection mechanism, some refactoring
This commit was SVN r25005.
2011-08-07 12:06:49 +00:00
Yevgeny Kliteynik
7068dc64eb Dynamic SL rework:
- Added dynamic SL support to xoob
 - Fixed seg fault in finalization
 - All the code has been moved to separate files: connect/btl_openib_connect_sl.{c,h}
 - The new files compilation is conditionalized

This commit was SVN r24991.
2011-08-04 20:26:08 +00:00
Rolf vandeVaart
3d3b3d4dad Add support for CUDA registering sm and openib buffers. Feature is disabled by default.
This commit was SVN r24987.
2011-08-04 10:15:45 +00:00
Mike Dubman
7b18ab2fa9 remove unused includes
This commit was SVN r24980.
2011-08-03 07:07:29 +00:00
Mike Dubman
45ea375531 code and readme updates, some refactoring
This commit was SVN r24977.
2011-08-02 14:30:11 +00:00
Jeff Squyres
8f4ac54336 Fixes trac:2838: add a warning message and disqualify the TCP BTL if both
btl_tcp_if_include and btl_tcp_if_exclude are specified. 

This commit was SVN r24976.

The following Trac tickets were found above:
  Ticket 2838 --> https://svn.open-mpi.org/trac/ompi/ticket/2838
2011-08-01 23:30:33 +00:00
Yevgeny Kliteynik
c1ab24c687 openib: added Mellanox ConnectX3 device ID to the device parameters ini file
This commit was SVN r24947.
2011-07-26 12:06:43 +00:00
Mike Dubman
aefffa073d initial implementation of MXM MTL layer
This commit was SVN r24946.
2011-07-26 04:36:21 +00:00
Mike Dubman
96ef2fc0e4 fix handling datatypes which have a gap in the beginning
This commit was SVN r24936.
2011-07-25 06:30:09 +00:00
Terry Dontje
fbda6aaf89 Fixes trac:2532 issues with 32-bit binaries
This commit was SVN r24891.

The following Trac tickets were found above:
  Ticket 2532 --> https://svn.open-mpi.org/trac/ompi/ticket/2532
2011-07-13 16:38:03 +00:00
Jeff Squyres
51ac69b05f Remove a now-nonexistent file
This commit was SVN r24874.
2011-07-11 23:51:41 +00:00
Abhishek Kulkarni
5501f83fb5 shmem fixes to make the trunk build with C/R flags on.
This commit was SVN r24871.
2011-07-10 23:32:23 +00:00
Jeff Squyres
b2b781e537 Fix a few miscelaneous memory leaks.
This commit was SVN r24865.
2011-07-08 16:39:58 +00:00
Mike Dubman
fd17f20ed5 Currently MTLs do no handle communicator contexts in any special way,
they only add the context id to the tag selection of the underlying 
messaging meachinsm. 
 
 We would like to enable an MTL to maintain its own context data
per-communicator. This way an MTL will be able to queue incoming eager 
messages and rendezvous requests per-communicator basis.

 The MTL will be allowed to override comm->c_pml_comm member, 
since it's unused in pml_cm anyway. 

This commit was SVN r24858.
2011-07-06 18:25:49 +00:00
Shiqing Fan
1ed0f40d35 Fix a few type casts on Windows.
This commit was SVN r24857.
2011-07-06 08:08:53 +00:00
Yevgeny Kliteynik
4fbe68dd86 Removing trailing white spaces in all the openib btl code.
This commit was SVN r24855.
2011-07-04 14:00:41 +00:00
Yevgeny Kliteynik
5cae33503d Changing the weird non-ASCII sign with '*'
This commit was SVN r24854.
2011-07-04 13:39:38 +00:00
Brian Barrett
a4b2bd903b * Implement long-ago discussed RFC to add a callback data pointer in the
request completion callback
* Use the completion callback pointer to remove all need for opal_progress
  calls in the one-sided layer

This commit was SVN r24848.
2011-06-30 20:05:16 +00:00
Rolf vandeVaart
e6295159ae Fix compilation of file due to some changes in btl structure.
This commit was SVN r24847.
2011-06-30 19:22:41 +00:00
Yevgeny Kliteynik
b05211148d Supporting dynamic SL (#2674)
- Added enable/disable configuration parameter for dynamic SL
 - All the dynamic SL code is conditionalized
 - Removed libibmad dependency
 - Using only one include - ib_types.h (part of opensm-devel package)
 - Removed all the macro and data types definitions, using the
   existing definitions from ib_types.h instead
 - general cleaning here and there

The async mode is not implemented yet - stay tuned...

This commit was SVN r24830.
2011-06-28 14:28:29 +00:00
Wesley Bland
84be81df95 Standardize the initialization of the EPOCH's.
Everyone will be starting at MIN anyway (until we implement restart of course)
so there's no reason to set the epoch to INVALID and then immediately reset them
to MIN. This way there's less room to make mistakes later.

This commit was SVN r24829.
2011-06-28 14:20:33 +00:00
Wesley Bland
e1ba09ad51 Add a resilience to ORTE. Allows the runtime to continue after a process (or
ORTED) failure. Note that more work will be necessary to allow the MPI layer to
take advantage of this.

Per RFC:
http://www.open-mpi.org/community/lists/devel/2011/06/9299.php

This commit was SVN r24815.
2011-06-23 20:38:02 +00:00
Brian Barrett
e8817f3f63 * Don't send acks for expected triggered messages; still need to get the rest of the data
* Don't ask for UNLINK events for persistent long unexpected ME or the get MEs.

This commit was SVN r24814.
2011-06-23 16:21:10 +00:00
Samuel Gutierrez
81f38b258a commit of new shared memory backing facility framework (shmem) and its components.
This commit was SVN r24795.
2011-06-21 15:41:57 +00:00
Jeff Squyres
3d8ef08912 Minor updates.
This commit was SVN r24791.
2011-06-20 17:59:37 +00:00
Jeff Squyres
c4f9debe21 Fix some names -- PTLs died a long time ago!
This commit was SVN r24787.
2011-06-20 17:28:27 +00:00
George Bosilca
65661a3cb4 Dont use a temporary string.
This commit was SVN r24786.
2011-06-20 09:29:19 +00:00
Brian Barrett
09d89242d6 Crank up the number of short receive blocks so that we're unlikely to hit the flow
control case.  Uses about same amount of memory as the Portals 3.3 implementations

This commit was SVN r24782.
2011-06-16 21:58:53 +00:00
Brian Barrett
4fec0c198d updtae short recv blocks to properly setup for triggered operations (where
they also store the triggered start message)

This commit was SVN r24777.
2011-06-16 16:51:59 +00:00
Brian Barrett
83154af74d Check return codes a bit more closely
Fix broken debug output in any_source recv case
Other minor code cleanups

This commit was SVN r24774.
2011-06-13 15:18:55 +00:00
Edgar Gabriel
0173a00f6b replace the switch-case statement on the basic datatypes by a series of
if-elseif statements to make it compile with OpenMPi again.

Fixes trac:2808

This commit was SVN r24768.

The following Trac tickets were found above:
  Ticket 2808 --> https://svn.open-mpi.org/trac/ompi/ticket/2808
2011-06-09 15:35:35 +00:00
Rolf vandeVaart
610421a0da Fix registration of common parameters in sm btl. This was broken by earlier checkin. Now we can adjust them via MCA parameters again and see the right values from ompi_info.
This commit was SVN r24763.
2011-06-09 13:57:46 +00:00
Brian Barrett
a7c682cdb0 Fix starting buffer point for triggered get. Should be after the eager part of the
message

This commit was SVN r24752.
2011-06-06 17:08:13 +00:00
Rolf vandeVaart
d1fdbadc91 Fix broken basic allocator. Not sure how this ever worked.
This commit was SVN r24746.
2011-06-03 14:43:54 +00:00
Brian Barrett
b778d785fb Add some debugging output and fix some places where the output id and
verbosity level were swapped

This commit was SVN r24740.
2011-06-01 17:20:18 +00:00
Brian Barrett
37d5c7e2ca * Add ability to set long protocol with MCA parameter
* Instead of static arrays of send/recv counts, put them in the endpoint

This commit was SVN r24735.
2011-05-26 21:53:39 +00:00
Brian Barrett
beb1bc70b2 * Add support for using modex to exchange NID/PID pairs when using Portals4.
Rather than try to support a bunch of lightweight environments like I did
  with the Portals3 code, always use the "modex" and hack the grpcomm for
  the SHMEM implementation to return the right nid/pid for a remote
  process by "magic".

This commit was SVN r24733.
2011-05-25 22:10:27 +00:00
Ralph Castain
b47ec2ee87 Remove lingering references to opal_profile option
This commit was SVN r24709.
2011-05-18 18:27:29 +00:00
Ralph Castain
502cc0747f My my...cleanup a disconnect between the man pages and how we implemented comm_spawn_multiple. We allow an info key per executable. Also fix the -host and -add-host info keys - they are supposed to accept comma-separated lists.
This commit was SVN r24706.
2011-05-17 20:12:31 +00:00
Mike Dubman
36db9c6233 * updated copyrights
* added support for non-contig data layout in FCA

This commit was SVN r24702.
2011-05-16 14:43:11 +00:00
Brian Barrett
d8b7ea315e First take at implementing rndv and triggered protocols
This commit was SVN r24699.
2011-05-13 05:57:16 +00:00
Brian Barrett
43902221cc * Fix bad argument to PtlGet in long receive
* Fix bad params when configuring ME for long unexpected

This commit was SVN r24698.
2011-05-13 03:56:03 +00:00
Brian Barrett
8376e0e507 Use free list get instead of wait; this is a constrained resource that will never come back, as it scales with the number of windows and not some more dynamic resources...
This commit was SVN r24685.
2011-05-05 17:19:59 +00:00
Jeff Squyres
d1d2cd0a87 Make the description of mca_btl_openib_cq_size be more accurate of
what it really is/does.

cmr:v1.5.4:kliteyn cmr:v1.4.4:reviewer=kliteyn

This commit was SVN r24684.
2011-05-05 13:10:11 +00:00
Brian Barrett
3d4b7ecbaf updates for API changes
This commit was SVN r24628.
2011-04-20 16:48:27 +00:00
Jeff Squyres
25a8944e09 Fixes trac:2776. Let the openib BTL auto-detect its bandwidth.
cmr:v1.5.4

This commit was SVN r24621.

The following Trac tickets were found above:
  Ticket 2776 --> https://svn.open-mpi.org/trac/ompi/ticket/2776
2011-04-19 16:31:36 +00:00
Josh Hursey
045035963a Fix return code from MPI_Probe and MPI_Iprobe.
Instead of returning MPI_SUCCESS every time they are called regardless of the status of the call, they should return a value representative of the action. So similar to MPI_Wait/MPI_Test they will return MPI_SUCCESS if the action was successfull, or the value that matches status.MPI_ERROR for the operation if it is unsuccessful.

This was discussed on the [http://www.open-mpi.org/community/lists/devel/2011/03/9109.php ompi-devel list]

This commit was SVN r24551.
2011-03-22 13:29:29 +00:00
Eugene Loh
2770a12beb Continue clean up of thread options started in r22841, 22842, and 22849.
No need for any CMRs to 1.5... that was already done in CMR 2728.

This commit was SVN r24545.

The following SVN revision numbers were found above:
  r22841 --> open-mpi/ompi@b400b84162
2011-03-18 21:36:35 +00:00
Jeff Squyres
82f9474fec Revert r24533 and r24507 until the compile errors can be fixed.
This commit was SVN r24541.

The following SVN revision numbers were found above:
  r24507 --> open-mpi/ompi@4ce1936fed
  r24533 --> open-mpi/ompi@3204af2d36
2011-03-18 13:33:02 +00:00
George Bosilca
13d2998d54 When the BTL TCP is trying to connect to a peer, output it's process name
in addition to all the information.

This commit was SVN r24534.
2011-03-16 20:20:14 +00:00
Mike Dubman
3204af2d36 * temporary fix for ib btl compilation with old ofed versions 1.3.x.
This commit was SVN r24533.
2011-03-16 17:53:51 +00:00