1
1

2934 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
c9e3654a85 Allow OMPI components to modify the link options for libmpi.so. This
functionality used to exist, but I removed it like a year ago because
it wasn't being used.  Well, now I need it :).

This commit was SVN r15901.
2007-08-17 03:52:53 +00:00
George Bosilca
b1082c95ff Remove all output. This final commit should solve all [hopefully]
problems with the integration with parallel debuggers.

This commit was SVN r15898.
2007-08-17 02:35:14 +00:00
George Bosilca
4dacd163cc Don't grab the information from the MPI_Status for the send requests.
This commit was SVN r15897.
2007-08-17 02:31:50 +00:00
George Bosilca
2086b7b445 Optimize the group creation. Don't create a new group if there is
already one containing the same nodes (useful for MPI_Comm_dup).

This commit was SVN r15896.
2007-08-17 02:19:34 +00:00
George Bosilca
7efffdb1da Be smart about parsing the communicators list. Based on the values of
lowest_free and number_free detect if the communicator list has changed.
If not, there is no reason to rebuild it, just use the old one.

This commit was SVN r15895.
2007-08-16 22:51:55 +00:00
George Bosilca
9cce0eb4bc Deal with int/size_t differences. This introduce some small problems
with the reported length of the receive requests, but I'll fix
it soon.

This commit was SVN r15893.
2007-08-16 21:02:24 +00:00
Brad Benton
c254645383 Fixes trac:1134.
Fixed a condition test while checking that all segments are empty.
Without this fix, a NULL segment pointer could make it past the
test, resulting in a SegV when dereferenced.

This commit was SVN r15891.

The following Trac tickets were found above:
  Ticket 1134 --> https://svn.open-mpi.org/trac/ompi/ticket/1134
2007-08-16 19:39:52 +00:00
Brad Benton
1ddba9ec65 Lock the endpoint before doing endpoint_state processing. This ensures
that the subsequent unlock is valid.

This commit was SVN r15890.
2007-08-16 18:11:29 +00:00
Aurelien Bouteiller
3a83c61c40 Fixed a bug with available space in sender based.
This commit was SVN r15889.
2007-08-16 17:54:26 +00:00
George Bosilca
51e726ee8c Remove some old [and unused] code.
This commit was SVN r15887.
2007-08-16 17:06:17 +00:00
Mohamad Chaarawi
b18129c260 Fix to the invalid process lookup cause by group_intersection
This commit was SVN r15886.
2007-08-16 17:01:01 +00:00
George Bosilca
1ae49b7143 Don't use a fprintf. Instead a plain print will do the job.
This commit was SVN r15883.
2007-08-16 14:55:44 +00:00
Tim Prins
5a795128af Change it so that different components in orte use unique rml tags
This commit was SVN r15881.
2007-08-16 14:02:35 +00:00
Jeff Squyres
4b7a6cd922 Fix the following compile error:
{{{
ompi/debuggers/ompi_dll.c:102: error: initializer element is not
constant
}}}

The fix is stupid and I suspect that we'll want to ''not'' print out
all this debugging information all the time.  But I'll leave that to
George to fix...  :-)

This commit was SVN r15880.
2007-08-16 11:51:06 +00:00
George Bosilca
33a73a88c6 Add a lot more information about the requests (pending, matched,
completed). Correctly detect the tag is a receive was matched.

This commit was SVN r15879.
2007-08-16 07:11:56 +00:00
Aurelien Bouteiller
77565d60d9 Heavy modification of the pml_v framework.
* Code cleanup and rationalization
* Fixed: mca_pml_base_send/recv_request are now allocated before recreation by the PML-V
* Fixed: pointer arithmetic bug in sender based that crashed 
* Changed: directory structure. This is one step forward using autogen.sh to build static-components.h (it needs to have the directory structure of a mca framework for this). 

This commit was SVN r15878.
2007-08-16 05:52:30 +00:00
Aurelien Bouteiller
ee708d702d Slight modification to register the name of the selected pml (from the pml framework) instead of the generic mca name. This might be a different name when enabling FT features. This name modification in the modex allows the PMLS to detect a FT protocol mismatch among hosts.
This commit was SVN r15877.
2007-08-16 05:46:11 +00:00
George Bosilca
de4813359a The message queue is now back online. It heavely depend on the
opal_list_t, ompi_free_list_t so every time there is a modification
in one of these files (such as changing the way we allocate the
elements in the free list) the debugger interface have to
reflect these changes.

This commit was SVN r15876.
2007-08-16 04:33:04 +00:00
Aurelien Bouteiller
fa7f6f6722 Improved error detection of request types
This commit was SVN r15857.
2007-08-14 17:24:46 +00:00
Aurelien Bouteiller
67399e7c31 Added a debug type checking for request types (to make sure request size is correctly computed).
This commit was SVN r15856.
2007-08-14 17:18:15 +00:00
Aurelien Bouteiller
1d97c183e7 Better argument checking for output function and added a routine for error printing.
This commit was SVN r15855.
2007-08-14 17:17:12 +00:00
Jeff Squyres
d7c5fea096 * Fix problem caused by r15848: the test parser was looking for
semicolons but the new specitifcation string used colons.  The text
   parser now looks for colons.
 * Changed all opal_output() error messages to
   much-more-helpful/descriptive opal_show_help() messages.
 * A few minor style/indenting fixes

This commit was SVN r15850.

The following SVN revision numbers were found above:
  r15848 --> open-mpi/ompi@dd30597f39
2007-08-14 14:46:13 +00:00
Jeff Squyres
dd30597f39 Change the default receive_queues value per
http://www.open-mpi.org/community/lists/devel/2007/08/2100.php.

This commit was SVN r15848.
2007-08-13 21:51:05 +00:00
Jelena Pjesivac-Grbovic
9bd9c92dbd Making sure that the decision function for scatter and gather correctly
computes everything for MPI_IN_PLACE case.

This commit was SVN r15841.
2007-08-13 17:35:50 +00:00
Brian Barrett
d166a2bb6d Change requested by Ralph -- Remove the dependency on GPR triggers for filling
in the OMPI proc structures.  For now, use an extension of the modex that is
keyed on strings.  Eventually, this should use the attribute put/get that is
part of the RSL interface.

This commit was SVN r15820.
2007-08-09 18:53:28 +00:00
Shiqing Fan
f84e919dc2 - one more to fix.
This commit was SVN r15814.
2007-08-09 13:53:10 +00:00
Shiqing Fan
7dc5dbd8ea - fix a few function export declarations.
This commit was SVN r15813.
2007-08-09 13:52:23 +00:00
Jelena Pjesivac-Grbovic
b558e820cb removing compiler wraning
This commit was SVN r15803.
2007-08-08 15:22:01 +00:00
Jelena Pjesivac-Grbovic
daa10b277e modifying scatter decision function to use binomial algorithm for
small message sizes.

This commit was SVN r15798.
2007-08-07 22:16:13 +00:00
Mohamad Chaarawi
8c458b0ee7 removing unused variables, that cause warnings..
This commit was SVN r15791.
2007-08-07 15:13:46 +00:00
Pak Lui
0790c4cc40 * Update the comment for the previous fix. Thanks Gleb for pointing out.
This commit was SVN r15790.
2007-08-07 14:40:13 +00:00
Jeff Squyres
50bae9c603 Bring in the modular-wireup stuff for the openib BTL (from
/tmp/jms-modular-wireup branch):

 * This commit moves all the openib BTL connection code out of
   btl_openib_endpoint.c and into a connect "pseudo-component" area,
   meaning that different schemes for doing OFA connection schemes can
   be chosen via function pointer (i.e., MCA parameter) at run-time.
 * The connect/connect.h file includes comments describing the
   specific interface for the connect pseudo-component.
 * Two pseudo-components are in this commit (more can certainly be
   added).
   * oob: use the same old oob/rml scheme for creating OFA connections
     that we've had forever; this now just puts the logic into this
     self-contained pseudo-component.
   * rdma_cm: a currently-empty set of functions (that currently
     return NOT_IMPLEMENTED) that will someday use the RDMA connection
     manager to make OFA connections.

This commit was SVN r15786.
2007-08-06 23:40:35 +00:00
Mohamad Chaarawi
96e132b11d typo in comment of the group struct...
This commit was SVN r15785.
2007-08-06 23:09:37 +00:00
Aurelien Bouteiller
ca69915b1e Code cleanup
This commit was SVN r15783.
2007-08-06 22:20:44 +00:00
Brian Barrett
69952d9603 Fix abort caused by calling PtlEQGet on an invalid eq, which could occur
if add_procs was never called.

This commit was SVN r15779.
2007-08-06 17:28:11 +00:00
Brian Barrett
1fb78a35f9 Back out part of r15756. The common_portals_utcp.c file is only used with
the Sandia reference implementation of Portals, and doesn't have the cnos
functions.  This file should never be compiled (and wasn't being compiled)
on the Cray machines, so doesn't need to be updated to support CNL.

This commit was SVN r15778.

The following SVN revision numbers were found above:
  r15756 --> open-mpi/ompi@755658694e
2007-08-06 17:21:00 +00:00
Sven Stork
9e2263f29f - fix a small memory leak
This commit was SVN r15768.
2007-08-06 13:35:32 +00:00
Mohamad Chaarawi
59a7bf8a9f Merging in the Sparse Groups..
This commit includes config changes..

This commit was SVN r15764.
2007-08-04 00:41:26 +00:00
George Bosilca
8baeadb761 The PTLs are now long gone ...
This commit was SVN r15763.
2007-08-04 00:37:52 +00:00
George Bosilca
78e2d3523b Remove some old and unused code. Update some of the comments.
This commit was SVN r15761.
2007-08-04 00:34:42 +00:00
George Bosilca
e41ee17ca5 Add a small comment that hopefully will enforce the correct ordering of
the fields between CM and the other PML in the requests structure.

This commit was SVN r15760.
2007-08-03 23:59:29 +00:00
Josh Hursey
755658694e Bring in changes to support Cray's Compute Node Linux (CNL) and
Application Level Placement Scheduler (ALPS).

This commit was tested under two Cray machines at ORNL: Jaguar (Catamount)
and Rizzo (CNL Test cage). Both machines performed as they should across
the commit.

It is likely that mor changes will follow this the work and environment
stabilizes.

Most of the infrastructure works the same for Catamount and CNL
except for a few bits. Below are the highlights:

Default IFACE Change:
 On Catamount we can use PTL_IFACE_DEFAULT, but on the CNL system we have access
 to will fail on this interface, and should be set to:
    IFACE_FROM_BRIDGE_AND_NALID(PTL_BRIDGE_UK,PTL_IFACE_SS).
 So if we detect that we are running with YOD then use the former interface
 and if we detect that we are running with ALPS then use the latter.
 We will want to pursue a more elegant solution if this interface continues to 
 change across machines.

PtlGetId and cnos_register_ptlid:
 The header suggests that these should never be called when launching with YOD.
 But in the ALPS environment the cnos_barrier() will hang forever if these 
 functions are not called after PtlNIInit(). Since these functions only need to
 be called once, and the orte rmgr/cnos component is loaded before the ompi 
 common/portals componet then just call these functions once in the rmgr/cnos
 component.

cnos_barrier_init():
 This is a noop for YOD, but critical for ALPS. So be sure to call it before
 calling the first barrier in the rmgr/cnos component.

cnos_barrier vs cnos_pm_barrier:
 It is suggested the cnos_pm_barrier only be used during finalization 
 as it will indicate to the launcher (yod or aprun) that the app is about
 to complete. It was suggested that we use the regular cnos_barrier() instead.
 I want to look into this a bit more to make sure there are not adverse
 side effects. A note has been placed in the code to indicate this reasoning.

This commit was SVN r15756.
2007-08-03 19:46:38 +00:00
Pak Lui
010d216db9 * restrict the user with 32 bit app to specify a sm_size to be between
2GB to 4GB-1 by using long instead of size_t for the sm size.
   * it is done to prevent user from running into the ftruncate() in 
	 common sm component (and possibly others) problem that ftruncate 
	 takes an off_t which is a signed long integer. If we use an 
	 unsigned long, it'll run into an invalid argument errno=22.
   * See trac #1117

This commit was SVN r15752.
2007-08-03 15:43:02 +00:00
Aurelien Bouteiller
1d160ca583 Needed change for vampir pml to work
This commit was SVN r15750.
2007-08-03 02:23:24 +00:00
Jeff Squyres
d3f008492f Introduce a new debugging MCA parameter:
mpi_show_mpi_alloc_mem_leaks

When activated, MPI_FINALIZE displays a list of memory allocations
from MPI_ALLOC_MEM that were not freed by MPI_FREE_MEM (in each MPI
process).

 * If set to a positive integer, display only that many leaks.
 * If set to a negative integer, display all leaks.
 * If set to 0, do not show any leaks.

This commit was SVN r15736.
2007-08-01 21:33:25 +00:00
Jeff Squyres
0fb8cf65a8 If you have an HCA with no active ports, we still create an mpool.
This mpool will have no btl module owner there was no btl created for
the HCA with no ports, but it will still be tracked in the mpool
framework (i.e., it's available).

If MPI_ALLOC_MEM is called by the app, one of two things will happen:

 1. if there's an HCA on the host with some active ports, the openib
    btl component will still be in the process space, and therefore
    the "mpool with no btl" (MWNB) module will still be able to call
    the reg/dereg functions, and all will be fine.  However, if
    MPI_FREE_MEM is never invoked to free the memory, bad things will
    happen during MPI_FINALIZE.  The pml is finalized, which finalizes
    all the btls.  The btls finalize all their mpools and all is fine.
    But later we close down the mpool framework which then finalizes
    any left over mpool modules, such as MWNB.  However, the openib
    BTL module functions that the MWNB was registered with are no
    longer in the process space, and it segv's while trying deregister
    the memory.
 2. if there are *no* HCA's on the host with active ports, then the
    openib btl will have been unloaded, and when the MWNM tries to
    register the memory, the functions it tries to call (in the openib
    btl) are no longer there, and we segv.

This commit was SVN r15735.
2007-08-01 20:53:34 +00:00
Gleb Natapov
627d9bc8ed Delay freeing of a send request if scheduling function is running by other
thread.

This commit was SVN r15722.
2007-08-01 12:19:16 +00:00
Gleb Natapov
758f932aa6 Handle credit in a thread safe manner. I am sure more work will have to be done
in this are.

This commit was SVN r15721.
2007-08-01 12:15:43 +00:00
Gleb Natapov
9c20d67301 1) Return IB header to it's previous size by using char for cm_seen field.
2) Allow to specify rd_win/rd_rsv parameters by user, but make them optional.

This commit was SVN r15719.
2007-08-01 12:10:56 +00:00
Aurelien Bouteiller
a403fed18a More checkings (assert) on the output system so that malformed format string does not crash the application at a later random time.
Changed various debug messages to retain most usefull messages

This commit was SVN r15715.
2007-07-31 19:33:39 +00:00