1
1

10248 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
629bacbb07 Don't include the atomic header file, if we're building a non threaded version.
This commit was SVN r15766.
2007-08-04 00:43:15 +00:00
George Bosilca
e2f6d69669 Only use one va_list, as it seems that only one is allowed.
This commit was SVN r15765.
2007-08-04 00:41:26 +00:00
Mohamad Chaarawi
59a7bf8a9f Merging in the Sparse Groups..
This commit includes config changes..

This commit was SVN r15764.
2007-08-04 00:41:26 +00:00
George Bosilca
8baeadb761 The PTLs are now long gone ...
This commit was SVN r15763.
2007-08-04 00:37:52 +00:00
George Bosilca
d658a477af Update the help file to match the real name of the required argument.
This commit was SVN r15762.
2007-08-04 00:35:55 +00:00
George Bosilca
78e2d3523b Remove some old and unused code. Update some of the comments.
This commit was SVN r15761.
2007-08-04 00:34:42 +00:00
George Bosilca
e41ee17ca5 Add a small comment that hopefully will enforce the correct ordering of
the fields between CM and the other PML in the requests structure.

This commit was SVN r15760.
2007-08-03 23:59:29 +00:00
Josh Hursey
f9a3f36f32 Add a note about CNL Support (r15756)
This commit was SVN r15758.

The following SVN revision numbers were found above:
  r15756 --> open-mpi/ompi@755658694e
2007-08-03 20:20:03 +00:00
Josh Hursey
755658694e Bring in changes to support Cray's Compute Node Linux (CNL) and
Application Level Placement Scheduler (ALPS).

This commit was tested under two Cray machines at ORNL: Jaguar (Catamount)
and Rizzo (CNL Test cage). Both machines performed as they should across
the commit.

It is likely that mor changes will follow this the work and environment
stabilizes.

Most of the infrastructure works the same for Catamount and CNL
except for a few bits. Below are the highlights:

Default IFACE Change:
 On Catamount we can use PTL_IFACE_DEFAULT, but on the CNL system we have access
 to will fail on this interface, and should be set to:
    IFACE_FROM_BRIDGE_AND_NALID(PTL_BRIDGE_UK,PTL_IFACE_SS).
 So if we detect that we are running with YOD then use the former interface
 and if we detect that we are running with ALPS then use the latter.
 We will want to pursue a more elegant solution if this interface continues to 
 change across machines.

PtlGetId and cnos_register_ptlid:
 The header suggests that these should never be called when launching with YOD.
 But in the ALPS environment the cnos_barrier() will hang forever if these 
 functions are not called after PtlNIInit(). Since these functions only need to
 be called once, and the orte rmgr/cnos component is loaded before the ompi 
 common/portals componet then just call these functions once in the rmgr/cnos
 component.

cnos_barrier_init():
 This is a noop for YOD, but critical for ALPS. So be sure to call it before
 calling the first barrier in the rmgr/cnos component.

cnos_barrier vs cnos_pm_barrier:
 It is suggested the cnos_pm_barrier only be used during finalization 
 as it will indicate to the launcher (yod or aprun) that the app is about
 to complete. It was suggested that we use the regular cnos_barrier() instead.
 I want to look into this a bit more to make sure there are not adverse
 side effects. A note has been placed in the code to indicate this reasoning.

This commit was SVN r15756.
2007-08-03 19:46:38 +00:00
Josh Hursey
6248b2bb51 Whoops. Make sure to include the opal_output header.
This commit was SVN r15755.
2007-08-03 19:20:23 +00:00
Josh Hursey
dc9644a2c2 Add a bit of error output so the user can figure out what
went wrong when we cannot create a directory.

This commit was SVN r15754.
2007-08-03 19:08:48 +00:00
Pak Lui
010d216db9 * restrict the user with 32 bit app to specify a sm_size to be between
2GB to 4GB-1 by using long instead of size_t for the sm size.
   * it is done to prevent user from running into the ftruncate() in 
	 common sm component (and possibly others) problem that ftruncate 
	 takes an off_t which is a signed long integer. If we use an 
	 unsigned long, it'll run into an invalid argument errno=22.
   * See trac #1117

This commit was SVN r15752.
2007-08-03 15:43:02 +00:00
Shiqing Fan
7e7e555ee8 - It's not a good idea to put them here right now.
- will be added into a tar ball later on, or we try to find another way to generate header files for windows.

This commit was SVN r15751.
2007-08-03 11:53:09 +00:00
Aurelien Bouteiller
1d160ca583 Needed change for vampir pml to work
This commit was SVN r15750.
2007-08-03 02:23:24 +00:00
Tim Mattox
ab4ff90dff Updatded the 1.2.4 section of the NEWS file.
This commit was SVN r15746.
2007-08-02 19:22:34 +00:00
Brian Barrett
951755f9fb no need to call gethostname twice to determine if a process is local
This commit was SVN r15742.
2007-08-02 16:25:25 +00:00
Jeff Squyres
d3f008492f Introduce a new debugging MCA parameter:
mpi_show_mpi_alloc_mem_leaks

When activated, MPI_FINALIZE displays a list of memory allocations
from MPI_ALLOC_MEM that were not freed by MPI_FREE_MEM (in each MPI
process).

 * If set to a positive integer, display only that many leaks.
 * If set to a negative integer, display all leaks.
 * If set to 0, do not show any leaks.

This commit was SVN r15736.
2007-08-01 21:33:25 +00:00
Jeff Squyres
0fb8cf65a8 If you have an HCA with no active ports, we still create an mpool.
This mpool will have no btl module owner there was no btl created for
the HCA with no ports, but it will still be tracked in the mpool
framework (i.e., it's available).

If MPI_ALLOC_MEM is called by the app, one of two things will happen:

 1. if there's an HCA on the host with some active ports, the openib
    btl component will still be in the process space, and therefore
    the "mpool with no btl" (MWNB) module will still be able to call
    the reg/dereg functions, and all will be fine.  However, if
    MPI_FREE_MEM is never invoked to free the memory, bad things will
    happen during MPI_FINALIZE.  The pml is finalized, which finalizes
    all the btls.  The btls finalize all their mpools and all is fine.
    But later we close down the mpool framework which then finalizes
    any left over mpool modules, such as MWNB.  However, the openib
    BTL module functions that the MWNB was registered with are no
    longer in the process space, and it segv's while trying deregister
    the memory.
 2. if there are *no* HCA's on the host with active ports, then the
    openib btl will have been unloaded, and when the MWNM tries to
    register the memory, the functions it tries to call (in the openib
    btl) are no longer there, and we segv.

This commit was SVN r15735.
2007-08-01 20:53:34 +00:00
Jeff Squyres
106beff744 Ahem. Apparently we should be checking for ORTE_EQUAL upon return
from orte_ns.compare_fields(), not 0 (yes, they're the same [today],
but it is much better to check for symbolic names...).

This commit was SVN r15731.
2007-08-01 18:59:37 +00:00
Jeff Squyres
8d4b6c7b0d The HNP changing into an orted brought a bug in the iof svc component
to light: we weren't ack'ing properly for streams that originated (or
originated via proxy) and terminated within the HNP.  This commit
fixes that.

It also fixes a few style issues, and added some more opal_outputs for
debugging.  Also, fixed a bug where the fact that we forwarded (and
therefore might need to update the ack) was not correctly reported if
there were multiple forwards (which there are not as the system is
currently using IOF, but there could be).

Refs trac:1098 -- want to get another pair of eyes to look at this before
I close the ticket.

This commit was SVN r15730.

The following Trac tickets were found above:
  Ticket 1098 --> https://svn.open-mpi.org/trac/ompi/ticket/1098
2007-08-01 18:38:03 +00:00
Tim Mattox
8185de6383 Updatded the 1.2.4 section of the NEWS file.
This commit was SVN r15725.
2007-08-01 16:00:32 +00:00
Gleb Natapov
627d9bc8ed Delay freeing of a send request if scheduling function is running by other
thread.

This commit was SVN r15722.
2007-08-01 12:19:16 +00:00
Gleb Natapov
758f932aa6 Handle credit in a thread safe manner. I am sure more work will have to be done
in this are.

This commit was SVN r15721.
2007-08-01 12:15:43 +00:00
Gleb Natapov
dd8b0c925f Add OPAL_ATOMIC_CMPSET macros that became non atomic with only one threaded.
This commit was SVN r15720.
2007-08-01 12:13:34 +00:00
Gleb Natapov
9c20d67301 1) Return IB header to it's previous size by using char for cm_seen field.
2) Allow to specify rd_win/rd_rsv parameters by user, but make them optional.

This commit was SVN r15719.
2007-08-01 12:10:56 +00:00
Gleb Natapov
072ebf0fb3 Add new opal_argv_split_with_empty() function. opal_argv_split() function
doesn't include empty string in the argv array if there are two delimiters
in a row in an input string.

This commit was SVN r15718.
2007-08-01 12:08:11 +00:00
Pak Lui
9af43da1dc * Remove the logic for Solaris to always use the FreeBSD version of qsort.
* Give user the option to configure with the broken qsort fix instead
    of using the native qsort.

This commit was SVN r15716.
2007-07-31 22:43:06 +00:00
Aurelien Bouteiller
a403fed18a More checkings (assert) on the output system so that malformed format string does not crash the application at a later random time.
Changed various debug messages to retain most usefull messages

This commit was SVN r15715.
2007-07-31 19:33:39 +00:00
Ralph Castain
066ff38d42 Ensure we read all the reported URI contact info when we fork an HNP for singleton support
This commit was SVN r15714.
2007-07-31 18:55:08 +00:00
Tim Mattox
a13123b5fc Updatded the 1.2.4 section of the NEWS file.
This commit was SVN r15712.
2007-07-31 18:49:51 +00:00
George Bosilca
d52d21fae8 Don't forget to include the header file in the sources list.
This commit was SVN r15711.
2007-07-31 18:40:31 +00:00
Aurelien Bouteiller
cec9ce8106 Fixed: various warnings with printf(%x, uint64_t) on 32 bit architectures + some left (long) cast for size_t printf.
This commit was SVN r15706.
2007-07-31 17:12:21 +00:00
Shiqing Fan
0f468f3668 - Remove the solution and project files, will commit them later.
This commit was SVN r15705.
2007-07-31 17:07:02 +00:00
George Bosilca
2e2bf472ff Mark the orte_abort function as noreturn and change the return value from
int to void. This function call exit at the end, so there is no way to
return from there. Apply the same thing to the errmsg_abort function and
update all components.

This commit was SVN r15704.
2007-07-31 16:09:52 +00:00
Aurelien Bouteiller
a5d0e53bb3 Moved replay macros to functions. The performance improvement in process recovery does not worth the debugging hassle.
This commit was SVN r15703.
2007-07-31 16:01:32 +00:00
Aurelien Bouteiller
5a792a3fad (hopefully) fixed various pedantic warning about casts on 32bit machines. Not tried only have 64bits available.
This commit was SVN r15702.
2007-07-31 15:58:19 +00:00
Sven Stork
fd778a5539 - put the label to the right place
This commit was SVN r15699.
2007-07-31 09:34:41 +00:00
Sven Stork
a13d2dcb96 - fix possible memory leak found by coverity
This commit was SVN r15698.
2007-07-31 09:32:49 +00:00
Sven Stork
27422e05ac - add parameter check for NULL pointer
This commit was SVN r15697.
2007-07-31 09:01:39 +00:00
George Bosilca
d6a676b29e Remove unused variable.
This commit was SVN r15693.
2007-07-30 19:38:26 +00:00
George Bosilca
cf9bccf2e6 prefer snprintf to sprintf.
This commit was SVN r15692.
2007-07-30 19:37:34 +00:00
Aurelien Bouteiller
3559fd5d1a Fixed issues with "verbose" output being too silent.
This commit was SVN r15691.
2007-07-30 19:11:15 +00:00
Sven Stork
4c5836c2ee - add missing va_end found by coverity
This commit was SVN r15689.
2007-07-30 16:08:18 +00:00
Sven Stork
d830318bbb - fix typo
This commit was SVN r15687.
2007-07-30 15:47:37 +00:00
Sven Stork
6c8d921a76 - coverity found dead code, but it's a typo
This commit was SVN r15686.
2007-07-30 15:41:41 +00:00
Sven Stork
80cdafb8f4 - remove dead code found by coverity
This commit was SVN r15685.
2007-07-30 15:36:00 +00:00
Sven Stork
71915f269c - more coverity fixes
- use stncpy
  - comapring NULL against an array which is staically inside
    the structure will allways be true

This commit was SVN r15684.
2007-07-30 15:19:54 +00:00
Sven Stork
855434de59 - fixes several coverty issues
- add missing initialisation for variables
  - use strncpy instead of strcpy

This commit was SVN r15683.
2007-07-30 14:44:37 +00:00
Jeff Squyres
327576b2a3 Fix incorrect behavior noted by Lisandro Dalcini: when MPI_COMM_SELF is
passed to MPI_COMM_FREE, it invoked the error handler on
MPI_COMM_WORLD, not on MPI_COMM_FREE.  This commit changes the
behavior: if MPI_COMM_SELF is passed to MPI_COMM_FREE, we invoke the
error handler on MPI_COMM_SELF (not MPI_COMM_WORLD).  Fixes trac:1109.

This commit was SVN r15682.

The following Trac tickets were found above:
  Ticket 1109 --> https://svn.open-mpi.org/trac/ompi/ticket/1109
2007-07-30 13:01:33 +00:00
Gleb Natapov
afac5eb93f Guard recv request with lock against simultaneous access from different
threads.

This commit was SVN r15681.
2007-07-30 12:50:38 +00:00