1
1
Граф коммитов

9426 Коммитов

Автор SHA1 Сообщение Дата
Tim Prins
2ffc02870d Reduce the memory usage of the GPR:
- Make it so that all the GPR pointer arrays are allocated initially at 16 elements instead of 512. This saves (on a 64 bit machine) approximately 4*(# procs + # nodes) KB.
- Fix up the segment prealloc function so that preallocating an existant segment is not an error, and make the areas where we do large inserts use it.

Fix the orte_pointer_array to efficiently implement setting its size. Before we just realloced the array one block at a time until the desired size was reached. Now we resize it all in one realloc.

This commit was SVN r14264.
2007-04-09 00:40:15 +00:00
Brian Barrett
13a4bba13f Yet another dumb thing that shouldn't have been in r14261.
This commit was SVN r14263.

The following SVN revision numbers were found above:
  r14261 --> open-mpi/ompi@8a55c84d0b
2007-04-07 23:23:23 +00:00
Brian Barrett
32f0090f81 fix dumb variable scope mistake
This commit was SVN r14262.
2007-04-07 23:00:57 +00:00
Brian Barrett
8a55c84d0b Fix a number of OOB issues:
* Remove the connect() timeout code, as it had some nasty race conditions
    when connections were established as the trigger was firing.  A better
    solution has been found for the cluster where this was needed, so just
    removing it was easiest.
  * When a fatal error (too many connection failures) occurs, set an error
    on messages in the queue even if there isn't an active message.  The
    first message to any peer will be queued without being active (and
    so will all subsequent messages until the connection is established),
    and the orteds will hang until that first message completes.  So if
    an orted can never contact it's peer, it will never exit and just sit
    waiting for that message to complete.
  * Cover an interesting RST condition in the connect code.  A connection
    can complete the three-way handshake, the connector can even send
    some data, but the server side will drop the connection because it
    can't move it from the half-connected to fully-connected state because
    of space shortage in the listen backlog queue.  This causes a RST to
    be received first time that recv() is called, which will be when waiting
    for the remote side of the OOB ack.  In this case, transition the
    connection back into a CLOSED state and try to connect again.
  * Add levels of debugging, rather than all or nothing, each building on
    the previous level.  0 (default) is hard errors.  1 is connection 
    error debugging info.  2 is all connection info.  3 is more state
    info.  4 includes all message info.
  * Add some hopefully useful comments

This commit was SVN r14261.
2007-04-07 22:33:30 +00:00
Tim Prins
df4c468bb4 fix some more minor memory leaks
This commit was SVN r14260.
2007-04-07 18:41:16 +00:00
Tim Prins
e09a154266 fix a buglet..
This commit was SVN r14259.
2007-04-07 18:27:39 +00:00
Rich Graham
f481722bdf move the code that sets the thread level information before the btl are
initialized, so that the btl's have this information for correct setup.

This commit was SVN r14258.
2007-04-07 05:06:47 +00:00
Tim Prins
8e7765e456 Fix a gigantic memory leak. We were copying a message to send into a buffer, then never freeing the copy we made. But we were mistakenly allocating the buffer on the stack, so the memory checking tools never caught the leak. On 96 nodes, 384 processes, mpirun memory usage went from about 12M to 3M for me after this minor change...
This commit was SVN r14257.
2007-04-07 02:25:48 +00:00
Tim Prins
e058266c96 Change the ORTE datatype service in 2 ways:
1. Remove a unneeded field, bytes_avail, from orte_buffer_t. It is a calcualed value, and updating it everywhere is worse then just calculating it in the one place it is acutally used.
2. Change it so the default size of a orte_buffer is 128 bytes instead of 1024 bytes. We then double the size of the buffer up to 1024 bytes, then we additively increase the size by 1024 bytes at a time as was done before.

This commit was SVN r14252.
2007-04-06 19:40:29 +00:00
Tim Prins
f0e6a28a1f pedantic indentation...
This commit was SVN r14251.
2007-04-06 19:18:31 +00:00
Tim Mattox
b304ae5fba Updated the NEWS file for another 1.2.1 change.
This commit was SVN r14249.
2007-04-06 17:55:44 +00:00
George Bosilca
33bf6c6e54 Move the comment at the right place.
This commit was SVN r14237.
2007-04-05 20:36:33 +00:00
George Bosilca
5c355d0bea Always return an initialized variable. More output if we fail to read
from the shell detection child. Don't spawn orted, instead spawn what's
inside the mca_pls_rsh_component.orted.

This commit was SVN r14236.
2007-04-05 20:17:10 +00:00
George Bosilca
ef4baeb6ab Don't reset the pid, as at this point it is already set.
This commit was SVN r14235.
2007-04-05 20:13:50 +00:00
George Bosilca
8fb8363868 Correctly detect the remote shell, and the local one. Big clean-up on how we
deal with the PLS RSH. Remove support for unknown user (i.e. if the user is
not known by the system, then it shouldn't be allowed to spawn anything).

This commit was SVN r14232.
2007-04-05 19:22:26 +00:00
Josh Hursey
8fd6d4ba09 add a newline so output is cleaner/clearer
This commit was SVN r14229.
2007-04-05 17:45:03 +00:00
Josh Hursey
38547459ae Improve the cleanup process in ob1
Remove a redundant statement in the r2 BML.

This commit was SVN r14228.

The following SVN revision numbers were found above:
  r2 --> open-mpi/ompi@58fdc18855
2007-04-05 17:37:29 +00:00
Josh Hursey
98fb9f26ef Some cleanup.
- Remove an old comment from crcp_base_fns.c
- Let ob1 have its very own ft_event function (which I'll fill in shortly)
- Make sure ob1 finalizes the bsend stuff so we don't leave a bunch of memory sitting around
- PML base - destruct the array upon finalize. Shrink the include search so it stops after finding a match

This commit was SVN r14222.
2007-04-05 13:52:05 +00:00
Tim Mattox
1705e370d3 Add a NEWS entry for yet another 1.2.1 change.
This commit was SVN r14220.
2007-04-05 00:56:05 +00:00
Ralph Castain
e95539a16a Add two new test codes - orte_loop_spawn/child - to help debug issues surrounding multiple calls to comm_spawn
This commit was SVN r14217.
2007-04-04 21:02:18 +00:00
Jeff Squyres
2cbcb4abf1 Remove the French and strip the tests down to essentials (no need for
buffer attaching/detaching, for example).

This commit was SVN r14216.
2007-04-04 15:38:23 +00:00
Josh Hursey
a8918fe3d5 pedantic cleanup. Switch loop to lowest rank sends first
This commit was SVN r14215.
2007-04-04 14:23:45 +00:00
Ralph Castain
d5b5cd2d3c Add test code for multiple comm_spawn calls.
Add ERROR_LOG calls to more clearly document failures in the rsh launcher.

This commit was SVN r14214.
2007-04-04 13:24:39 +00:00
Edgar Gabriel
4d2b3e859d fix the indenting from tabs to spaces :-)
This commit was SVN r14211.
2007-04-03 21:33:44 +00:00
Edgar Gabriel
188f770d94 ok, increase the reference count on ompi_mpi_group_null twice when
creating ompi_mpi_comm_null, since the destructor of ompi_mpi_comm_null will
decrease the reference counter of ompi_mpi_group_null twice according to the
last fix of Mohamad.

Added also a lengthy comment in ompi_comm_finalize about why we do 
not decrease the reference counters for ompi_mpi_comm_null,
ompi_mpi_group_null etc. for the parent 
communicator, although we do increase it in ompi_comm_init

This commit was SVN r14210.
2007-04-03 21:16:26 +00:00
Jeff Squyres
fe58753a23 Add a little documentation to iof.h.
This commit was SVN r14208.
2007-04-03 18:17:35 +00:00
Li-Ta Lo
ec8a859a44 fixed typo
This commit was SVN r14207.
2007-04-03 17:21:54 +00:00
George Bosilca
667bda0fef Rework the code a little bit to make things simpler.
This commit was SVN r14203.
2007-04-03 16:05:51 +00:00
George Bosilca
cb1b976486 Big update. Correct the behavior for true_lb and true_ub computation
when the size of the data is zero. Now they are not updated, which leave
us with the correct memory layout in all situations (so far). Update all
the comments to reflect exactly the supported behavior of the DDT engine.

This commit was SVN r14202.
2007-04-03 16:05:15 +00:00
Josh Hursey
51daa15f9c play a bit nicer with references.
This commit was SVN r14201.
2007-04-02 22:27:52 +00:00
Tim Prins
b1bed8375c A few more small leaks...
This commit was SVN r14200.
2007-04-02 21:12:16 +00:00
Josh Hursey
5ff1c10e70 minor cleanup
This commit was SVN r14199.
2007-04-02 20:39:36 +00:00
Josh Hursey
b0b91a5fde A couple more fixes for async case.
Mostly working again, 1 small bug I'm still tracking.

This commit was SVN r14198.
2007-04-02 20:00:58 +00:00
Josh Hursey
71937c3eaf A bit of cleanup for async case... Still one bug in there.
This commit was SVN r14197.
2007-04-02 19:25:22 +00:00
George Bosilca
120cf76ad8 Remove some warnings.
This commit was SVN r14196.
2007-04-02 19:11:06 +00:00
Mohamad Chaarawi
0e98bf2ac6 quick fix for the cart create problem caused by the previous memory leak
fix

This commit was SVN r14195.
2007-04-02 19:06:52 +00:00
George Bosilca
8273c5eeba Correct an error introduced by commit r14180.
This commit was SVN r14191.

The following SVN revision numbers were found above:
  r14180 --> open-mpi/ompi@1cb26e3b9c
2007-04-02 02:59:23 +00:00
George Bosilca
1dcf2aaedf We need libgen.h for dirname.
This commit was SVN r14190.
2007-04-01 16:26:54 +00:00
George Bosilca
cd5a4c6416 Save the ghost pointer once the element is initialized.
This commit was SVN r14189.
2007-04-01 16:18:48 +00:00
George Bosilca
f2a6b9394f Deal with the include spree. Protect "environ" on Windows.
Some others minors modifications in order to make it
compile [again] on Windows.

This commit was SVN r14188.
2007-04-01 16:16:54 +00:00
George Bosilca
6ddd250a87 OPAL layer should include opal_config.h not ompi_config.h
This commit was SVN r14187.
2007-04-01 16:10:05 +00:00
George Bosilca
01a4f56369 Mostly DECLSPEC cleanups and some include corrections.
This commit was SVN r14186.
2007-04-01 16:08:27 +00:00
George Bosilca
50f2695fe2 Be more generic with the Windows support.
This commit was SVN r14185.
2007-04-01 15:56:59 +00:00
George Bosilca
da4d993176 dd some Windows missing defines & macros.
This commit was SVN r14184.
2007-04-01 15:55:34 +00:00
Tim Prins
80e047b843 make the mx btl compile again...
This commit was SVN r14183.
2007-04-01 02:49:23 +00:00
George Bosilca
f518a9c1f6 Remove some warnings from the data-type engine.
This commit was SVN r14181.
2007-03-31 04:14:47 +00:00
George Bosilca
1cb26e3b9c Finally the convertor export a convenience function to allow a consistent
computation of the current location on the pack/unpack process. This can
be used both for retrieving the pointer to the first byte (in the special
case of the cached RDMA protocol) and for getting the current
position (for the pipelined protocol).

I modified all BTLs, but most of them are still untested.

This commit was SVN r14180.
2007-03-30 22:02:45 +00:00
Mohamad Chaarawi
8f4f992bfc fixed the memory leak problem by decrementing the ref count on the
remote group in case of Intra communicators. This needs to go in V1.2.
We will file a move request on monday..

This commit was SVN r14179.
2007-03-30 19:30:40 +00:00
Tim Mattox
2aa27325bb Add news item for mpi_leave_pinned fix.
This commit was SVN r14177.
2007-03-30 16:19:03 +00:00
Tim Prins
2f74160a37 Fix some more memory leaks
This commit was SVN r14175.
2007-03-30 13:43:50 +00:00