1
1
Граф коммитов

7363 Коммитов

Автор SHA1 Сообщение Дата
Gilles Gouaillardet
5e9347bfb4 Fix deallocation error in mca_common_sm_rml_info_bcast()
buffer is sent (num_local_procs-1) time, so it should be
OBJ_RETAIN'ed (num_local_procs-2) time

cmr=v1.8.2:reviewer=rhc

This commit was SVN r31874.
2014-05-22 04:53:42 +00:00
Jeff Squyres
af5399ab5b Gah; we need to have a unique barrier ID for the ompi_rte_barrier()
operation.  Ralph will fix shortly.

For the time being, put back the original code...

Refs trac:4669

This commit was SVN r31872.

The following Trac tickets were found above:
  Ticket 4669 --> https://svn.open-mpi.org/trac/ompi/ticket/4669
2014-05-22 00:31:39 +00:00
Jeff Squyres
c860f289ab usnic: re-order teardown sequence now that del_procs() is actually called
Now that the infrastructure is calling BTL del_procs() before the BTL
finalize(), the usnic BTL had to re-order some of its teardown
sequence to avoid assert() failing.

This is part of a larger conversation involving #4669.  Since
MPI_FINALIZE and MPI_COMM_DISCONNECT currently use an
oob/grpcomm-based barrier, the usnic BTL can ''absolutely know'' that
these endpoints and procs will no longer be used.  If the ORTE DPM
goes back to a PML-based barrier, the usnic BTL will need to grow more
complex teardown semantics (a la TCP socket FIN/ACK/FIN_WAIT states).

Refs trac:4669

This commit was SVN r31871.

The following Trac tickets were found above:
  Ticket 4669 --> https://svn.open-mpi.org/trac/ompi/ticket/4669
2014-05-21 23:59:17 +00:00
Jeff Squyres
5aa05f2f1d Potentially temporarily change the barrier in the ORTE DPM component
to be based on grpcomm (i.e., an out-of-band based barrier) rather
than the simplistic PML-based barrier that it currently uses.

This is pending a larger discussion with Nathan and George, but it
will allow the usnic BTL to stop assert()-failing in light of the
recent del_procs() change.

This commit was SVN r31870.
2014-05-21 23:09:01 +00:00
Mike Dubman
fad1063980 MXM: fix warning
reviewed by Yossi

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r31855.
2014-05-21 07:50:05 +00:00
George Bosilca
cc0239d52f Remove unused variables.
This commit was SVN r31846.
2014-05-21 01:34:26 +00:00
Jeff Squyres
b0a6e42f45 pml ob1: use the pre-computed size from the free lists
Based on a suggestion from George on #31806, use the pre-computed
sizes rather than duplicating the computation math (which may change
someday in the future).

cmr=v1.8.2:ticket=trac:4647

This commit was SVN r31841.

The following Trac tickets were found above:
  Ticket 4647 --> https://svn.open-mpi.org/trac/ompi/ticket/4647
2014-05-20 20:32:25 +00:00
George Bosilca
db9660264e Update the error message to pinpoint the right location.
Thanks Tim.

This commit was SVN r31839.
2014-05-20 20:08:42 +00:00
George Bosilca
685f051557 Move the allocator initialization from open to init. This clean
a memory leak. Similar changes shuld be applied to all the 
other PML that are copies of OB1. This patch is related to
#4653.

This commit was SVN r31838.
2014-05-20 19:34:18 +00:00
Gilles Gouaillardet
baf532087a Fix mca_coll_basic_alltoallw_inter()
Avoid sending/receiving zero size messages in order to be compliant
with the top-level modification

cmr=v1.8.2:ticket=4651:reviewer=bosilca

This commit was SVN r31836.

The following Trac tickets were found above:
  Ticket 4651 --> https://svn.open-mpi.org/trac/ompi/ticket/4651
2014-05-20 09:22:57 +00:00
George Bosilca
137874ec4d In fact btl_eager and btl_dma will just vanish upon destruction of the
bml_endpoint. No need to clean them bfore.

This commit was SVN r31835.
2014-05-20 08:53:22 +00:00
George Bosilca
85e3caaa17 Handle the del_procs correctly. The btl_send is the complete list
of existing BTL fo an endpoint, all the others are just partial list.
Thus, all the cleaning should first be done in the btl_send array,
and them in the other arrays (btl_eager and btl_rdma).

This commit was SVN r31834.
2014-05-20 08:46:57 +00:00
George Bosilca
1647664c43 Show the unreacheable message for the first unreacheable proc
and then stop.

This commit was SVN r31833.
2014-05-20 08:40:32 +00:00
Gilles Gouaillardet
c496131eef Fixes *alltoall* collectives at top
- fix bugs
 - silent warnings

cmr=v1.8.2:ticket=4651:reviewer=bosilca

This commit was SVN r31831.

The following Trac tickets were found above:
  Ticket 4651 --> https://svn.open-mpi.org/trac/ompi/ticket/4651
2014-05-20 05:32:57 +00:00
Gilles Gouaillardet
ef4548a215 bml/r2 : fix mca_bml_r2_del_procs()
cmr=v1.8.2:reviewer=hjelmn:ticket=trac:4645

This commit was SVN r31830.

The following SVN revision numbers were found above:
  r2 --> open-mpi/ompi@58fdc18855

The following Trac tickets were found above:
  Ticket 4645 --> https://svn.open-mpi.org/trac/ompi/ticket/4645
2014-05-20 02:55:47 +00:00
Nathan Hjelm
27d3e1ca25 bml/r2: fix a problem identified by Gilles
This commit fixes two issues:

 - This intent of the code @ bml_r2.c:486 is to prevent calling the
   btl_del_procs more than once for a given proc. Gilles correctly
   identified there was a problem in this code but r31786 we not the
   correct fix.

 - Fix a segmentation fault in r2 finalize revealed by the fact we
   actually call del_procs now.

cmr=v1.8.2:reviewer=ggouaillardet:ticket=trac:4645

This commit was SVN r31829.

The following SVN revision numbers were found above:
  r2 --> open-mpi/ompi@58fdc18855
  r31786 --> open-mpi/ompi@fc96b0a7b8

The following Trac tickets were found above:
  Ticket 4645 --> https://svn.open-mpi.org/trac/ompi/ticket/4645
2014-05-19 20:22:34 +00:00
Nathan Hjelm
a1d5ce0893 pml/ob1: as per past RFC bring the inline send optimization to
MPI_Isend.

I filed an RFC for this optimization some time back. It is a
relatively simple optimization. If the data associated with an
MPI_Isend can be put on the wire without allocating an MPI_Request
then do so. In this case we can legally return omp_request_empty
which will correctly indicate that the request is complete and that is
was not cancelled (these are the only requirements on send requests).

cmr=v1.8.3:reviewer=bosilca

This commit was SVN r31828.
2014-05-19 19:34:59 +00:00
Nathan Hjelm
22e59b056a bcol/basesmuma: fix leak in basesmuma code
Basesmuma was vallocing space for control data then mmapping over that
data. Nothing in the code suggests any need for mmapping a specific
address so I did the following to remove the leak:

 - Removed the valloc of the buffer space

 - ftruncate the mmaped file to ensure there is sufficient memory to
   allocate space for the control data.

Ideally this code should be using opal/shmem but that is a larger
change. Keeping it simple for now.

cmr=v1.8.2:reviewer=manjugv

This commit was SVN r31822.
2014-05-19 15:21:58 +00:00
Nathan Hjelm
dedf6b377e btl/vader: don't leak registration cache items
Make sure to cleanup the registation cache when removing an
endpoint. This leak only affects systems with XPMEM installed.

Since this is in code specific to XPMEM not sure who could review so
sending it directly to the RM.

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r31821.
2014-05-19 15:16:32 +00:00
Gilles Gouaillardet
2b89aac15b Fix a typo in MCA_PML_OB1_RECV_REQUEST_UNPACK
cmr=v1.8.2:reviewer=rhc

This commit was SVN r31817.
2014-05-19 11:00:13 +00:00
Gilles Gouaillardet
c82a6f5063 Fix a memory leak in mca/pml/bfo
Allocate the allocator in init rather than open

cmr=v1.8.2:reviewer=rhc

This commit was SVN r31816.
2014-05-19 10:40:18 +00:00
Gilles Gouaillardet
8bafe06c57 Fixes *alltoall* collectives at top level
This commit :
 - Correctly retrieve the communicator size when
   checking memory and parameters
 - Ensure (sendtype,sendcount) and (recvtype,recvcount)
   matches and return with MPI_ERR_TRUNCATE otherwise
 - Return with MPI_SUCCESS without invoking the low level
   if no data is going to be transferred
 - Fixes trac:4506

cmr=v1.8.2:reviewer=bosilca

This commit was SVN r31815.

The following Trac tickets were found above:
  Ticket 4506 --> https://svn.open-mpi.org/trac/ompi/ticket/4506
2014-05-19 07:46:07 +00:00
Oscar Vega-Gisbert
83bdebbf81 Java bindings for OSHMEM.
This commit was SVN r31810.
2014-05-18 21:48:09 +00:00
Mike Dubman
cadc1485ff HCOLL: register memory release hook to avoid races
fixed by Devender, reviewed by Miked

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r31809.
2014-05-17 19:49:43 +00:00
Ralph Castain
61fba04a0c Remove undeclared variable
This commit was SVN r31807.
2014-05-17 04:09:05 +00:00
Jeff Squyres
025e4a852b pml_ob1: ensure to have enough space for send/recvreq on stack
r30343 introduced the optimization of putting the OB1 sendreq and
recvreq on the stack for blocking sends and receives.  However, the
requests did not contain enough storage for the data that is normally
immediately ''after'' the request (e.g., BTL data).

This commit changes these requests to be pointers and to use alloca()
to get enough total space for the OB1 request and all the associated
data.

The change is smaller than it looks; most of it is just changing from
"foo.bar" to "foo->bar" notation (etc.).

Submitted by Jeff, reviewed by Nathan.  But we want George to look at
this (and get a little soak time on the trunk) before moving to v1.8.

cmr=v1.8.2:reviewer=bosilca

This commit was SVN r31806.

The following SVN revision numbers were found above:
  r30343 --> open-mpi/ompi@2b57f4227e
2014-05-17 01:05:59 +00:00
George Bosilca
750c6c7861 Update the UTK copyright on the topology related files.
This commit was SVN r31805.
2014-05-16 22:23:52 +00:00
Gilles Gouaillardet
96ae38823d Fix a memory leak in ompi_osc_base_finalize()
cmr=v1.8.2:reviewer=rhc

This commit was SVN r31788.
2014-05-16 05:25:24 +00:00
Gilles Gouaillardet
fc96b0a7b8 Fix a typo in mca_bml_r2_del_procs()
Use bml_endpoint->btl_eager instead of bml_endpoint->btl_send.

cmr=v1.8.2:reviewer=rhc

This commit was SVN r31786.
2014-05-16 04:43:18 +00:00
Nathan Hjelm
e97e4cf924 Add missing include.
cmr=v1.8.2:ticket=trac:4639

This commit was SVN r31784.

The following Trac tickets were found above:
  Ticket 4639 --> https://svn.open-mpi.org/trac/ompi/ticket/4639
2014-05-15 19:52:06 +00:00
Nathan Hjelm
572507f451 btl/vader: fix del_procs bug exposed by the fact we now actually call
del_procs

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r31783.
2014-05-15 19:26:49 +00:00
Nathan Hjelm
faf008f527 Fix bugs that were causing leaks in finalize.
This commit fixes leaks of bml endpoints in finalize. A summary of the
bugs/fixes is below.

 1) ompi_mpi_finalize used ompi_proc_all to get the list of procs but
    never released the reference to them (ompi_proc_all called
    OBJ_RETAIN on all the procs returned). When calling del_procs at
    finalize it should suffice to call ompi_proc_world which does not
    increment the reference count.

 2) del_procs is called BEFORE ompi_comm_finalize. This leaves the
    references to the procs from calling the pml_add_comm
    function. The fix is to reorder the calls to do omp_comm_finalize,
    del_procs, pml_finalize instead of del_procs, pml_finalize,
    ompi_comm_finalize.

 3) The check in del_procs in r2 checked for a reference count of
    1. This is incorrect. At this point there should be 2 references:
    1 from ompi_proc, and another from the add_procs. The fix is to
    change this check to look for a reference count of 22. This check
    makes me extremely uncomforable as nothing will call del_procs if
    the reference count of a procs is not 2 when del_procs is
    called. Maybe there should be an assert since this is a developer
    error IMHO.

cmr=v1.8.2:reviewer=bosilca

This commit was SVN r31782.

The following SVN revision numbers were found above:
  r2 --> open-mpi/ompi@58fdc18855
2014-05-15 18:28:03 +00:00
Nathan Hjelm
55f0dcb81a Add netpatterns_cleanup_narray_knomial_tree function to cleanup after
netpatterns_setup_narray_knomial_tree.

Fix a bug in ptpcoll that caused memory allocated by
netpatterns_setup_narray_knomial_tree to leak.

cmr=v1.8.2:reviewer=manjugv

This commit was SVN r31781.
2014-05-15 17:36:26 +00:00
Nathan Hjelm
d3dc2c9b0b btl/ugni: reorder teardown to avoid using freed memory
vagrind correctly indicated that the mpool is needed when tearing down
the free lists. Reordered the teardown code to correct this.

My code so no review required.

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r31780.
2014-05-15 16:58:28 +00:00
Nathan Hjelm
73bfecd650 More leak fixes.
Two leaks are fixed in this commit:

 - Do not leak btl component list items.

 - Do not leak the nodename when decoding the pidmap.

cmr=v1.8.2:reviewer=rhc

This commit was SVN r31779.
2014-05-15 16:38:13 +00:00
Nathan Hjelm
e4db2c3ebb ompi: fix various small leaks
This commit fixes three leaks:

 - bml/r2: fix leak of del_procs in mca_bml_r2_del_procs

 - Release the modex data in btl/scif, btl/ugni, and btl/vader

 - ompi_mpi_finalize: close the allocator framework

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r31778.

The following SVN revision numbers were found above:
  r2 --> open-mpi/ompi@58fdc18855
2014-05-15 15:59:51 +00:00
Gilles Gouaillardet
e836abfc51 btl/scif: fix hang at shutdown
a brand new dummy connection has to be used to properly
trigger the main thread termination and avoid timeout in
the main thread

cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r31772.
2014-05-15 09:55:50 +00:00
Gilles Gouaillardet
3e19041425 Fix memory leak in mca_common_sm_rml_info_bcast(...)
The local buffer object is only used by the root of the group,
so move down the buffer declaration and initialization.

cmr=v1.8.2:reviewer=rhc

This commit was SVN r31771.
2014-05-15 08:41:14 +00:00
Nathan Hjelm
4113cfa03a pml/ob1: add missing OBJ_DESTRUCT
An OBJ_DESTRUCT was missing for mca_pml_ob1.send_ranges causing a
memory leak. Identified by valgrind.

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r31768.
2014-05-14 21:15:45 +00:00
Nathan Hjelm
32a85c6d7d allocator/bucket: free all memory associated with a bucket allocator
when it is finalized.

Fixes a leak identified by valgrind.

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r31767.
2014-05-14 21:15:39 +00:00
Nathan Hjelm
279c0a3ca7 comm: fix communicator subsystem leaks
This commit fixes two leaks:

 - We never destructed the attributes on MPI_COMM_WORLD. All other
   communicators that have attributes are released through
   ompi_comm_free which does the attribute destruction. For
   MPI_COMM_WORLD this is now done before the destructor is called.

 - Add missing OBJ_RELEASE for ompi_comm_f_to_c_table.

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r31766.
2014-05-14 21:15:33 +00:00
Ralph Castain
a202139656 Looks like someone missed a '|' in this comparison, so correct it here
This commit was SVN r31760.
2014-05-14 16:53:22 +00:00
Nathan Hjelm
82934b173c btl/scif: make the exiting member volatile
cmr=v1.8.2:ticket=trac:4626

This commit was SVN r31757.

The following Trac tickets were found above:
  Ticket 4626 --> https://svn.open-mpi.org/trac/ompi/ticket/4626
2014-05-14 16:22:24 +00:00
Nathan Hjelm
97605a4002 btl/scif: fix hang at shutdown
scif_close is not causing scif_poll in the listening thread to return
as expected. To ensure the thread exits attempt to make a local
connection to wake up the thread before calling pthread_join.

cmr=v1.8.2:reviewer=ggouaillardet

This commit was SVN r31756.
2014-05-14 16:14:00 +00:00
George Bosilca
f27123a20d Fix the add_proc issue identified by Jeff: the TCP BTL now discard a
peer proc without TCP support instead of completely dropping TCP support for the entire job.

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r31753.
2014-05-14 13:47:57 +00:00
Nathan Hjelm
518f188ad4 bml/base: ensure all components are closed when the framework is
closed

We were leaving the selected component open. This commit should
eliminate a leak detected by valgrind.

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r31749.
2014-05-13 23:04:40 +00:00
Nathan Hjelm
c15c3dee4c common/ugni: fix bugs in r31557
This commit was SVN r31748.

The following SVN revision numbers were found above:
  r31557 --> open-mpi/ompi@c4c9bc1573
2014-05-13 22:37:01 +00:00
Nathan Hjelm
dd8de4d6eb btl/ugni: fix memory leaks and silence some warnings
The smsg_mboxes free list was not getting destructed. The construct
has been moved to module initialization and a matching destruct is now
in the module destruct.

This commit was SVN r31746.
2014-05-13 21:22:33 +00:00
Nathan Hjelm
9c45e4152d sbgp/base: fix memory leaks
This commit fixes memory leaks discovered in the sbgp setup code. We
were leaking an opal_argv as well as some list items. I took the
opportunity to clean up the code a little which includes making use of
the opal_argv_free function.

cmr=v1.8.2:reviewer=manjugv

This commit was SVN r31745.
2014-05-13 21:22:25 +00:00
Nathan Hjelm
ddd501c0d9 bcol/base: cleanup code and fix memory leak
The items in the available bcol list were getting leaked. This commit
fixes this leak. I also cleaned up the code a bit. This includes
making use of the opal_argv_free function.

cmr=v1.8.2:reviewer=manjugv

This commit was SVN r31744.
2014-05-13 21:22:18 +00:00