1
1

Fix bugs that were causing leaks in finalize.

This commit fixes leaks of bml endpoints in finalize. A summary of the
bugs/fixes is below.

 1) ompi_mpi_finalize used ompi_proc_all to get the list of procs but
    never released the reference to them (ompi_proc_all called
    OBJ_RETAIN on all the procs returned). When calling del_procs at
    finalize it should suffice to call ompi_proc_world which does not
    increment the reference count.

 2) del_procs is called BEFORE ompi_comm_finalize. This leaves the
    references to the procs from calling the pml_add_comm
    function. The fix is to reorder the calls to do omp_comm_finalize,
    del_procs, pml_finalize instead of del_procs, pml_finalize,
    ompi_comm_finalize.

 3) The check in del_procs in r2 checked for a reference count of
    1. This is incorrect. At this point there should be 2 references:
    1 from ompi_proc, and another from the add_procs. The fix is to
    change this check to look for a reference count of 22. This check
    makes me extremely uncomforable as nothing will call del_procs if
    the reference count of a procs is not 2 when del_procs is
    called. Maybe there should be an assert since this is a developer
    error IMHO.

cmr=v1.8.2:reviewer=bosilca

This commit was SVN r31782.

The following SVN revision numbers were found above:
  r2 --> open-mpi/ompi@58fdc18855
Этот коммит содержится в:
Nathan Hjelm 2014-05-15 18:28:03 +00:00
родитель 55f0dcb81a
Коммит faf008f527
2 изменённых файлов: 18 добавлений и 12 удалений

Просмотреть файл

@ -449,7 +449,11 @@ static int mca_bml_r2_del_procs(size_t nprocs,
for(p = 0; p < nprocs; p++) {
ompi_proc_t *proc = procs[p];
if(((opal_object_t*)proc)->obj_reference_count == 1) {
/* We much check that there are 2 references to the proc (not 1). The
* first reference belongs to ompi/proc the second belongs to the bml
* since we retained it. We will release that reference at the end of
* the loop below. */
if(((opal_object_t*)proc)->obj_reference_count == 2) {
del_procs[n_del_procs++] = proc;
}
}

Просмотреть файл

@ -1,4 +1,4 @@
/* -*- Mode: C; c-basic-offset:4 ; -*- */
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
/*
* Copyright (c) 2004-2010 The Trustees of Indiana University and Indiana
* University Research and Technology
@ -11,7 +11,7 @@
* Copyright (c) 2004-2005 The Regents of the University of California.
* All rights reserved.
* Copyright (c) 2006-2013 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2006-2012 Los Alamos National Security, LLC. All rights
* Copyright (c) 2006-2014 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2006 University of Houston. All rights reserved.
* Copyright (c) 2009 Sun Microsystems, Inc. All rights reserved.
@ -138,11 +138,6 @@ int ompi_mpi_finalize(void)
*/
(void)mca_pml_base_bsend_detach(NULL, NULL);
nprocs = 0;
procs = ompi_proc_all(&nprocs);
MCA_PML_CALL(del_procs(procs, nprocs));
free(procs);
#if OMPI_ENABLE_PROGRESS_THREADS == 0
opal_progress_set_event_flag(OPAL_EVLOOP_ONCE | OPAL_EVLOOP_NONBLOCK);
#endif
@ -280,14 +275,21 @@ int ompi_mpi_finalize(void)
return ret;
}
/* free communicator resources. this MUST come before finalizing the PML
* as this will call into the pml */
if (OMPI_SUCCESS != (ret = ompi_comm_finalize())) {
return ret;
}
nprocs = 0;
procs = ompi_proc_world(&nprocs);
MCA_PML_CALL(del_procs(procs, nprocs));
free(procs);
/* free pml resource */
if(OMPI_SUCCESS != (ret = mca_pml_base_finalize())) {
return ret;
}
/* free communicator resources */
if (OMPI_SUCCESS != (ret = ompi_comm_finalize())) {
return ret;
}
/* free requests */
if (OMPI_SUCCESS != (ret = ompi_request_finalize())) {