1
1
openmpi/opal/mca/rcache/base/rcache_base_mem_cb.c
Ralph Castain 33ab928e1b ompi_proc_t size reduction: part 1
We currently save the hostname of a proc when we create the ompi_proc_t for it. This was originally done because the only method we had for discovering the host of a proc was to include that info in the modex, and we had to therefore store it somewhere proc-local. Obviously, this ccarried a memory penalty for storing all those strings, and so we added a "cutoff" parameter so that we wouldn't collect hostnames above a certain number of procs.

Unfortunately, this still results in an 8-byte/proc memory cost as we have a char* pointer in the opal_proc_t that is contained in the ompi_proc_t so that we can store the hostname of the other procs if we fall below the cutoff. At scale, this can consume a fair amount of memory.

With the switch to relying on PMIx, there is no longer a need to cache the proc hostnames. Using the "optional" feature of PMIx_Get, we restrict the retrieval to be purely proc-local - i.e., we retrieve the info either via shared memory or from within the proc-internal hash storage (depending upon the active PMIx components). Thus, the retrieval of a hostname is purely a local operation involving no communication.

All RM's are required to provide a complete hostname map of all procs at startup. Thus, we have full access to all hostnames without including them in a modex or having to cache them on each proc. This allows us to remove the char* pointer from the opal_proc_t, saving us 8-bytes/proc.

Unfortunately, PMIx_Get does not currently support the return of a static pointer to memory. Thus, even though PMIx has the hostname in its memory, it can only return a malloc'd version of it. I have therefore ensured that the return from opal_get_proc_hostname is consistently malloc'd and free'd wherever used. This shouldn't be a burden as the hostname is only used in one of two circumstances:

(a) in an error message
(b) in a verbose output for debugging purposes

Thus, there should be no performance penalty associated with the malloc/free requirement. PMIx will eventually be returning static pointers, and so we can eventually simplify this method and return a "const char*" - but as noted, this really isn't an issue even today.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-23 12:49:44 -07:00

94 строки
3.6 KiB
C

/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
/*
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2007 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
* University of Stuttgart. All rights reserved.
* Copyright (c) 2004-2005 The Regents of the University of California.
* All rights reserved.
* Copyright (c) 2009 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2012-2015 Los Alamos National Security, LLC.
* All rights reserved.
* Copyright (c) 2020 Intel, Inc. All rights reserved.
* $COPYRIGHT$
*
* Additional copyrights may follow
*
* $HEADER$
*/
/**
* @file
*/
#include "opal_config.h"
#ifdef HAVE_UNISTD_H
#include <unistd.h>
#endif
#include "opal/util/show_help.h"
#include "opal/util/proc.h"
#include "opal/runtime/opal_params.h"
#include "opal/mca/rcache/base/rcache_base_mem_cb.h"
#include "opal/mca/rcache/base/base.h"
#include "opal/mca/mca.h"
#include "opal/memoryhooks/memory.h"
static char msg[512];
/*
* memory hook callback, called when memory is free'd out from under
* us. Be wary of the from_alloc flag -- if you're called with
* from_alloc==true, then you cannot call malloc (or any of its
* friends)!
*/
void mca_rcache_base_mem_cb (void* base, size_t size, void* cbdata, bool from_alloc)
{
mca_rcache_base_selected_module_t* current;
int rc;
/* Only do anything meaningful if the OPAL layer is up and running
and size != 0 */
if ((from_alloc && (!opal_initialized)) || size == 0) {
return;
}
OPAL_LIST_FOREACH(current, &mca_rcache_base_modules, mca_rcache_base_selected_module_t) {
if (current->rcache_module->rcache_invalidate_range != NULL) {
rc = current->rcache_module->rcache_invalidate_range (current->rcache_module,
base, size);
if (rc != OPAL_SUCCESS) {
if (from_alloc) {
int len;
len = snprintf(msg, sizeof(msg), "[%s:%05d] Attempt to free memory that is still in "
"use by an ongoing MPI communication (buffer %p, size %lu). MPI job "
"will now abort.\n", opal_process_info.nodename,
getpid(), base, (unsigned long) size);
msg[sizeof(msg) - 1] = '\0';
write(2, msg, len);
} else {
opal_show_help("help-rcache-base.txt",
"cannot deregister in-use memory", true,
current->rcache_component->rcache_version.mca_component_name,
opal_process_info.nodename,
base, (unsigned long) size);
}
/* We're in a callback from somewhere; we can't do
anything meaningful to pass an error back up. :-(
So just exit. Call _exit() so that we don't try to
call anything on the way out -- just exit!
(remember that we're in a callback, and state may
be very undefined at this point...) */
_exit(1);
}
}
}
}