2005-07-02 18:36:36 +04:00
|
|
|
#
|
2005-11-05 22:57:48 +03:00
|
|
|
# Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
|
|
|
# University Research and Technology
|
|
|
|
# Corporation. All rights reserved.
|
|
|
|
# Copyright (c) 2004-2005 The University of Tennessee and The University
|
|
|
|
# of Tennessee Research Foundation. All rights
|
|
|
|
# reserved.
|
2015-06-24 06:59:57 +03:00
|
|
|
# Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
2005-07-02 18:36:36 +04:00
|
|
|
# University of Stuttgart. All rights reserved.
|
|
|
|
# Copyright (c) 2004-2005 The Regents of the University of California.
|
|
|
|
# All rights reserved.
|
2008-08-08 23:10:02 +04:00
|
|
|
# Copyright (c) 2008 Sun Microsystems, Inc. All rights reserved.
|
2014-06-01 20:14:10 +04:00
|
|
|
# Copyright (c) 2014 Cisco Systems, Inc. All rights reserved.
|
Move from the use of regex to compression
We've been fighting the battle of trying to create a regex generator and
parser that can handle arbitrary hostname schemes - without long-term
success. The worst of it is that there is no way of checking to see if
the computed regex is correct short of parsing it and doing a
character-by-character comparison with the original string. Ugh...there
has to be a better solution.
One option is to investigate using 3rd-party regex libraries as
those are coming from communities whose sole focus is resolving that
problem. However, someone would need to spend the time to investigate
it, and we'd have to find a license-friendly implementation.
Another option is to quit beating our heads against the wall and just
compress the information. It won't be as much of a reduction, but we
also won't keep hitting scenarios where things break. In this case, it
seems that "perfection" is definitely the enemy of "good enough".
This PR implements the compression option while retaining the
possibility of people adding regex-generating components. The
compression code used in ORTE is consolidated into the opal/compress
framework. That framework currently held bzip and gzip components for
use in compressing checkpoint files - since we no longer support C/R, I
have .opal_ignore'd those components.
However, I have left the original framework APIs alone in case someone
ever decides to redo C/R. The APIs of interest here are added to the
framework - specifically, the "compress_block" and "decompress_block"
functions. I then moved the ORTE zlib compression code into a new
component in this framework.
Unfortunately, the framework currently is a single-select one - i.e.,
only one active component at a time. Since I .opal_ignore'd the other
two and made the priority of zlib high, this isn't a problem. However,
if someone wants to re-enable bzip/gzip or add another component, they
might need to transition opal/compress to a multi-select framework.
Included changes:
* Consolidate the compression code into the opal/compress framework
* Move the ORTE zlib compression code into a new opal/compress/zlib
component
* Ignore the bzip and gzip components in opal/compress framework
* Add a "compress_base_limit" MCA param to set the threshold above which
we compress data - defaults to 4096 bytes
* Delete stale brucks and rcd components from orte/grpcomm framework
* Delete the orte/regx framework
* Update the launch system to use opal/compress instead of string regex
* Provide a default module if no zlib is available
* Fix some misc multi-node issues
* Properly generate the nidmap in response to a "connection warmup"
message so the remote daemon knows the children it needs to launch.
* Remove stale references to orte_node_regex
* opal_byte_object_t's are not OPAL objects - properly release allocated
memory.
* Set the topology
* Currently only handling homogeneous case
* Update the compress framework files to conform
* Consolidate open/close into one "frame" file. Ensure we open/close the
framework
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-01-30 03:02:21 +03:00
|
|
|
# Copyright (c) 2014-2019 Intel, Inc. All rights reserved.
|
2016-05-26 08:51:32 +03:00
|
|
|
# Copyright (c) 2016 Research Organization for Information Science
|
|
|
|
# and Technology (RIST). All rights reserved.
|
2005-07-02 18:36:36 +04:00
|
|
|
# $COPYRIGHT$
|
2015-06-24 06:59:57 +03:00
|
|
|
#
|
2005-07-02 18:36:36 +04:00
|
|
|
# Additional copyrights may follow
|
2015-06-24 06:59:57 +03:00
|
|
|
#
|
2005-07-02 18:36:36 +04:00
|
|
|
# $HEADER$
|
|
|
|
#
|
|
|
|
|
2005-10-17 04:21:10 +04:00
|
|
|
# This makefile.am does not stand on its own - it is included from orte/Makefile.am
|
|
|
|
|
2014-03-28 22:24:32 +04:00
|
|
|
include $(top_srcdir)/Makefile.ompi-rules
|
2008-08-08 23:10:02 +04:00
|
|
|
|
2014-05-08 18:32:24 +04:00
|
|
|
dist_ortedata_DATA += util/hostfile/help-hostfile.txt \
|
2009-02-19 23:46:36 +03:00
|
|
|
util/dash_host/help-dash-host.txt \
|
|
|
|
util/help-regex.txt
|
2008-07-21 21:58:12 +04:00
|
|
|
|
2008-08-08 23:10:02 +04:00
|
|
|
nodist_man_MANS = util/hostfile/orte_hosts.7
|
|
|
|
|
|
|
|
# We are using $(am_dirstamp) instead of creating our own dirstamp since there
|
|
|
|
# is src code in util/hostfile directory is created. The automake process
|
|
|
|
# creates the $(am_dirstamp), we found the use of this in the generated Makefile
|
|
|
|
# in the util/Makefile
|
|
|
|
$(nodist_man_MANS): util/hostfile/$(am__dirstamp) $(top_builddir)/opal/include/opal_config.h
|
2008-07-21 21:58:12 +04:00
|
|
|
|
2009-02-10 21:33:32 +03:00
|
|
|
EXTRA_DIST += $(nodist_man_MANS:.7=.7in)
|
2008-02-28 04:57:57 +03:00
|
|
|
|
|
|
|
AM_LFLAGS = -Porte_util_hostfile_
|
|
|
|
LEX_OUTPUT_ROOT = lex.orte_util_hostfile_
|
|
|
|
|
2005-10-17 04:21:10 +04:00
|
|
|
headers += \
|
2017-06-06 01:22:28 +03:00
|
|
|
util/name_fns.h \
|
2005-10-17 04:21:10 +04:00
|
|
|
util/proc_info.h \
|
|
|
|
util/session_dir.h \
|
2010-04-23 08:44:41 +04:00
|
|
|
util/show_help.h \
|
2013-02-28 05:35:55 +04:00
|
|
|
util/error_strings.h \
|
2017-06-06 01:22:28 +03:00
|
|
|
util/context_fns.h \
|
|
|
|
util/parse_options.h \
|
|
|
|
util/pre_condition_transports.h \
|
2008-02-28 04:57:57 +03:00
|
|
|
util/hnp_contact.h \
|
|
|
|
util/hostfile/hostfile.h \
|
|
|
|
util/hostfile/hostfile_lex.h \
|
|
|
|
util/dash_host/dash_host.h \
|
2008-04-30 23:49:53 +04:00
|
|
|
util/comm/comm.h \
|
Fat SMPs (i.e., systems with nodes containing large numbers of cpus) were failing to start due to connection failures of the opal/pmix support. Root cause was that (a) we were setting the client socket to non-blocking before calling connect, and (b) the server was using the event library to harvest the accepts, and also did the handshake while in that event. So the server would backup beyond the connection backlog limit, and we would fail.
Changing the client to leave its socket as blocking during the connect doesn't solve the problem by itself - you also have to introduce a sleep delay once the backlog is hit to avoid simply machine-gunning your way thru retries. This gets somewhat difficult to adjust as you don't want to unnecessarily prolong startup time.
We've solved this before by adding a listening thread that simply reaps accepts and shoves them into the event library for subsequent processing. This would resolve the problem, but meant yet another daemon-level thread. So I centralized the listening thread support and let multiple elements register listeners on it. Thus, each daemon now has a single listening thread that reaps accepts from multiple sources - for now, the orte/pmix server and the oob/usock support are using it. I'll add in the oob/tcp component later.
This still didn't fully resolve the SMP problem, especially on coprocessor cards (e.g., KNC). Removing the shared memory dstore support helped further improve the behavior - it looks like there is some kind of memory paging issue there that needs further understanding. Given that the shared memory support was about to be lost when I bring over the PMIx integration (until it is restored in that library), it seemed like a reasonable thing to just remove it at this point.
2015-05-30 00:28:26 +03:00
|
|
|
util/attr.h \
|
2017-01-20 00:26:00 +03:00
|
|
|
util/listener.h \
|
Move from the use of regex to compression
We've been fighting the battle of trying to create a regex generator and
parser that can handle arbitrary hostname schemes - without long-term
success. The worst of it is that there is no way of checking to see if
the computed regex is correct short of parsing it and doing a
character-by-character comparison with the original string. Ugh...there
has to be a better solution.
One option is to investigate using 3rd-party regex libraries as
those are coming from communities whose sole focus is resolving that
problem. However, someone would need to spend the time to investigate
it, and we'd have to find a license-friendly implementation.
Another option is to quit beating our heads against the wall and just
compress the information. It won't be as much of a reduction, but we
also won't keep hitting scenarios where things break. In this case, it
seems that "perfection" is definitely the enemy of "good enough".
This PR implements the compression option while retaining the
possibility of people adding regex-generating components. The
compression code used in ORTE is consolidated into the opal/compress
framework. That framework currently held bzip and gzip components for
use in compressing checkpoint files - since we no longer support C/R, I
have .opal_ignore'd those components.
However, I have left the original framework APIs alone in case someone
ever decides to redo C/R. The APIs of interest here are added to the
framework - specifically, the "compress_block" and "decompress_block"
functions. I then moved the ORTE zlib compression code into a new
component in this framework.
Unfortunately, the framework currently is a single-select one - i.e.,
only one active component at a time. Since I .opal_ignore'd the other
two and made the priority of zlib high, this isn't a problem. However,
if someone wants to re-enable bzip/gzip or add another component, they
might need to transition opal/compress to a multi-select framework.
Included changes:
* Consolidate the compression code into the opal/compress framework
* Move the ORTE zlib compression code into a new opal/compress/zlib
component
* Ignore the bzip and gzip components in opal/compress framework
* Add a "compress_base_limit" MCA param to set the threshold above which
we compress data - defaults to 4096 bytes
* Delete stale brucks and rcd components from orte/grpcomm framework
* Delete the orte/regx framework
* Update the launch system to use opal/compress instead of string regex
* Provide a default module if no zlib is available
* Fix some misc multi-node issues
* Properly generate the nidmap in response to a "connection warmup"
message so the remote daemon knows the children it needs to launch.
* Remove stale references to orte_node_regex
* opal_byte_object_t's are not OPAL objects - properly release allocated
memory.
* Set the topology
* Currently only handling homogeneous case
* Update the compress framework files to conform
* Consolidate open/close into one "frame" file. Ensure we open/close the
framework
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-01-30 03:02:21 +03:00
|
|
|
util/threads.h \
|
|
|
|
util/nidmap.h
|
2005-10-17 04:21:10 +04:00
|
|
|
|
configury: new OPAL_SET_LIB_PREFIX/ORTE_SET_LIB_PREFIX macros
These two macros set the prefix for the OPAL and ORTE libraries,
respectively. Specifically, the OPAL library will be named
libPREFIXopen-pal.la and the ORTE library will be named
libPREFIXopen-rte.la.
These macros must be called, even if the prefix argument is empty.
The intent is that Open MPI will call these macros with an empty
prefix, but other projects (such as ORCM) will call these macros with
a non-empty prefix. For example, ORCM libraries can be named
liborcm-open-pal.la and liborcm-open-rte.la.
This scheme is necessary to allow running Open MPI applications under
systems that use their own versions of ORTE and OPAL. For example,
when running MPI applications under ORTE, if the ORTE and OPAL
libraries between OMPI and ORCM are not identical (which, because they
are released at different times, are likely to be different), we need
to ensure that the OMPI applications link against their ORTE and OPAL
libraries, but the ORCM executables link against their ORTE and OPAL
libraries.
2014-10-22 16:49:58 +04:00
|
|
|
lib@ORTE_LIB_PREFIX@open_rte_la_SOURCES += \
|
2013-02-28 05:35:55 +04:00
|
|
|
util/error_strings.c \
|
|
|
|
util/name_fns.c \
|
|
|
|
util/proc_info.c \
|
|
|
|
util/session_dir.c \
|
|
|
|
util/show_help.c \
|
2017-06-06 01:22:28 +03:00
|
|
|
util/context_fns.c \
|
|
|
|
util/parse_options.c \
|
|
|
|
util/pre_condition_transports.c \
|
2008-02-28 04:57:57 +03:00
|
|
|
util/hnp_contact.c \
|
|
|
|
util/hostfile/hostfile_lex.l \
|
|
|
|
util/hostfile/hostfile.c \
|
|
|
|
util/dash_host/dash_host.c \
|
2008-04-30 23:49:53 +04:00
|
|
|
util/comm/comm.c \
|
Fat SMPs (i.e., systems with nodes containing large numbers of cpus) were failing to start due to connection failures of the opal/pmix support. Root cause was that (a) we were setting the client socket to non-blocking before calling connect, and (b) the server was using the event library to harvest the accepts, and also did the handshake while in that event. So the server would backup beyond the connection backlog limit, and we would fail.
Changing the client to leave its socket as blocking during the connect doesn't solve the problem by itself - you also have to introduce a sleep delay once the backlog is hit to avoid simply machine-gunning your way thru retries. This gets somewhat difficult to adjust as you don't want to unnecessarily prolong startup time.
We've solved this before by adding a listening thread that simply reaps accepts and shoves them into the event library for subsequent processing. This would resolve the problem, but meant yet another daemon-level thread. So I centralized the listening thread support and let multiple elements register listeners on it. Thus, each daemon now has a single listening thread that reaps accepts from multiple sources - for now, the orte/pmix server and the oob/usock support are using it. I'll add in the oob/tcp component later.
This still didn't fully resolve the SMP problem, especially on coprocessor cards (e.g., KNC). Removing the shared memory dstore support helped further improve the behavior - it looks like there is some kind of memory paging issue there that needs further understanding. Given that the shared memory support was about to be lost when I bring over the PMIx integration (until it is restored in that library), it seemed like a reasonable thing to just remove it at this point.
2015-05-30 00:28:26 +03:00
|
|
|
util/attr.c \
|
2017-01-20 00:26:00 +03:00
|
|
|
util/listener.c \
|
Move from the use of regex to compression
We've been fighting the battle of trying to create a regex generator and
parser that can handle arbitrary hostname schemes - without long-term
success. The worst of it is that there is no way of checking to see if
the computed regex is correct short of parsing it and doing a
character-by-character comparison with the original string. Ugh...there
has to be a better solution.
One option is to investigate using 3rd-party regex libraries as
those are coming from communities whose sole focus is resolving that
problem. However, someone would need to spend the time to investigate
it, and we'd have to find a license-friendly implementation.
Another option is to quit beating our heads against the wall and just
compress the information. It won't be as much of a reduction, but we
also won't keep hitting scenarios where things break. In this case, it
seems that "perfection" is definitely the enemy of "good enough".
This PR implements the compression option while retaining the
possibility of people adding regex-generating components. The
compression code used in ORTE is consolidated into the opal/compress
framework. That framework currently held bzip and gzip components for
use in compressing checkpoint files - since we no longer support C/R, I
have .opal_ignore'd those components.
However, I have left the original framework APIs alone in case someone
ever decides to redo C/R. The APIs of interest here are added to the
framework - specifically, the "compress_block" and "decompress_block"
functions. I then moved the ORTE zlib compression code into a new
component in this framework.
Unfortunately, the framework currently is a single-select one - i.e.,
only one active component at a time. Since I .opal_ignore'd the other
two and made the priority of zlib high, this isn't a problem. However,
if someone wants to re-enable bzip/gzip or add another component, they
might need to transition opal/compress to a multi-select framework.
Included changes:
* Consolidate the compression code into the opal/compress framework
* Move the ORTE zlib compression code into a new opal/compress/zlib
component
* Ignore the bzip and gzip components in opal/compress framework
* Add a "compress_base_limit" MCA param to set the threshold above which
we compress data - defaults to 4096 bytes
* Delete stale brucks and rcd components from orte/grpcomm framework
* Delete the orte/regx framework
* Update the launch system to use opal/compress instead of string regex
* Provide a default module if no zlib is available
* Fix some misc multi-node issues
* Properly generate the nidmap in response to a "connection warmup"
message so the remote daemon knows the children it needs to launch.
* Remove stale references to orte_node_regex
* opal_byte_object_t's are not OPAL objects - properly release allocated
memory.
* Set the topology
* Currently only handling homogeneous case
* Update the compress framework files to conform
* Consolidate open/close into one "frame" file. Ensure we open/close the
framework
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-01-30 03:02:21 +03:00
|
|
|
util/nidmap.c
|
2009-02-19 23:46:36 +03:00
|
|
|
|
2008-08-08 23:10:02 +04:00
|
|
|
# Remove the generated man pages
|
|
|
|
distclean-local:
|
|
|
|
rm -f $(nodist_man_MANS)
|
2016-05-26 08:51:32 +03:00
|
|
|
|
|
|
|
maintainer-clean-local:
|
|
|
|
rm -f util/hostfile/hostfile_lex.c
|