* do not add -I/.../include/hcoll -I /.../include/hcoll/api to CPPFLAGS
* allow configure --with-hcoll
* search hcoll libs in both DIR/lib and DIR/lib64
* fix the description of the --with-hcoll option
This commit adds support for project_framework_component_* parameter
matching. This is the first step in allowing the same framework name
in multiple projects. This change also bumps the MCA component version
to 2.1.0.
All master frameworks have been updated to use the new component
versioning macro. An mca.h has been added to each project to add a
project specific versioning macro of the form
PROJECT_MCA_VERSION_2_1_0.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
Use of the old ompi_free_list_t and ompi_free_list_item_t is
deprecated. These classes will be removed in a future commit.
This commit updates the entire code base to use opal_free_list_t and
opal_free_list_item_t.
Notes:
OMPI_FREE_LIST_*_MT -> opal_free_list_* (uses opal_using_threads ())
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds an owner file in each of the component directories
for each framework. This allows for a simple script to parse
the contents of the files and generate, among other things, tables
to be used on the project's wiki page. Currently there are two
"fields" in the file, an owner and a status. A tool to parse
the files and generate tables for the wiki page will be added
in a subsequent commit.
new flag to ompi_info that allows a user to print all MCA variables of a specific type.
--type version_string
This command will print all MCA variables of type version_string.
This feature was developed by Elena Shipunova and was reviewed by Josh Ladd.
This commit was SVN r32166.
- -check-shmem-params is OFF by default. It checks OSHMEM API params and will abort on bad input
- hcoll do not save fallback coll pointers for unsupported collectives.
fixed by Val, Roman, reviewed by Miked/Igor
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30995.
Adds coll_hcoll_np mca parameter similar to that of fca component (defaults to 32). Those who use hcoll be aware that from now on the communicators less than 32 procs will run w/o hcoll by default. - Resolves fallback issue in case libhcoll runs out of allowed contexts. The solution is moving hcoll_context_create from comm_enable to comm_query. Shortly, comm_enable should never return OMPI_ERROR in the coll component with highest priority (hcoll). Otherwise the ompi coll_base_select will unselect the coll funtion pointers and module references leaving the communicator w/o coll pointer. This will cause the fail. Same behavior can be reproduced even with tuned if one would hardcore some "return OMPI_ERROR" into it's module_enable funtion. - Additionally, removed all the dead code under #if 0; removed unused variables (path for library, active_modules list) and classes (module list wrapper)
Fixed by Val, Reviewed by Devendar/Josh/Miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30341.
Set comm attribute with keyval.
Wait for pending hcoll module tasks in comm delete callback where PML
still valid on the communicator. safely destroy hcoll context during
hcoll module destructor.
Author: Devendar Bureddy
reviewed by miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30175.
- HCOLL close without init
- Call hcoll progress after comm finalize
- mpirun default for coll_hcoll_enable is 1
fixed by Igor, reviewed by miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30156.
pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is
always set to {datadir,libdir,includedir}/openmpi. This will keep us from
having help files in prefix/share/open-rte when building without Open MPI,
but in prefix/share/openmpi when building with Open MPI.
This commit was SVN r30140.
- Modifications to coll/hcoll component related to the changes in the libhcoll API.
Now, hcoll_destroy_context accepts one more parameter that indicates if the context was
really destroyed as a result of the call.
This new "non-blocking" context destruction fixes hang discovered in IMB with mcast enabled.
- Clean up all the left contexts (if any) on the comm_world destruction.
fixed by Val, reviewed by miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30055.
1. Change in rte api implementation: now comm_world used to do p2p.
This allows to not worry about other comms being destroyed.
2. added a notification mechanism with a help of which runtime can say libhcoll that RTE api can not be used any longer.
pass a pointer to a flag, and its size to libhcoll.
The flag changes when the RTE is no longer available.
Currently this flag is just ompi_mpi_finalized global bool value.
cmr=v1.7.3:reviewer=jladd
This commit was SVN r29331.
many builds. I am temporarily .ompi_ignore'ing this component until
it can be fixed by its owner.
* It calls AC_MSG_ERROR, which configure.m4 scripts are ''never''
supposed to do. If you don't want to build, then call $2.
* All static and --disable-dlopen builds are broken; they fall afoul
of whatever test configure.m4 is doing and therefore error out of
configure entirely (vs. simply disabling the hcoll component).
* There appear to be multiple shell scripting errors in the
configure.m4. Here's the output of "./configure --disable-dlopen":
{{{
--- MCA component coll:hcoll (m4 configuration macro)
checking for MCA component coll:hcoll compile mode... static
checking --with-hcoll value... simple ok (unspecified)
./configure: line 421: test: basic: integer expression expected
configure: error: Can not use coll/hcoll and coll/ml (static build)
simultaneously. You have two options:
1. Use static build & disable ml with:
--enable-mpi-no-build=coll-ml
2. Use dso build for ML & disable ml at runtime: -mca
coll self
./configure: line 310: return: basic: numeric argument required
./configure: line 320: exit: basic: numeric argument required
}}}
Finally, all of these configure.m4 errors aside, I don't understand
why there is a ''compile-time'' exclusion between the hcoll and ml
components. Why isn't this a ''run-time'' decision? Having what
seems to be an unnecessary compile-time exclusion goes against the
general Open MPI philosophy.
Note: Open MPI 1.7 is also broken in all the same ways. I suggest
that the RM's .ompi_ignore hcoll over there, too.
Mellanox: please fix.
This commit was SVN r28748.
value to signal that the operation of retrieving the element from the free list
failed. However in this case the returned pointer was set to NULL as well, so the
error code was redundant. Moreover, this was a continuous source of warnings when
the picky mode is on.
The attached parch remove the rc argument from the OMPI_FREE_LIST_GET and
OMPI_FREE_LIST_WAIT macros, and change to check if the item is NULL instead of
using the return code.
This commit was SVN r28722.