openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	c1b43b6753	libfabric: the LIBADD should be unconditional The LIBADD for the common libfabric library does not belong down in the providers; it needs to be set when the libfabric core itself decides to build.	2014-12-17 14:02:08 -08:00
Jeff Squyres	f1a5d3a90d	configury: propagate a libtool shared lib version for libfabric	2014-12-17 13:36:01 -08:00
Jeff Squyres	6edc19d78d	libfabric: ensure that shell variables are initialized Ensure that the <provider>_happy shell variables are initialized to 0. Without this, the --without-libfabric case would leave them initialized, resulting in "test: -eq operator expecting a value" kinds of errors.	2014-12-17 13:36:01 -08:00
Jeff Squyres	e4b3c6f1c4	libfabric psm: fix (void*) dereference Committed upstream to libfabric as well.	2014-12-11 20:12:13 -08:00
Jeff Squyres	0f28233b35	libfabric: don't use __thread There's no real reason that this routine should use thread local storage. Plus, __thread appears to be a GCC extension.	2014-12-11 14:10:48 -08:00
Jeff Squyres	4551cab6f1	help messages: fix obvious typos	2014-12-11 12:23:33 -08:00
rolfv	f471b09ae9	Add support for CUDA Unified memory. Basically, add a new flag and disable some optimizations when that flag is detected. Lightly reviewed by bosilca.	2014-12-10 05:46:00 -08:00
Jeff Squyres	e6c8bfc201	libfabric: Gah -- also remove the "pragma pop" line Thanks to Nathan for pointing out that I missed snipping one line in `2f9c69f016` (I removed the trailing comment, but not the trailing pragma -- oops!).	2014-12-09 14:03:39 -08:00
Jeff Squyres	2f9c69f016	libfabric: use correct C99 notation for var-length array Nathan pointed out the correct C99 way to notate a variable-length array in a struct. This change has now been accepted upstream in libfabric.	2014-12-09 13:33:15 -08:00
Jeff Squyres	c40fd09d2a	libfabric: fix providers to conditionally add libs/flags Only allow the usnic and PSM providers to add CPPFLAGS and LIBADD flags when they are going to be built.	2014-12-09 07:15:25 -08:00
Jeff Squyres	45d6f29a27	Merge pull request #310 from yburette/master libfabric: add optional PSM provider.	2014-12-09 06:39:34 -08:00
Jeff Squyres	f5a07f651c	libfabric: Open MPI addition to stem a flood of warnings Add a pragma to not warn about zero-length arrays. This needs to be addressed upstream, but for now, do it here.	2014-12-09 06:04:37 -08:00
Jeff Squyres	f331f48796	libfabric: update embedded libfabric to 934a714 Update the embedded copy of libfabric to the github ofiwg/libfabric repo hash 934a714ca85f1a30a1e384a7d5f714ee962dc253.	2014-12-09 06:03:51 -08:00
Jeff Squyres	09d03a154b	libfabric: fix some typos in the usnic configury	2014-12-09 05:52:24 -08:00
Howard Pritchard	3a14c8eeff	fix build for cray xc Recent addition of libfabric embdded broke build on Cray XC/XE. This commit fixes this problem.	2014-12-08 22:21:13 -08:00
Yohann Burette	f90a7b51d2	libfabric: add optional PSM provider.	2014-12-08 16:49:41 -08:00
Yohann Burette	f33a9afd22	libfabric: fix typo in Makefile.am	2014-12-08 13:19:43 -08:00
Jeff Squyres	ac8e9d103c	libfabric: need to make AM_CONDITIONALs always be run Ensure that the usnic-specific AM_CONDITIONAL for the embedded libfabric is always run.	2014-12-08 11:51:26 -08:00
Jeff Squyres	d64881f040	psm_am.h: add missing file from libfabric snapshot This is just about to be fixed upstream, but "make dist" was not including this file in the libfabric tarball.	2014-12-08 11:39:08 -08:00
Jeff Squyres	d02756cdbb	libfabric: various configury updates 1. Ensure to override CFLAGS properly. Move the setting of CFLAGS outside the AM_CONDITIONAL so that Automake doesn't get confused (because CFLAGS is already set inside an AM_CONDITIONAL -- moving it outside the conditional ensure that this local CFLAGS override trumps all other CFLAGS overrides). 2. Only build libfabric on Linux. Add a little more configury to ensure that we only try to build libfabric on Linux. 3. Remove a dead/unused file 4. Fix typo in condition check 5. Use "false", not "/bin/false"	2014-12-08 11:39:07 -08:00
Jeff Squyres	92818d1fa5	usnic: remove SVN-style $Id$ tokens (and #idents) This commit is also upstream in libfabric.	2014-12-08 11:39:07 -08:00
Jeff Squyres	c4e8d67515	libfabric: sync to upstream libfabric github Bring down the latest from the libfabric github, as of 9d051567c8eb7adc2af89516f94c7d0539152948.	2014-12-08 11:37:37 -08:00
Jeff Squyres	7a96b58882	common verbs: remove usnic-specific code Now that the usnic BTL uses libfabric, we can remove the usnic-specific code from opal/mca/common/verbs.	2014-12-08 11:37:37 -08:00
Jeff Squyres	984982790a	usnic: convert from verbs to libfabric (yay!) This commit represents the conversion of the usnic BTL from verbs to libfabric. For the moment, libfabric is embedded in Open MPI (currently in the usnic BTL). This is because the libfabric API is still changing, and also has not yet been released. Ultimately, this embedded copy of libfabric will likely disappear and the usnic BTL will rely on an external installation of libfabric. New configure options: * --with-libfabric: will cause configure to fail if libfabric support cannot be built * --without-libfabric: will prevent libfabric support from being built * --with-libfabric=DIR: use an external libfabric installation * --with-libfabric-libdir=LIBDIR: when paired with --with-libfabric=DIR, use LIBDIR for the libfabric installation library dir The --with-libnl3[-libdir] arguments are now gone.	2014-12-08 11:37:37 -08:00
George Bosilca	324e43909d	Enable CUDA support on Mac OS X.	2014-11-20 13:51:10 -06:00
Ralph Castain	780c93ee57	Per the PR and discussion on today's telecon, extend the process name definition as a two-field struct of uint32_t's down to the OPAL layer. This resolves issues created by prior commits that impacted both heterogeneous and SPARC support. This also simplifies the OMPI code base by removing the need for frequent memcpy's when transitioning between the OMPI/ORTE layers and OPAL. We recognize that this means other users of OPAL will need to "wrap" the opal_process_name_t if they desire to abstract it in some fashion. This is regrettable, and we are looking at possible alternatives that might mitigate that requirement. Meantime, however, we have to put the needs of the OMPI community first, and are taking this step to restore hetero and SPARC support.	2014-11-11 17:00:42 -08:00
Howard Pritchard	6c8c9cb4a3	another fix for --enable-dlopen for ugni btl missed a change to create libmca_common_ugni.la file correctly.	2014-11-10 13:40:59 -07:00
Howard Pritchard	5c08aa8552	enable ugni btl to work without disable-dlopen There were mistakes in the Makefiles for the ugni btl and mca/common/ugni that prevented the ugni btl from being used unless one happened to set the --disable-dlopen option on the config line. This commit fixes this problem.	2014-11-09 15:19:47 -07:00
Howard Pritchard	59f8d0a92d	cleanup ugni compiler warnings	2014-11-06 12:25:10 -07:00
Jeff Squyres	9334abc474	Makefile: fix problems with static linking Avoid a problem with double-derefence of a variable macro name (i.e., a macro with part of its name from an AC_SUBST, such as ```$(foo@BAR@baz)```. In what might be a bug in Automake 1.14.1, if you do a pattern like this: ```makefile lib_LTLIBRARIES = lib@A_PREFIX@a_lib.la noinst_LTLIBRARIES = lib@A_PREFIX@a_noinst.la lib@A_PREFIX@a_lib_la_SOURCES = a.c lib@A_PREFIX@a_noinst_la_SOURCES = $(lib@A_PREFIX@a_lib_la_SOURCES) ``` Then in the resulting Makefile, the value of ```$(lib@A_PREFIX@a_lib_la_OBJECTS)``` will be blank (when it really should be ```a.o```). To workaround this potential bug, I've simply avoided doing double-derefences like this, and effectively set the second ```_SOURCES``` line equal to ```a.c``` (just like the first ```_SOURCES``` line). Fixes #250.	2014-10-24 16:27:54 -07:00
Jeff Squyres	c22e1ae33b	configury: new OPAL_SET_LIB_PREFIX/ORTE_SET_LIB_PREFIX macros These two macros set the prefix for the OPAL and ORTE libraries, respectively. Specifically, the OPAL library will be named libPREFIXopen-pal.la and the ORTE library will be named libPREFIXopen-rte.la. These macros must be called, even if the prefix argument is empty. The intent is that Open MPI will call these macros with an empty prefix, but other projects (such as ORCM) will call these macros with a non-empty prefix. For example, ORCM libraries can be named liborcm-open-pal.la and liborcm-open-rte.la. This scheme is necessary to allow running Open MPI applications under systems that use their own versions of ORTE and OPAL. For example, when running MPI applications under ORTE, if the ORTE and OPAL libraries between OMPI and ORCM are not identical (which, because they are released at different times, are likely to be different), we need to ensure that the OMPI applications link against their ORTE and OPAL libraries, but the ORCM executables link against their ORTE and OPAL libraries.	2014-10-22 10:32:19 -07:00
Howard Pritchard	ebc368d26b	remove GNI_RDMAMODE_FENCE bit in GNI_PostRdma The GNI_RDMAMODE_FENCE bit was a left over from async progress work that is not needed at this point in the gni BTL. Removing the bit also allows for the removal of the GNI_CDM_MODE_BTE_SINGLE_CHANNEL bit from the GNI_CdmCreate call.	2014-10-09 12:41:19 -06:00
Howard Pritchard	9947758d98	initial thread safety for ugni btl This commit adds initial ugni thread safety support. With this commit, sun thread tests (excepting MPI-2 RMA) pass with various process counts and threads/process. Also osu_latency_mt passes.	2014-10-08 10:13:22 -06:00
rolfv	697b18db63	Making async copy the default	2014-10-03 06:42:18 -07:00
Rolf vandeVaart	399dc3db43	Code to check for managed memory. Configure support also. This commit was SVN r32801.	2014-09-26 16:24:45 +00:00
Rolf vandeVaart	35858f837a	Revert r32713. Have different code for this. This commit was SVN r32800. The following SVN revision numbers were found above: r32713 --> open-mpi/ompi@9a2bab0e27	2014-09-26 14:56:18 +00:00
Rolf vandeVaart	9a2bab0e27	Add support for detecting CUDA managed memory. Disabled for now. This commit was SVN r32713.	2014-09-11 21:07:17 +00:00
Ralph Castain	f1a33b6476	Use the accessor function to get the jobid and vpid This commit was SVN r32672.	2014-09-06 19:18:21 +00:00
Howard Pritchard	2a12fd833d	Fix compile problem from pmix merge This commit was SVN r32626.	2014-08-28 22:14:12 +00:00
Ralph Castain	aec5cd08bd	Per the PMIx RFC: WHAT: Merge the PMIx branch into the devel repo, creating a new OPAL “lmix” framework to abstract PMI support for all RTEs. Replace the ORTE daemon-level collectives with a new PMIx server and update the ORTE grpcomm framework to support server-to-server collectives WHY: We’ve had problems dealing with variations in PMI implementations, and need to extend the existing PMI definitions to meet exascale requirements. WHEN: Mon, Aug 25 WHERE: https://github.com/rhc54/ompi-svn-mirror.git Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding. All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level. Accordingly, we have: * created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations. * Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported. * Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint * removed the prior OMPI/OPAL modex code * added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform. * retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand This commit was SVN r32570.	2014-08-21 18:56:47 +00:00
Rolf vandeVaart	c53c981506	Fix initialization and cleanup code for CUDA-aware code. Eliminates all resource leaks. This commit was SVN r32512.	2014-08-12 19:41:46 +00:00
Gilles Gouaillardet	b565e69b86	check-help-strings cleanup This commit was SVN r32491.	2014-08-11 03:19:57 +00:00
Rolf vandeVaart	876232e6d0	Fix previous checkin. This commit was SVN r32468.	2014-08-08 18:58:25 +00:00
George Bosilca	5df7451429	Fix the #include for the common sm. This commit was SVN r32467.	2014-08-08 18:30:27 +00:00
Howard Pritchard	a1f6ecf1e6	initial fixes for ugni btl move to opal This commit was SVN r32466.	2014-08-08 18:02:46 +00:00
Rolf vandeVaart	ac16af0bff	Small fix for case where no libcuda.so.1 is found. This commit was SVN r32461.	2014-08-08 16:05:19 +00:00
Ralph Castain	db89071dc2	Cleanup the moved component's Makefile.am to use the opal instead of ompi directories This commit was SVN r32370.	2014-07-31 04:41:04 +00:00
Ralph Castain	d674e22433	Remove stale include as header no longer exists. Add missing header This commit was SVN r32336.	2014-07-29 01:24:27 +00:00
Nathan Hjelm	0e47441333	Remove unused files This commit was SVN r32335.	2014-07-28 22:01:16 +00:00
Nathan Hjelm	1407c1f501	Remove RML code from common/sm The only user of this code was coll/sm. I implemented a basic replacement for the removed code. This gets the trunk compiling again with --disable-dlopen. This commit was SVN r32333.	2014-07-28 22:00:12 +00:00

1 2

80 Коммитов