openmpi

Автор	SHA1	Сообщение	Дата
Nathan Hjelm	afae924e29	coll/ml: fix some warnings and the spelling of indices This commit fixes one warning that should have caused coll/ml to segfault on reduce. The fix should be correct but we will continue to investigate. cmr=v1.7.5:ticket=trac:4158 This commit was SVN r30477. The following Trac tickets were found above: Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158	2014-01-29 18:44:21 +00:00
Ralph Castain	b32556e6dc	Fixes trac:4143 After IM with Nathan, apply patch from ticket after verification by Paul Hargrove that it fixes the problem on non-x86 32-bit platforms Verified by Paul, RM-approved cmr=v1.7.4:reviewer=ompi-gk1.7 This commit was SVN r30411. The following Trac tickets were found above: Ticket 4143 --> https://svn.open-mpi.org/trac/ompi/ticket/4143	2014-01-24 17:56:52 +00:00
Mike Dubman	071838bb0a	HCOLL: call hcoll_finalize and hcoll progress unregister in case of hcoll module query failures fixed by Elena, reviewed by Val/Miked cmr=v1.7.4:reviewer=ompi-rm1.7 This commit was SVN r30390.	2014-01-23 07:29:23 +00:00
Nathan Hjelm	7ba8bd81fa	coll/ml: remove debug fprintfs cmr=v1.7.5:ticket=trac:4158 This commit was SVN r30367. The following Trac tickets were found above: Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158	2014-01-22 17:21:05 +00:00
Nathan Hjelm	82d996fb76	coll/ml: cleanup some merge related errors cmr=v1.7.5:ticket=trac:4158 This commit was SVN r30366. The following Trac tickets were found above: Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158	2014-01-22 16:48:09 +00:00
Nathan Hjelm	1a021b8f2d	coll/ml: add support for blocking and non-blocking allreduce, reduce, and allgather. The new collectives provide a signifigant performance increase over tuned for small and medium messages. We are initially setting the priority lower than tuned until this has had some time to soak in the trunk. Please set coll_ml_priority to 90 for MTT runs. Credit for this work goes to Manjunath Gorentla Venkata (ORNL), Pavel Shamis (ORNL), and Nathan Hjelm (LANL). Commit details (for reference): Import ORNL's collectives for MPI_Allreduce, MPI_Reduce, and MPI_Allgather. We need to take the basesmuma header into account when calculating the ptpcoll small message thresholds. Add a define to bcol.h indicating the maximum header size so we can take the header into account while not making ptpcoll dependent on information from basesmuma. This resolves an issue with allreduce where ptpcoll overwrites the header of the next buffer in the basesmuma bank. Fix reduce and make a sequential collective launcher in coll_ml_inlines.h The root calculation for reduce was wrong for any root != 0. There are four possibilities for the root: - The root is not the current process but is in the current hierarchy. In this case the root is the index of the global root as specified in the root vector. - The root is not the current process and is not in the next level of the hierarchy. In this case 0 must be the local root since this process will never communicate with the real root. - The root is not the current process but will be in next level of the hierarchy. In this case the current process must be the root. - I am the root. The root is my index. Tested with IMB which rotates the root on every call to MPI_Reduce. Consider IMB the reproducer for the issue this commit solves. Make the bcast algorithm decision an enumerated variable Resolve various asset failures when destructing coll ml requests. Two issues: - Always reset the request to be invalid before returning it to the free list. This will avoid an asset in ompi_request_t's destructor. OMPI_REQUEST_FINI does this (and also releases the fortran handle index). - Never explicitly construct or destruct the superclass of an opal object. This screws up the class function tables and will cause either an assert failure or a segmentation fault when destructing coll ml requests. Cleanup allgather. I removed the duplicate non-blocking and blocking functions and modeled the cleanup after what I found in allreduce. Also cleaned up the code somewhat. Don't bother copying from the send to the recieve buffer in bcol_basesmuma_allreduce_intra_fanin_fanout if the pointers are the same. The eliminates a warning about memcpy and aliasing and avoids an unnecessary call to memcpy. Alwasy call CHECK_AND_RELEASE on memsync collectives. There was a call to OBJ_RELEASE on the collective communicator but because CHECK_AND_RECYLCE was never called there was not matching call to OBJ_RELEASE. This caused coll ml to leak communicators. Make allreduce use the sequential collective launcher in coll_ml_inlines.h Just launch the next collective in the component progress. I am a little unsure about this patch. There appears to be some sort of race between collectives that causes buffer exhaustion in some cases (IMB Allreduce is a reproducer). Changing progress to only launch the next bcol seems to resolve the issue but might not be the best fix. Note that I see little-no performance penalty for this change. Fix allreduce when there are extra sources. There was an issue with the buffer offset calculation when there are extra sources. In the case of extra sources == 1 the offset was set to buffer_size (just past the header of the next buffer). I adjusted the buffer size to take into accoun the maximum header size (see the earlier commit that added this) and simplified the offset calculation. Make reduce/allreduce non-blocking. This is required for MPI_Comm_idup to work correctly. This has been tested with various layouts using the ibm testsuite and imb and appears to have the same performance as the old blocking version. Fix allgather for non-contiguous layouts and simplify parsing the topology. Some things in this patch: - There were several comments to the effect that level 0 of the hierarchy MUST contain all of the ranks. At least one function made this assumption but it was not true. I changed the sbgp components and the coll ml initization code to enforce this requirement. - Ensure that hierarchy level 0 has the ranks in the correct scatter gather order. This removes the need for a separate sort list and fixes the offset calculation for allgather. - There were several passes over the hierarchy to determine properties of the hierarchy. I eliminated these extra passes and the memory allocation associated with them and calculate the tree properties on the fly. The same DFS recursion also handles the re-order of level 0. All these changes have been verified with MPI_Allreduce, MPI_Reduce, and MPI_Allgather. All functions now pass all IBM/Open MPI, and IMB tests. coll/ml: correct pointer usage for MPI_BOTTOM Since contiguous datatypes are copied via memcpy (bypassing the convertor) we need to adjust for the lb of the datatype. This corrects problems found testing code that uses MPI_BOTTOM (NULL) as the send pointer. Add fallback collectives for allreduce and reduce. cmr=v1.7.5:reviewer=pasha This commit was SVN r30363.	2014-01-22 15:39:19 +00:00
Mike Dubman	b8550a55a7	HCOLL: many fixes Adds coll_hcoll_np mca parameter similar to that of fca component (defaults to 32). Those who use hcoll be aware that from now on the communicators less than 32 procs will run w/o hcoll by default. - Resolves fallback issue in case libhcoll runs out of allowed contexts. The solution is moving hcoll_context_create from comm_enable to comm_query. Shortly, comm_enable should never return OMPI_ERROR in the coll component with highest priority (hcoll). Otherwise the ompi coll_base_select will unselect the coll funtion pointers and module references leaving the communicator w/o coll pointer. This will cause the fail. Same behavior can be reproduced even with tuned if one would hardcore some "return OMPI_ERROR" into it's module_enable funtion. - Additionally, removed all the dead code under #if 0; removed unused variables (path for library, active_modules list) and classes (module list wrapper) Fixed by Val, Reviewed by Devendar/Josh/Miked cmr=v1.7.4:reviewer=ompi-rm1.7 This commit was SVN r30341.	2014-01-21 12:19:47 +00:00
Ralph Castain	9566650458	Per Marco, don't define a "min" function if one is already defined to avoid conflict with cygwin reserved word This commit was SVN r30241.	2014-01-10 18:03:25 +00:00
Ralph Castain	c7a94a57d7	Per Marco, rename ERROR tags to exit_ERROR to avoid cygwin reserved name issues. Refs trac:4085 This commit was SVN r30239. The following Trac tickets were found above: Ticket 4085 --> https://svn.open-mpi.org/trac/ompi/ticket/4085	2014-01-10 18:00:49 +00:00
Mike Dubman	110c99af4f	sharing negative tag space between libNBC and HCOLL fixed by devendar, reviewed by miked cmr=v1.7.4:reviewer=ompi-rm1.7 This commit was SVN r30224.	2014-01-10 12:51:34 +00:00
Nathan Hjelm	bb01fc2938	Add missing MCA variable enumerator sentinel. cmr=v1.7.4:reviewer=rhc This commit was SVN r30178.	2014-01-09 15:28:42 +00:00
Mike Dubman	0fae2caef3	Create a comm keyval for hcoll component with delete callback function. Set comm attribute with keyval. Wait for pending hcoll module tasks in comm delete callback where PML still valid on the communicator. safely destroy hcoll context during hcoll module destructor. Author: Devendar Bureddy reviewed by miked cmr=v1.7.4:reviewer=ompi-rm1.7 This commit was SVN r30175.	2014-01-09 11:27:24 +00:00
Mike Dubman	43d6a30693	Fix problems of: - HCOLL close without init - Call hcoll progress after comm finalize - mpirun default for coll_hcoll_enable is 1 fixed by Igor, reviewed by miked cmr=v1.7.4:reviewer=ompi-rm1.7 This commit was SVN r30156.	2014-01-08 10:55:25 +00:00
Jeff Squyres	13b29cff2c	This commit compliements/completes r30140. r30140 made all the configury/Makefile.am changes; this commit renames the internal installdirs.h framework struct field names to match the configry macro names: * pkgdatdir -> ompidatadir * pkglibdir -> ompilibdir * pkgincludedir -> ompiincludedir This commit was SVN r30145. The following SVN revision numbers were found above: r30140 --> open-mpi/ompi@8b778903d8	2014-01-07 23:36:33 +00:00
Brian Barrett	8b778903d8	Fix longstanding issue with our multi-project support. Rather than using pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is always set to {datadir,libdir,includedir}/openmpi. This will keep us from having help files in prefix/share/open-rte when building without Open MPI, but in prefix/share/openmpi when building with Open MPI. This commit was SVN r30140.	2014-01-07 22:11:15 +00:00
Brian Barrett	e811a8a9cb	Make the Portals 4 collective component disable itself when there's not a Portals 4 point-to-point (MTL or BTL) component in use This commit was SVN r30109.	2014-01-02 22:35:37 +00:00
Mike Dubman	80f4e02e0a	Several changes: - Modifications to coll/hcoll component related to the changes in the libhcoll API. Now, hcoll_destroy_context accepts one more parameter that indicates if the context was really destroyed as a result of the call. This new "non-blocking" context destruction fixes hang discovered in IMB with mcast enabled. - Clean up all the left contexts (if any) on the comm_world destruction. fixed by Val, reviewed by miked cmr=v1.7.4:reviewer=ompi-rm1.7 This commit was SVN r30055.	2013-12-23 06:57:12 +00:00
Jeff Squyres	71ec6c1617	Remove unnecessary "mpi.h"; move opal headers to the top. This commit was SVN r30053.	2013-12-22 20:38:43 +00:00
Jeff Squyres	0ab48ad0d2	Fix some annoying flex warnings that have been there for years. Many thanks to Tom Fogal for the initial patch. cmr=v1.7.4:reviewer=rhc:subject=Fix annoying flex warnings This commit was SVN r29904.	2013-12-14 00:36:12 +00:00
Mike Dubman	9a65e0d8c6	cosmetic fixed fpr hcol autotools Refs: #3694 This commit was SVN r29841.	2013-12-08 09:45:13 +00:00
Mike Dubman	2e124454b4	cosmitic fix to remove redundant -lfca use CPP extra flags var which propagated to coll/fca and scoll/fca Refs: #3694 This commit was SVN r29832.	2013-12-07 15:00:54 +00:00
Devendar Bureddy	4554770ee4	hcol fixes cmr=v1.7.4:reviewer=jladd This commit was SVN r29787.	2013-12-03 20:21:40 +00:00
George Bosilca	cb24277737	Restrict the usage of MPI_Type_extent only to receiving processes (aka the root). This commit is based on a patch provided by Pierre Jolivet. Fix all the output to match the failing MPI call. This commit was SVN r29761.	2013-11-27 12:09:31 +00:00
George Bosilca	68268377af	Fix an error message for the igather and the usage of the extent on non non-root processes for the iscatter. Thanks to Pierre Jolivet for the bug report and the patch. This commit was SVN r29736.	2013-11-23 00:59:22 +00:00
Nathan Hjelm	24a7e7aa34	Add support for the udreg registration cache and dynamics on XE/XK/XC. To support the new mpool two changes were made to the mpool infrastructure: 1) Added an mpool flag to indicate that an mpool does not need the memory hooks to use the leave pinned protocols. This flag is checked in the mpool lookup. 2) Add a mpool context to the base registration. This new member is used by the udreg mpool to store the udreg context associated with the particular registration. The new member will not break the ABI compatibility as the new member is only currently used by the udreg mpool. Dynamics support for Cray systems makes use of the global rank provided by orte to give the ugni library a unique rank for each process. Dynamics support is not available under direct-launch (srun.) cmr=v1.7.4 This commit was SVN r29719.	2013-11-18 04:58:37 +00:00
Brian Barrett	cf8de1ef0f	Minor indent cleanup in init_query() Only use Portals on communicators with more than one rank Fix computation of number of children when using the hypercube tree This commit was SVN r29616.	2013-11-06 15:21:09 +00:00
Nathan Hjelm	c71125acfd	Using MPI_* functions in iallreduce can cause comm-spawned processes to crash. Update libnbc's iallreduce function to use ompi_* functions instead. cmr=v1.7.4:reviewer=brbarret This commit was SVN r29582.	2013-11-01 16:45:54 +00:00
Nathan Hjelm	a31e617d17	Remove outdated comments in coll_basic_reduce_scatter.c. Refs trac:1559 This commit was SVN r29566. The following Trac tickets were found above: Ticket 1559 --> https://svn.open-mpi.org/trac/ompi/ticket/1559	2013-10-30 16:20:20 +00:00
Nathan Hjelm	167d5613db	Do not do arithmetic with void * in basic neighborhood alltoall[vw]. cmr=v1.7.4:reviewer=jsquyres This commit was SVN r29558.	2013-10-29 20:02:13 +00:00
Nathan Hjelm	b202bb0d63	Fix the recursive halfing algorithms for reduce scatter in both basic and tuned to correctly handle 0 recvcounts. Tested with the reproducer from #1550. Refs trac:1559 This commit was SVN r29542. The following Trac tickets were found above: Ticket 1559 --> https://svn.open-mpi.org/trac/ompi/ticket/1559	2013-10-28 19:06:38 +00:00
Mike Dubman	d27cffedb9	expand tabs to 4 spaces cd ompi/mca/coll/fca for i in *.[ch]; do expand -t 4 $i > koko && mv koko $i; done Refs: #3799 This commit was SVN r29472.	2013-10-22 17:05:55 +00:00
Jeff Squyres	6714890244	paffinity.h is gone and won't be coming back. This commit was SVN r29467.	2013-10-22 15:59:00 +00:00
Mike Dubman	5a7dff2d15	fix icc warning fixed by Dinar, reviewed by miked cmr=v1.7.4:reviewer=ompi-gk1.7 This commit was SVN r29428.	2013-10-12 18:04:28 +00:00
Nathan Hjelm	6232ef3bfb	At coll_select time we can not check whether the communicator has a virtual topology. Remove code checking for a virtual topology until this flag is set before coll_select. This commit was SVN r29344.	2013-10-03 03:37:46 +00:00
Nathan Hjelm	7bedf62dd8	Add basic algorithms for the remaining non-blocking collectives. The algorithms are intended for MPI-3.0 compliance and are not optimized. We should aim to add better algorithms in the future through cheetah. MPI_Iallreduce and MPI_Igatherv on intercommunicators are required for MPI_Comm_idup support. cmr=v1.7.4:reviewer=brbarret:ticket=trac:2715 This commit was SVN r29333. The following Trac tickets were found above: Ticket 2715 --> https://svn.open-mpi.org/trac/ompi/ticket/2715	2013-10-02 14:26:23 +00:00
Mike Dubman	19748e6957	fix race condition which can happen on finalize 1. Change in rte api implementation: now comm_world used to do p2p. This allows to not worry about other comms being destroyed. 2. added a notification mechanism with a help of which runtime can say libhcoll that RTE api can not be used any longer. pass a pointer to a flag, and its size to libhcoll. The flag changes when the RTE is no longer available. Currently this flag is just ompi_mpi_finalized global bool value. cmr=v1.7.3:reviewer=jladd This commit was SVN r29331.	2013-10-02 13:38:47 +00:00
Nathan Hjelm	4f12406436	Don't check for neighborhood collective routines on non-virtual topology communicators This commit was SVN r29319.	2013-10-01 19:59:18 +00:00
Mike Dubman	9bf7578ff2	fix memory corruption cmr:v1.7.3:reviewer=ompi-rm1.7 This commit was SVN r29293.	2013-09-30 06:18:12 +00:00
Nathan Hjelm	c5596548b2	MPI-3: Add support for neighborhood collectives Blocking versions are simple linear algorithms implemented in coll/basic. Non- blocking versions are from libnbc 1.1.1. All algorithms have been tested with simple test cases. cmr=v1.7.4:reviewer=jsquyres This commit was SVN r29265.	2013-09-26 21:55:08 +00:00
Mike Dubman	7c6ff00da5	Add caching of FCA communicators developed by Dinar, reviewed by miked/yossi. cmr:v1.7.3:reviewer=jsquyres:subject=add caching of FCA communicators. This commit was SVN r29256.	2013-09-26 17:48:07 +00:00
Joshua Ladd	82e092db1b	Adding interface changes in hcoll component to support non-blocking collectives in libhcoll. This was added by Elena Elkina and reviewed by Josh Ladd. cmr:v1.7.3:reviewer=jladd:subject=Add support for non-blocking collectives in hcoll This commit was SVN r29244.	2013-09-25 16:14:59 +00:00
Mike Dubman	2a5c342587	Modifications that are necessary in order to meet latest libhcoll API. cmr:v1.7.3:reviewer=jladd This commit was SVN r29202.	2013-09-18 12:22:02 +00:00
George Bosilca	5f686a90d0	Fix several issues regarding MPI_IN_PLACE and different flavors of MPI_Alltoall. - add support for MPI_IN_PLACE in the self collective component. - fix the extent usage in the tuned collective component. - correctly use the peer counts instead of local - add support for MPI_IN_PLACE in the self collective component. - fix the extent usage in the tuned collective component. - correctly use the peer counts instead of local. Thanks to Fujitsu for the patch. This commit was SVN r29187.	2013-09-17 11:35:18 +00:00
Brian Barrett	16a1166884	Remove the proc_pml and proc_bml fields from ompi_proc_t and replace with a configure-time dynamic allocation of flags. The net result for platforms which only support BTL-based communication is a reduction of 8*nprocs bytes per process. Platforms which support both MTLs and BTLs will not see a space reduction, but will now be able to safely run both the MTL and BTL side-by-side, which will prove useful. This commit was SVN r29100.	2013-08-30 16:54:55 +00:00
Nathan Hjelm	f5495ace48	coll/ml: update the coll_ml_enable_fragmentation variable to support the option to autodetect whether fragmentation should be enabled cmr=v1.7.3:ticket=trac:3717 This commit was SVN r29065. The following Trac tickets were found above: Ticket 3717 --> https://svn.open-mpi.org/trac/ompi/ticket/3717	2013-08-27 16:36:54 +00:00
Ralph Castain	c74c54e18d	Cleanup uninitialized warnings This commit was SVN r29033.	2013-08-16 21:23:09 +00:00
Nathan Hjelm	6c75699068	coll/ml: fix typo in assert that could cause an abort in debug builds. cmr=v1.7.3:reviewer=manjugv This commit was SVN r29024.	2013-08-12 14:31:44 +00:00
Nathan Hjelm	47320713bb	coll/ml: do not register variables in open and fix a bug in the coll/ml parser cmr=v1.7.3:reviewer=pasha This commit was SVN r29010.	2013-08-09 17:55:30 +00:00
Brian Barrett	2cc947513b	* Fix some compile errors * Need to subtract 1 off the size so that we stay in the bit length requirements This commit was SVN r28997.	2013-08-05 18:49:48 +00:00
Nathan Hjelm	e4f105ffb3	revert change that shouldn't have been part of r28952 This commit was SVN r28953. The following SVN revision numbers were found above: r28952 --> open-mpi/ompi@cb90a4a7fc	2013-07-25 20:23:55 +00:00

1 2 3 4 5 ...

676 Коммитов