1
1
openmpi/ompi
Rainer Keller 4e6a6fc146 - Check, whether the compiler supports __builtin_clz (count leading
zeroes);
   if so, use it for bit-operations like opal_cube_dim and opal_hibit.
   Implement two versions of power-of-two.
   In case of opal_next_poweroftwo, this reduces the average execution
   time from 83 cycles to 4 cycles (Intel Nehalem, icc, -O2, inlining,
   measured rdtsc, with loop over 2^27 values).
   Numbers for other functions are similar (but of course heavily depend
   on the usage, e.g. opal_hibit() with a start of 4 does not save
   much).  The bsr instruction on AMD Opteron is also not as fast.

 - Replace various places where the next power-of-two is computed.
   
   Tested on Intel Nehalem Cluster with openib, compilers GNU-4.6.1 and
   Intel-12.0.4 using mpi_testsuite -t "Collective" with 128 processes.

This commit was SVN r25270.
2011-10-11 22:49:01 +00:00
..
attribute Fixes trac:2767: Recursive locking when ROMIO used with THREAD_MULITPLE 2011-05-04 06:31:42 +00:00
class Add support for CUDA registering sm and openib buffers. Feature is disabled by default. 2011-08-04 10:15:45 +00:00
communicator Each level (OPAL/ORTE/OMPI) should only return it's own constants, 2011-10-04 14:50:31 +00:00
config Fixing the librdmacm dependency for build process 2011-10-11 09:10:06 +00:00
contrib Changes to VT: 2011-08-30 11:14:56 +00:00
datatype Unsigned datatypes should be redirected to their unsigned correspondants 2011-05-03 12:53:52 +00:00
debuggers Make the no-orte case compile again 2011-04-20 16:48:07 +00:00
errhandler Add a resilience to ORTE. Allows the runtime to continue after a process (or 2011-06-23 20:38:02 +00:00
etc Many thanks to Ralf W. for finding a subtle bug in these Makefile.am's 2008-06-04 01:28:03 +00:00
file Clean up request handling in the I/O framework to be more consistent with 2009-11-26 05:13:43 +00:00
group Fix formatting in group and communicator code (- No functionality changes -) 2010-10-04 14:54:58 +00:00
include Each level (OPAL/ORTE/OMPI) should only return it's own constants, 2011-10-04 14:50:31 +00:00
info Some relatively minor C/R related cleanup 2010-07-30 18:59:34 +00:00
mca - Check, whether the compiler supports __builtin_clz (count leading 2011-10-11 22:49:01 +00:00
mpi Minor text fix suggested by Jeremiah Willcock. 2011-09-21 20:05:19 +00:00
mpiext Fix minor typo -- f90, not f77 2011-06-27 20:38:30 +00:00
op Reshape the datatype engine. The basic types are built down in OPAL. MPI types are 2011-01-13 06:08:54 +00:00
peruse - Sanity check initialization and finalization of PERUSE. 2010-01-12 16:36:24 +00:00
proc Each level (OPAL/ORTE/OMPI) should only return it's own constants, 2011-10-04 14:50:31 +00:00
request * Implement long-ago discussed RFC to add a callback data pointer in the 2011-06-30 20:05:16 +00:00
runtime We always have hwloc xml support (now that it's built into to hwloc 2011-10-11 20:20:59 +00:00
tools Each level (OPAL/ORTE/OMPI) should only return it's own constants, 2011-10-04 14:50:31 +00:00
win Fixes trac:2767: Recursive locking when ROMIO used with THREAD_MULITPLE 2011-05-04 06:31:42 +00:00
CMakeLists.txt Set the compiler flags in a better way. 2011-09-12 08:24:27 +00:00
Makefile.am Just use the LIB definition. 2011-02-25 00:39:05 +00:00