1
1
openmpi/ompi/mca/coll/basic
Rainer Keller 4e6a6fc146 - Check, whether the compiler supports __builtin_clz (count leading
zeroes);
   if so, use it for bit-operations like opal_cube_dim and opal_hibit.
   Implement two versions of power-of-two.
   In case of opal_next_poweroftwo, this reduces the average execution
   time from 83 cycles to 4 cycles (Intel Nehalem, icc, -O2, inlining,
   measured rdtsc, with loop over 2^27 values).
   Numbers for other functions are similar (but of course heavily depend
   on the usage, e.g. opal_hibit() with a start of 4 does not save
   much).  The bsr instruction on AMD Opteron is also not as fast.

 - Replace various places where the next power-of-two is computed.
   
   Tested on Intel Nehalem Cluster with openib, compilers GNU-4.6.1 and
   Intel-12.0.4 using mpi_testsuite -t "Collective" with 128 processes.

This commit was SVN r25270.
2011-10-11 22:49:01 +00:00
..
.windows Convert the bad dos line endings to unix style for all windows related files. 2010-12-02 12:08:08 +00:00
coll_basic_allgather.c - Split the datatype engine into two parts: an MPI specific part in 2009-07-13 04:56:31 +00:00
coll_basic_allgatherv.c - Split the datatype engine into two parts: an MPI specific part in 2009-07-13 04:56:31 +00:00
coll_basic_allreduce.c - Split the datatype engine into two parts: an MPI specific part in 2009-07-13 04:56:31 +00:00
coll_basic_alltoall.c - Split the datatype engine into two parts: an MPI specific part in 2009-07-13 04:56:31 +00:00
coll_basic_alltoallv.c - Split the datatype engine into two parts: an MPI specific part in 2009-07-13 04:56:31 +00:00
coll_basic_alltoallw.c - Split the datatype engine into two parts: an MPI specific part in 2009-07-13 04:56:31 +00:00
coll_basic_barrier.c Fixes trac:1392, #1400 2008-07-28 22:40:57 +00:00
coll_basic_bcast.c - Split the datatype engine into two parts: an MPI specific part in 2009-07-13 04:56:31 +00:00
coll_basic_component.c Fixes trac:1392, #1400 2008-07-28 22:40:57 +00:00
coll_basic_exscan.c Add support for MPI_IN_PLACE to MPI_Exscan. Required for MPI 2.2 compliance. 2011-09-20 14:54:41 +00:00
coll_basic_gather.c - Split the datatype engine into two parts: an MPI specific part in 2009-07-13 04:56:31 +00:00
coll_basic_gatherv.c - Split the datatype engine into two parts: an MPI specific part in 2009-07-13 04:56:31 +00:00
coll_basic_module.c Fixes trac:1392, #1400 2008-07-28 22:40:57 +00:00
coll_basic_reduce_scatter.c - Check, whether the compiler supports __builtin_clz (count leading 2011-10-11 22:49:01 +00:00
coll_basic_reduce.c - Split the datatype engine into two parts: an MPI specific part in 2009-07-13 04:56:31 +00:00
coll_basic_scan.c - Split the datatype engine into two parts: an MPI specific part in 2009-07-13 04:56:31 +00:00
coll_basic_scatter.c - Split the datatype engine into two parts: an MPI specific part in 2009-07-13 04:56:31 +00:00
coll_basic_scatterv.c - Split the datatype engine into two parts: an MPI specific part in 2009-07-13 04:56:31 +00:00
coll_basic.h - Last of intrusive commits (promised)... err for now. 2009-03-04 17:06:51 +00:00
Makefile.am WARNING: Work on the temp branch being merged here encountered problems with bugs in subversion. Considerable effort has gone into validating the branch. However, not all conditions can be checked, so users are cautioned that it may be advisable to not update from the trunk for a few days to allow MTT to identify platform-specific issues. 2010-09-17 23:04:06 +00:00