1
1
Граф коммитов

956 Коммитов

Автор SHA1 Сообщение Дата
Mike Dubman
34b3718042 revert, need rework.
This commit was SVN r30351.
2014-01-21 17:56:01 +00:00
Mike Dubman
37343574e0 OSHMEM: fix fortran binding
The check to enable shmem fortran was too early, MPI can disable fortran but SHMEM fortran check was already done.
Refs trac:3763

This commit was SVN r30340.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2014-01-21 09:20:51 +00:00
Ralph Castain
abb432aef6 The portable_platform file moved to opal at some point, but this .m4 didn't get updated to match.
Thanks to Paul Hargrove for spotting it and providing a patch!

cmr=v1.7.4:reviewer=jsquyres

This commit was SVN r30320.
2014-01-18 03:19:34 +00:00
Ralph Castain
42b7c4499a Correct a typo to get the right configure option to match the internal logic, thus making --enable-mpi-thread-multiple to actually work
cmr=v1.7.4:reviewer=jladd:subject=fix enable-mpi-thread-multiple

This commit was SVN r30296.
2014-01-15 03:56:29 +00:00
Mike Dubman
9f5abfb233 OSHMEM: fix fortran selection logic and refactoring
refactoring inspired by this thread: http://www.open-mpi.org/community/lists/devel/2014/01/13744.php
fix oshmem fortran selection logic, Thanks to Paul for info
Refs trac:3763

This commit was SVN r30286.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2014-01-14 13:23:44 +00:00
Jeff Squyres
5f17bc3c2c Make the use of PROTECTED in the mpi_f08 module be optional.
Add a configure test to see if the Fortran compiler supports the
PROTECTED keyword.  If it does, use in mpi-f08-types.F90 (via a macro
defined in configure-fortran-output-bottom.h).

This is needed to support the PGI 9 Fortran compiler, which does not
support the PROTECTED keyword.

Note that regardless of whether we want to support the PGI 9 Fortran
compiler + mpi_f08, we need to correctly detect whether PROTECTED
works or not, and then use that determination as a criteria for
building the mpi_f08 module.  Previously, mpi-f08-types.F90 used
PROTECTED unconditionally, and we didn't test for it in configure.  So
if a compiler (e.g., PGI 9) supported everything else but didn't
support PROTECTED, it would try to compile the mpi_f08 stuff and choke
on the use of PROTECTED.

Refs trac:4093

This commit was SVN r30273.

The following Trac tickets were found above:
  Ticket 4093 --> https://svn.open-mpi.org/trac/ompi/ticket/4093
2014-01-13 18:35:42 +00:00
Jeff Squyres
c6c6cbddb2 Missed this one in r30271.
Refs trac:4102

This commit was SVN r30272.

The following SVN revision numbers were found above:
  r30271 --> open-mpi/ompi@23a235d1ef

The following Trac tickets were found above:
  Ticket 4102 --> https://svn.open-mpi.org/trac/ompi/ticket/4102
2014-01-13 18:30:15 +00:00
Jeff Squyres
23a235d1ef Minor configure output consistency updates
Just some minor updates to make some Fortran test outputs be a bit
more consistent with each other.  This can definitely wait until
1.7.5, unless it causes conflicts with other changes that need to come
into 1.7.4.

cmr=v1.7.5:reviewer=dgoodell

This commit was SVN r30271.
2014-01-13 18:28:26 +00:00
Jeff Squyres
cc4440147f Fix typos in Fortran configury: ensure to call the right m4 macros!
Refs trac:4093

This commit was SVN r30258.

The following Trac tickets were found above:
  Ticket 4093 --> https://svn.open-mpi.org/trac/ompi/ticket/4093
2014-01-11 14:34:53 +00:00
Jeff Squyres
69ecf1670c Remove even more dead Fortran configury.
This configure option was only relevant when we were generating TKR
"use mpi" interfaces for MPI subroutines with choice buffers.  Now
that we aren't, the only interface that needs to accept a choice
buffer is MPI_SIZEOF (which we have to provide).  

And since there's now only several dozen interfaces in the "mpi" TKR
module, there's no reason to not generate ''all'' possible array rank
values (when there were thousands of interfaces, generating 4-vs-7
array ranks per interface per type was a big deal).  The default used
to be 4; now we can just hard-code it to 7, the max possible value for
Fortran 2003 (I think the max was raised ?to 11? in F2008, but let's
not go there for now).

cmr=v1.7.5:reviewer=dgoodell:subject=Remove even more dead Fortran configury

This commit was SVN r30257.
2014-01-11 14:06:59 +00:00
Jeff Squyres
b0ffdb3ae5 As noted by Paul Hargrove, older PGI compilers support ''some'' of
BIND(C), but not ''all'' of it.  So expand our configure checks to
look for multiple different forms of BIND(C):

 * ISO_C_BINDING
 * SUBROUTINE ... BIND(C)
 * TYPE, BIND(C)
 * TYPE(foo), BIND(C, name="bar")

If the compiler supports all of these, then declare that we support
BIND(C), and the rest of the mpi_f08 checks can continue.  If we miss
any one of those, don't bother continuing -- we won't build the
mpi_f08 module.

Also push the results of all of these tests down to ompi_info so that
they can be reported easily (e.g., "Hey, why doesn't my OMPI
installation have the mpi_f08 module?").

cmr=v1.7.4:reviewer=jsquyres:subject=Expand Fortran BIND(C) configure checks

This commit was SVN r30247.
2014-01-10 23:44:55 +00:00
Jeff Squyres
18cc164a3b Remove dead Fortran .m4 code.
This should have been part of r30151/#4057.  Oops.

cmr=v1.7.4:reviewer=dgoodell

This commit was SVN r30246.

The following SVN revision numbers were found above:
  r30151 --> open-mpi/ompi@52b5e17d97
2014-01-10 23:40:47 +00:00
Mike Dubman
d9f144d09f fix broken logic for oshmem:bindings:fort
Refs trac:3763

This commit was SVN r30226.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2014-01-10 15:20:19 +00:00
Jeff Squyres
36cca10042 Thanks to a reminder from Tobias Burunus, commit support for the
upcoming GCC/gfortran 4.9's ignore TKR interface.

This was originally committed in a side mercurial repo, but I sadly
completely forgot about it until Tobias reminded me.

cmr=v1.7.5:reviewer=dgoodell:subject=Add support for gfortran 4.9 Fortran ignore TKR

This commit was SVN r30152.
2014-01-08 03:46:27 +00:00
Jeff Squyres
52b5e17d97 Eliminate some dead Fortran configury (these values are not used any
more).

cmr=v1.7.5:reviewer=dgoodell

This commit was SVN r30151.
2014-01-08 03:19:58 +00:00
Nathan Hjelm
e627c91227 btl/vader: add support for traditional shared memory.
This commit adds support for placing the send memory segment in a
traditional shared memory segment when XPMEM is not available. The
current default is to reserve 4MB for shared memory on each process.
The latest benchmarks show vader performing better than sm on both
Intel and AMD CPUs.

For large messages vader will now use CMA if it is available (and
XPMEM is not).

cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r30123.
2014-01-06 19:51:44 +00:00
Ralph Castain
07503d8d6b Add some plumbing for ORCM support to keep the two code repos from diverging.
This commit was SVN r30115.
2014-01-03 22:35:20 +00:00
Ralph Castain
969d39d1af Fix enable-mpi-java - we can't push/pop the opal_java_happy variable in opal_setup_java.m4 as we need it in ompi_setup_java.m4 to determine that the java bindings can be built
cmr=v1.7.4:reviewer=jsquyres:subject=Fix enable-mpi-java

This commit was SVN r30097.
2013-12-28 17:03:59 +00:00
Jeff Squyres
6f6c3cc21c Per http://www.open-mpi.org/community/lists/devel/2013/12/13532.php,
r30016 was not enough to solve the issue.

So properly prefix all the shell variables used in opal_setup_java.m4
(one of them had an orte_ prefix -- oops!).  Now we won't get any
conflicts.

Refs trac:4015

This commit was SVN r30037.

The following SVN revision numbers were found above:
  r30016 --> open-mpi/ompi@35dfd26f9e

The following Trac tickets were found above:
  Ticket 4015 --> https://svn.open-mpi.org/trac/ompi/ticket/4015
2013-12-20 22:42:49 +00:00
Mike Dubman
92fdbbd7b1 Implementing comment #1 from http://www.open-mpi.org/community/lists/devel/2013/12/13523.php
Refs trac:4011

This commit was SVN r30024.

The following Trac tickets were found above:
  Ticket 4011 --> https://svn.open-mpi.org/trac/ompi/ticket/4011
2013-12-20 16:53:28 +00:00
Mike Dubman
d78a9cdd77 add rpath on mca_mtl_mxm.so to point to /path/to/mxm/lib/libmxm.so which was detected at configure time
This *should* fix following situation:

1 mxm.rpm puts /etc/ld.so.conf.d/mxm.conf file during rpm install with libpath to /opt/mellanox/mxm/lib
2 some1 can extract mxm.rpm into $HOME/mxm and compile OMPI with new mxm location
3 during runtime, OMPI from prev step will pick MXM from step (1) instead of from step (2)

cmr=v1.7.4:reviewer=ompi-rm1.7

This commit was SVN r30005.
2013-12-20 11:15:41 +00:00
Jeff Squyres
a7e65df6bc Update the --enable-wrapper-rpath help string to be correct.
Refs trac:3694

This commit was SVN r29903.

The following Trac tickets were found above:
  Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
2013-12-13 22:20:10 +00:00
Jeff Squyres
a25630c5e7 Fix rpath m4 typo that seeped in at the last minute.
Refs trac:3694

This commit was SVN r29901.

The following Trac tickets were found above:
  Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
2013-12-13 21:40:03 +00:00
Jeff Squyres
1bc8f41edb This commit combines 3 somewhat-unrelated things, which unfortunately
got linked together (work on one caused work in the other):

 * Clean up a bunch of VAR_SCOPE issues in configure.  This includes:
   * Using VAR_SCOPE_PUSH and VAR_SCOPE_POP in more places
   * Cleaning up the use of some shell variables (e.g., name them better)
 * Add support for external libevent via
   --with-libevent=<dir-to-libevent-install-tree>, as specifically
   asked for by downstream packagers.
 * Revamp how wrapper compiler RPATH (and RUNPATH) support is done.
   The external libevent work exposed weakenesses in how the original
   RPATH/RUNPATH work was done, so we had to re-do it to be a bit more
   robust.

This work has not yet been tested on Solaris.

Refs trac:3694

This commit was SVN r29899.

The following Trac tickets were found above:
  Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
2013-12-13 21:24:45 +00:00
Brian Barrett
121ca26c59 Per discussion at Develoepr's Meeting, remove Solaris threads support. Solaris
will just fall back to pthreads, which should be no problem.

This commit was SVN r29893.
2013-12-13 20:07:11 +00:00
Brian Barrett
6ef938de3f * Per the Developer's meeting today, restructure the threading in Open MPI a bit
more:
  - Remove OPAL_ENABLE_MULTI_THREADS, since it didn't really do anything
    correctly.  Opal always has threads enabled at this point.
  - Remove OMPI_ENABLE_PROGRESS_THREADS, since this hasn't worked in
    8 years and it has performance issues we'll never be able to
    overcome.  Note that we have plans for re-adding async progress, using
    a hybrid protocol of async and sync sends.
  - OMPI_ENABLE_THREAD_MULTIPLE now determines whether the thread lock
    macros do the check or not.
  - Condition variables are ALWAYS polling right now, which fixes the thread
    live-lock currently found when THREAD_MULTIPLE is turned on.

This commit was SVN r29891.
2013-12-13 19:40:12 +00:00
Jeff Squyres
cc7e121b1c Per thread starting here:
http://www.open-mpi.org/community/lists/users/2013/10/22882.php, fix
the value of MPI_STATUS_SIZE for the -i8 case.  Thanks to Jim Parker
for bringing up the issue and providing the patch.

Separate patches are required for v1.6 and v1.7 (and will be attached
to their respective tickets), because this breaks ABI, so we need a
non-default configure option to fix the issue but knowingly break ABI.

cmr=v1.7.4:reviewer=bosilca:subject=Fix MPI_STATUS_SIZE for -i8 case
cmr=v1.6.6:reviewer=bosilca:subject=Fix MPI_STATUS_SIZE for -i8 case

This commit was SVN r29858.
2013-12-11 17:47:54 +00:00
Jeff Squyres
e45412f5db Addendum to r29847: actually remove the old OPAL_VAR_SCOPE_POP
Refs trac:3694

This commit was SVN r29848.

The following SVN revision numbers were found above:
  r29847 --> open-mpi/ompi@c1ce04ad23

The following Trac tickets were found above:
  Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
2013-12-09 14:30:01 +00:00
Mike Dubman
c1ce04ad23 revert Jeff`s fix, now hcol can be compiled with it
Refs trac:3694

This commit was SVN r29847.

The following Trac tickets were found above:
  Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
2013-12-09 11:49:09 +00:00
Jeff Squyres
33fe77b874 Comment out the badness from the hcoll configury, originally from
r29830 -- Jeff will straighten this out with Alexander in person next
week (I can't test this myself because I have no access to libhcoll).
Sorry for the hassle...

Refs trac:3694

This commit was SVN r29842.

The following SVN revision numbers were found above:
  r29830 --> open-mpi/ompi@3bd9c603ff

The following Trac tickets were found above:
  Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
2013-12-08 14:36:55 +00:00
Mike Dubman
9a65e0d8c6 cosmetic fixed fpr hcol autotools
Refs: #3694

This commit was SVN r29841.
2013-12-08 09:45:13 +00:00
Jeff Squyres
a7f45f2675 Oops! r29830 added a variable into OPAL_VAR_SCOPE that is actually
needed in the global scope ($ompi_check_fca_dir).  This commit removes
it from the OPAL_VAR_SCOPE, so it should be fixed now.

Sorry about that, folks!  :-(

This commit was SVN r29838.

The following SVN revision numbers were found above:
  r29830 --> open-mpi/ompi@3bd9c603ff
2013-12-07 22:54:10 +00:00
Mike Dubman
2e124454b4 cosmitic fix to remove redundant -lfca
use CPP extra flags var which propagated to coll/fca and scoll/fca
Refs: #3694

This commit was SVN r29832.
2013-12-07 15:00:54 +00:00
Jeff Squyres
3bd9c603ff Clean up variables used in configure with OPAL_VAR_SCOPE.
This is helpful in the work for #3694: ensure that many places that
eventually end up in configure don't overly-pollute the global shell
variable space (because debugging accidental shell variable pollution
can be a real pain).

Refs trac:3694

This commit was SVN r29830.

The following Trac tickets were found above:
  Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
2013-12-06 23:40:34 +00:00
Rolf vandeVaart
d556b60b21 Chnage some CUDA configure code and macro names per review request by jsquyres in ticket #3880.
Functionally, nothing changes.

This commit was SVN r29815.
2013-12-06 14:35:10 +00:00
Mike Dubman
7b7b82ef35 check for mxm API version >= 1.5
Refs: #3947

This commit was SVN r29808.
2013-12-05 12:25:52 +00:00
Oscar Vega-Gisbert
78a65a30e6 Remove pinning check from ompi_setup_java.m4
This commit was SVN r29803.
2013-12-04 23:04:52 +00:00
Mike Dubman
753d630809 libshmem to liboshmem rename voices
Refs: 3763

This commit was SVN r29785.
2013-12-03 13:16:12 +00:00
Mike Dubman
5c7557f11a reverting r29771, duplicates Brian`s fix
This commit was SVN r29772.

The following SVN revision numbers were found above:
  r29771 --> open-mpi/ompi@bc25091b61
2013-11-29 14:28:02 +00:00
Mike Dubman
bd358cb180 Apply Jeff`s patch from ticket: 3145
Refs: 3763

This commit was SVN r29751.
2013-11-25 11:02:42 +00:00
Mike Dubman
826fa635a3 add support for --without-oshmem in configure
This commit was SVN r29733.
2013-11-22 13:51:46 +00:00
Brian Barrett
aa517b09a8 Add option to disable building the OSHMEM interface. This isn't perfect, but should work for the v1.7 timeframe
This commit was SVN r29727.
2013-11-21 17:13:28 +00:00
Jeff Squyres
05f4f62db1 Update -Wno-long-double test to support clang.
This prevents bazillions of warnings from clang tha -Wno-long-double
isn't supported.

The warning that clang issues when it see -Wno-long-double is does not
use any of the words we were looking for, so I added "unknown" to the
list of words to look for.  I also re-indented the two m4 tests so
that they're a bit more readable.

cmr=v1.7.4:reviewer=brbarret:subject=Update -Wno-long-double test to support clang

This commit was SVN r29681.
2013-11-13 15:08:01 +00:00
Mike Dubman
b10bb525f1 fix oshmem_CFLAGS to meet OMPI requirements
Refs: 3763

This commit was SVN r29662.
2013-11-12 07:25:14 +00:00
Mike Dubman
ab796052b4 Applying Jeff`s comments about proper SHMEM fortran organization of files.
Refs: 3870

This commit was SVN r29651.
2013-11-11 14:26:25 +00:00
Brian Barrett
9780043456 re-apply r29608 and fix the broken configure test that broke worse with the
patch.  See ticket #3885, comment 10 for an explination of why calling
_STRINGIFY on something that's not a numerical constant is always a bad idea.

This commit was SVN r29613.

The following SVN revision numbers were found above:
  r29608 --> open-mpi/ompi@b71bd51cdd
2013-11-05 22:41:10 +00:00
Jeff Squyres
337b2f7639 Output the directory where the java files were found.
Useful for debugging / understanding exactly where OMPI will be using
Java header files / libraries from.

This commit was SVN r29596.
2013-11-05 03:31:07 +00:00
Nathan Hjelm
b922cd1583 Add support for OSX builtin atomics.
OSX atomic support is disabled by default. Enable with --enable-osx-builtin-atomics.

Fixes trac:2120

This commit was SVN r29568.

The following Trac tickets were found above:
  Ticket 2120 --> https://svn.open-mpi.org/trac/ompi/ticket/2120
2013-10-30 17:48:15 +00:00
Ralph Castain
74c6b12d67 Don't force "treat warnings as errors" in the OpenSHMEM branch as this prevents building of tarballs on machines with compilers that generate warnings when the system is built with "tarball" settings. Instead, make this a configuration option so that OSHMEM developers can set it when they work on the code, but others don't have to use such restrictions.
Refs trac:3763

This commit was SVN r29538.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2013-10-27 15:35:14 +00:00
Ralph Castain
588e7ce974 Cleanup the Java setup m4's - orte doesn't require java, and we do all compiler checks in opal
This commit was SVN r29530.
2013-10-26 19:43:32 +00:00
Joshua Ladd
2f22329ea4 This commit (hopefully) fully addresses the enabling of oshmem fortran bindings given its dependency on the building of fortran bindings in OMPI. This commit closes trac:3851.
This commit was SVN r29521.

The following Trac tickets were found above:
  Ticket 3851 --> https://svn.open-mpi.org/trac/ompi/ticket/3851
2013-10-25 15:42:10 +00:00
Mike Dubman
5cc1f3803b shmem-fortran fix, naming conventions for shmem option, examples
This commit was SVN r29518.
2013-10-25 05:25:41 +00:00
Dave Goodell
620f40b6c7 fix compile error introduced by r29488
Apologies for the breakage, I did my test build in the wrong window...

No reviewer.

cmr=v1.7.4:ticket=3865

This commit was SVN r29492.

The following SVN revision numbers were found above:
  r29488 --> open-mpi/ompi@25dd719d4d

The following Trac tickets were found above:
  Ticket 3865 --> https://svn.open-mpi.org/trac/ompi/ticket/3865
2013-10-23 16:36:52 +00:00
Dave Goodell
25dd719d4d opal: support __attribute__((__noinline__))
First cut does not attempt any "cross-check".  As we discover compilers
which complain about __noinline__, we will add specific cross checks to
handle those cases.

Reviewed-by: Jeff Squyres <jsquyres@cisco.com>

This commit was SVN r29488.
2013-10-23 15:52:05 +00:00
Mike Dubman
2d76a9be45 add --enable-oshmem-fortran opt to configure
This commit was SVN r29448.
2013-10-17 05:42:43 +00:00
Mike Dubman
14304c299d add globalexit API support.
it is not fully functional yet, but initial version is good enough.
developed by Igor, reviewed by miked

This commit was SVN r29430.
2013-10-12 19:15:36 +00:00
Mike Dubman
2141e9e6b4 tools: Add oshmem_info utility
Reworked ompi_info tool to be close with orte_info implementation.
ompi_info_register_types(), ompi_info_close_components() and
ompi_info_show_ompi_version() are moved to runtime/ompi_info_support.c.

Added runtime/oshmem_info_support layer that exports following api to be
used into oshmem_info tool as
oshmem_info_register_types()
oshmem_info_register_framework_params()
oshmem_info_close_components()
oshmem_info_show_oshmem_version()
These functions call ompi_info_support related interfaces as long as
Oshmem supports Open MPI/SHMEM combination.

Now orte_info/ompi_info/oshmem_info have identical implementation approach.

Possible improvement:
OSHMEM processing of --config option is the same as OMPI`s (code is duplicated).
Probably list of info_support interfaces can be extended by xxx_info_do_config().
developed by Igor, reviewed by miked

This commit was SVN r29429.
2013-10-12 19:03:32 +00:00
Jeff Squyres
dc822de80f Fix typo/misspelling in variable name.
This commit was SVN r29410.
2013-10-09 14:04:25 +00:00
Jeff Squyres
3fb4401dee Remove an unused configure option, and comment that another
seemingly-unused configure option is used by downstream projects.

This commit was SVN r29367.
2013-10-04 14:16:09 +00:00
Mike Dubman
08efe5a338 Adopting shmem configure logic to trunk build system conventions
fixed by Dinar, reviewed by miked
cmr=v1.7.4:reviewer=ompi-rm1.7

This commit was SVN r29328.
2013-10-02 06:59:09 +00:00
Jeff Squyres
7941c81caa The TargetConditionals.h check is specific to Java -- move it to
ompi_setup_java.m4.

This commit was SVN r29261.
2013-09-26 21:34:00 +00:00
Rolf vandeVaart
d67e3077f5 Add a check for the CUDA 6.0 version of the cuda.h header file.
This commit was SVN r29250.
2013-09-26 12:46:06 +00:00
Jeff Squyres
df7654e8cf 1. Per my previous email
(http://www.open-mpi.org/community/lists/devel/2013/09/12889.php), I
renamed all "f77" and "f90" directory/file names to "fortran"
(including removing shmemf77 / shmemf90 wrapper compilers and
replacing them with "shmemfort").

2. Fixed several Fortran coding errors.

3. Removed lots of old/stale comments that were clearly the result of
copying from the OMPI layer and then not cleaning up afterwards (i.e.,
the comments were wholly inaccurate in the oshmem layer).

4. Removed both redundant and harmful code from oshmem_config.h.in.

5. Temporarily slave building the oshmem Fortran bindings to
--enable-mpi-fortran.  This doesn't seem like a good long-term
solution, but at least you can now build all Fortran bindings (MPI +
oshmem) or not. *** SEE MY NOTE IN config/oshmem_configure_options.m4
FOR WORK THAT STILL NEEDS TO BE DONE!

This commit was SVN r29165.
2013-09-15 09:32:07 +00:00
Joshua Ladd
b3f88c4a1d Per the RFC schedule, this commit adds Mellanox OpenSHMEM to the trunk. It does not yet run on OSX or with CM PML for an MTL other than MXM. Mellanox is aware of these issues and is in the process of resolving them. This should be added to \ncmr=v1.7.4:subject=Move OSHMEM to 1.7.4:reviewer=rhc
This commit was SVN r29153.
2013-09-10 15:34:09 +00:00
Brian Barrett
16a1166884 Remove the proc_pml and proc_bml fields from ompi_proc_t and replace with a
configure-time dynamic allocation of flags.  The net result for platforms
which only support BTL-based communication is a reduction of 8*nprocs bytes
per process.  Platforms which support both MTLs and BTLs will not see
a space reduction, but will now be able to safely run both the MTL and BTL
side-by-side, which will prove useful.

This commit was SVN r29100.
2013-08-30 16:54:55 +00:00
Ralph Castain
a200e4f865 As per the RFC, bring in the ORTE async progress code and the rewrite of OOB:
*** THIS RFC INCLUDES A MINOR CHANGE TO THE MPI-RTE INTERFACE ***

Note: during the course of this work, it was necessary to completely separate the MPI and RTE progress engines. There were multiple places in the MPI layer where ORTE_WAIT_FOR_COMPLETION was being used. A new OMPI_WAIT_FOR_COMPLETION macro was created (defined in ompi/mca/rte/rte.h) that simply cycles across opal_progress until the provided flag becomes false. Places where the MPI layer blocked waiting for RTE to complete an event have been modified to use this macro.

***************************************************************************************

I am reissuing this RFC because of the time that has passed since its original release. Since its initial release and review, I have debugged it further to ensure it fully supports tests like loop_spawn. It therefore seems ready for merge back to the trunk. Given its prior review, I have set the timeout for one week.

The code is in  https://bitbucket.org/rhc/ompi-oob2


WHAT:    Rewrite of ORTE OOB

WHY:       Support asynchronous progress and a host of other features

WHEN:    Wed, August 21

SYNOPSIS:
The current OOB has served us well, but a number of limitations have been identified over the years. Specifically:

* it is only progressed when called via opal_progress, which can lead to hangs or recursive calls into libevent (which is not supported by that code)

* we've had issues when multiple NICs are available as the code doesn't "shift" messages between transports - thus, all nodes had to be available via the same TCP interface.

* the OOB "unloads" incoming opal_buffer_t objects during the transmission, thus preventing use of OBJ_RETAIN in the code when repeatedly sending the same message to multiple recipients

* there is no failover mechanism across NICs - if the selected NIC (or its attached switch) fails, we are forced to abort

* only one transport (i.e., component) can be "active"


The revised OOB resolves these problems:

* async progress is used for all application processes, with the progress thread blocking in the event library

* each available TCP NIC is supported by its own TCP module. The ability to asynchronously progress each module independently is provided, but not enabled by default (a runtime MCA parameter turns it "on")

* multi-address TCP NICs (e.g., a NIC with both an IPv4 and IPv6 address, or with virtual interfaces) are supported - reachability is determined by comparing the contact info for a peer against all addresses within the range covered by the address/mask pairs for the NIC.

* a message that arrives on one TCP NIC is automatically shifted to whatever NIC that is connected to the next "hop" if that peer cannot be reached by the incoming NIC. If no TCP module will reach the peer, then the OOB attempts to send the message via all other available components - if none can reach the peer, then an "error" is reported back to the RML, which then calls the errmgr for instructions.

* opal_buffer_t now conforms to standard object rules re OBJ_RETAIN as we no longer "unload" the incoming object

* NIC failure is reported to the TCP component, which then tries to resend the message across any other available TCP NIC. If that doesn't work, then the message is given back to the OOB base to try using other components. If all that fails, then the error is reported to the RML, which reports to the errmgr for instructions

* obviously from the above, multiple OOB components (e.g., TCP and UD) can be active in parallel

* the matching code has been moved to the RML (and out of the OOB/TCP component) so it is independent of transport

* routing is done by the individual OOB modules (as opposed to the RML). Thus, both routed and non-routed transports can simultaneously be active

* all blocking send/recv APIs have been removed. Everything operates asynchronously.


KNOWN LIMITATIONS:

* although provision is made for component failover as described above, the code for doing so has not been fully implemented yet. At the moment, if all connections for a given peer fail, the errmgr is notified of a "lost connection", which by default results in termination of the job if it was a lifeline

* the IPv6 code is present and compiles, but is not complete. Since the current IPv6 support in the OOB doesn't work anyway, I don't consider this a blocker

* routing is performed at the individual module level, yet the active routed component is selected on a global basis. We probably should update that to reflect that different transports may need/choose to route in different ways

* obviously, not every error path has been tested nor necessarily covered

* determining abnormal termination is more challenging than in the old code as we now potentially have multiple ways of connecting to a process. Ideally, we would declare "connection failed" when *all* transports can no longer reach the process, but that requires some additional (possibly complex) code. For now, the code replicates the old behavior only somewhat modified - i.e., if a module sees its connection fail, it checks to see if it is a lifeline. If so, it notifies the errmgr that the lifeline is lost - otherwise, it notifies the errmgr that a non-lifeline connection was lost.

* reachability is determined solely on the basis of a shared subnet address/mask - more sophisticated algorithms (e.g., the one used in the tcp btl) are required to handle routing via gateways

* the RML needs to assign sequence numbers to each message on a per-peer basis. The receiving RML will then deliver messages in order, thus preventing out-of-order messaging in the case where messages travel across different transports or a message needs to be redirected/resent due to failure of a NIC

This commit was SVN r29058.
2013-08-22 16:37:40 +00:00
Jeff Squyres
31283aaffd Revert r29049 because it is incorrectly overriding the results of an
AC config macro.

This commit was SVN r29050.

The following SVN revision numbers were found above:
  r29049 --> open-mpi/ompi@b82f89e78b
2013-08-20 01:21:41 +00:00
Steve Wise
b82f89e78b Define HAVE_IBV_LINK_LAYER_ETHERNET if it is supported in libibverbs.
Commit r27211 missed a config file change which broke ompi over
iwarp transports.  

This fixes trac:3726 and should be added to cmr:v1.7.3:reviewer=jsquyres

This commit was SVN r29049.

The following SVN revision numbers were found above:
  r27211 --> open-mpi/ompi@b27862e5c7

The following Trac tickets were found above:
  Ticket 3726 --> https://svn.open-mpi.org/trac/ompi/ticket/3726
2013-08-19 22:27:51 +00:00
Ralph Castain
e0cfcf376f Okay, fix it so it works both --disable-mpi-profile and --enable-mpi-profile. I'm not sure why mpit's library has to be treated differently, but it seems that it needs some special care to work in both scenarios
Refs trac:3725

This commit was SVN r29043.

The following Trac tickets were found above:
  Ticket 3725 --> https://svn.open-mpi.org/trac/ompi/ticket/3725
2013-08-19 14:48:23 +00:00
Ralph Castain
e6199da2e7 Fixes trac:3486 - prevent opal_check_pmi from bleeding CPPFLAGS
This commit was SVN r28940.

The following Trac tickets were found above:
  Ticket 3486 --> https://svn.open-mpi.org/trac/ompi/ticket/3486
2013-07-24 03:53:23 +00:00
Nathan Hjelm
c6e586a81d MPI-3: fortran support for large counts using derived datatypes
Jeff:
 - Make sure not to go over 72 characters.  Love Fortran!
 - Ensure to include 'mpif-config.h' in Type_size_x.

This commit was SVN r28933.
2013-07-23 15:36:03 +00:00
Rolf vandeVaart
49663fb802 Move CUDA-aware configurary to its own file and other minor changes due to review.
This commit was SVN r28832.
2013-07-17 22:12:29 +00:00
Nathan Hjelm
e6e9f2c6fd Add profiling function definitions for MPI_T and add a missing type into mpi.h
This commit was SVN r28803.
2013-07-16 16:03:33 +00:00
Ralph Castain
5f520e241b Ensure we get both -lpmi and -lpmi2 when the libs are separate
This commit was SVN r28795.
2013-07-16 14:57:18 +00:00
Ralph Castain
10ca1c1b04 Turns out that there was exactly ONE place in all of the OMPI code base that still referred to OPAL_TRACE, though a few places retained the include file for no reason. So no point in letting this sit as it is clearly an unused "feature".
This commit was SVN r28789.
2013-07-14 18:57:20 +00:00
Joshua Ladd
16beaa3878 This fixes the nasty configure.m4 hack that was added long ago and not removed. My fault for not catching earlier. I've also removed the '.ompi_ignore' in coll/hcoll. Throwing this to Nathan for review. Upon successful review, this should be added to cmr:v1.7:reviewer=hjelmn
This commit was SVN r28753.
2013-07-11 09:55:46 +00:00
Jeff Squyres
04da831f04 The gm BTL was removed in June of 2012; there's no need for
ompi_check_gm.m4 any more.

This commit was SVN r28744.
2013-07-09 20:57:12 +00:00
Brian Barrett
ecbbf888d3 * Update Portals 4 MTL's multi-md code to be a bit cleaner (no if statements
in the path) and not create MDs due to boundary crossing
* Add the same logic to the Coll component

This commit was SVN r28733.
2013-07-08 21:27:37 +00:00
Joshua Ladd
e2b53dcf10 Adding the ompi_check_libhcoll.m4 file
This commit was SVN r28695.
2013-07-01 22:45:36 +00:00
Ralph Castain
7331dd9534 Apparently, the alps configury has not been checked since we added the RTE abstraction code. Fix it now.
This commit was SVN r28673.
2013-06-26 07:03:54 +00:00
Ralph Castain
e8340b6339 There is no convention out there as to how OEMs handle PMI2 functions. Some put them in their own -lpmi2 library, and some don't. Some have split the PMI2 definitions into a pmi2.h and keep the PMI-1 definitions in a separate pmi.h, and some don't.
Try to handle cases more generally so at least Slurm and Cray can co-exist in peace.

This commit was SVN r28672.
2013-06-26 00:43:26 +00:00
Ralph Castain
fa943dc6ff Cleanup a few things in the revised PMI configury - we know slurm has both pmi and pmi2 libs, so just auto-detect the presence of them if the user directed us to build with pmi support.
Also cleanup some changed names in the alps code

This commit was SVN r28670.
2013-06-24 02:41:40 +00:00
Joshua Ladd
0b5c1f2ea8 Add 'generic' support for PMI2 (previously, we checked for PMI2 only on Cray systems.) If your resource manager (e.g. SLURM) has support for PMI2, then the --with-pmi configure flag will enable its usage. If you don't have PMI2, then you will fallback to regular old PMI1. This patch was submitted by Ralph Castain and reviewed and pushed by Josh Ladd. This should be added to cmr:v1.7:reviewer=jladd
This commit was SVN r28666.
2013-06-21 15:28:14 +00:00
Rolf vandeVaart
7771857991 Adjust how cuda.h is found. It can be found in the with-cuda dir now.
This commit was SVN r28555.
2013-05-22 22:04:46 +00:00
Jeff Squyres
4d9da92e60 Fixes trac:376: bu default the wrappr compilers will enable rpath support
in generated executables on systems that support it.  Use
--disable-wrapper-rpath to disable this behavior.  See text in
README about --disable-wrapper-rpath for more details.

This commit was SVN r28479.

The following Trac tickets were found above:
  Ticket 376 --> https://svn.open-mpi.org/trac/ompi/ticket/376
2013-05-11 00:49:17 +00:00
Ralph Castain
353c77e659 Re-enable udcm for test purposes - appears to be working, but needs broader exposure to MTT
This commit was SVN r28468.
2013-05-09 16:10:29 +00:00
Ralph Castain
12969cec81 Update orte_progress_threads configure option - no longer need to test for --enable-event-threads
This commit was SVN r28449.
2013-05-05 14:48:35 +00:00
Nathan Hjelm
809db8f6a9 Check for .git directory when checking for developer build
This commit was SVN r28149.
2013-03-06 17:47:27 +00:00
Rolf vandeVaart
ebe63118ac Remove dependency on libcuda.so when building in CUDA-aware support. Dynamically load it if needed.
This commit was SVN r28140.
2013-03-01 13:21:52 +00:00
Ralph Castain
a4b6fb241f Remove all remaining vestiges of the Windows integration
This commit was SVN r28137.
2013-02-28 17:31:47 +00:00
Ralph Castain
cf9796accd Remove the old configure option for disabling full rte support - we now use the OMPI rte framework for such purposes
This commit was SVN r28134.
2013-02-28 01:35:55 +00:00
Ralph Castain
bd9265c560 Per the meeting on moving the BTLs to OPAL, move the ORTE database "db" framework to OPAL so the relocated BTLs can access it. Because the data is indexed by process, this requires that we define a new "opal_identifier_t" that corresponds to the orte_process_name_t struct. In order to support multiple run-times, this is defined in opal/mca/db/db_types.h as a uint64_t without identifying the meaning of any part of that data.
A few changes were required to support this move:

1. the PMI component used to identify rte-related data (e.g., host name, bind level) and package them as a unit to reduce the number of PMI keys. This code was moved up to the ORTE layer as the OPAL layer has no understanding of these concepts. In addition, the component locally stored data based on process jobid/vpid - this could no longer be supported (see below for the solution).

2. the hash component was updated to use the new opal_identifier_t instead of orte_process_name_t as its index for storing data in the hash tables. Previously, we did a hash on the vpid and stored the data in a 32-bit hash table. In the revised system, we don't see a separate "vpid" field - we only have a 64-bit opaque value. The orte_process_name_t hash turned out to do nothing useful, so we now store the data in a 64-bit hash table. Preliminary tests didn't show any identifiable change in behavior or performance, but we'll have to see if a move back to the 32-bit table is required at some later time.

3. the db framework was a "select one" system. However, since the PMI component could no longer use its internal storage system, the framework has now been changed to a "select many" mode of operation. This allows the hash component to handle all internal storage, while the PMI component only handles pushing/pulling things from the PMI system. This was something we had planned for some time - when fetching data, we first check internal storage to see if we already have it, and then automatically go to the global system to look for it if we don't. Accordingly, the framework was provided with a custom query function used during "select" that lets you seperately specify the "store" and "fetch" ordering.

4. the ORTE grpcomm and ess/pmi components, and the nidmap code,  were updated to work with the new db framework and to specify internal/global storage options.

No changes were made to the MPI layer, except for modifying the ORTE component of the OMPI/rte framework to support the new db framework.

This commit was SVN r28112.
2013-02-26 17:50:04 +00:00
Jeff Squyres
7136dbb5b3 Fixes trac:3523. Discussed on the OMPI weekely teleconf today: add a
configure test to ensure that the Fortran compiler supports BIND(C)
with LOGICAL parameters (per
http://lists.mpi-forum.org/mpi-comments/2013/02/0076.php).

This may become moot shortly -- Pathscale tells me that they intend
upgrade their compiler to support BIND(C) with default LOGICAL in the
very near term (this week?).  But we still want this configure test so
that Open MPI won't even try to build the F08 bindings with older
versions of the Pathscale compilers (or any compiler that doesn't
support BIND(C) and LOGICAL parameters).

This commit was SVN r28110.

The following Trac tickets were found above:
  Ticket 3523 --> https://svn.open-mpi.org/trac/ompi/ticket/3523
2013-02-26 17:11:11 +00:00
Brian Barrett
504a6d036f * Rather than use the extra_includes directive, add the extra includes (which is really just -I${includedir}/openmpi/ for devel headers) to CPPFLAGS, since all the other necessary -Is for devel headers (like libevent and hwloc) are added to CPPFLAGS.
* Clean up ${includedir} and ${libdir} for script wrapper compilers
* Update script wrapper compilers to work like the C wrapper compilers w.r.t static and dynamic linking
* Remove the ORTE script wrapper compilers since they didn't support the ${includedir} stuff and Ralph said they weren't used anymore.

This commit was SVN r28052.
2013-02-13 00:33:05 +00:00
Joshua Ladd
70ad711337 Backing out the Open SHMEM project
This commit was SVN r28050.
2013-02-12 17:45:27 +00:00
Mike Dubman
ff384daab4 Added new project: oshmem.
This commit was SVN r28048.
2013-02-12 15:33:21 +00:00
Nathan Hjelm
ab57e2ef38 update lustre configuration to use OMPI_CHECK_PACKAGE and only check for liblustreapi (not liblustre)
This commit was SVN r28036.
2013-02-06 17:22:58 +00:00
Brian Barrett
d80218996f Rather than setting up the direct call stuff in ompi_mca (which requires
modifying ompi_mca for every interface that is direct called), do it in
the framework's .m4 file.

This commit was SVN r28031.
2013-02-04 23:26:42 +00:00
Nathan Hjelm
c28879137d set $1_LIBS correctly on all calls to ORTE_CHECK_ALPS not just the first call
This commit was SVN r28006.
2013-01-31 23:42:28 +00:00
Ralph Castain
576a5ab6f0 Actually, those calls need to be removed as we superceded them with the new opal_setup_ft.m4
This commit was SVN r28000.
2013-01-31 20:46:35 +00:00