1
1
Граф коммитов

18871 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
b12167abef Per a good suggestion from Jeff, make the coprocessor mapping more scalable by using a hash table to cache the coprocessor list, and then do a single pass thru the nodes at the end to assign hostid's.
Refs trac:3847

This commit was SVN r29439.

The following Trac tickets were found above:
  Ticket 3847 --> https://svn.open-mpi.org/trac/ompi/ticket/3847
2013-10-14 22:01:48 +00:00
Ralph Castain
772a376d73 Correct location of elog file
Refs trac:3847

This commit was SVN r29438.

The following Trac tickets were found above:
  Ticket 3847 --> https://svn.open-mpi.org/trac/ompi/ticket/3847
2013-10-14 19:21:45 +00:00
Ralph Castain
de6177cfbf Set ignores and minor cleanup
Refs trac:3763

This commit was SVN r29437.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2013-10-14 16:58:53 +00:00
Ralph Castain
bd0b13221b Cleanup ompi_info change to silence compiler warning
This commit was SVN r29436.
2013-10-14 16:57:50 +00:00
Ralph Castain
24c811805f ****************************************************************
This change contains a non-mandatory modification
       of the MPI-RTE interface. Anyone wishing to support
       coprocessors such as the Xeon Phi may wish to add
       the required definition and underlying support
****************************************************************

Add locality support for coprocessors such as the Intel Xeon Phi.

Detecting that we are on a coprocessor inside of a host node isn't straightforward. There are no good "hooks" provided for programmatically detecting that "we are on a coprocessor running its own OS", and the ORTE daemon just thinks it is on another node. However, in order to properly use the Phi's public interface for MPI transport, it is necessary that the daemon detect that it is colocated with procs on the host.

So we have to split the locality to separately record "on the same host" vs "on the same board". We already have the board-level locality flag, but not quite enough flexibility to handle this use-case. Thus, do the following:

1. add OPAL_PROC_ON_HOST flag to indicate we share a host, but not necessarily the same board

2. modify OPAL_PROC_ON_NODE to indicate we share both a host AND the same board. Note that we have to modify the OPAL_PROC_ON_LOCAL_NODE macro to explicitly check both conditions

3. add support in opal/mca/hwloc/base/hwloc_base_util.c for the host to check for coprocessors, and for daemons to check to see if they are on a coprocessor. The former is done via hwloc, but support for the latter is not yet provided by hwloc. So the code for detecting we are on a coprocessor currently is Xeon Phi specific - hopefully, we will find more generic methods in the future.

4. modify the orted and the hnp startup so they check for coprocessors and to see if they are on a coprocessor, and have the orteds pass that info back in their callback message. Automatically detect that coprocessors have been found and identify which coprocessors are on which hosts. Note that this algo isn't scalable at the moment - this will hopefully be improved over time.

5. modify the ompi proc locality detection function to look for coprocessor host info IF the OMPI_RTE_HOST_ID database key has been defined. RTE's that choose not to provide this support do not have to do anything - the associated code will simply be ignored.

6. include some cleanup of the hwloc open/close code so it conforms to how we did things in other frameworks (e.g., having a single "frame" file instead of open/close). Also, fix the locality flags - e.g., being on the same node means you must also be on the same cluster/cu, so ensure those flags are also set.

cmr:v1.7.4:reviewer=hjelmn

This commit was SVN r29435.
2013-10-14 16:52:58 +00:00
Mike Dubman
48c2728b1d globalexit() should use common approach using orte_errmgr.abort()
This commit was SVN r29434.
2013-10-14 09:48:35 +00:00
Ralph Castain
62a0d315ce Set svn ignores
Refs trac:3763

This commit was SVN r29433.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2013-10-13 23:19:58 +00:00
Mike Dubman
2edf853e7b fix warnings when "--with-pmi" is used
This commit was SVN r29432.
2013-10-13 14:03:23 +00:00
Mike Dubman
a8d333fadf fix gcc warnings
This commit was SVN r29431.
2013-10-12 20:03:44 +00:00
Mike Dubman
14304c299d add globalexit API support.
it is not fully functional yet, but initial version is good enough.
developed by Igor, reviewed by miked

This commit was SVN r29430.
2013-10-12 19:15:36 +00:00
Mike Dubman
2141e9e6b4 tools: Add oshmem_info utility
Reworked ompi_info tool to be close with orte_info implementation.
ompi_info_register_types(), ompi_info_close_components() and
ompi_info_show_ompi_version() are moved to runtime/ompi_info_support.c.

Added runtime/oshmem_info_support layer that exports following api to be
used into oshmem_info tool as
oshmem_info_register_types()
oshmem_info_register_framework_params()
oshmem_info_close_components()
oshmem_info_show_oshmem_version()
These functions call ompi_info_support related interfaces as long as
Oshmem supports Open MPI/SHMEM combination.

Now orte_info/ompi_info/oshmem_info have identical implementation approach.

Possible improvement:
OSHMEM processing of --config option is the same as OMPI`s (code is duplicated).
Probably list of info_support interfaces can be extended by xxx_info_do_config().
developed by Igor, reviewed by miked

This commit was SVN r29429.
2013-10-12 19:03:32 +00:00
Mike Dubman
5a7dff2d15 fix icc warning
fixed by Dinar, reviewed by miked
cmr=v1.7.4:reviewer=ompi-gk1.7

This commit was SVN r29428.
2013-10-12 18:04:28 +00:00
Jeff Squyres
eabbcd1157 Fix a pair of minor errors in the affinity MPI extension.
* Use the right length for memset/strncpy
* Return the set return value (vs. unconditionally returning
  OMPI_SUCCESS)

cmr=v1.7.4:reviewer=dgoodell:subject=Fix a pair of minor errors in the affinity MPI extension

This commit was SVN r29427.
2013-10-11 21:17:38 +00:00
Jeff Squyres
b5e2ae86ad Remove all of our "to-do" items from the README.txt.
This commit was SVN r29424.
2013-10-11 16:43:56 +00:00
Rolf vandeVaart
fbf143f3b4 Move another function that was missed in r29347.
This commit was SVN r29422.

The following SVN revision numbers were found above:
  r29347 --> open-mpi/ompi@ce61985503
2013-10-10 14:48:56 +00:00
Jeff Squyres
d9be19f011 Added shared library versions to those who were missing it.
The following common shared libraries did not have versioning:

 * ompi/common/ofacm
 * ompi/common/verbs
 * ompi/common/ugni

Additionally, we still had shared library versions in VERSION for the
following libraries, which no longer exist:

 * ompi/common/portals
 * opal/common/hwloc

This commit was SVN r29421.
2013-10-10 13:25:57 +00:00
Nathan Hjelm
e11233cb65 Remove unnecessary member from the comm idup context.
Refs trac:3796

This commit was SVN r29419.

The following Trac tickets were found above:
  Ticket 3796 --> https://svn.open-mpi.org/trac/ompi/ticket/3796
2013-10-09 22:00:17 +00:00
Mike Dubman
f7aae9a814 remove duplicate (thanks Jeff)
add epoll
cmr=v1.7.3:reviewer=jsquyres

This commit was SVN r29415.
2013-10-09 16:42:49 +00:00
Nathan Hjelm
50b4b92758 hostname may not NULL-terminate the string if the buffer is too small.
Thanks to Kevin M. Hildebrand for catching this.

cmr=v1.7.3:reviewer=jsquyres

This commit was SVN r29412.
2013-10-09 15:49:18 +00:00
Jeff Squyres
2d7d7ab731 Absoft caught that bind(c) functions can't have OPTIONAL dummy arguments.
Also, removed an MPI_Aint snuck through (which is a C type, not a Fortran
type).  Oddly, the Intel compiler complained about neither of these
issues.  :-\

This commit was SVN r29411.
2013-10-09 14:29:34 +00:00
Jeff Squyres
dc822de80f Fix typo/misspelling in variable name.
This commit was SVN r29410.
2013-10-09 14:04:25 +00:00
Jeff Squyres
98d71f555c Use the correct macro: OMPI_HAVE_FORTRAN_REAL16
Using the "#if defined(ompi_fortran_real16_t)" band-aid that I applied
in r29165 wasn't correct because if the compiler doesn't have a
fortran REAL16 type, then OMPI may well #define ompi_fortran_real16_t
to be empty, which then expands to Badness in the macro.  Hence, use
OMPI_HAVE_FORTRAN_REAL16, which will also be 0 or 1.

This commit was SVN r29408.

The following SVN revision numbers were found above:
  r29165 --> open-mpi/ompi@df7654e8cf
2013-10-09 13:59:36 +00:00
Alex Mikheev
68b8e562d6 fixed check availability of mxm atomics. No need to run the code. succesful compilation is enough to detect availability.
reviewed by miked

This commit was SVN r29407.
2013-10-09 11:52:04 +00:00
Mike Dubman
a9521f3abd Add mellanox platform files
reviewed by Amir
cmr=v1.7.4:reviewer=ompi-rm1.7

This commit was SVN r29406.
2013-10-09 10:06:39 +00:00
Ralph Castain
9902748108 ***** THIS INCLUDES A SMALL CHANGE IN THE MPI-RTE INTERFACE *****
Fix two problems that surfaced when using direct launch under SLURM:

1. locally store our own data because some BTLs want to retrieve 
   it during add_procs rather than use what they have internally

2. cleanup MPI_Abort so it correctly passes the error status all
   the way down to the actual exit. When someone implemented the
   "abort_peers" API, they left out the error status. So we lost
   it at that point and *always* exited with a status of 1. This 
   forces a change to the API to include the status.

cmr:v1.7.3:reviewer=jsquyres:subject=Fix MPI_Abort and modex_recv for direct launch

This commit was SVN r29405.
2013-10-08 18:37:59 +00:00
Jeff Squyres
7de2179866 Remove Cisco platform files from Makefile.am so we don't break "make dist"
Thanks to Mellanox/Jenkins for catching this before the nightly build
tonight!

This commit was SVN r29402.
2013-10-08 16:24:00 +00:00
Jeff Squyres
0d093176e1 Add missing .so version numbers for libmpi_usempif08.so.
cmr=v1.7.3:reviewer=rhc:subject=Add missing libmpi_usempif08 shared lib

This commit was SVN r29401.
2013-10-08 16:20:12 +00:00
Jeff Squyres
66dadbe1e7 Per RFC, remove the udapl BTL.
This commit was SVN r29400.
2013-10-08 15:18:59 +00:00
Ralph Castain
9389592e05 Fix --without-hwloc build
This commit was SVN r29399.
2013-10-08 15:02:47 +00:00
Jeff Squyres
8b1c0432f9 Remove unused files.
This commit was SVN r29398.
2013-10-08 14:56:43 +00:00
Jeff Squyres
a3606c4029 Add weak symbol #defines.
Fixes compile errors if you compile --without-weak-symbols.

This commit was SVN r29396.
2013-10-08 14:54:40 +00:00
Jeff Squyres
ad4abdbf53 Avoid using /* in the comment (which starts a C comment)
Since this header is included in .F90 files (which are preprocessed,
vs. .f90 files, which are *not* preprocessed), we don't want to
accidentally start C-style comments, which are recognized by some
Fortran compiler preprocessors (e.g., Absoft).

This commit was SVN r29394.
2013-10-08 14:22:55 +00:00
Jeff Squyres
ceecc04bfe Some Fortran compilers warn about ! comments after #endif.
So just move the comment to the prior line, and it's all good.  This
is obviosuly not *necessary*, but it helps cut down on warning noise.

This commit was SVN r29393.
2013-10-08 13:47:44 +00:00
Ralph Castain
2f9374e2b4 Update README to reflect removal of support for native Windows and reduced support for Solaris
cmr:v1.7.3:reviewer=jsquyres

This commit was SVN r29391.
2013-10-07 19:20:26 +00:00
Ralph Castain
6951976bc4 Update struct member name - this is why we put such things in the trunk before moving them to a branch, especially when coming from outside :-)
Refs trac:3830

This commit was SVN r29390.

The following Trac tickets were found above:
  Ticket 3830 --> https://svn.open-mpi.org/trac/ompi/ticket/3830
2013-10-07 15:43:43 +00:00
Ralph Castain
3d1fdf5528 Ensure we add --no-undefined to libtool for cygwin
This commit was SVN r29389.
2013-10-06 23:56:54 +00:00
Ralph Castain
13cd112fb4 Avoid use of interface in struct because cygwin compilers apparently object (go figure)
This commit was SVN r29388.
2013-10-06 23:55:38 +00:00
Ralph Castain
2d2307b6eb Modify libevent to support cygwin - patch will be pushed upstream
This commit was SVN r29387.
2013-10-06 23:53:31 +00:00
Mike Dubman
d2d533cf6c enable ikrit for np>=0 with mxm2
reviewed by Amir
Refs: #3763

This commit was SVN r29386.
2013-10-06 12:43:47 +00:00
Jeff Squyres
9f9fb5ce38 Add weak symbols for the MPI-3.1 MPI_Alloc_mem_cptr fortran subroutine.
This subroutine was added in an errata to MPI-3.0; here's the MPI
Forum ticket about it:
    
    https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/390

This commit was SVN r29385.
2013-10-04 22:43:40 +00:00
Jeff Squyres
c08f97b030 Fix problems with calling back-end BIND(C) ompi_*_f() functions.
BIND(C) doesn't let us have LOGICAL parameters, so we have to be
creative in how we invoke back-end ompi_*_f() C functions.
Additionally, the mpi_f08 type for MPI_Status presented some
difficulties, too.
    
See the large comment in
ompi/mpi/fortran/use-mpi-f08/mpi-f-interfaces-bind.h that explains
this in much more detail.

This commit was SVN r29384.
2013-10-04 22:43:07 +00:00
Jeff Squyres
f23d3bca64 Remove BIND(C) from all mpi_f08 interfaces.
This commit was SVN r29383.
2013-10-04 22:41:59 +00:00
Jeff Squyres
e81e3ccee0 Add missing implementation of MPI_FREE_MEM in the TKR mpi module.
This commit was SVN r29382.
2013-10-04 22:38:57 +00:00
Jeff Squyres
886e2cbf0f Remove and eliminate this extra redundant phrase.
This commit was SVN r29381.
2013-10-04 22:12:04 +00:00
Jeff Squyres
4899c531d3 Add PMPI versions for all ignore-TKR Fortran mpi module interfaces.
This commit was SVN r29380.
2013-10-04 21:58:56 +00:00
Rolf vandeVaart
3bd02fbaf5 Add one more verbose debug output that prints when we are out of memory.
This commit was SVN r29378.
2013-10-04 18:56:06 +00:00
Nathan Hjelm
b025babfec Correctly set the state for communicator requests. Refs trac:3796
This commit was SVN r29377.

The following Trac tickets were found above:
  Ticket 3796 --> https://svn.open-mpi.org/trac/ompi/ticket/3796
2013-10-04 17:34:54 +00:00
Nathan Hjelm
ba4b0f9235 Remove meaningless checks. Refs trac:3828
This commit was SVN r29375.

The following Trac tickets were found above:
  Ticket 3828 --> https://svn.open-mpi.org/trac/ompi/ticket/3828
2013-10-04 17:24:59 +00:00
Nathan Hjelm
70878764d7 Silence compiler warnings on 32-bit platforms. Refs trac:3828
This commit was SVN r29374.

The following Trac tickets were found above:
  Ticket 3828 --> https://svn.open-mpi.org/trac/ompi/ticket/3828
2013-10-04 15:53:14 +00:00
Rolf vandeVaart
dfd3883416 Update some CUDA-aware NEWS items.
This commit was SVN r29372.
2013-10-04 15:32:44 +00:00