1
1

18898 Коммитов

Автор SHA1 Сообщение Дата
Dave Goodell
a6ed232a10 usnic: fix several memory leaks
- some free lists simply were not being OBJ_DESTRUCTed, so they never
  freed their internal memory

- channel->recv_segs.ctx was being assigned in a way that got clobbered
  by ompi_free_list_init_new, so the cleanup code that relied on it
  being set never ran

- numerous other ".ctx" assignments were similarly ineffectual and were
  not being consumed, so I deleted them

Reviewed-by: Jeff Squyres <jsquyres@cisco.com>

This commit was SVN r29484.
2013-10-23 15:51:22 +00:00
Dave Goodell
c9b2343982 usnic: add ompi_btl_usnic_component_debug helper
This new routine can be called in exceptional situations, either
conditionally in BTL code or from a debugger, to help with debugging in
cases where MSGDEBUG1/2 or stats logging are impractical but more detail
is needed.

Reviewed-by: Jeff Squyres <jsquyres@cisco.com>

This commit was SVN r29483.
2013-10-23 15:51:11 +00:00
Dave Goodell
d0b7d125b2 usnic: refactor usnic_stats_callback
Pull the bulk of the functionality out into a new routine,
ompi_btl_usnic_print_stats, which can be used in other debugging
contexts.  This also lets us eliminate the module->final_stats state
tracking.

Reviewed-by: Jeff Squyres <jsquyres@cisco.com>

This commit was SVN r29482.
2013-10-23 15:50:57 +00:00
Nathan Hjelm
d34a4300b8 Fix various bugs in mca_base_pvar.
Fixes:

 - Segmentation fault when using watermark variables.

 - Segmentation fault when using a handle bound to a no longer valid
   performance variable.

 - Incorrect return codes from MPI_T_pvar_* functions.

cmr=v1.7.4:reviewer=jsquyres

This commit was SVN r29481.
2013-10-23 15:47:15 +00:00
Jeff Squyres
0fb8edd720 Trivial comment change
This commit was SVN r29480.
2013-10-23 10:15:18 +00:00
Mike Dubman
d6ead2a3a5 Add support for routable ROCE where different subnet_id is a valid to proceed with MPI routing.
(can happen in the same LAN)
developed by vasily, reviewed by miked
cmr=v1.7.4:reviewer=ompi-gk1.7

This commit was SVN r29479.
2013-10-23 06:08:54 +00:00
Ralph Castain
16b1ad052f Silence compiler warning
This commit was SVN r29478.
2013-10-23 04:13:51 +00:00
Ralph Castain
7c86a843c8 Silence compiler warning
This commit was SVN r29477.
2013-10-23 04:13:36 +00:00
Ralph Castain
960a255e7f Do some cleanup of the --without-hwloc build - no need to work on coprocessors since we can't detect them anyway, cleanup some unused variables in the ppr mapper
This commit was SVN r29476.
2013-10-23 01:45:21 +00:00
Nathan Hjelm
5bf6555604 Fix locality when in the case where the OMPI_RTE_HOST_ID is not found.
cmr=v1.7.4:ticket=3847

This commit was SVN r29475.

The following Trac tickets were found above:
  Ticket 3847 --> https://svn.open-mpi.org/trac/ompi/ticket/3847
2013-10-22 19:07:03 +00:00
Rolf vandeVaart
3c916d55c9 Fix two issues pointed out in review of ticket #3870.
This commit was SVN r29473.
2013-10-22 17:28:12 +00:00
Mike Dubman
d27cffedb9 expand tabs to 4 spaces
cd ompi/mca/coll/fca
for i in *.[ch]; do expand -t 4 $i > koko && mv koko $i; done
Refs: #3799

This commit was SVN r29472.
2013-10-22 17:05:55 +00:00
Jeff Squyres
6714890244 paffinity.h is gone and won't be coming back.
This commit was SVN r29467.
2013-10-22 15:59:00 +00:00
Nathan Hjelm
7bee047e5d Fix rentry check in communicator request progress.
cmr=v1.7.4:ticket=3796

This commit was SVN r29465.

The following Trac tickets were found above:
  Ticket 3796 --> https://svn.open-mpi.org/trac/ompi/ticket/3796
2013-10-22 15:33:39 +00:00
Nathan Hjelm
280a89448f Make btl/vader valgrind safe.
cmr=v1.7.4:reviewer=samuel

This commit was SVN r29464.
2013-10-22 15:33:32 +00:00
Jeff Squyres
506d0e96f4 Fix the IN_PLACE detection for Fortran SCATTER and SCATTERV.
Thanks to Charles Gerlach for identifying the issue.

Oddly, this issue exists in trunk and v1.7, but ''not'' in the v1.6
tree (!).

cmr=v1.7.4:reviewer=hjelmn

This commit was SVN r29463.
2013-10-22 14:55:17 +00:00
Jeff Squyres
09fae6e62b Prefix DSO filenames with "lib" so that Automake doesn't complain.
Follow the convention established by the ompi/mca/common/sm tree and
prefix both the "install" and "no install" versions of the build with
"lib" so that Automake doesn't complain.  Differentiate the two by
adding a "_noinst" suffix to the "no install" version.

This commit was SVN r29462.
2013-10-22 13:16:33 +00:00
Matthias Jurenz
582d5337ce Changes to VT's configure:
- removed potential double-'/' in CUPTIDIR which makes trouble with rpmbuild's debugedit program (fixes trac:3854)

This commit was SVN r29461.

The following Trac tickets were found above:
  Ticket 3854 --> https://svn.open-mpi.org/trac/ompi/ticket/3854
2013-10-22 09:02:11 +00:00
Mike Dubman
c33d5c0b59 issue with bml init/fin for yoda component
bml can be initialized by not yoda component and in this case yoda should not
call bml finalization.

This commit was SVN r29458.
2013-10-22 06:13:00 +00:00
Ralph Castain
25a84c7f0a Fix build --without-hwloc
This commit was SVN r29453.
2013-10-19 23:12:33 +00:00
Mike Dubman
fb356ee523 Script to generate svn2git mirror on github
Please read comments to customize various locations if you intend to run it locally

This commit was SVN r29449.
2013-10-17 06:49:58 +00:00
Mike Dubman
2d76a9be45 add --enable-oshmem-fortran opt to configure
This commit was SVN r29448.
2013-10-17 05:42:43 +00:00
Mike Dubman
ae78eef749 enable fortran for shmem, add O3 by default
This commit was SVN r29447.
2013-10-17 05:39:20 +00:00
Rolf vandeVaart
0cd1e8dfd9 Add runtime support to turn off CUDA IPC support.
This commit was SVN r29444.
2013-10-16 16:48:18 +00:00
Rolf vandeVaart
9f83405c78 Fix one more corner case initialization issue.
This commit was SVN r29443.
2013-10-16 16:39:19 +00:00
Ralph Castain
f128b2093d Add missing file to tarball
This commit was SVN r29442.
2013-10-16 14:38:26 +00:00
Mike Dubman
30013f0339 remove --with-pmi, not all systems have JobScheduler
reviewed by yossi
cmr=v1.7.3:reviewer=ompi-gk1.7

This commit was SVN r29441.
2013-10-16 08:56:13 +00:00
Ralph Castain
b12167abef Per a good suggestion from Jeff, make the coprocessor mapping more scalable by using a hash table to cache the coprocessor list, and then do a single pass thru the nodes at the end to assign hostid's.
Refs trac:3847

This commit was SVN r29439.

The following Trac tickets were found above:
  Ticket 3847 --> https://svn.open-mpi.org/trac/ompi/ticket/3847
2013-10-14 22:01:48 +00:00
Ralph Castain
772a376d73 Correct location of elog file
Refs trac:3847

This commit was SVN r29438.

The following Trac tickets were found above:
  Ticket 3847 --> https://svn.open-mpi.org/trac/ompi/ticket/3847
2013-10-14 19:21:45 +00:00
Ralph Castain
de6177cfbf Set ignores and minor cleanup
Refs trac:3763

This commit was SVN r29437.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2013-10-14 16:58:53 +00:00
Ralph Castain
bd0b13221b Cleanup ompi_info change to silence compiler warning
This commit was SVN r29436.
2013-10-14 16:57:50 +00:00
Ralph Castain
24c811805f ****************************************************************
This change contains a non-mandatory modification
       of the MPI-RTE interface. Anyone wishing to support
       coprocessors such as the Xeon Phi may wish to add
       the required definition and underlying support
****************************************************************

Add locality support for coprocessors such as the Intel Xeon Phi.

Detecting that we are on a coprocessor inside of a host node isn't straightforward. There are no good "hooks" provided for programmatically detecting that "we are on a coprocessor running its own OS", and the ORTE daemon just thinks it is on another node. However, in order to properly use the Phi's public interface for MPI transport, it is necessary that the daemon detect that it is colocated with procs on the host.

So we have to split the locality to separately record "on the same host" vs "on the same board". We already have the board-level locality flag, but not quite enough flexibility to handle this use-case. Thus, do the following:

1. add OPAL_PROC_ON_HOST flag to indicate we share a host, but not necessarily the same board

2. modify OPAL_PROC_ON_NODE to indicate we share both a host AND the same board. Note that we have to modify the OPAL_PROC_ON_LOCAL_NODE macro to explicitly check both conditions

3. add support in opal/mca/hwloc/base/hwloc_base_util.c for the host to check for coprocessors, and for daemons to check to see if they are on a coprocessor. The former is done via hwloc, but support for the latter is not yet provided by hwloc. So the code for detecting we are on a coprocessor currently is Xeon Phi specific - hopefully, we will find more generic methods in the future.

4. modify the orted and the hnp startup so they check for coprocessors and to see if they are on a coprocessor, and have the orteds pass that info back in their callback message. Automatically detect that coprocessors have been found and identify which coprocessors are on which hosts. Note that this algo isn't scalable at the moment - this will hopefully be improved over time.

5. modify the ompi proc locality detection function to look for coprocessor host info IF the OMPI_RTE_HOST_ID database key has been defined. RTE's that choose not to provide this support do not have to do anything - the associated code will simply be ignored.

6. include some cleanup of the hwloc open/close code so it conforms to how we did things in other frameworks (e.g., having a single "frame" file instead of open/close). Also, fix the locality flags - e.g., being on the same node means you must also be on the same cluster/cu, so ensure those flags are also set.

cmr:v1.7.4:reviewer=hjelmn

This commit was SVN r29435.
2013-10-14 16:52:58 +00:00
Mike Dubman
48c2728b1d globalexit() should use common approach using orte_errmgr.abort()
This commit was SVN r29434.
2013-10-14 09:48:35 +00:00
Ralph Castain
62a0d315ce Set svn ignores
Refs trac:3763

This commit was SVN r29433.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2013-10-13 23:19:58 +00:00
Mike Dubman
2edf853e7b fix warnings when "--with-pmi" is used
This commit was SVN r29432.
2013-10-13 14:03:23 +00:00
Mike Dubman
a8d333fadf fix gcc warnings
This commit was SVN r29431.
2013-10-12 20:03:44 +00:00
Mike Dubman
14304c299d add globalexit API support.
it is not fully functional yet, but initial version is good enough.
developed by Igor, reviewed by miked

This commit was SVN r29430.
2013-10-12 19:15:36 +00:00
Mike Dubman
2141e9e6b4 tools: Add oshmem_info utility
Reworked ompi_info tool to be close with orte_info implementation.
ompi_info_register_types(), ompi_info_close_components() and
ompi_info_show_ompi_version() are moved to runtime/ompi_info_support.c.

Added runtime/oshmem_info_support layer that exports following api to be
used into oshmem_info tool as
oshmem_info_register_types()
oshmem_info_register_framework_params()
oshmem_info_close_components()
oshmem_info_show_oshmem_version()
These functions call ompi_info_support related interfaces as long as
Oshmem supports Open MPI/SHMEM combination.

Now orte_info/ompi_info/oshmem_info have identical implementation approach.

Possible improvement:
OSHMEM processing of --config option is the same as OMPI`s (code is duplicated).
Probably list of info_support interfaces can be extended by xxx_info_do_config().
developed by Igor, reviewed by miked

This commit was SVN r29429.
2013-10-12 19:03:32 +00:00
Mike Dubman
5a7dff2d15 fix icc warning
fixed by Dinar, reviewed by miked
cmr=v1.7.4:reviewer=ompi-gk1.7

This commit was SVN r29428.
2013-10-12 18:04:28 +00:00
Jeff Squyres
eabbcd1157 Fix a pair of minor errors in the affinity MPI extension.
* Use the right length for memset/strncpy
* Return the set return value (vs. unconditionally returning
  OMPI_SUCCESS)

cmr=v1.7.4:reviewer=dgoodell:subject=Fix a pair of minor errors in the affinity MPI extension

This commit was SVN r29427.
2013-10-11 21:17:38 +00:00
Jeff Squyres
b5e2ae86ad Remove all of our "to-do" items from the README.txt.
This commit was SVN r29424.
2013-10-11 16:43:56 +00:00
Rolf vandeVaart
fbf143f3b4 Move another function that was missed in r29347.
This commit was SVN r29422.

The following SVN revision numbers were found above:
  r29347 --> open-mpi/ompi@ce61985503
2013-10-10 14:48:56 +00:00
Jeff Squyres
d9be19f011 Added shared library versions to those who were missing it.
The following common shared libraries did not have versioning:

 * ompi/common/ofacm
 * ompi/common/verbs
 * ompi/common/ugni

Additionally, we still had shared library versions in VERSION for the
following libraries, which no longer exist:

 * ompi/common/portals
 * opal/common/hwloc

This commit was SVN r29421.
2013-10-10 13:25:57 +00:00
Nathan Hjelm
e11233cb65 Remove unnecessary member from the comm idup context.
Refs trac:3796

This commit was SVN r29419.

The following Trac tickets were found above:
  Ticket 3796 --> https://svn.open-mpi.org/trac/ompi/ticket/3796
2013-10-09 22:00:17 +00:00
Mike Dubman
f7aae9a814 remove duplicate (thanks Jeff)
add epoll
cmr=v1.7.3:reviewer=jsquyres

This commit was SVN r29415.
2013-10-09 16:42:49 +00:00
Nathan Hjelm
50b4b92758 hostname may not NULL-terminate the string if the buffer is too small.
Thanks to Kevin M. Hildebrand for catching this.

cmr=v1.7.3:reviewer=jsquyres

This commit was SVN r29412.
2013-10-09 15:49:18 +00:00
Jeff Squyres
2d7d7ab731 Absoft caught that bind(c) functions can't have OPTIONAL dummy arguments.
Also, removed an MPI_Aint snuck through (which is a C type, not a Fortran
type).  Oddly, the Intel compiler complained about neither of these
issues.  :-\

This commit was SVN r29411.
2013-10-09 14:29:34 +00:00
Jeff Squyres
dc822de80f Fix typo/misspelling in variable name.
This commit was SVN r29410.
2013-10-09 14:04:25 +00:00
Jeff Squyres
98d71f555c Use the correct macro: OMPI_HAVE_FORTRAN_REAL16
Using the "#if defined(ompi_fortran_real16_t)" band-aid that I applied
in r29165 wasn't correct because if the compiler doesn't have a
fortran REAL16 type, then OMPI may well #define ompi_fortran_real16_t
to be empty, which then expands to Badness in the macro.  Hence, use
OMPI_HAVE_FORTRAN_REAL16, which will also be 0 or 1.

This commit was SVN r29408.

The following SVN revision numbers were found above:
  r29165 --> open-mpi/ompi@df7654e8cf
2013-10-09 13:59:36 +00:00
Alex Mikheev
68b8e562d6 fixed check availability of mxm atomics. No need to run the code. succesful compilation is enough to detect availability.
reviewed by miked

This commit was SVN r29407.
2013-10-09 11:52:04 +00:00