1
1
Граф коммитов

20606 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
8e52ba423f finalize/disconnect: add explicit comment about why we use an RTE barrier
Based on extensive discussions before/at the June 2014 developer's
meeting, put a lengthy comment explaining a second reason why we
''must'' use an RTE barrier during MPI_FINALIZE and
MPI_COMM_DISCONNECT (i.e., unreliable transports).  Slightly explain
more the original reason why we do this, too (BTLs can lie/buffer a
message without actually injecting it on the network). 

This commit was SVN r32095.
2014-06-26 14:31:40 +00:00
Adrian Reber
47b118c0ae fix FT compilation
This commit was SVN r32094.
2014-06-26 03:40:07 +00:00
Adrian Reber
cabf1d4e68 use the orte attributes in the FT code to fix compile errors
This commit was SVN r32093.
2014-06-26 03:19:17 +00:00
Adrian Reber
10c1a50705 "handle" removal of opal_db.remove() in the FT code
This commit was SVN r32092.
2014-06-26 03:11:37 +00:00
Dave Goodell
f6bb853409 usnic: properly check src iface in route queries
rtnetlink doesn't check the source address when determining whether to
return route info for a query.  So we need to check that the OIF matches
the OIF of the source interface name.  Without this check, OMPI might
pair a local interface which does not have a route to a particular
remote interface.

Fixes Cisco bug CSCup55797.

Reviewed-by: Jeff Squyres <jsquyres@cisco.com>

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32090.
2014-06-25 22:39:02 +00:00
Ralph Castain
f3cb124e50 Revert r32082 and r32070 - the developer's conference has decided to go a different direction on the threaded progress effort. This will involve some degree of prototyping to understand the tradeoffs prior to making a final design decision, and so we'll hold off on the final change until that is completed.
This commit was SVN r32089.

The following SVN revision numbers were found above:
  r32070 --> open-mpi/ompi@12d92d0c22
  r32082 --> open-mpi/ompi@aa6438ef7a
2014-06-25 20:43:28 +00:00
Adrian Reber
9f73e79d91 also change the callback function prototype (to get the FT code to compile again)
This commit was SVN r32088.
2014-06-25 20:37:02 +00:00
Adrian Reber
4aca7095dc fix a syntax error in the FT code
This commit was SVN r32087.
2014-06-25 20:35:50 +00:00
Adrian Reber
4b25e92194 get the FT code to compile again by adding/removing #includes
This commit was SVN r32086.
2014-06-25 18:42:17 +00:00
Ralph Castain
8fca77c3d3 Protect the binding policy setting so it builds when --without-hwloc
Refs trac:4742

This commit was SVN r32085.

The following Trac tickets were found above:
  Ticket 4742 --> https://svn.open-mpi.org/trac/ompi/ticket/4742
2014-06-25 18:13:54 +00:00
Adrian Reber
72f1c7941f use a consistent naming scheme for the SNAPSHOT attributes
This commit was SVN r32083.
2014-06-25 15:26:24 +00:00
Rolf vandeVaart
aa6438ef7a Remove OMPI_USE_PROGRESS_THREADS that was missed.
This commit was SVN r32082.
2014-06-25 14:42:21 +00:00
Gilles Gouaillardet
53ae38cfb1 Handle error case in mca_spml_yoda_register
if source memory could not be registered, then return NULL
some cleanup might be needed, please refer to the FIXME in the code

cmr=v1.8.2:reviewer=miked

This commit was SVN r32081.
2014-06-25 08:58:45 +00:00
MPI Team
db35021a6c Update git/hg ignore files
This commit was SVN r32080.
2014-06-25 05:00:31 +00:00
Gilles Gouaillardet
fae7adf8ee Remove legacy FCA_IS_LOCAL_PROCESS macro
and use OPAL_PROC_ON_LOCAL_NODE instead

cmr=v1.8.2:reviewer=rhc

This commit was SVN r32079.
2014-06-25 02:37:53 +00:00
Ralph Castain
f70b4a33ec Per the developer conference, let's be a little nicer during MPI_Finalize and ease up on the cpu by inserting usleep into the loop over opal_progress while waiting for the RTE barrier to complete. This is a non-performant area of the code, and while most codes may call finalize at close-to-similar times, there are some that may choose to have one or more procs continue to perform some work prior to finalizing.
So save a little power while we are waiting.

cmr=v1.8.2:reviewer=jladd:subject=save power during finalize

This commit was SVN r32077.
2014-06-24 21:59:50 +00:00
Nathan Hjelm
563eaf0726 Fix support for Cray alps
The alps ras and plm components were broken by recent changes in ORTE. This
commit resolves those issues.

Changes:

 - Define PMI2_SUCCESS if it isn't defined. This fixes a problem with Cray's
   PMI implementation which does not define (for some reason) PMI2_SUCCESS. We
   had previously just used PMI_SUCCESS.

 - Add missing definition and a typo in pml_alps_module.

 - launch_id is no longer available in the orte_node_t structure. Use the
   attribute lookup to get the value.

 - Do not use an O(n^2) sorting algorithm when putting alps nodes in order. Use
   opal_list_sort instead (O(nlogn)).

This commit was SVN r32076.
2014-06-24 21:29:04 +00:00
Jeff Squyres
bce33635a7 sctp: remove from trunk
At the developer meeting today, the question was raised as to whether
the SCTP BTL was maintained any more.  I emailed Alan Wagner to see if
he had any interest/resources to continue to maintain the SCTP BTL.
He indicated that he unfortunately had any resources to maintain it;
it would be fine to remove the SCTP BTL from the trunk.

So long, SCTP BTL... fare thee well...

This commit was SVN r32075.
2014-06-24 21:23:09 +00:00
Jeff Squyres
69fa331cc2 openib/ugni: output verbose message when a BTL is ignored due to THREAD_MULTIPLE
usnic and portals4 already do this.

cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r32074.
2014-06-24 21:13:17 +00:00
Jeff Squyres
d7a2d964f0 usnic: use the correct output stream name
cmr=v1.8.2:reviewer=dgoodell

This commit was SVN r32073.
2014-06-24 18:13:49 +00:00
Ralph Castain
5f6be06b54 Per request from Gilles and discussion at devel conference, have the --oversubscribe option automatically set both oversubscribe and overload-allowed properties as this is likely what the user intended.
cmr=v1.8.2:reviewer=rhc:subject=automatically set oversub/load

This commit was SVN r32072.
2014-06-24 18:11:39 +00:00
Jeff Squyres
fb9d063be2 Fortran: include the type functions (eq/ne) in libmpi_usempif08
This file has to be pre-emptively compiled to generate the module, but
then it also has to be included in libmpi_usempif08.

cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32071.

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-24 17:48:15 +00:00
Ralph Castain
12d92d0c22 Per the OMPI developer conference, remove the last vestiges of OMPI_USE_PROGRESS_THREADS
This commit was SVN r32070.
2014-06-24 17:05:11 +00:00
Ralph Castain
1949f485ac Update platform file
cmr=v1.8.2:reviewer=ompi-gk1.8

This commit was SVN r32069.
2014-06-24 13:53:05 +00:00
Ralph Castain
20535bca19 Reorder the var release so a debugger can still see the var name that caused a segfault, thus helping to identify the var in question
cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r32068.
2014-06-24 13:51:31 +00:00
Gilles Gouaillardet
926e29c972 Fortran: add ompi/mpi/fortran/use-mpi-f08/mpi-f08-sizeof.F90 to the dist tarball.
cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32065.

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-23 04:14:28 +00:00
Gilles Gouaillardet
d1f5d9f675 Fortran: fix OMPI_GENERATE_F77_BINDINGS macro invokation
Some parameters were ommited and compilation failed if
configured with --disable-weak-symbols

cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32064.

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-23 02:10:35 +00:00
Ralph Castain
34e5573988 Resolve the MTT timeout problem. This appears to have largely been caused by missing sigchld notifications, thus causing the daemons to believe that not all procs had exited. Let comm failure also serve as notification of process termination, and add appropriate flags/attributes to avoid multiple reporting of proc termination.
This won't transition cleanly to the 1.8 series, and may represent too much change, so we'll have to (a) evaluate whether or not to bring it over (once it demonstrates that it does indeed solve the problem), and (b) develop a custom patch for that purpose.

Refs trac:4717

This commit was SVN r32063.

The following Trac tickets were found above:
  Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717
2014-06-21 17:09:02 +00:00
Jeff Squyres
011db6974e usnic: refactor usnic_add_procs() into 2 distinct parts
1: find/create procs, and create associated endpoint for each
2: resolve peer addresses

The 2nd part is done as a separate loop so that the address lookups
can be parallelized.

The overall result is to split usnic_add_procs() into two smaller,
simpler parts.

cmr=v1.8.2:ticket=trac:4734

This commit was SVN r32062.

The following Trac tickets were found above:
  Ticket 4734 --> https://svn.open-mpi.org/trac/ompi/ticket/4734
2014-06-20 20:58:36 +00:00
Jeff Squyres
1ea7bad5a0 usnic: behave better when ibv_create_ah() fails
When ibv_create_ah() fails due to an address resolution failure, it
really only means that we can't reach that one peer -- so we should
just ignore that one peer.  If ibv_create_ah() fails for some other
reason, then give up on the entire usnic_X device.

Change the show_help() message that is displayed when ibv_create_ah()
fails due to address resolution failure; indicate that it's likely a
routing problem.  Also opal_output_verbose() the same info, since
show_help() is de-duplicated (and this particular show_help() message
can be squelched).

Fixes Cisco bugs CSCup35851 and CSCup35872.

cmr=v1.8.2:ticket=trac:4734

This commit was SVN r32061.

The following Trac tickets were found above:
  Ticket 4734 --> https://svn.open-mpi.org/trac/ompi/ticket/4734
2014-06-20 20:53:50 +00:00
Ralph Castain
9cfc408fd4 Little more debug - getting close to figuring this one out
Refs trac:4717

This commit was SVN r32060.

The following Trac tickets were found above:
  Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717
2014-06-20 16:24:06 +00:00
Ralph Castain
f9da295682 Add some additional debug
Refs trac:4717

This commit was SVN r32059.

The following Trac tickets were found above:
  Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717
2014-06-20 14:14:36 +00:00
Ralph Castain
645df5e823 Don't release the node_name field as it gets used in the slots parsing - will be released at newline detection
This commit was SVN r32058.
2014-06-20 13:18:46 +00:00
MPI Team
95b6a03884 Update git/hg ignore files
This commit was SVN r32057.
2014-06-20 05:00:23 +00:00
Ralph Castain
2aade28259 Protect against NULL return from malloc
This commit was SVN r32056.
2014-06-19 20:57:56 +00:00
Ralph Castain
9a47e45a09 <laugh> ensure we really compare the things we want to compare
This commit was SVN r32055.
2014-06-19 20:54:25 +00:00
Ralph Castain
e65538e91b Add some defensive programming, fix a typo
This commit was SVN r32054.
2014-06-19 20:52:13 +00:00
Ralph Castain
b43f760f93 If you don't specify all the rank-file mapping for all procs, then you'll segfault - which is probably a bad idea. I can't see an easy workaround, so just error out for now and let's see if anyone really cares.
cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r32053.
2014-06-19 20:30:06 +00:00
Jeff Squyres
395078da00 Fortran: fix two type mistakes
Use the appropriate modules, don't use mpif-config.h.

cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32052.

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-19 20:25:09 +00:00
Ralph Castain
1a53d541ab Cleanup memory leak
cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r32051.
2014-06-19 18:56:57 +00:00
Ralph Castain
b618b36a2f Fix potential issue if opal_hwloc_topology is NULL
cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r32050.
2014-06-19 18:52:41 +00:00
Ralph Castain
b5a2ceaa7c Minor cleanup to node_stat packing routine
This commit was SVN r32049.
2014-06-19 18:46:27 +00:00
Jeff Squyres
8935e0a5e0 Fortran use-mpi-tkr: remove real*16 and complex*32 (for now)
There is more comprehensive work regarding MPI_SIZEOF coming, but the
Fortran working group in the MPI Forum is debating this internally,
and I'm still doing more testing to get a final solution.  So for the
moment, just remove real*16 and complex*32 support so that it compiles
porperly with older compilers (that do not support real*16 and
complex*32).

This commit was SVN r32048.
2014-06-19 18:12:53 +00:00
Ralph Castain
61fe4daa33 Add some further debug
Refs trac:4717

This commit was SVN r32047.

The following Trac tickets were found above:
  Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717
2014-06-19 15:59:51 +00:00
Jeff Squyres
b375808928 Fortran: add files accidentally skipped in r32042
cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32046.

The following SVN revision numbers were found above:
  r32042 --> open-mpi/ompi@fa764c1567

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-19 13:53:27 +00:00
Jeff Squyres
134c527f18 Fortran: Move all f08-related modules out of fortran/base
Move them all to fortran/use-mpi-f08, since that's the only directory
that uses them (the use-mpi-f08-desc directory has been disabled).

cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32045.

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-19 13:44:08 +00:00
Jeff Squyres
f33cc84ac6 Fortran: disable the descriptor-based mpi_f08 prototype
It will come back someday.

cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32044.

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-19 13:39:56 +00:00
Jeff Squyres
9ae8b44a44 ompi_fortran_check_bind_c.m4: Fix comment in existing .m4 file
cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32043.

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-19 13:39:11 +00:00
Jeff Squyres
fa764c1567 Fortran: add missing implementation of win_allocate_shared and win_shared_query
Thanks to Michael Rachner for pointing out the issue.

cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32042.

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-19 13:38:25 +00:00
Jeff Squyres
2cbda4fe6d Fortran: fix a few ierr->ierror mistakes that crept in
Thanks for Walter Spector for raising the issue on the users list.

Refs trac:3582

cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32041.

The following Trac tickets were found above:
  Ticket 3582 --> https://svn.open-mpi.org/trac/ompi/ticket/3582
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-19 13:37:22 +00:00