1
1
Граф коммитов

20461 Коммитов

Автор SHA1 Сообщение Дата
Mike Dubman
247da2819f OSHMEM: fix wrong btl/sm processing and typo
fixed by Igor reviewed by Alex,Mike,Yossi

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32100.
2014-06-28 18:40:28 +00:00
Mike Dubman
5a06f5dff5 OSHMEM: fix bss check
fixed by AlexM reviewed by Miked

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32099.
2014-06-28 18:37:45 +00:00
Dave Goodell
c104604387 common/verbs: update usnic transport probe
RHEL 7 has shipped with kernel support for the RDMA_TRANSPORT_USNIC
enum, but ''not'' the RDMA_TRANSPORT_USNIC_UDP enum.  This means that
when you install usNIC drivers from cisco.com, the kernel will report
IBV_TRANSPORT_USNIC, even though the transport is actually using UDP.

Therefore, we have to modify the logic in common/verbs to do the
additional magic probe if the device reports either an
IBV_TRANSPORT_IWARP or IBV_TRANSPORT_USNIC (because both of those might
be lies -- do the probe to figure out the real transport).

The code changed by this patch is fairly trivial; it simply moves the
logic of the magic probe to its own short function, and then calls that
short function in both the IBV_TRANSPORT_(IWARP|USNIC) cases.  It looks
longer because several lengthy comments were also updated.

Authored-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Dave Goodell <dgoodell@cisco.com>

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32098.
2014-06-27 18:43:32 +00:00
Devendar Bureddy
228772ae81 hcoll gatherv support
cmr=v1.8.2:reviewer=jladd

This commit was SVN r32097.
2014-06-26 18:14:41 +00:00
George Bosilca
99561c5cc1 If the enable fails don't give up, but instead keep going with
the other collective modules. If we endup without some of the
collective the code will raise an error anyway.

cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r32096.
2014-06-26 15:52:45 +00:00
Jeff Squyres
8e52ba423f finalize/disconnect: add explicit comment about why we use an RTE barrier
Based on extensive discussions before/at the June 2014 developer's
meeting, put a lengthy comment explaining a second reason why we
''must'' use an RTE barrier during MPI_FINALIZE and
MPI_COMM_DISCONNECT (i.e., unreliable transports).  Slightly explain
more the original reason why we do this, too (BTLs can lie/buffer a
message without actually injecting it on the network). 

This commit was SVN r32095.
2014-06-26 14:31:40 +00:00
Adrian Reber
47b118c0ae fix FT compilation
This commit was SVN r32094.
2014-06-26 03:40:07 +00:00
Adrian Reber
cabf1d4e68 use the orte attributes in the FT code to fix compile errors
This commit was SVN r32093.
2014-06-26 03:19:17 +00:00
Adrian Reber
10c1a50705 "handle" removal of opal_db.remove() in the FT code
This commit was SVN r32092.
2014-06-26 03:11:37 +00:00
Dave Goodell
f6bb853409 usnic: properly check src iface in route queries
rtnetlink doesn't check the source address when determining whether to
return route info for a query.  So we need to check that the OIF matches
the OIF of the source interface name.  Without this check, OMPI might
pair a local interface which does not have a route to a particular
remote interface.

Fixes Cisco bug CSCup55797.

Reviewed-by: Jeff Squyres <jsquyres@cisco.com>

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32090.
2014-06-25 22:39:02 +00:00
Ralph Castain
f3cb124e50 Revert r32082 and r32070 - the developer's conference has decided to go a different direction on the threaded progress effort. This will involve some degree of prototyping to understand the tradeoffs prior to making a final design decision, and so we'll hold off on the final change until that is completed.
This commit was SVN r32089.

The following SVN revision numbers were found above:
  r32070 --> open-mpi/ompi@12d92d0c22
  r32082 --> open-mpi/ompi@aa6438ef7a
2014-06-25 20:43:28 +00:00
Adrian Reber
9f73e79d91 also change the callback function prototype (to get the FT code to compile again)
This commit was SVN r32088.
2014-06-25 20:37:02 +00:00
Adrian Reber
4aca7095dc fix a syntax error in the FT code
This commit was SVN r32087.
2014-06-25 20:35:50 +00:00
Adrian Reber
4b25e92194 get the FT code to compile again by adding/removing #includes
This commit was SVN r32086.
2014-06-25 18:42:17 +00:00
Ralph Castain
8fca77c3d3 Protect the binding policy setting so it builds when --without-hwloc
Refs trac:4742

This commit was SVN r32085.

The following Trac tickets were found above:
  Ticket 4742 --> https://svn.open-mpi.org/trac/ompi/ticket/4742
2014-06-25 18:13:54 +00:00
Adrian Reber
72f1c7941f use a consistent naming scheme for the SNAPSHOT attributes
This commit was SVN r32083.
2014-06-25 15:26:24 +00:00
Rolf vandeVaart
aa6438ef7a Remove OMPI_USE_PROGRESS_THREADS that was missed.
This commit was SVN r32082.
2014-06-25 14:42:21 +00:00
Gilles Gouaillardet
53ae38cfb1 Handle error case in mca_spml_yoda_register
if source memory could not be registered, then return NULL
some cleanup might be needed, please refer to the FIXME in the code

cmr=v1.8.2:reviewer=miked

This commit was SVN r32081.
2014-06-25 08:58:45 +00:00
MPI Team
db35021a6c Update git/hg ignore files
This commit was SVN r32080.
2014-06-25 05:00:31 +00:00
Gilles Gouaillardet
fae7adf8ee Remove legacy FCA_IS_LOCAL_PROCESS macro
and use OPAL_PROC_ON_LOCAL_NODE instead

cmr=v1.8.2:reviewer=rhc

This commit was SVN r32079.
2014-06-25 02:37:53 +00:00
Ralph Castain
f70b4a33ec Per the developer conference, let's be a little nicer during MPI_Finalize and ease up on the cpu by inserting usleep into the loop over opal_progress while waiting for the RTE barrier to complete. This is a non-performant area of the code, and while most codes may call finalize at close-to-similar times, there are some that may choose to have one or more procs continue to perform some work prior to finalizing.
So save a little power while we are waiting.

cmr=v1.8.2:reviewer=jladd:subject=save power during finalize

This commit was SVN r32077.
2014-06-24 21:59:50 +00:00
Nathan Hjelm
563eaf0726 Fix support for Cray alps
The alps ras and plm components were broken by recent changes in ORTE. This
commit resolves those issues.

Changes:

 - Define PMI2_SUCCESS if it isn't defined. This fixes a problem with Cray's
   PMI implementation which does not define (for some reason) PMI2_SUCCESS. We
   had previously just used PMI_SUCCESS.

 - Add missing definition and a typo in pml_alps_module.

 - launch_id is no longer available in the orte_node_t structure. Use the
   attribute lookup to get the value.

 - Do not use an O(n^2) sorting algorithm when putting alps nodes in order. Use
   opal_list_sort instead (O(nlogn)).

This commit was SVN r32076.
2014-06-24 21:29:04 +00:00
Jeff Squyres
bce33635a7 sctp: remove from trunk
At the developer meeting today, the question was raised as to whether
the SCTP BTL was maintained any more.  I emailed Alan Wagner to see if
he had any interest/resources to continue to maintain the SCTP BTL.
He indicated that he unfortunately had any resources to maintain it;
it would be fine to remove the SCTP BTL from the trunk.

So long, SCTP BTL... fare thee well...

This commit was SVN r32075.
2014-06-24 21:23:09 +00:00
Jeff Squyres
69fa331cc2 openib/ugni: output verbose message when a BTL is ignored due to THREAD_MULTIPLE
usnic and portals4 already do this.

cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r32074.
2014-06-24 21:13:17 +00:00
Jeff Squyres
d7a2d964f0 usnic: use the correct output stream name
cmr=v1.8.2:reviewer=dgoodell

This commit was SVN r32073.
2014-06-24 18:13:49 +00:00
Ralph Castain
5f6be06b54 Per request from Gilles and discussion at devel conference, have the --oversubscribe option automatically set both oversubscribe and overload-allowed properties as this is likely what the user intended.
cmr=v1.8.2:reviewer=rhc:subject=automatically set oversub/load

This commit was SVN r32072.
2014-06-24 18:11:39 +00:00
Jeff Squyres
fb9d063be2 Fortran: include the type functions (eq/ne) in libmpi_usempif08
This file has to be pre-emptively compiled to generate the module, but
then it also has to be included in libmpi_usempif08.

cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32071.

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-24 17:48:15 +00:00
Ralph Castain
12d92d0c22 Per the OMPI developer conference, remove the last vestiges of OMPI_USE_PROGRESS_THREADS
This commit was SVN r32070.
2014-06-24 17:05:11 +00:00
Ralph Castain
1949f485ac Update platform file
cmr=v1.8.2:reviewer=ompi-gk1.8

This commit was SVN r32069.
2014-06-24 13:53:05 +00:00
Ralph Castain
20535bca19 Reorder the var release so a debugger can still see the var name that caused a segfault, thus helping to identify the var in question
cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r32068.
2014-06-24 13:51:31 +00:00
Gilles Gouaillardet
926e29c972 Fortran: add ompi/mpi/fortran/use-mpi-f08/mpi-f08-sizeof.F90 to the dist tarball.
cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32065.

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-23 04:14:28 +00:00
Gilles Gouaillardet
d1f5d9f675 Fortran: fix OMPI_GENERATE_F77_BINDINGS macro invokation
Some parameters were ommited and compilation failed if
configured with --disable-weak-symbols

cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32064.

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-23 02:10:35 +00:00
Ralph Castain
34e5573988 Resolve the MTT timeout problem. This appears to have largely been caused by missing sigchld notifications, thus causing the daemons to believe that not all procs had exited. Let comm failure also serve as notification of process termination, and add appropriate flags/attributes to avoid multiple reporting of proc termination.
This won't transition cleanly to the 1.8 series, and may represent too much change, so we'll have to (a) evaluate whether or not to bring it over (once it demonstrates that it does indeed solve the problem), and (b) develop a custom patch for that purpose.

Refs trac:4717

This commit was SVN r32063.

The following Trac tickets were found above:
  Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717
2014-06-21 17:09:02 +00:00
Jeff Squyres
011db6974e usnic: refactor usnic_add_procs() into 2 distinct parts
1: find/create procs, and create associated endpoint for each
2: resolve peer addresses

The 2nd part is done as a separate loop so that the address lookups
can be parallelized.

The overall result is to split usnic_add_procs() into two smaller,
simpler parts.

cmr=v1.8.2:ticket=trac:4734

This commit was SVN r32062.

The following Trac tickets were found above:
  Ticket 4734 --> https://svn.open-mpi.org/trac/ompi/ticket/4734
2014-06-20 20:58:36 +00:00
Jeff Squyres
1ea7bad5a0 usnic: behave better when ibv_create_ah() fails
When ibv_create_ah() fails due to an address resolution failure, it
really only means that we can't reach that one peer -- so we should
just ignore that one peer.  If ibv_create_ah() fails for some other
reason, then give up on the entire usnic_X device.

Change the show_help() message that is displayed when ibv_create_ah()
fails due to address resolution failure; indicate that it's likely a
routing problem.  Also opal_output_verbose() the same info, since
show_help() is de-duplicated (and this particular show_help() message
can be squelched).

Fixes Cisco bugs CSCup35851 and CSCup35872.

cmr=v1.8.2:ticket=trac:4734

This commit was SVN r32061.

The following Trac tickets were found above:
  Ticket 4734 --> https://svn.open-mpi.org/trac/ompi/ticket/4734
2014-06-20 20:53:50 +00:00
Ralph Castain
9cfc408fd4 Little more debug - getting close to figuring this one out
Refs trac:4717

This commit was SVN r32060.

The following Trac tickets were found above:
  Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717
2014-06-20 16:24:06 +00:00
Ralph Castain
f9da295682 Add some additional debug
Refs trac:4717

This commit was SVN r32059.

The following Trac tickets were found above:
  Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717
2014-06-20 14:14:36 +00:00
Ralph Castain
645df5e823 Don't release the node_name field as it gets used in the slots parsing - will be released at newline detection
This commit was SVN r32058.
2014-06-20 13:18:46 +00:00
MPI Team
95b6a03884 Update git/hg ignore files
This commit was SVN r32057.
2014-06-20 05:00:23 +00:00
Ralph Castain
2aade28259 Protect against NULL return from malloc
This commit was SVN r32056.
2014-06-19 20:57:56 +00:00
Ralph Castain
9a47e45a09 <laugh> ensure we really compare the things we want to compare
This commit was SVN r32055.
2014-06-19 20:54:25 +00:00
Ralph Castain
e65538e91b Add some defensive programming, fix a typo
This commit was SVN r32054.
2014-06-19 20:52:13 +00:00
Ralph Castain
b43f760f93 If you don't specify all the rank-file mapping for all procs, then you'll segfault - which is probably a bad idea. I can't see an easy workaround, so just error out for now and let's see if anyone really cares.
cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r32053.
2014-06-19 20:30:06 +00:00
Jeff Squyres
395078da00 Fortran: fix two type mistakes
Use the appropriate modules, don't use mpif-config.h.

cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32052.

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-19 20:25:09 +00:00
Ralph Castain
1a53d541ab Cleanup memory leak
cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r32051.
2014-06-19 18:56:57 +00:00
Ralph Castain
b618b36a2f Fix potential issue if opal_hwloc_topology is NULL
cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r32050.
2014-06-19 18:52:41 +00:00
Ralph Castain
b5a2ceaa7c Minor cleanup to node_stat packing routine
This commit was SVN r32049.
2014-06-19 18:46:27 +00:00
Jeff Squyres
8935e0a5e0 Fortran use-mpi-tkr: remove real*16 and complex*32 (for now)
There is more comprehensive work regarding MPI_SIZEOF coming, but the
Fortran working group in the MPI Forum is debating this internally,
and I'm still doing more testing to get a final solution.  So for the
moment, just remove real*16 and complex*32 support so that it compiles
porperly with older compilers (that do not support real*16 and
complex*32).

This commit was SVN r32048.
2014-06-19 18:12:53 +00:00
Ralph Castain
61fe4daa33 Add some further debug
Refs trac:4717

This commit was SVN r32047.

The following Trac tickets were found above:
  Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717
2014-06-19 15:59:51 +00:00
Jeff Squyres
b375808928 Fortran: add files accidentally skipped in r32042
cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32046.

The following SVN revision numbers were found above:
  r32042 --> open-mpi/ompi@fa764c1567

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-06-19 13:53:27 +00:00