1
1
Граф коммитов

18428 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
90f5bd9424 Add missing f08 binding declarations for MPI_Count functions
This commit was SVN r28936.
2013-07-23 19:00:06 +00:00
Rolf vandeVaart
a3995f73d3 Update trunk to match OMPI 1.7.3 due to code reviews.
This commit was SVN r28934.
2013-07-23 17:58:21 +00:00
Nathan Hjelm
c6e586a81d MPI-3: fortran support for large counts using derived datatypes
Jeff:
 - Make sure not to go over 72 characters.  Love Fortran!
 - Ensure to include 'mpif-config.h' in Type_size_x.

This commit was SVN r28933.
2013-07-23 15:36:03 +00:00
Nathan Hjelm
c4c69b4ddf MPI-3: add support for large counts using derived datatypes
Add support for MPI_Count type and MPI_COUNT datatype and add the required
MPI-3 functions MPI_Get_elements_x, MPI_Status_set_elements_x,
MPI_Type_get_extent_x, MPI_Type_get_true_extent_x, and MPI_Type_size_x.
This commit adds only the C bindings. Fortran bindins will be added in
another commit. For now the MPI_Count type is define to have the same size
as MPI_Offset. The type is required to be at least as large as MPI_Offset
and MPI_Aint. The type was initially intended to be a ssize_t (if it was
the same size as a long long) but there were issues compiling romio with
that definition (despite the inclusion of stddef.h).

I updated the datatype engine to use size_t instead of uint32_t to support
large datatypes. This will require some review to make sure that 1) the
changes are beneficial, 2) nothing was broken by the change (I doubt
anything was), and 3) there are no performance regressions due to this
change.

Increase the maximum number of predifined datatypes to support MPI_Count

Put common get_elements code to ompi/datatype/ompi_datatype_get_elements.c

Update MPI_Get_count to reflect changes in MPI-3 (return MPI_UNDEFINED when the count is too large for an int)

This commit was SVN r28932.
2013-07-23 15:35:14 +00:00
Jeff Squyres
c9cc0223a9 Sync 1.6.6 bullets from 1.6 branch.
This commit was SVN r28930.
2013-07-23 12:01:50 +00:00
Matthias Jurenz
c4a7dded5f Changes to VT:
- configure: Removed double slashes in path names which make trouble when building RPMs on Fedora (see #3688)

This commit was SVN r28924.
2013-07-23 08:12:18 +00:00
Nathan Hjelm
1349b825c2 MPI-2.2: Add C++ datatypes to mpi.h and fix support for MPI_C_*COMPLEX
This commit was SVN r28919.
2013-07-22 23:45:45 +00:00
Ralph Castain
59a71765cf Hmmm...these error outputs will never occur, which is probably not what the author intended. So do the output and THEN jump to the error exit.
This commit was SVN r28918.
2013-07-22 22:58:03 +00:00
Ralph Castain
6c1a140e99 Per request from Nathan, add a "commit" API to the opal db framework. This allows him to aggregate keys to work around the Cray's severe PMI limitations
This commit was SVN r28917.
2013-07-22 22:57:16 +00:00
Tom Naughton
0d5b6a73a4 + fix typo in README
This commit was SVN r28916.
2013-07-22 22:16:05 +00:00
Nathan Hjelm
1b6ad3f002 fix a couple of typos in README
This commit was SVN r28915.
2013-07-22 22:02:40 +00:00
Edgar Gabriel
8ffc1aac89 update the _component.c files in ompio to use the explicit assignment of the
mca_register_component_params element of the structure.

This commit was SVN r28914.
2013-07-22 21:11:05 +00:00
Jeff Squyres
49b5342130 After talking with Nathan, update some comments/documentation about
the new MCA var and pvar systems.

This commit was SVN r28913.
2013-07-22 20:34:42 +00:00
Nathan Hjelm
562cfd9630 Update README with information about uGNI and vader BTLs. Also remove references to the csum pml.
cmr=v1.7.3:reviewer=jsquyres

This commit was SVN r28911.
2013-07-22 19:16:59 +00:00
Nathan Hjelm
b17cd13c09 sharedfp: ensure sharedfp components register their parameters in mca_register_component_params not mca_component_open
This commit was SVN r28910.
2013-07-22 17:53:58 +00:00
Nathan Hjelm
61d331d5b5 MCA/base: fix some warnings and an error in the MCA variable system
This commit was SVN r28909.
2013-07-22 17:52:39 +00:00
Jeff Squyres
b437041aeb Update one more comment.
This commit was SVN r28908.
2013-07-22 17:29:00 +00:00
Jeff Squyres
4b6006402d Use the RTE framework instead of calling ORTE directly.
Brian (rightfully) hit me on the head with the
don't-use-ORTE-use-the-rte-framework clue bat; the usnic BTL now
nicely plays with the RTE framework.

This commit was SVN r28907.
2013-07-22 17:28:23 +00:00
Jeff Squyres
ca9da8a554 Fix minor typo in the comments/docs.
This commit was SVN r28905.
2013-07-22 17:24:17 +00:00
Rolf vandeVaart
67badf384c Only search SONAME of library. Expand comments.
This commit was SVN r28904.
2013-07-22 15:54:45 +00:00
Brian Barrett
0d8b57211a add missing include
This commit was SVN r28900.
2013-07-21 20:18:17 +00:00
Brian Barrett
e1d72409cd add missing header
This commit was SVN r28897.
2013-07-21 19:40:31 +00:00
Brian Barrett
704f1ecc18 fix non-orte builds of PSM
This commit was SVN r28893.
2013-07-21 19:12:32 +00:00
Brian Barrett
05ab9cbaa6 Need to ship pmi_internal.h
This commit was SVN r28891.
2013-07-21 19:00:50 +00:00
Brian Barrett
495384d8b7 Update documentation in rte.h to match recent changes
This commit was SVN r28887.
2013-07-20 22:14:12 +00:00
Brian Barrett
414ba3dad8 Update PMI RTE to match error handling changes that were part of r28852.
Note that the PMI RTE still doesn't listen for asynchronous errors, so
the error handler still won't ever actually do anything :).

This commit was SVN r28886.

The following SVN revision numbers were found above:
  r28852 --> open-mpi/ompi@e4e678e234
2013-07-20 22:09:02 +00:00
Brian Barrett
5bfd980968 update PMI RTE component to adapt to ORTE changes
This commit was SVN r28885.
2013-07-20 22:06:47 +00:00
Brian Barrett
d984d25da3 Remove orte header file from sharedfp components (OMPI layer should not
include ORTE layer with the RTE framework).  Thankfully, nothing used
orte_show_help, so easy fix.

This commit was SVN r28884.
2013-07-20 22:03:44 +00:00
Ralph Castain
d64e45cfa3 Add utility for comparing two code trees
This commit was SVN r28883.
2013-07-20 21:48:23 +00:00
Jeff Squyres
7a63ee24fb Remove Elan and Windows Verbs from the list of supported networks.
This commit was SVN r28881.
2013-07-19 22:15:25 +00:00
Jeff Squyres
bcf40e075b Add some notes about the Cisco usNIC BTL.
This commit was SVN r28880.
2013-07-19 22:14:49 +00:00
Jeff Squyres
194b285447 First commit of the Cisco usNIC BTL.
This BTL accesses the Cisco usNIC Linux device via the Linux verbs
API via Unreliable Datagram queue pairs.  A few noteworthy points:

 * This BTL does most of its own fragmentation; it tells the PML that
   it has a very high max_send_size (much higher than the network
   MTU).
 * Since UD fragments are, by definition, unreliable, the usnic BTL
   handles all of its own reliability via a sliding window approach
   using the opal_hotel construct and many tricks stolen from the
   corpus of knowledge surrounding efficient TCP.
 * There is a fun PML latency-metric based optimization for NUMA
   awareness of short messages.
 * Note that this is ''not'' a generic UD verbs BTL; it is specific to
   the Cisco usNIC device.

This commit was SVN r28879.
2013-07-19 22:13:58 +00:00
Jeff Squyres
3546163c48 Devices that do not support RC QP's are also intentionally skipped;
don't warn about skipping them.

This commit was SVN r28874.
2013-07-19 19:05:18 +00:00
Ralph Castain
5d12ab3873 Ensure we always set num_local_peers for both PMI2 and PMI1
This commit was SVN r28860.
2013-07-19 04:34:58 +00:00
Ralph Castain
b033a6b6d6 One last Cray-inspired fix...
Refs trac:3685

This commit was SVN r28857.

The following Trac tickets were found above:
  Ticket 3685 --> https://svn.open-mpi.org/trac/ompi/ticket/3685
2013-07-19 03:04:00 +00:00
Nathan Hjelm
1e8ba2b8cf fix condition in common/pmi init that c caused pmi to fail if PMI2_Init succeeds
This commit was SVN r28856.
2013-07-19 02:43:42 +00:00
Ralph Castain
92cb93b21e Remove set-but-unused variable
Refs trac:3685

This commit was SVN r28855.

The following Trac tickets were found above:
  Ticket 3685 --> https://svn.open-mpi.org/trac/ompi/ticket/3685
2013-07-19 01:42:35 +00:00
Ralph Castain
bc2586cf3c Refs trac:3685. Check error code returned by PMI2_Info_GetJobAttr.
This commit was SVN r28854.

The following Trac tickets were found above:
  Ticket 3685 --> https://svn.open-mpi.org/trac/ompi/ticket/3685
2013-07-19 01:24:51 +00:00
Ralph Castain
a10546d5c1 Cleanup and rename of platform files
This commit was SVN r28853.
2013-07-19 01:18:41 +00:00
Ralph Castain
e4e678e234 Per the RFC and discussion on the devel list, update the RTE-MPI error handling interface. There are a few differences in the code from the original RFC that came out of the discussion - I've captured those in the following writeup
George and I were talking about ORTE's error handling the other day in regards to the right way to deal with errors in the updated OOB. Specifically, it seemed a bad idea for a library such as ORTE to be aborting the job on its own prerogative. If we lose a connection or cannot send a message, then we really should just report it upwards and let the application and/or upper layers decide what to do about it.

The current code base only allows a single error callback to exist, which seemed unduly limiting. So, based on the conversation, I've modified the errmgr interface to provide a mechanism for registering any number of error handlers (this replaces the current "set_fault_callback" API). When an error occurs, these handlers will be called in order until one responds that the error has been "resolved" - i.e., no further action is required - by returning OMPI_SUCCESS. The default MPI layer error handler is specified to go "last" and calls mpi_abort, so the current "abort" behavior is preserved unless other error handlers are registered.

In the register_callback function, I provide an "order" param so you can specify "this callback must come first" or "this callback must come last". Seemed to me that we will probably have different code areas registering callbacks, and one might require it go first (the default "abort" will always require it go last). So you can append and prepend, or go first. Note that only one registration can declare itself "first" or "last", and since the default "abort" callback automatically takes "last", that one isn't available. :-)

The errhandler callback function passes an opal_pointer_array of structs, each of which contains the name of the proc involved (which can be yourself for internal errors) and the error code. This is a change from the current fault callback which returned an opal_pointer_array of just process names. Rationale is that you might need to see the cause of the error to decide what action to take. I realize that isn't a requirement for remote procs, but remember that we will use the SAME interface to report RTE errors internal to the proc itself. In those cases, you really do need to see the error code. It is legal to pass a NULL for the pointer array (e.g., when reporting an internal failure without error code), so handlers must be prepared for that possibility. If people find that too burdensome, we can remove it.

Should we ever decide to create a separate callback path for internal errors vs remote process failures, or if we decide to do something different based on experience, then we can adjust this API.

This commit was SVN r28852.
2013-07-19 01:08:53 +00:00
Ralph Castain
6c50c8167c Fix pmi-1 compile when no pmi2 is present
This commit was SVN r28849.
2013-07-18 22:45:08 +00:00
Ralph Castain
351b1203a7 Set ignores to ignore mca_sharedfp_addproc_control, a generated file
This commit was SVN r28846.
2013-07-18 22:19:52 +00:00
Ralph Castain
8a8b4896be Need to protect libgen.h as some systems might not have it
This commit was SVN r28845.
2013-07-18 20:21:37 +00:00
Edgar Gabriel
185e365dad make the sm sharedfp component compile on Mac.
This commit was SVN r28844.
2013-07-18 20:17:14 +00:00
Ralph Castain
256034a3dc Sigh - fix a couple of spots I missed
Refs trac:3683

This commit was SVN r28843.

The following Trac tickets were found above:
  Ticket 3683 --> https://svn.open-mpi.org/trac/ompi/ticket/3683
2013-07-18 19:07:16 +00:00
Ralph Castain
4eb0dfa039 This has apparently been wrong for some time! Fix the common/pmi libraries so we build them dynamic so they can be properly linked into the components that use them. Define required library version numbers and so some other cuteness to make it all work.
cmr:v1.7.3:reviewer=jsquyres

This commit was SVN r28842.
2013-07-18 18:42:42 +00:00
Ralph Castain
fc3b777ef5 Cleanup a variable that isn't used if pmi2 support is available
Refs trac:3683

This commit was SVN r28841.

The following Trac tickets were found above:
  Ticket 3683 --> https://svn.open-mpi.org/trac/ompi/ticket/3683
2013-07-18 17:19:13 +00:00
Edgar Gabriel
93cef82873 remove the ylib component from the fcoll framework. It is not used, there are
no plans to use it. We can always recover it from svn if we would ever change
our minds.

This commit was SVN r28840.
2013-07-18 16:18:06 +00:00
Ralph Castain
92c6b806b9 Based on a patch submitted by Piotr Lesnicki of Bull, cleanup the PMI2 support. This has not been tested yet on multiple environments (e.g., Cray), so it needs more evaluation prior to moving to the 1.7 branch.
cmr:v1.7.3:reviewer=rhc

This commit was SVN r28837.
2013-07-18 14:46:07 +00:00
Pavel Shamis
68969ba6e5 Removing bogus references in iboffload code.
cmr:v1.7:reviewer=hjelmn

This commit was SVN r28834.
2013-07-17 22:35:24 +00:00