1
1

6548 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
e4f105ffb3 revert change that shouldn't have been part of r28952
This commit was SVN r28953.

The following SVN revision numbers were found above:
  r28952 --> open-mpi/ompi@cb90a4a7fc
2013-07-25 20:23:55 +00:00
Nathan Hjelm
cb90a4a7fc Add simple algorithms to support MPI_IN_PLACE for MPI_Alltoall, MPI_Alltoallv, and MPI_Alltoallw.
Working on faster algorithms for tuned that will come at a later time.

cmr=v1.7.3:ticket=trac:2965

This commit was SVN r28952.

The following Trac tickets were found above:
  Ticket 2965 --> https://svn.open-mpi.org/trac/ompi/ticket/2965
2013-07-25 19:19:41 +00:00
Nathan Hjelm
99adeb7f6e Fix support for complex datatypes when fortran is not available but _Complex is
This commit was SVN r28951.
2013-07-25 19:08:21 +00:00
Edgar Gabriel
012f99c3b6 ompi_ignore this component until we find a solution for the dependence on
libmpi for the external process that is being spawned by this component.

This commit was SVN r28945.
2013-07-24 20:02:47 +00:00
Nathan Hjelm
22868b9f68 MPI_T: add man pages for MPI_T_* functions and fix typos in tool file names
This commit was SVN r28943.
2013-07-24 18:19:40 +00:00
Jeff Squyres
f7337b8f77 Correct faulty max payload and MTU computations (and update some
debugging that helped us find those).

This commit was SVN r28942.
2013-07-24 16:06:28 +00:00
Ralph Castain
db214a2321 Refs trac:3697 - use the opal_pmi_error function instead of ompi_error as the returned error codes are from PMI
This commit was SVN r28941.

The following Trac tickets were found above:
  Ticket 3697 --> https://svn.open-mpi.org/trac/ompi/ticket/3697
2013-07-24 04:05:41 +00:00
Jeff Squyres
5323051047 Use sysfs to check MPI has enough VFs, QPs, and CQs
Use the new sysfs files to check that there are enough VFs, QPs, and
CQs for all the MPI processes on this server.
    
Move the checking code into its own subroutine to make it smaller and
easier to read/grok.

This commit was SVN r28937.
2013-07-24 00:38:32 +00:00
Nathan Hjelm
90f5bd9424 Add missing f08 binding declarations for MPI_Count functions
This commit was SVN r28936.
2013-07-23 19:00:06 +00:00
Rolf vandeVaart
a3995f73d3 Update trunk to match OMPI 1.7.3 due to code reviews.
This commit was SVN r28934.
2013-07-23 17:58:21 +00:00
Nathan Hjelm
c6e586a81d MPI-3: fortran support for large counts using derived datatypes
Jeff:
 - Make sure not to go over 72 characters.  Love Fortran!
 - Ensure to include 'mpif-config.h' in Type_size_x.

This commit was SVN r28933.
2013-07-23 15:36:03 +00:00
Nathan Hjelm
c4c69b4ddf MPI-3: add support for large counts using derived datatypes
Add support for MPI_Count type and MPI_COUNT datatype and add the required
MPI-3 functions MPI_Get_elements_x, MPI_Status_set_elements_x,
MPI_Type_get_extent_x, MPI_Type_get_true_extent_x, and MPI_Type_size_x.
This commit adds only the C bindings. Fortran bindins will be added in
another commit. For now the MPI_Count type is define to have the same size
as MPI_Offset. The type is required to be at least as large as MPI_Offset
and MPI_Aint. The type was initially intended to be a ssize_t (if it was
the same size as a long long) but there were issues compiling romio with
that definition (despite the inclusion of stddef.h).

I updated the datatype engine to use size_t instead of uint32_t to support
large datatypes. This will require some review to make sure that 1) the
changes are beneficial, 2) nothing was broken by the change (I doubt
anything was), and 3) there are no performance regressions due to this
change.

Increase the maximum number of predifined datatypes to support MPI_Count

Put common get_elements code to ompi/datatype/ompi_datatype_get_elements.c

Update MPI_Get_count to reflect changes in MPI-3 (return MPI_UNDEFINED when the count is too large for an int)

This commit was SVN r28932.
2013-07-23 15:35:14 +00:00
Matthias Jurenz
c4a7dded5f Changes to VT:
- configure: Removed double slashes in path names which make trouble when building RPMs on Fedora (see #3688)

This commit was SVN r28924.
2013-07-23 08:12:18 +00:00
Nathan Hjelm
1349b825c2 MPI-2.2: Add C++ datatypes to mpi.h and fix support for MPI_C_*COMPLEX
This commit was SVN r28919.
2013-07-22 23:45:45 +00:00
Ralph Castain
59a71765cf Hmmm...these error outputs will never occur, which is probably not what the author intended. So do the output and THEN jump to the error exit.
This commit was SVN r28918.
2013-07-22 22:58:03 +00:00
Edgar Gabriel
8ffc1aac89 update the _component.c files in ompio to use the explicit assignment of the
mca_register_component_params element of the structure.

This commit was SVN r28914.
2013-07-22 21:11:05 +00:00
Nathan Hjelm
b17cd13c09 sharedfp: ensure sharedfp components register their parameters in mca_register_component_params not mca_component_open
This commit was SVN r28910.
2013-07-22 17:53:58 +00:00
Jeff Squyres
b437041aeb Update one more comment.
This commit was SVN r28908.
2013-07-22 17:29:00 +00:00
Jeff Squyres
4b6006402d Use the RTE framework instead of calling ORTE directly.
Brian (rightfully) hit me on the head with the
don't-use-ORTE-use-the-rte-framework clue bat; the usnic BTL now
nicely plays with the RTE framework.

This commit was SVN r28907.
2013-07-22 17:28:23 +00:00
Jeff Squyres
ca9da8a554 Fix minor typo in the comments/docs.
This commit was SVN r28905.
2013-07-22 17:24:17 +00:00
Rolf vandeVaart
67badf384c Only search SONAME of library. Expand comments.
This commit was SVN r28904.
2013-07-22 15:54:45 +00:00
Brian Barrett
e1d72409cd add missing header
This commit was SVN r28897.
2013-07-21 19:40:31 +00:00
Brian Barrett
704f1ecc18 fix non-orte builds of PSM
This commit was SVN r28893.
2013-07-21 19:12:32 +00:00
Brian Barrett
05ab9cbaa6 Need to ship pmi_internal.h
This commit was SVN r28891.
2013-07-21 19:00:50 +00:00
Brian Barrett
495384d8b7 Update documentation in rte.h to match recent changes
This commit was SVN r28887.
2013-07-20 22:14:12 +00:00
Brian Barrett
414ba3dad8 Update PMI RTE to match error handling changes that were part of r28852.
Note that the PMI RTE still doesn't listen for asynchronous errors, so
the error handler still won't ever actually do anything :).

This commit was SVN r28886.

The following SVN revision numbers were found above:
  r28852 --> open-mpi/ompi@e4e678e234
2013-07-20 22:09:02 +00:00
Brian Barrett
5bfd980968 update PMI RTE component to adapt to ORTE changes
This commit was SVN r28885.
2013-07-20 22:06:47 +00:00
Brian Barrett
d984d25da3 Remove orte header file from sharedfp components (OMPI layer should not
include ORTE layer with the RTE framework).  Thankfully, nothing used
orte_show_help, so easy fix.

This commit was SVN r28884.
2013-07-20 22:03:44 +00:00
Jeff Squyres
194b285447 First commit of the Cisco usNIC BTL.
This BTL accesses the Cisco usNIC Linux device via the Linux verbs
API via Unreliable Datagram queue pairs.  A few noteworthy points:

 * This BTL does most of its own fragmentation; it tells the PML that
   it has a very high max_send_size (much higher than the network
   MTU).
 * Since UD fragments are, by definition, unreliable, the usnic BTL
   handles all of its own reliability via a sliding window approach
   using the opal_hotel construct and many tricks stolen from the
   corpus of knowledge surrounding efficient TCP.
 * There is a fun PML latency-metric based optimization for NUMA
   awareness of short messages.
 * Note that this is ''not'' a generic UD verbs BTL; it is specific to
   the Cisco usNIC device.

This commit was SVN r28879.
2013-07-19 22:13:58 +00:00
Jeff Squyres
3546163c48 Devices that do not support RC QP's are also intentionally skipped;
don't warn about skipping them.

This commit was SVN r28874.
2013-07-19 19:05:18 +00:00
Ralph Castain
e4e678e234 Per the RFC and discussion on the devel list, update the RTE-MPI error handling interface. There are a few differences in the code from the original RFC that came out of the discussion - I've captured those in the following writeup
George and I were talking about ORTE's error handling the other day in regards to the right way to deal with errors in the updated OOB. Specifically, it seemed a bad idea for a library such as ORTE to be aborting the job on its own prerogative. If we lose a connection or cannot send a message, then we really should just report it upwards and let the application and/or upper layers decide what to do about it.

The current code base only allows a single error callback to exist, which seemed unduly limiting. So, based on the conversation, I've modified the errmgr interface to provide a mechanism for registering any number of error handlers (this replaces the current "set_fault_callback" API). When an error occurs, these handlers will be called in order until one responds that the error has been "resolved" - i.e., no further action is required - by returning OMPI_SUCCESS. The default MPI layer error handler is specified to go "last" and calls mpi_abort, so the current "abort" behavior is preserved unless other error handlers are registered.

In the register_callback function, I provide an "order" param so you can specify "this callback must come first" or "this callback must come last". Seemed to me that we will probably have different code areas registering callbacks, and one might require it go first (the default "abort" will always require it go last). So you can append and prepend, or go first. Note that only one registration can declare itself "first" or "last", and since the default "abort" callback automatically takes "last", that one isn't available. :-)

The errhandler callback function passes an opal_pointer_array of structs, each of which contains the name of the proc involved (which can be yourself for internal errors) and the error code. This is a change from the current fault callback which returned an opal_pointer_array of just process names. Rationale is that you might need to see the cause of the error to decide what action to take. I realize that isn't a requirement for remote procs, but remember that we will use the SAME interface to report RTE errors internal to the proc itself. In those cases, you really do need to see the error code. It is legal to pass a NULL for the pointer array (e.g., when reporting an internal failure without error code), so handlers must be prepared for that possibility. If people find that too burdensome, we can remove it.

Should we ever decide to create a separate callback path for internal errors vs remote process failures, or if we decide to do something different based on experience, then we can adjust this API.

This commit was SVN r28852.
2013-07-19 01:08:53 +00:00
Ralph Castain
8a8b4896be Need to protect libgen.h as some systems might not have it
This commit was SVN r28845.
2013-07-18 20:21:37 +00:00
Edgar Gabriel
185e365dad make the sm sharedfp component compile on Mac.
This commit was SVN r28844.
2013-07-18 20:17:14 +00:00
Edgar Gabriel
93cef82873 remove the ylib component from the fcoll framework. It is not used, there are
no plans to use it. We can always recover it from svn if we would ever change
our minds.

This commit was SVN r28840.
2013-07-18 16:18:06 +00:00
Pavel Shamis
68969ba6e5 Removing bogus references in iboffload code.
cmr:v1.7:reviewer=hjelmn

This commit was SVN r28834.
2013-07-17 22:35:24 +00:00
Rolf vandeVaart
49663fb802 Move CUDA-aware configurary to its own file and other minor changes due to review.
This commit was SVN r28832.
2013-07-17 22:12:29 +00:00
Edgar Gabriel
6e8522fec5 infuse life into the shared file pointer framework. For this:
- extend the framework API
 - remove the dummy component, not require anymore
 - add four components to perform the actual job.

This commit was SVN r28828.
2013-07-17 21:55:24 +00:00
Edgar Gabriel
ac694b7056 in preparation for the new shared file pointer components to be committed
soon:
 - add a new abstraction layer to be used internally for some operations
 - add a new mca parameter to control lazy intialization of shared file
 pointer structures

This commit was SVN r28826.
2013-07-17 21:30:50 +00:00
Vishwanath Venkatesan
ce8f8f0829 Changing the MPI Datatype from MPI_LONG to OMPI_OFFSET_DATATYPE for send/recv offsets
This commit was SVN r28822.
2013-07-17 19:16:53 +00:00
Nathan Hjelm
d4c6029cf3 sbgp/ibnet: set mca_sbgp_ibnet_component.mtu to IBV_MTU_1024 before registering it. cmr:v1.7:reviewer=pasha
This commit was SVN r28821.
2013-07-17 19:16:31 +00:00
Rolf vandeVaart
7a45be8bde Fix variable initialization.
This commit was SVN r28819.
2013-07-17 17:37:35 +00:00
Nathan Hjelm
5999906dec Remote duplicate mpi/tool from DIST_SUBDIRS
This commit was SVN r28818.
2013-07-17 04:35:47 +00:00
Nathan Hjelm
f0aeb36d80 Fix warnings in ob1 introduced by the pvar commit
This commit was SVN r28817.
2013-07-17 03:41:05 +00:00
Ralph Castain
956317ac1e Cleanup the MPIT errors - include the generated mpit lib in libompi, set ignores, remove the now unused mpit directory
This commit was SVN r28816.
2013-07-17 03:13:13 +00:00
Rolf vandeVaart
f95c95cf79 Additional cleanup of how libraries and paths are searched.
This commit was SVN r28815.
2013-07-16 18:40:55 +00:00
Nathan Hjelm
38bcbc4696 mpit: fix behavior when returning strings
This commit was SVN r28804.
2013-07-16 16:03:48 +00:00
Nathan Hjelm
e6e9f2c6fd Add profiling function definitions for MPI_T and add a missing type into mpi.h
This commit was SVN r28803.
2013-07-16 16:03:33 +00:00
Nathan Hjelm
35673ea400 Add example performance variables to ob1: unexpected message queue length, posted receive length
This commit was SVN r28801.
2013-07-16 16:02:25 +00:00
Nathan Hjelm
d446675526 MCA: Per-RFC, add support for performance variables
This commit adds an API for registering and querying performance
variables (mca_base_pvar) in the MCA base. The existing MCA variable
system API has been updated to reflect the new API: MCA variable
groups have performance variables, and new types have been added (double,
unsigned long long) to reflect what is required by the MPI_T
interface. Additionally, the MCA variable group code has been split
into its own set of files: mca_base_var_group.[ch].

Details of the new API can be found in doxygen comments in the header:
mca_base_pvar.h.

Other changes to the variable system:

 - Use an opal_hash_table to speed up variable/group lookup.

 - Clean up code associated with MCA variable types.

 - Registered performance variables are printed by ompi_info -a. In the
   future an option should be added to control this behavior.

Changes to OMPI:

 - Added full support for the MPI_T performance variable interface.

This commit was SVN r28800.
2013-07-16 16:02:13 +00:00
Rolf vandeVaart
54b1fbdb4a Better error message code. Remove commented out code.
This commit was SVN r28793.
2013-07-15 22:27:34 +00:00