1
1

4303 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
2d33c9ee39 Stop complaining about an overwritten default parameter.
This commit was SVN r28322.
2013-04-10 19:44:37 +00:00
Jeff Squyres
8405975bf6 Be a little more conservative about initializing devices and modules
(i.e., ensure that more data items get zeroed out/set to NULL) so that
if something goes wrong during initialization, we don't try to clean
up something that isn't there (and segv).

The chance of this happening on the trunk is very low (and will also
be low once the verbs improvements are brought over to v1.7).  But it
can actually happen in the v1.6 branch (e.g., if no CPC is available,
we'll try to get the length of the endpoints list, but the endpoints
list is NULL).  

Hence, even though the real goal is to get this functionality over to
v1.6, I figured I'd commit to the trunk/CMR to v1.7 just to try to
keep commonality in the openib between all three where possible.

This commit was SVN r28317.
2013-04-09 21:55:31 +00:00
Ralph Castain
45af6cf59e The move of the orte_db framework to opal required that we create an opaque opal_identifier_t type as OPAL cannot know anything about the ORTE process name. However, passing a value down to opal and then having the db components reference it causes alignment issues on Solaris Sparc platforms. So pass the pointer instead and do the old "memcpy" trick to avoid the problem.
This commit was SVN r28308.
2013-04-08 23:34:16 +00:00
Nathan Hjelm
4e95d691a7 pml/ob1: do not reset the convertor if one was not created (size = 0).
This macro is only used on the failure path so the additional if statement
should not have any affect on performance.

cmr:v1.7

This commit was SVN r28292.
2013-04-05 01:40:11 +00:00
Pavel Shamis
fed6e60131 Fixing OpenIB BTL compilation failure for a cases when
BTL_OPENIB_MALLOC_HOOKS_ENABLED is disabled.

This commit was SVN r28290.
2013-04-04 20:17:18 +00:00
Pavel Shamis
aa1f5697b4 In order to prevent name conflicts in XRC (MOFED) enabled mode
OFACM's ib_address_t was renamed to ofacm_ib_address_t

This commit was SVN r28289.
2013-04-04 20:02:17 +00:00
Nathan Hjelm
e8d9944456 sbgp/ibnet: fix param -> var update errors
This commit was SVN r28284.
2013-04-03 20:17:18 +00:00
Nathan Hjelm
75093155ab bcol/iboffload: fix still more errors from param -> var updates
This commit was SVN r28283.
2013-04-03 19:57:03 +00:00
Nathan Hjelm
47a1897710 bcol/iboffload: fix more errors from param -> var updates
This commit was SVN r28281.
2013-04-03 18:55:46 +00:00
Nathan Hjelm
31a498c2a1 bcol/iboffload: fix errors from param -> var updates
This commit was SVN r28280.
2013-04-03 18:33:19 +00:00
Ralph Castain
66f3a81488 Cleanup warnings found when building v1.7
cmr:v1.7

This commit was SVN r28279.
2013-04-03 17:37:02 +00:00
Vishwanath Venkatesan
74c418b860 Adding typecasting with intptr_t to remove warnings.
This commit was SVN r28278.
2013-04-03 17:07:43 +00:00
Vishwanath Venkatesan
784337aab1 typecasting with intptr_t to remove warnings
This commit was SVN r28276.
2013-04-03 17:06:02 +00:00
Jeff Squyres
64d39a4e97 Technically speaking, we're creating a QP with 1 send WQE and 1
receive WQE, so it's good form to have a CQ with 2 entries, not 1.

This commit was SVN r28256.
2013-03-28 13:11:31 +00:00
George Bosilca
9c6374b515 Swap the open and register.
This commit was SVN r28253.
2013-03-27 22:19:57 +00:00
Nathan Hjelm
f1fa290157 btl/vader: add missing return statement
This commit was SVN r28252.
2013-03-27 22:16:21 +00:00
Nathan Hjelm
113fadd749 btl/vader: do not use common/sm for shared memory fragments
This commit was SVN r28250.
2013-03-27 22:10:02 +00:00
Nathan Hjelm
9d4a26f47d Update OMPI frameworks to use the MCA framework system.
Notes:
  - This commit also eliminates the need for an available components list in use
    in several frameworks. None of the code in question was making use of the
    priority field of the priority component list item so these extra lists were
    removed.
  - Cleaned up selection code in several frameworks to sort lists using opal_list_sort.
  - Cleans up the ompi/orte-info functions. Expose the functions that construct the
    list of params so they can be used elsewhere.

patches for mtl/portals4 from brian

missed a few output variables in openib

This commit was SVN r28241.
2013-03-27 21:17:31 +00:00
Nathan Hjelm
c041156f60 Update ORTE frameworks to use the MCA framework system.
This commit was SVN r28240.
2013-03-27 21:14:43 +00:00
Nathan Hjelm
cf377db823 MCA/base: Add new MCA variable system
Features:
 - Support for an override parameter file (openmpi-mca-param-override.conf).
   Variable values in this file can not be overridden by any file or environment
   value.
 - Support for boolean, unsigned, and unsigned long long variables.
 - Support for true/false values.
 - Support for enumerations on integer variables.
 - Support for MPIT scope, verbosity, and binding.
 - Support for command line source.
 - Support for setting variable source via the environment using
   OMPI_MCA_SOURCE_<var name>=source (either command or file:filename)
 - Cleaner API.
 - Support for variable groups (equivalent to MPIT categories).

Notes:
 - Variables must be created with a backing store (char **, int *, or bool *)
   that must live at least as long as the variable.
 - Creating a variable with the MCA_BASE_VAR_FLAG_SETTABLE enables the use of
   mca_base_var_set_value() to change the value.
 - String values are duplicated when the variable is registered. It is up to
   the caller to free the original value if necessary. The new value will be
   freed by the mca_base_var system and must not be freed by the user.
 - Variables with constant scope may not be settable.
 - Variable groups (and all associated variables) are deregistered when the
   component is closed or the component repository item is freed. This
   prevents a segmentation fault from accessing a variable after its component
   is unloaded.
 - After some discussion we decided we should remove the automatic registration
   of component priority variables. Few component actually made use of this
   feature.
 - The enumerator interface was updated to be general enough to handle
   future uses of the interface.
 - The code to generate ompi_info output has been moved into the MCA variable
   system. See mca_base_var_dump().

opal: update core and components to mca_base_var system
orte: update core and components to mca_base_var system
ompi: update core and components to mca_base_var system

This commit also modifies the rmaps framework. The following variables were
moved from ppr and lama: rmaps_base_pernode, rmaps_base_n_pernode,
rmaps_base_n_persocket. Both lama and ppr create synonyms for these variables.

This commit was SVN r28236.
2013-03-27 21:09:41 +00:00
Ralph Castain
317915225c Finish the binding cleanup by removing the no-longer-used binding level scheme. This proved to be fallible as there is no guarantee that the hierarchy it used matched physical reality of the machine (e.g., is L3 "above" the socket or not). Still have to complete the ppr update, but get the rest of it correct.
This commit was SVN r28223.
2013-03-26 20:09:49 +00:00
Jeff Squyres
44e371a65d Remove (bogus) port number from the opal_output -- there's no port
number associated with creating a QP.

This commit was SVN r28222.
2013-03-26 19:48:50 +00:00
Vishwanath Venkatesan
e092cc34e0 Fixing the read all bugs discovered by Coverity
This commit was SVN r28189.
2013-03-20 20:27:09 +00:00
Samuel Gutierrez
8ce2041102 Cleanup in error path. Fixes CID 967211. Thanks, Jeff.
This commit was SVN r28183.
2013-03-19 20:00:08 +00:00
Jeff Squyres
2513122d31 Remove extraneous semicolon.
This commit was SVN r28180.
2013-03-18 23:58:11 +00:00
Jeff Squyres
7ac02fb9d4 Two fixes for the ROMIO io module:
* Don't call PMPI_* anything from our module code; that's terribly
   bad form (and disallowed!).  Instead, do the proper back-end stuff
   to reset the error handler on the file handle.
 * If we've already started to MPI_Finalize, then just give up and
   don't actually perform all the file closing actions (because
   ROMIO's file close calls MPI_Barrier, which will obviously fail if
   MPI_Finalize has already been invoked).  Bad user behavior should
   be punished (by leaking resources, not closing the file properly,
   etc.).

This commit was SVN r28177.
2013-03-18 20:11:20 +00:00
Vasily Filipov
7bda23dd84 SBGP, BCOL: add missing "show_help.h" includes.
This commit was SVN r28163.
2013-03-10 09:11:09 +00:00
Brian Barrett
65109de931 Fix leak of comm and datatype references for mprobe/improbe and fix a request leak in improbe
This commit was SVN r28157.
2013-03-07 21:55:22 +00:00
Brian Barrett
db858827df Fill in more of the process info structure when using PMI
This commit was SVN r28152.
2013-03-06 19:32:47 +00:00
Brian Barrett
a67d768ee4 quick hack to get things compiling again. Still need to fill in the fixme parts. sigh.
This commit was SVN r28150.
2013-03-06 18:33:25 +00:00
Nathan Hjelm
3c5cd95087 mtl/psm: add missing header for opal_show_help (one more)
This commit was SVN r28147.
2013-03-05 00:18:51 +00:00
Nathan Hjelm
25d0d97d6b mtl/psm: add missing header for opal_show_help
This commit was SVN r28146.
2013-03-05 00:17:48 +00:00
Nathan Hjelm
213cb79fab mtl/psm: add missing header for opal_show_help
This commit was SVN r28145.
2013-03-05 00:15:11 +00:00
Rolf vandeVaart
037729dcbb Add a search path. Refactor code.
This commit was SVN r28142.
2013-03-01 21:50:56 +00:00
Rolf vandeVaart
5c761d701d Remove tabs for spaces, fix some error messages.
This commit was SVN r28141.
2013-03-01 19:13:06 +00:00
Rolf vandeVaart
ebe63118ac Remove dependency on libcuda.so when building in CUDA-aware support. Dynamically load it if needed.
This commit was SVN r28140.
2013-03-01 13:21:52 +00:00
Ralph Castain
a4b6fb241f Remove all remaining vestiges of the Windows integration
This commit was SVN r28137.
2013-02-28 17:31:47 +00:00
Nathan Hjelm
b5a2cd1cce remove csum pml
This commit was SVN r28133.
2013-02-28 00:17:56 +00:00
Brian Barrett
1370d4569a workaround for case when MD can't span all of memory (sigh)
This commit was SVN r28132.
2013-02-27 17:02:45 +00:00
Vasily Filipov
f897c8a1e0 MTL MXM: STREAM supporting for isend and irecv.
This commit was SVN r28122.
2013-02-27 13:21:30 +00:00
Ralph Castain
8d2fa3693b First cut at removing the native Windows support. Remove all the Windows-specific components, and the .windows files sprinkled around. Remove the Windows platform files and MTT scripts. Update the NEWS to point Windows users to the cygwin package.
This commit was SVN r28116.
2013-02-26 20:44:56 +00:00
Ralph Castain
bd9265c560 Per the meeting on moving the BTLs to OPAL, move the ORTE database "db" framework to OPAL so the relocated BTLs can access it. Because the data is indexed by process, this requires that we define a new "opal_identifier_t" that corresponds to the orte_process_name_t struct. In order to support multiple run-times, this is defined in opal/mca/db/db_types.h as a uint64_t without identifying the meaning of any part of that data.
A few changes were required to support this move:

1. the PMI component used to identify rte-related data (e.g., host name, bind level) and package them as a unit to reduce the number of PMI keys. This code was moved up to the ORTE layer as the OPAL layer has no understanding of these concepts. In addition, the component locally stored data based on process jobid/vpid - this could no longer be supported (see below for the solution).

2. the hash component was updated to use the new opal_identifier_t instead of orte_process_name_t as its index for storing data in the hash tables. Previously, we did a hash on the vpid and stored the data in a 32-bit hash table. In the revised system, we don't see a separate "vpid" field - we only have a 64-bit opaque value. The orte_process_name_t hash turned out to do nothing useful, so we now store the data in a 64-bit hash table. Preliminary tests didn't show any identifiable change in behavior or performance, but we'll have to see if a move back to the 32-bit table is required at some later time.

3. the db framework was a "select one" system. However, since the PMI component could no longer use its internal storage system, the framework has now been changed to a "select many" mode of operation. This allows the hash component to handle all internal storage, while the PMI component only handles pushing/pulling things from the PMI system. This was something we had planned for some time - when fetching data, we first check internal storage to see if we already have it, and then automatically go to the global system to look for it if we don't. Accordingly, the framework was provided with a custom query function used during "select" that lets you seperately specify the "store" and "fetch" ordering.

4. the ORTE grpcomm and ess/pmi components, and the nidmap code,  were updated to work with the new db framework and to specify internal/global storage options.

No changes were made to the MPI layer, except for modifying the ORTE component of the OMPI/rte framework to support the new db framework.

This commit was SVN r28112.
2013-02-26 17:50:04 +00:00
Ralph Castain
70a28c8a27 Now that we are using local ranks in OMPI, we need to define an ompi_local_rank_t and equate it to orte_local_rank_t. Change the sm btl to use the correct abstraction.
This commit was SVN r28098.
2013-02-22 17:48:53 +00:00
Samuel Gutierrez
af5ed9b25c OMPI_NODE_RANK_INVALID ==> OMPI_LOCAL_RANK_INVALID
This commit was SVN r28096.
2013-02-21 18:28:07 +00:00
Samuel Gutierrez
4bf0134901 Remove debug.
This commit was SVN r28095.
2013-02-21 18:21:22 +00:00
Samuel Gutierrez
b7791963f2 Fix sm BTL initialization for MPI_Comm_spawn and friends. Thanks to Jeff for
finding the issue.

This commit was SVN r28094.
2013-02-21 18:19:46 +00:00
Nathan Hjelm
55cf850eca Add comment about r28083
This commit was SVN r28084.

The following SVN revision numbers were found above:
  r28083 --> open-mpi/ompi@5411e28c00
2013-02-20 21:42:13 +00:00
Nathan Hjelm
5411e28c00 btl/openib: don't align fragments on 2 byte boundaries (changed to 8)
cmr:v1.6,v1.7

This commit was SVN r28083.
2013-02-20 21:27:01 +00:00
Rolf vandeVaart
da3e9ff906 Add show_help.h where needed.
This commit was SVN r28071.
2013-02-19 15:42:09 +00:00
Brian Barrett
3c83618799 fix a missing header file issue with IB
This commit was SVN r28070.
2013-02-18 18:29:14 +00:00