As per discussion in the June 2013 developer meeting these
flags will be used by the PML in the future to request
asynchronous progress on an operation. The naming was chosen
to reflect that a BTL supports this mode (MCA_BTL_FLAG_SIGNALED)
and that a descriptor should "signal" the remote side to wake
up and progress the message (MCA_BTL_DES_FLAG_SIGNAL).
Future commits will update OB1 to take advantage of this
feature when performing the RDMA get or RDMA rendezvous
protocols.
This commit was SVN r28612.
commit is the trunk version of what is needed for #3626.
Add the "ignore_device" field to the INI file. This allows us to
specifically list devices that should be ignored by the openib BTL
(such as the Intel Phi, at least as of May 2013 -- see #3626).
Also add the Intel Phi to the ini file, and set its ignore_device=1.
Finally, add the concept of counting intentionally ignored verbs
devices. Devices are ignored for one of two reasons:
* If the number of allowed ports on that device is 0 (i.e., if
if_include/if_exclude was set such that we're intentionally
ignoring this device).
* If the INI ignore_device field for this device is set to 1.
Once we have the count of devices that were intentionally ignored,
only show the "Hey, there's verbs devices that you're not using!"
show_help message if there are devices that were ''unintentionally''
ignored.
This commit was SVN r28589.
The following Trac tickets were found above:
Ticket 3626 --> https://svn.open-mpi.org/trac/ompi/ticket/3626
The primary issue with udcm is that the immediate data in message
acks were often bogus. This caused the sender to keep trying even
though a message was received and acked. The fix is to use the
source LID and QP to determine which message is being acked. In
most cases this should work well since only one message will be
in flight to any peer.
This commit was SVN r28444.
of individual regions (each region is a multiple of page size in
length), and each process claims its own regions by binding it to its
local memory. Each process would end up membining something like 16
individual regions in the overall shmem segment.
There were two errors in this code relating to the memory affinity
pinning. Some combination of these two errors would lead to kernel
panics (!) on my RHEL 6.2 x86_64 machines when used with mmap'ed
shared memory (not posix or sysv shared memory, curiously enough):
1. The shared memory segment is initially divided into two regions:
control and data. The control starts at the beginning of the shmem
segment, the data starts after that. The data portion, unfortunately,
was ''not'' aligned to a page. So all the multiple-of-page-size
regions that we divvy up were also not alined on page boundaries. And
therefore all the regions we tried to membind were not on page
boundaries.
The solution was to ensure that the data portion started on a page
boundary. Then all of the individual regions were on page boundaries,
too.
That being said, in my tests, Linux mbind() fails gracefully when the
address is not on a page boundary. So I'm not sure how this worked at
all / led to a kernel panic...
2. There was some bad pointer math that resulted in membinding regions
larger than they should have been, resulting in region overlaps.
There were definitely overlaps between regions in the same process;
it's likely that there were overlaps between regions of multiple
processes, too -- I'm not sure (and don't care to figure out :-) ).
The solution was to fix the pointer math so that each region membinds
exactly only itself and no neighboring/overlapping regions.
cmr:v1.7.2:reviewer=samuel
This commit was SVN r28442.
- increase number of wqe to minimize number of RNRs
- it is better to have high watermark and post relatively small number of wqes
- increased TX queue size
This commit was SVN r28440.
from the list (just for good measure), and then free() it (without
using _SAFE, we were accessing memory that was just free()'d to get to
the next item). Also be a little more thorough -- DESTRUCT the list
when we're all done.
This commit was SVN r28429.
(i.e., ensure that more data items get zeroed out/set to NULL) so that
if something goes wrong during initialization, we don't try to clean
up something that isn't there (and segv).
The chance of this happening on the trunk is very low (and will also
be low once the verbs improvements are brought over to v1.7). But it
can actually happen in the v1.6 branch (e.g., if no CPC is available,
we'll try to get the length of the endpoints list, but the endpoints
list is NULL).
Hence, even though the real goal is to get this functionality over to
v1.6, I figured I'd commit to the trunk/CMR to v1.7 just to try to
keep commonality in the openib between all three where possible.
This commit was SVN r28317.
This macro is only used on the failure path so the additional if statement
should not have any affect on performance.
cmr:v1.7
This commit was SVN r28292.
Notes:
- This commit also eliminates the need for an available components list in use
in several frameworks. None of the code in question was making use of the
priority field of the priority component list item so these extra lists were
removed.
- Cleaned up selection code in several frameworks to sort lists using opal_list_sort.
- Cleans up the ompi/orte-info functions. Expose the functions that construct the
list of params so they can be used elsewhere.
patches for mtl/portals4 from brian
missed a few output variables in openib
This commit was SVN r28241.
Features:
- Support for an override parameter file (openmpi-mca-param-override.conf).
Variable values in this file can not be overridden by any file or environment
value.
- Support for boolean, unsigned, and unsigned long long variables.
- Support for true/false values.
- Support for enumerations on integer variables.
- Support for MPIT scope, verbosity, and binding.
- Support for command line source.
- Support for setting variable source via the environment using
OMPI_MCA_SOURCE_<var name>=source (either command or file:filename)
- Cleaner API.
- Support for variable groups (equivalent to MPIT categories).
Notes:
- Variables must be created with a backing store (char **, int *, or bool *)
that must live at least as long as the variable.
- Creating a variable with the MCA_BASE_VAR_FLAG_SETTABLE enables the use of
mca_base_var_set_value() to change the value.
- String values are duplicated when the variable is registered. It is up to
the caller to free the original value if necessary. The new value will be
freed by the mca_base_var system and must not be freed by the user.
- Variables with constant scope may not be settable.
- Variable groups (and all associated variables) are deregistered when the
component is closed or the component repository item is freed. This
prevents a segmentation fault from accessing a variable after its component
is unloaded.
- After some discussion we decided we should remove the automatic registration
of component priority variables. Few component actually made use of this
feature.
- The enumerator interface was updated to be general enough to handle
future uses of the interface.
- The code to generate ompi_info output has been moved into the MCA variable
system. See mca_base_var_dump().
opal: update core and components to mca_base_var system
orte: update core and components to mca_base_var system
ompi: update core and components to mca_base_var system
This commit also modifies the rmaps framework. The following variables were
moved from ppr and lama: rmaps_base_pernode, rmaps_base_n_pernode,
rmaps_base_n_persocket. Both lama and ppr create synonyms for these variables.
This commit was SVN r28236.
* Don't call PMPI_* anything from our module code; that's terribly
bad form (and disallowed!). Instead, do the proper back-end stuff
to reset the error handler on the file handle.
* If we've already started to MPI_Finalize, then just give up and
don't actually perform all the file closing actions (because
ROMIO's file close calls MPI_Barrier, which will obviously fail if
MPI_Finalize has already been invoked). Bad user behavior should
be punished (by leaking resources, not closing the file properly,
etc.).
This commit was SVN r28177.
A few changes were required to support this move:
1. the PMI component used to identify rte-related data (e.g., host name, bind level) and package them as a unit to reduce the number of PMI keys. This code was moved up to the ORTE layer as the OPAL layer has no understanding of these concepts. In addition, the component locally stored data based on process jobid/vpid - this could no longer be supported (see below for the solution).
2. the hash component was updated to use the new opal_identifier_t instead of orte_process_name_t as its index for storing data in the hash tables. Previously, we did a hash on the vpid and stored the data in a 32-bit hash table. In the revised system, we don't see a separate "vpid" field - we only have a 64-bit opaque value. The orte_process_name_t hash turned out to do nothing useful, so we now store the data in a 64-bit hash table. Preliminary tests didn't show any identifiable change in behavior or performance, but we'll have to see if a move back to the 32-bit table is required at some later time.
3. the db framework was a "select one" system. However, since the PMI component could no longer use its internal storage system, the framework has now been changed to a "select many" mode of operation. This allows the hash component to handle all internal storage, while the PMI component only handles pushing/pulling things from the PMI system. This was something we had planned for some time - when fetching data, we first check internal storage to see if we already have it, and then automatically go to the global system to look for it if we don't. Accordingly, the framework was provided with a custom query function used during "select" that lets you seperately specify the "store" and "fetch" ordering.
4. the ORTE grpcomm and ess/pmi components, and the nidmap code, were updated to work with the new db framework and to specify internal/global storage options.
No changes were made to the MPI layer, except for modifying the ORTE component of the OMPI/rte framework to support the new db framework.
This commit was SVN r28112.
ompi_show_help, because opal_show_help is replaced with an
aggregating version when using ORTE, so there's no reason to
directly call orte_show_help.
This commit was SVN r28051.
r27987 - MTL MXM: ver. 2.0 interface changes.
This commit was SVN r28026.
The following SVN revision numbers were found above:
r27987 --> open-mpi/ompi@2735658d81
necessarily mean an error -- it could (and usually does) mean that the
peer realized that we both initiated a connect at the same time, and
therefore it decided to hang up.
I also added a friendly show_help error message for other cases where
recv_blocking() fails (i.e., "Something went wrong. Kaboom! Your job
will abort...").
This commit was SVN r28023.
The following Trac tickets were found above:
Ticket 3494 --> https://svn.open-mpi.org/trac/ompi/ticket/3494
flags, and mca flags are kept seperate until the very end. The main configure
wrapper flags should now be modified by using the OPAL_WRAPPER_FLAGS_ADD
macro. MCA components should either let <framework>_<component>_{LIBS,LDFLAGS}
be copied over OR set <framework>_<component>_WRAPPER_EXTRA_{LIBS,LDFLAGS}.
The situations in which WRAPPER CPPFLAGS can be set by MCA components was
made very small to match the one use case where it makes sense.
This commit was SVN r27950.
all pending fragments when the destination goes down. This allows the PML
to recalibrate its behavior, either find an alternate route or just give up.
This commit was SVN r27881.
party configure.in scripts to be configure.ac so that Automake stops
complaining about them.
This commit was SVN r27791.
The following SVN revision numbers were found above:
r27790 --> open-mpi/ompi@675a2f5c48
using the modex or RML to share sm initialization information, have node rank 0
create a file containing initialization information in a well-known place. Then
during add_procs, the rest of the node processes requiring sm BTL initialization
will just read from that file to complete their initialization.
This commit was SVN r27789.
There was a race condition in the eager get protocol where the RDMA complete message could be received before the local completion of the SMSG message that started the eager get protocol.
cmr:v1.7
This commit was SVN r27740.
* btl sendi(): if message can be send inline try to avoid signal
* signal is requested one per 64 or when
there are no send wqes
when message can not be send inline
any other btl method then sendi()
This commit was SVN r27724.
An upcoming BTL from Cisco used ofud as a starting point, and should
probably be used as a starting point for any future UD-based BTL.
And this OFUD BTL is obviously still in history if anyone ever wants
to resurrect it.
This commit was SVN r27655.
of modules, print a BTL_ERROR and exit(1) (previous behavior was to
segv). This at least explicitly tells the developer that their BTL
component is behaving badly.
This commit was SVN r27634.
fbtl modules. This implmentation in alignment with all other collective modules tries to
keep all the file-ops as contiguous as possible.
This commit was SVN r27611.
Reasoning: The old behavior was a little confusing. mca_base_components_open does not open an output stream so it is a little unexpected that mca_base_components_close does. To add to this several frameworks (that don't use mca_base_components_close) failed to close their output in the framework close function and others closed their output a second time. This change is an improvement to the symantics of mca_base_components_open/close as they are now symetric in their functionality.
This commit was SVN r27570.
pml/v:
- If vprotocol is not being used vprotocol_include_list is leaked. Assume vprotocol never takes ownership (see below) and always free the string.
coll/ml:
- (patch verified) calling mca_base_param_lookup_string after mca_base_param_reg_string is unnecessary. The call to mca_base_param_lookup_string causes the value returned by mca_base_param_reg_string to be leaked.
- Need to free mca_coll_ml_component.config_file_name on component close.
btl/openib:
- calling mca_base_param_lookup_string after mca_base_param_reg_string is unnecessary. The call to mca_base_param_lookup_string causes the value returned by mca_base_param_reg_string to be leaked.
vprotocol/base:
- There was no way for pml/v to determine if vprotocol took ownership of vprotocol_include_list. Fix by always never ownership (use strdup).
mca/base:
- param_lookup will result in storage->stringval to be a newly allocated string if the mca parameter has a string value. ensure this string is always freed.
cmr:v1.7
This commit was SVN r27569.
It appears the problem was not with the command line parser but the rsh plm. I don't know why this problem was not occuring before the command line parser changes but it appears to be resolved now.
This commit was SVN r27527.
The following SVN revision numbers were found above:
r27451 --> open-mpi/ompi@d59034e6ef
r27456 --> open-mpi/ompi@ecdbf34937
Not sure what happened here, but the resulting trunk wouldn't even configure. After spending time fixing that problem, I found it wouldn't compile due to multiple syntax errors that had been introduced in both the OPAL and OMPI layer. This raised questions as to the completeness of the work.
Given that the author is departing, I pinged Jeff about it and we agreed to revert this for now. Hopefully, it can either be fixed by the author prior to actual departure, or someone else can pick it up (now that it is in the history) and fix it.
This commit was SVN r27511.
The following SVN revision numbers were found above:
r27508 --> open-mpi/ompi@12c3c743de
r27509 --> open-mpi/ompi@79e4a8ca38
r27510 --> open-mpi/ompi@1ad5ff625a
* Only register the progress function on first call to a non-blocking
collective operation, to try to reduce overall performance impact
* Fix tag management in roll-over case
This commit was SVN r27498.
# The processes register their information and continue.
# Actual printing of timing information happens at file close.
# Triggered by MCA parameter at runtime
This commit was SVN r27442.
# The processes register their information and continue.
# Actual printing of timing information happens at file close.
# Triggered by MCA parameter at runtime
This commit was SVN r27441.
# The processes register their information and continue.
# Actual printing of timing information happens at file close.
# Triggered by MCA parameter at runtime
This commit was SVN r27440.
# This is triggered based on a mca-paramater and can be used with all collective modules.
# Individual queues maintained for read and write.
# The additional communication to combine data is done at file-close so that the
actual timing of collective-operations will not get affected.
# The queues are initialized in file-open
This commit was SVN r27439.
- fix the Fortran layer to use new macros to convert Fortran-to-C status
- change the C internals to pull out old OMPI_SET_STATUS* macros
Also, change name of "status" argument in topo_test_f.c to "topo_type".
This commit was SVN r27403.
* Add OMPI_COMMON_VERBS_FLAGS_NOT_RC, which looks for a device that
does ''not'' support RC
* Add ompi_common_verbs_find_max_inline(), and remove that code from
the openib BTL component
This commit was SVN r27393.
ompi/mca/sbgp/basesmsocket
orte/mca/rmaps/lama
Remove stale configure.params files from the sbgp framework as the OMPI build system no longer looks at those files.
This commit was SVN r27377.
1. Multiple aggregator with non-contiguous datatype,
2. Memory corruption bugs.
Cleaned version, with proper initialization and memory management.
This commit was SVN r27370.
* Moved "check basics" sanity check from openib BTL to common/verbs
(which also allows us to have openib ''not'' include
<infiniband/driver.h>, which is a Very Good Thing)
* Add new ompi_common_verbs_qp_test() function, which tests to see
whether a device supports RC and/or UD QPs. The openib BTL now
uses this function to ensure that the device supports RC QPs.
* Rename ompi_common_verbs_find_ibv_ports() to be
ompi_common_verbs_find_ports() -- the "ibv" was redundant.
* Re-work ompi_common_verbs_find_ports() to use
ompi_common_verbs_qp_test() instead of testing for RC/UD QPs itself
* Add bunches of opal_output_verbose() to the find_ports() routine
(to help diagnosing connectivity problems -- imaging running with
--mca btl_base_verbose 10; you'll see all the find_ports() test
results)
* Make ompi_common_verbs_qp_test() warn if devices/ports are supplied
in the if_include/if_exclude strings that do not exists (quite
similar to what the openib BTL does today).
* Add ompi_common_verbs_mca_register() function, which registers
common verbs MCA params. It will also register MCA param synonyms
for thse MCA params to upper-level components (e.g.,
btl_<upper-level-component>_<the-mca-param>).
* common_verbs_warn_nonexistent_if: warn if
if_include/if_exclude-specified devices or ports do not exist.
This commit was SVN r27332.
We ran into a case where the OMPI SVN trunk grew a new acceptable MCA
parameter value, but this new value was not accepted on the v1.6
branch (hwloc_base_mem_bind_failure_action -- on the trunk it accepts
the value "silent", but on the older v1.6 branch, it doesn't). If you
set "hwloc_base_mem_bind_failure_action=silent" in the default MCA
params file and then accidentally ran with the v1.6 branch, every OMPI
executable (including ompi_info) just failed because hwloc_base_open()
would say "hey, 'silent' is not a valid value for
hwloc_base_mem_bind_failure_action!". Kaboom.
The only problem is that it didn't give you any indication of where
this value was being set. Quite maddening, from a user perspective.
So we changed the ompi_info handles this case. If any framework open
function return OMPI_ERR_BAD_PARAM (either because its base MCA params
got a bad value or because one of its component register/open
functions return OMPI_ERR_BAD_PARAM), ompi_info will stop, print out
a warning that it received and error, and then dump out the parameters
that it has received so far in the framework that had a problem.
At a minimum, this will show the user the MCA param that had an error
(it's usually the last one), and ''where it was set from'' (so that
they can go fix it).
We updated ompi_info to check for O???_ERR_BAD_PARAM from each from
the framework opens. Also updated the doxygen docs in mca.h for this
O???_BAD_PARAM behavior. And we noticed that mca.h had MCA_SUCCESS
and MCA_ERR_??? codes. Why? I think we used them in exactly one
place in the code base (mca_base_components_open.c). So we deleted
those and just used the normal OPAL_* codes instead.
While we were doing this, we also cleaned up a little memory
management during ompi_info/orte-info/opal-info finalization.
Valgrind still reports a truckload of memory still in use at ompi_info
termination, but they mostly look to be components not freeing
memory/resources properly (and outside the scope of this fix).
This commit was SVN r27306.
The following Trac tickets were found above:
Ticket 3275 --> https://svn.open-mpi.org/trac/ompi/ticket/3275
ompi_comm_split(), and the entire set of periods from the old
communicator have already been copied to the new communicator. But up
here in mca_topo_base_cart_sub(), we need to subset the periods that
are actually stored on the new communicator according to remain_dims
(just like we did for the set of dimensions).
This commit renames a few variables to be a little less misleading,
and then adds a loop to copy over the periods information. I could
have added this into the first loop (that subset-copies the
dimensions), but this code is already confusing enough and this is not
a performance-critical section: so I made it a new loop.
Note that all the topo code will be revamped a bit when the new
MPI-2.2 topo stuff (currently off in a mercurial branch) finally makes
it back to the SVN trunk. But that new stuff will only get to v1.7 --
this commit will need to be CMR'ed to v1.6.x.
cmr:v1.7
cmr:v1.6.2
This commit was SVN r27248.
The following Trac tickets were found above:
Ticket 3294 --> https://svn.open-mpi.org/trac/ompi/ticket/3294
some new common OpenFabrics functionality to ompi/mca/common/verbs.
Also move everything that was in ompi/mca/common/ofautils under
ompi/mca/common/verbs.
* Move ofautils -> verbs
* Add new functionality in ompi/mca/common/verbs (see doxygen
* comments in ompi/mca/common/verbs/common_verbs.h for details):
* ompi_common_verbs_find_ibv_ports()
* ompi_common_verbs_port_bw()
* ompi_common_verbs_mtu()
* '''If you're writing verbs-based code, you should be using this
common functionality'''
* Adapt openib BTL to use some trivial common functionality in
common/verbs
* Don't use "#ifdef OMPI_HAVE_RDMAOE",use
"#if defined(HAVE_IBV_LINK_LAYER_ETHERNET)"
* Update the following to include/link against common/verbs
* bcol/iboffload
* sbgp/ibnet
* btl/openib
This commit was SVN r27212.
that causes MPI jobs to abort if there is not enough registered memory
available (vs. just warning).
This commit was SVN r27140.
The following Trac tickets were found above:
Ticket 3258 --> https://svn.open-mpi.org/trac/ompi/ticket/3258
The project includes following components and frameworks:
- ML Collective component
- NETPATTERNS and COMMPATTERNS common components
- BCOL framework
- SBGP framework
Note: By default the ML collective component is disabled. In order to enable
new collectives user should bump up the priority of ml component (coll_ml_priority)
=============================================
Primary Contributors (in alphabetical order):
Ishai Rabinovich (Mellanox)
Joshua S. Ladd (ORNL / Mellanox)
Manjunath Gorentla Venkata (ORNL)
Mike Dubman (Mellanox)
Noam Bloch (Mellanox)
Pavel (Pasha) Shamis (ORNL / Mellanox)
Richard Graham (ORNL / Mellanox)
Vasily Filipov (Mellanox)
This commit was SVN r27078.
technically this is a necessary thing to do, it wasn't a tragedy that
we didn't have it because err was initialize to 0 in the beginning of
the functions where this problem occurred. Also, OMPI will likely
abort if one of the MCA_PML_CALLs actually incurs an error (or, even
if it doesn't, MPI doesn't define the behavior anyway ;-) ).
But looking forward to an FT-aware world, fixing this issue is a Good
Thing. Many thanks to Hristo Iliev for pointing out the issue.
This commit was SVN r27070.
- OMPI_SUCCESS
- OMPI_ERROR
- OMPI_ERR_RESOURCE_BUSY
If an "OMPI_ERR_OUT_OF_RESOURCE" occurs, the request is added to the pending list, and will be handled later. An error message
should not be printed to the user in this case. This is not an error, but rather a notification of a possible valid condition.
Only in the case of "OMPI_ERROR" should it be printed to the user.
This commit was SVN r27065.
btl_openib_connect_udcm when notifying not to listen to an fd to ensure
that the main thread does not continue until the service thread has
processed the message
Adds ability to send message to openib async thread to tell it to
ignore the ERR state on a specific QP. Adds this call to udcm_module_finalize
so when we set the error state on the QP it doesn't cause the
openib async thread to abort the mpi program prematurely
Fixes trac:3161
This commit was SVN r27064.
The following Trac tickets were found above:
Ticket 3161 --> https://svn.open-mpi.org/trac/ompi/ticket/3161
receive queues value) so that we don't break the use of RDMA CM, and
therefore break RoCE.
This commit was SVN r27017.
The following SVN revision numbers were found above:
r26869 --> open-mpi/ompi@fe0e7f81df
aren't separated out into individual commits; they represent a few
months of work in the Mercurial branch, and it seemed error-prone to
try to break them up into multiple SVN commits.
* Remove 2nd overloaded interfaces for MPI_TESTALL, MPI_TESTSOME,
MPI_WAITALL, and MPI_WAITSOME in the "mpi" module implementations
(because we're not allowed to have them, anyway -- it causes
complications in the profiling interface). This forced an MPI-2.2
errata in the MPI Forum; we applied the errata here (the array of
statuses parameter could not have a specific dimension specified in
the dummy argument). Fixes trac:3166.
* Similarly, fix type for MPI_ARGVS_NULL in Fortran
* Add MPI_3.0 function MPI_F_SYNC_REG (Fortran interfaces only).
* Add MPI-3.0 MPI_MESSAGE_NO_PROC in the mpi_f08 module.
* Added mpi_f08 handle comparison operators, per MPI-3.0 addendum to
the F08 proposal at the last Forum meeting.
* Added missing type(MPI_File) and type(Message) in mpi_f08 module.
* Fix --disable-mpi-io configure switch with all Fortran interfaces
* Re-factor the Fortran header files to be fundamentally simpler and
easier to maintain. Fortran constant values in the header files
are now generated by a script named mpif-values.pl during
autogen.pl (they were previously generated by mpif-common.pl, but
it was quite a bit more subtle/complex). A second commit will
follow this one to update svn:ignore values (just to ensure we
don't muck up the first commit with the SVN client getting confused
by the changed ignore values and new/changed files).
* Fix some dependencies for compile ordering in
ompi/mpi/fortran/use-mpi-ignore-tkr/Makefile.am.
* Fix bad wording in several places (.m4 file name, ompi_info output,
etc.): we previoulsy said "F08 assumed shape" when we really meant
"F08 assumed rank" (for Fortran gurus, those are very different
things).
* Removed the GREEK/SVN version string from mpif.h. It really had no
purpose being there.
Still to be done:
* Handling of 2D array of strings in MPI_COMM_SPAWN_MULTIPLE still
isn't right yet. Not sure how many people really care about this
:-), but it is still broken.
This commit was SVN r26997.
The following Trac tickets were found above:
Ticket 3166 --> https://svn.open-mpi.org/trac/ompi/ticket/3166
- return MPI_ERR_OTHER instead of MPI_SUCCESS for the functions that are not
yet implemented
- add another field to the mca_io_ompio_file_t structure to point back to the
ompi_file_t structure.
This commit was SVN r26908.
- For now we'll use 8192 as a base value
- We leave the adjust_cq() as is
- For the long term we can work on an appropriate setting to expose through the INI file.
8K CQEs are 512K per process, which is 8MB for ppn=16
This commit was SVN r26877.
ibv_get_device_list_compat() and not finding it, I finally realized
that it was a function in OMPI. So let's name it with a proper ompi_
prefix, not an ibv_ prefix.
This commit was SVN r26867.
it to a negative number). Get rid of the multiplication in the critical
path, and keep the functions as simple as possible.
This commit was SVN r26864.
move). Extended common sm API with: mca_common_sm_module_create and
mca_common_sm_module_attach. Please note that the new routines aren't currently
used -- but will be...
This commit was SVN r26845.
alignment, which typically causes problems on SPARC. Further, the pointer
manipulation to access elements in a round schedule was clumsy. This change
introduces macros to facilitate addressing and make it more portable.
This commit was SVN r26802.
1000. Refs trac:3154.
IB/iWarp vendors need to get together to figure out a real fix.
This commit was SVN r26777.
The following SVN revision numbers were found above:
r26730 --> open-mpi/ompi@5315c91baf
The following Trac tickets were found above:
Ticket 3154 --> https://svn.open-mpi.org/trac/ompi/ticket/3154
Among other things, this patch deals with the following issues:
* fix ompi-checkpoint argument parsing
* ompi-restart -showme prints an extraneous "Restarted child with PID"
message. Move around the debug statement to avoid this.
* fixes for the state machine changes
This commit was SVN r26770.
- change the location where we mark the file view as contiguous and the
condition on how it is determined to be contiguous
- remove the unnecessary include statements
This commit was SVN r26763.