get bitten by header depending on having already included
the corresponding [opal|orte|ompi]_config.h header.
When separating, things like [OPAL|ORTE|OMPI]_DECLSPEC
are missed.
Script to add the corresponding header in front of all following
(taking care of possible #ifdef HAVE_...)
- Including some minor cleanups to
- ompi/group/group.h -- include _after_ #ifndef OMPI_GROUP_H
- ompi/mca/btl/btl.h -- nclude _after_ #ifndef MCA_BTL_H
- ompi/mca/crcp/bkmrk/crcp_bkmrk_btl.c -- still no need for
orte/util/output.h
- ompi/mca/pml/dr/pml_dr_recvreq.c -- no need for mpool.h
- ompi/mca/btl/btl.h -- reorder to fit
- ompi/mca/bml/bml.h -- reorder to fit
- ompi/runtime/ompi_mpi_finalize.c -- reorder to fit
- ompi/request/request.h -- additionally need ompi/constants.h
- Tested on linux/x86-64
This commit was SVN r20720.
Studio compilers again. It has been broken since
1/14/2009 when some changes exposed a bug in autoconf
and how it handles support for the restrict keyword.
Basically, Sun Studio C supports the restrict keyword
but Sun Studio C++ does not.
I am also pursuing a fix with the autoconf folks, but
this change was needed to get things building again.
This commit was SVN r20351.
size. The correct way is to cast to an integer type that has the same length, and
then allow the compiler to upgrade to the read type.
This commit was SVN r19944.
but not anything in libc. Which causes an incorrect answer for
AC_CHECK_FUNCS. Work around that by also checking for the
#define.
This commit was SVN r18730.
to allow explicit grouping of hot functions into similar code
sections upon link-time. Should decrease TLB misses (iff the code-
section is really too large)...
Candidates for __opal_attribute_hot__ are MPI_Isend MPI_Irecv,
MPI_Wait, MPI_Waitall
Candidates for __opal_attribute_cold__ are MPI_Init, MPI_Finalize and
MPI_Abort...
This commit was SVN r18421.
* The opal_sys_timer_get_cycles() call was implemented for
Sparc v9 using inline assembly, but not in the assembly files.
This would only currently matter on Linux Sparc systems using
a compiler that didn't support inline assembly (not many of
those), but it should be there for completion.
* The linux timer component would always build on non-Alpha
platforms, rather than only building on platforms where
opal_sys_timer_get_cycles() was implemented. This would
only matter on a very narrow set of platforms that we don't
really support, but still, it could be more right. We now
only build the component on platforms where we have the
assembly call to get the cycle counter.
* Added a comment to opal/sys/timer.h to note that the linux
timer component needed to be updated if another platform was
added.
This should be harmless to commit. It will only really change
behaviors on platforms we don't have assembly support for, which
currently won't make it through configure. It really only matters
when (if?) we support atomic operations through libatomic_ops.
This commit was SVN r17887.
* Conditionalize around `static inline` using
`OPAL_HAVE_INLINE_ATOMIC*` macros
* Remove redundant `opal_atomic*` prototypes (they
belong in the top-level `sys/atomic.h`
This commit was SVN r16957.
The following SVN revision numbers were found above:
r16807 --> open-mpi/ompi@b7c885247a
(e.g., gcc) are indifferent about this, while others are more
particular (e.g., Sun Studio 12).
* Typo: `asms.s` to `asm.s`
* Eliminate "foo is multiply-defined" linker errors on
Solaris by making the declarations in `opal/sys/atomic.h` agree
with their corresponding definitions (use `static inline` in *both*
places).
This commit was SVN r16807.
* General TCP cleanup for OPAL / ORTE
* Simplifying the OOB by moving much of the logic into the RML
* Allowing the OOB RML component to do routing of messages
* Adding a component framework for handling routing tables
* Moving the xcast functionality from the OOB base to its own framework
Includes merge from tmp/bwb-oob-rml-merge revisions:
r15506, r15507, r15508, r15510, r15511, r15512, r15513
This commit was SVN r15528.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r15506
r15507
r15508
r15510
r15511
r15512
r15513
1. Galen's fine-grain control of queue pair resources in the openib
BTL.
1. Pasha's new implementation of asychronous HCA event handling.
Pasha's new implementation doesn't take much explanation, but the new
"multifrag" stuff does.
Note that "svn merge" was not used to bring this new code from the
/tmp/ib_multifrag branch -- something Bad happened in the periodic
trunk pulls on that branch making an actual merge back to the trunk
effectively impossible (i.e., lots and lots of arbitrary conflicts and
artifical changes). :-(
== Fine-grain control of queue pair resources ==
Galen's fine-grain control of queue pair resources to the OpenIB BTL
(thanks to Gleb for fixing broken code and providing additional
functionality, Pasha for finding broken code, and Jeff for doing all
the svn work and regression testing).
Prior to this commit, the OpenIB BTL created two queue pairs: one for
eager size fragments and one for max send size fragments. When the
use of the shared receive queue (SRQ) was specified (via "-mca
btl_openib_use_srq 1"), these QPs would use a shared receive queue for
receive buffers instead of the default per-peer (PP) receive queues
and buffers. One consequence of this design is that receive buffer
utilization (the size of the data received as a percentage of the
receive buffer used for the data) was quite poor for a number of
applications.
The new design allows multiple QPs to be specified at runtime. Each
QP can be setup to use PP or SRQ receive buffers as well as giving
fine-grained control over receive buffer size, number of receive
buffers to post, when to replenish the receive queue (low water mark)
and for SRQ QPs, the number of outstanding sends can also be
specified. The following is an example of the syntax to describe QPs
to the OpenIB BTL using the new MCA parameter btl_openib_receive_queues:
{{{
-mca btl_openib_receive_queues \
"P,128,16,4;S,1024,256,128,32;S,4096,256,128,32;S,65536,256,128,32"
}}}
Each QP description is delimited by ";" (semicolon) with individual
fields of the QP description delimited by "," (comma). The above
example therefore describes 4 QPs.
The first QP is:
P,128,16,4
Meaning: per-peer receive buffer QPs are indicated by a starting field
of "P"; the first QP (shown above) is therefore a per-peer based QP.
The second field indicates the size of the receive buffer in bytes
(128 bytes). The third field indicates the number of receive buffers
to allocate to the QP (16). The fourth field indicates the low
watermark for receive buffers at which time the BTL will repost
receive buffers to the QP (4).
The second QP is:
S,1024,256,128,32
Shared receive queue based QPs are indicated by a starting field of
"S"; the second QP (shown above) is therefore a shared receive queue
based QP. The second, third and fourth fields are the same as in the
per-peer based QP. The fifth field is the number of outstanding sends
that are allowed at a given time on the QP (32). This provides a
"good enough" mechanism of flow control for some regular communication
patterns.
QPs MUST be specified in ascending receive buffer size order. This
requirement may be removed prior to 1.3 release.
This commit was SVN r15474.
VxWorks. Still some issues remaining, I'm sure.
Refs trac:1010
This commit was SVN r15320.
The following Trac tickets were found above:
Ticket 1010 --> https://svn.open-mpi.org/trac/ompi/ticket/1010
* Move ipv6comat.h code into opal_config_bottom.h and change into some
more intelligent testing of structures
* Change opal's if interface to use sockaddr instead of sockaddr_storage,
as the RFCs suggest we do
* Move the networking code in opal that isn't directly related to if
detection into net.h
* Add quicky function to get the port out of either a sockaddr_in
or sockaddr_in6, saving a bunch of code in the oob.
* Update TCP oob and btl with new interface
This commit was SVN r14679.
via the visibility feature that is provided by some compilers.
Per default this feature is disabled, to enable it you need to
configure with --enable-visibility and obviously you need a compiler
with visibility support. Please refer to the wiki for more information.
https://svn.open-mpi.org/trac/ompi/wiki/Visibility
This commit was SVN r14582.
finally brings in functionality that is already on the 1.2 branch, and
was developed and tested in the v1.2ofed branch (and other places).
Short version of new features:
* Support for ibv_fork_init()
* Automatically fill in the openib BTL bandwidth value by
querying the HCA port
* Installdirs functionality
* Fixes to always use -I in the Fortran wrapper compilers (#924)
* Gleb's mpool updates
* Remove some kruft in btl/openib/configure.m4, therefore
fixing the harmless warnings noted in #665
* Bunches of updates to the Linux RPM spec file
I.e., effectively the same thing that r14411 brought to the v1.2
branch.
Also effectively brought in r14432 and r14433 (some fixes on top of
the original r14411 commit to v1.2). Still need to bring in the moral
equivalent of r14445 after this commit (fixes to installdirs).
This commit was SVN r14449.
The following SVN revision numbers were found above:
r14411 --> open-mpi/ompi@83b31314ae
r14432 --> open-mpi/ompi@a48f160595
r14433 --> open-mpi/ompi@68f346d2bc
r14445 --> open-mpi/ompi@13d366b827
supported, but "q" was for long long. This isn't ANSI
C and causes a warning when using PRI?64 macros. We
don't support versions prior to OS X 10.3, so we dont'
need such backward compatibility. Instead, redefine
the macros to be "ll", which is ANSI C and doesn't
cause a compiler warning.
Fixes trac:868
This commit was SVN r14358.
The following Trac tickets were found above:
Ticket 868 --> https://svn.open-mpi.org/trac/ompi/ticket/868
allocated from mpool memory (which is registered memory for RDMA transports)
This is not a problem for a small jobs, but for a big number of ranks an
amount of waisted memory is big.
This commit was SVN r13921.
the dispatch function opal_atomic_cmpset_acq_xx was used to call
opal_atomic_cmpset_acq_32 would cause the compiler / linker to fix up a
jump address to be itself, leading to an infinite loop.
We're still looking into exactly what caused this, but during the
investigation into the hang, we determined that the compiler (both pathcc
and gcc) weren't always inlining both the call to opal_atomic_cmpset_acq_xx
and opal_atomic_cmpset_acq_32, meaning there was a function call in
opal_atomic_lock. The atomic lock will always be 32 bit, so there's no
need for the dispatch function, so might as well remove the dispatch
function that may or may not be inlined.
A bug fix that leads to potentially better performance. Gotta love the
few times that happens...
This commit was SVN r13651.
to flush all writes pending (ie, the data being protected) out of the memory
manager before we write the spinlock unlock. Only need a wmb instead of
full mb, which is at least slightly less intrusive. Also, after much
thought, no need for a memory barrier in init.
This commit was SVN r13649.
- Set ompi-specific autoconf cache-variables
- Implement one function to check for availability of an
attribute with the possibility for a cross-check.
- Do cross-checks for
__attribute__(format)
__attribute__(nonnull)
__attribute__(sentinel)
__attribute__(warn_unused_result)
- Grep the compilers warnings for keywords regarding ignored
attributes.
- Include also the no_instrument_function
This commit was SVN r13556.
The OMPI_HAVE_ATTRIBUTE flag is always set when the oldest attribute
is supported. But in all cases when the OMPI_HAVE_ATTRIBUTE is set and
one of the attributes is not supported the empty declaration is missing.
This commit was SVN r13390.
most common attributes. Useful attributes may be:
- optimization (pure, const, malloc--restrictness)
- data layout (packed)
- interface definition, API (deprecated, visibility, weak alias)
- compiler hint for code analysis and debugging (nonnull, format)
This commit was SVN r13369.
George wrote the initial patch, I extended it slightly and am responsible for all bugs found.
Refs trac:587
This commit was SVN r13023.
The following Trac tickets were found above:
Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
* Make sure that the pval always writes to the correct portion of the
lval. This only matters on 32 bit big endian machines.
* On 32 bit machines when assigning to pval, the other 4 bytes of lval
weren't being written, which could lead to bogus data
We use macros so that there aren't casts all over the code and the pval
assignment can occur to the correct 4 bytes. Refs trac:587
This commit was SVN r12974.
The following Trac tickets were found above:
Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
process when creating a datatype from an internal description.
Refs trac:640
This commit was SVN r12877.
The following Trac tickets were found above:
Ticket 640 --> https://svn.open-mpi.org/trac/ompi/ticket/640
provide one for the internals of Open MPI. For mpi.h, typedef MPI_Aint
either to ptrdiff_t or whatever we used as ptrdiff_t if that type doesn't
actually exist.
This commit was SVN r12212.
The following SVN revision numbers were found above:
r12146 --> open-mpi/ompi@8852c00c36
there is an exit path out of the loop
* Reformat assembly to match other platforms
* Update base file for non-inline assembly to match changes in
the inline version
This commit was SVN r11803.
* Use $31 instead of mnemonic zero for the gcc inline
assembly test, as the GNU assembler doesn't like
zero, but both Tru64 and GNU assembler should be fine
with $31
* Disable Linux timer component on Alpha. The CPU timer
rolls over every 10 seconds or less, so it's kinda
worthless for our needs.
* Fix some escaping issues when local functions are
denoted with a $
* Remove C++ comments from the Alpha assembly.
* Add base assembly code for the non-inlined functions
on Alpha
This commit was SVN r11764.
- everything statically built (dynamically opened).
- OPAL, ORTE and OMPI static libraries and all the components
as dynamic files(DLL).
- everything as dynamic files (DLL).
This commit was SVN r11461.
the BOOL type predefined on Windows in C does not match the C++ bool type
for the same compiler. One is an int when the other is a char.
Make sure we check for bool for all non C++ compilers.
This commit was SVN r11429.
The same treatement will happens on all sub-projects. The .h files
have to be C++ compatibles and all symbols with an external visibility
have to get the {PROJECT}_DECLSPEC in front of the prototype.
This commit was SVN r11340.
range of OS friendly path management functions, such as opal_basename
opal_dirname. They should always be used instead of basename and
dirname. There are several functions which allow us to create paths
that are compatible with the OS.
The OPAL_ENV_SEP define should be used (instead of ':') when a env
variable is splitted.
This commit was SVN r11336.
One of the most important is the IOVBASE_TYPE which should be used
all over the places for casting to and from the iov_base field of
the iovec struct. The reason is because windows does not support
any kind of implicit conversion (not even through void*). All
conversions should be EXPLICIT.
This commit was SVN r11328.
different macros, one for each project. Therefore, now we have OPAL_DECLSPEC,
ORTE_DECLSPEC and OMPI_DECLSPEC. Please use them based on the sub-project.
This commit was SVN r11270.
need memory barriers to actually do something other than hint
to the compiler not to reorder memory-related instructions. The
IA64 instruction for memory barriers is "mf".
Fixes bug #137.
This commit was SVN r10401.
are 3 arguments: the pointer to the memory location to prefetch, the type of
operation that will be done on the memory (read or write) and the expected
locality.
This commit was SVN r10294.
whether an if statement is likely to be taken and for prefetching memory.
Current macros:
OPAL_LIKELY(expression)
OPAL_UNLIKELY(expression)
OPAL_PREFETC(address)
This commit was SVN r10278.
as I understand how that one works and don't really understand what
was in the amd64 code (which was copied from before I started working
on the inline assembly). This fixes the race condition we were
seeing on PGI causing test failures
* sync non-inline assembly with inline assembly version
This needs to go to the v1.0 branch
This commit was SVN r9539.
* Don't do the .in -> .tmp -> header thing for the prefixes and versions.
It causes some severe cleanup issues all to save 4 files from rebuilding
when configure is run.
* Clean up some makefiles so it's clear what is being installed/disted
This commit was SVN r9260.
installation directories) in configure, the files that depend on this
information are not properly rebuilt. If you need this information,
don't setup a -D in the Makefile.am - instead, include
opal/install_dirs.h.
* Use the : option in AC_CONFIG_FILES to avoid needing to expose that
we are playing around with temporary files with our headers to avoid
rebuilding
* Clean up the version file information a bit, and like the install
directory stuff, make sure that there is a dependency so that
ompi_info gets rebuilt properly when a version number changes.
This commit was SVN r9256.
- moved hton64 and ntoh64 from the bunch of places it had been copied
into one header file
- properly set and use the btl_tcp's nbo option to put things in
network byte order on the wire if both sides don't have the same
endianness
- Put the OB1 PML's headers (with a couple exceptions I need to discuss
with Tim) in network byte order on the wire if both sides don't have
the same endianness
- since it was needed for the TCP BTL, move the orte_process_name_t
HTON and NTOH macros from the TCP OOB to ns_types.h
This commit was SVN r9145.
- move files out of toplevel include/ and etc/, moving it into the
sub-projects
- rather than including config headers with <project>/include,
have them as <project>
- require all headers to be included with a project prefix, with
the exception of the config headers ({opal,orte,ompi}_config.h
mpi.h, and mpif.h)
This commit was SVN r8985.
Windows installation provide the 32 and 64 bits atomic operations and compile
only the relevant part. For 64 bits there is a problem, but I already put a
big comment about it in the configure file ...
This commit was SVN r8480.
originally suggested by Ralf Wildenhues, to try to speed autogen, configure,
and make (and possibly even make install). Use automake's include directive
to drastically reduce the number of Makefile files (although the number of
Makefile.am files is the same - most are just included in a top-level
Makefile.am). Also use an Automake SUBDIRs feature to eliminate the
dynamic-mca tree, which was no longer really needed. This makes adding
a framework easier (since you don't have to remember the dynamic-mca
tree) and makes building faster (as make doesn't have to recurse through
the dynamic-mca tree)
This commit was SVN r7777.