- Remove printing of CFLAGS in configure.m4
- Set MCA_BTL_FLAGS_SEND flag
- Improved error handling during module initialization
- Extract the address of each interface with dat_ia_query
- Start playing around with fragment stuff - probably wrong
- Misc code cleanup (removal of GM-specific code)
This commit was SVN r8801.
if-clause, getting rid of local schedule variable.
This commit was SVN r8778.
The following SVN revision numbers were found above:
r8771 --> open-mpi/ompi@2fadddebc8
new wrapper compilers for the OMPI layer. This should require no changes
at all for anyone (other than running autogen, of course)
This commit was SVN r8772.
know that the request is inactive we don't have to call fini). Remove if's from
the critical path. Change a do in while to make sure we do the minimum in all cases.
All in all it decrease the latency for mvapi by something between 0.15 and 0.20
micro-seconds. But it's just a first step ...
This commit was SVN r8770.
increased the datatype ref count, and there is no way to have a convertor without a valid
request, so there is no need for the convertor to increase the reference count of the datatype
again. Therefore, if one want to use the datatype for something else than MPI request, one
has to manage the reference counts outside the datatype layer.
This commit was SVN r8769.
thread finish a request between the moment when we check the request status and the moment
when we acquire the lock, and if there are no more pending request on the pipe we will be
stuck foreverr as nobody have any reason to broadcast the request condition. I check the
req_any and it doesn't contain the same bug.
This commit was SVN r8760.
the descriptors to vanish. The PML was thinking that they are in the btl_cache when they
weren't ... It lead to memory consumption on most environments when compiled with
thread enabled. After modification the latency went down by nearly 0.5 microseconds.
Simple way to trigger the bug: limit the number of maximum items in the free list and run
any communication intensive application (like Netpipe).
This commit was SVN r8741.
- Move mca_btl_udapl_error/mca_btl_module_init to mca_btl_udapl.c and rename it
- White space cleanups
- Free the uDAPL evd and ia handles in mca_btl_udapl_finalize
This commit was SVN r8705.
- fall back to compile test for windows paffinity component
when cross compiling
- fall back to platform guess when checking for threads having
different pids with pthreads (yes on linux, no elsewhere)
- pass the proper host, target, and build flags to the
ROMIO configure script
With these changes, cross-compiling should be possible with the exception
of the Fortran 77 and Fortran 90 bindings. Fortran 77 can be cross-
compiled if cache values are provided for type sizes and alignment.
This commit was SVN r8702.
- Borrow configure.m4 from the mvapi btl. One of the uDAPL headers emits a
warning when -pedantic is enabled, so strip it out.
- Change function check in ompi_check_dapl.m4 from dat_ia_open to
dat_registry_list_providers.. dat_ia_open wasn't working right
- Make the references to prepare_dst, put, and get NULL for now
- Add opal_output() calls in all the udapl interface functions for debugging
- Add evd_qlen component parameter to control event dispatcher queue length
- First stab at component_init and module_init
- Misc cleanups - whitespace, dead code removal
- Update copyrights to 2006
This commit was SVN r8701.
r8698), with changes below:
- Split wrapper flags into those required for each of the three projects,
and cleaned up some cruft (including the LIBMPI_EXTRA_*FLAGS) through-
out the build system
- Added opal_init_util and opal_finalize_util to allow init / cleanup
of all the opal code that doesn't require the MCA system
- Create standalone key=value file parser, based on the one that used
to be in the mca param parser, so that it can be shared in multiple
places
- Add wrapper datafiles for opal, orte, and ompi wrappers, and add
wrapper compiler with support for all the old features
This commit was SVN r8699.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r8690
r8698
this was implemented using a chain (tree followed with pipeline) by setting the chain fanout to a factor of size etc but the chain datastructure was fixed in length and if exceeded the topo create returned a null which isn't helpfull in cid next function of comdup...
Anyway two fixes, first we do have a real linear function so changed the decision function and second altered the
topo chain create to force chain fanouts of less than 1 to 1 and fanouts bigger than max to max.
next check in will change chain to dynamically allocd array (reallocable) but we shouldn't ever use a chain fanout for a linear tree anyway.
(lession must rerun all tests for all data sizes when changing decision functions)
This commit was SVN r8662.
As the common part is the one that create the shred memory file it seems
logical to make it destroy the file as well. Therefore, the code for
unmapping the file is in a common place.
This commit was SVN r8622.
On some systems (like windows) qsort can modify the order of modules when the
comparaison function return 0. As we expect to have the
mca_btl_sm_add_procs_same_base_addr called before mca_btl_sm_add_procs we have
force the qsort to return the SM modules in the correct order. Giving to the
same_addr module a slightly higher priority solve this problem.
This commit was SVN r8620.
(windows). Instead use the LN_S variable exported by the Makefile (set to
"ln -s" on all Unixes and to "cp -p" on windows).
When we remove an executable use the correct extension for its name
(add $(EXEEXT) to the name).
This commit was SVN r8616.
implementation was not thread safe). See lengthy comment in
ompi/mpi/cxx/intercepts.cc::ompi_mpi_cxx_op_intercept() for a full
explanation.
This commit was SVN r8606.
(apparently we've been doing this in opal and orte, but not in ompi
yet). All public symbols begin with "ompi_coll_tuned_" (not
mca_coll_tuned_) except the component struct. Now this component
passes the illegal symbol report with no hits.
This commit was SVN r8589.
testing. Note that this effectively replaces the "basic" component as
the baseline collective component. Please report any problems with
this component.
If you run into problems with this component, you can disable it with:
--mca coll_tuned_priority 0
This commit was SVN r8575.
confuses the Lahey compiler (because it doesn't want to name a module
mpi_kinds.ompi_module). We shouldn't need this because we're already
touching a sentinel file (mpi_kinds.ompi_module) for make
dependencies; we don't need the compiler-generated module to be named
that.
This commit was SVN r8571.
- Copied the template BTL and renamed everything
- Compiles and shows up correctly in ompi_info, not tested past that
- Should be ignored for everyone but me
This commit was SVN r8544.
deallocation came from the allocator (malloc, fee, etc) or somewhere
else (the user calling mmap/munmap, etc). Going to be used by Galen
to determine if it is worth searching the allocations tree
Set flag if it is possible to intercept mmap (not always possible
due to a circular dependency between mmap, dlsym, and calloc)
This commit was SVN r8521.
- Fix some typos from last commit
- Add collectives of intercommunicators
- Move the static current_op member from Intracomm to Comm
--> this is still a remaining problem: the global variable
current_op is not thread safe!
This commit was SVN r8520.
the same algorithm as in the zlib library, as it is supposed to be the fastest
one and still give good error detection. For those who want to activate the
checksum, there is a define at the bottom of the datatype_internal.h file.
This commit was SVN r8509.
Get the diff down as close as possible, so only whitespace and a few
rearrangements of code (in mca_pml_teg_add_procs, uniq is nicer to
read), so awked diff of both is minimal.
This commit was SVN r8489.
- Test for MPI_REQUEST_NULL within the often executed loop not
necessary, but left as a comment
- The req_fini may potentially return an error, do check for it in
req_wait.
This commit was SVN r8488.
- remove windows socket initialization (it's already in the TCP component)
- protect all used header files
- remove the unused ones.
This commit was SVN r8434.
empty for non threaded builds, but somehow just by moving the code a
little bit around and removing 2 call to lock/unlock the latency for TCP
went down by 2 micro-seconds ...
This commit was SVN r8426.
number of recv system call by caching the data. Each endpoint has a buffer
(the size is an MCA parameter) that can be use as a cache. Before each receive
operation this buffer is added at the end of the iovec list. All data that are
not expected by the fragment will go in this cache. If the cache contain data
all subsequent receive will just memcpy the data into the BTL buffers.
The only drawback is that we will spin around the receive_handle until all the
cached data is readed by the PML layer. This limitation come from the fact that
the event library is unable to call us if there is no events on the socket.
Therefore we are unable to keep the data in the cache until the next loop
into the progress engine.
This commit was SVN r8398.
the script did not complete successfully
- ROMIO likes to default to using the system compiler if no compiler is
specified. This can lead to using a different compiler for ROMIO as
for Open MPI, which is not always a good thing. Reset the default to
behave the same way Open MPI does.
This commit was SVN r8380.
- when processing out of order list - reset match to null on each iteration
- check matched request type and if probe - complete probe and queue fragment
on unexpected list
This commit was SVN r8339.
the page address three situations:
1. if the array of requests are all REQUEST_NULL, set flag=true and
index=MPI_UNDEFINED
2. if the array of requests are not all REQUEST_NULL, if any of them
finished, set flag=true and index=whatever_the_index_was
3. if the array of requests are not all REQUEST_NULL, if none of them
finished, set flag=false and index=MPI_UNDEFINED
With regards to the index value, we are currently doing 1 and 2, but
not 3. More specifically, index should be set to MPI_UNDEFINED if no
requests are found to have completed (regardless of whether they are
REQUEST_NULL or not).
This patch fixes this problem.
This commit was SVN r8319.
functionality. This is a global registration, so all components have to be made
aware of the registration -- therefore, do things at the component level, not the
module level.
This commit was SVN r8314.
case of:
sizeof(MPI_Flogical) != sizeof (int)
and
Fortran value of .TRUE. != 1
as is often the case.
- Check in configure the value of .TRUE., the C-type coresponding to
logical and check, that fortran compiler does not do something strange
with arrays of logicals
- Convert all occurrences of logicals in the fortran wrappers, only
in case it is needed.
*Please note* Implementation of MPI_Cart_sub needed special treatment.
- Output these value in ompi_info -a
- Clean up the prototypes_mpi.h to just have a single definition and
thereby deleting the necessity for prototypes_pmpi.h
- configured, compiled and tested with F90-program, which uses
MPI_Cart_create and MPI_Cart_get:
linux ia32, gcc (no testing, as no f90)
linux ia32, gcc --disable-mpi-f77 --disable-mpi-f90 (had a bug there)
linux ia32, icc-8.1
linux opteron, gcc-3.3.5, pgcc, pathccx/pathf90 (tested just
pgi-compiler)
linux em64t, gcc, icc-8.1 (tested just icc)
This commit was SVN r8254.
increase the previous connection code was broken. It can take as much as 60 seconds to connect
64 processes. Now we do not create the connections when we add the procs but only when we send
them the first message. Now it take only 1.6 seconds to setup a 64 procs MPI job over MX (doing a 2 steps barrier in order to insure that we create all the connections).
This commit was SVN r8252.
displacement (for both the inter and intra-communicator version). The
displacements in scatterv are given in multiples of the sendtype.
This fix should probably make to v1.0.1 as well?
This commit was SVN r8251.
argument is defined to be
integer (KIND=MPI_ADDRESS_KIND)
we have to map that to an MPI_Aint in the f->c interface, else the
typecast back from the C to the Fortran interface will be wrong as well.
The same holds (at least) for all other new MPI-2 datatype functions
which take a byte-displacement or an address as an argument.
This commit was SVN r8243.
Thanks to Anthony Chan for pointing this out.
Note that these will only work for the Fortran compiler that Open MPI
was configured with -- since these values, are, by definition,
single-value, they cannot support all 4 values that Open MPI may
generate for the different Fortran name-mangling schemes. A lengthy
comment in ompi_mpi_init.c explains this in more detail. Added to the
README to explain this situation, as well as the forthcoming
.TRUE. Fortran fixes.
This commit was SVN r8231.
* turns out (duh!) that there was a reason that the <projectdir>dir
variable was set in the AM conditional. If not, stupid directories
are created and not needed... duh.
This commit was SVN r8205.
component/base Makefile.am files, reducing the time configure spends
stamping out Makefiles at the end
* Install base_impl.h file when devel-headers are being installed
This commit was SVN r8200.
- Separate out the registration of the MCA params into a standalong
function that is invoked by ompi_mpi_init() (so that ompi_info can
see these params)
- Rename the params to "mpi_ddt_*" instead of "datatype_*" so that
they fit into the common naming scheme
This commit was SVN r8196.
do for mpirun/orterun. This will allow -mca btl foo,self to work as
expected when doing MPI_COMM_SPAWN and friends.
This should be pushed to the v1.0 branch
This commit was SVN r8170.
is labeled as internal so the users will not see it but it is not read-only so we can still
play with it (that's for our internal tests). This is supposed to dissapear later after the
next (or next next) release of the MX library, but we need it now as a quick fix before the
release.
This commit was SVN r8161.
Remove the old function from the dt_unpack.c and activate the new one from dt_copy.c
Add a MCA param ompi_copy_debug to get messages about the local memory copies in the new function.
Slightly change the prototype of the function to keep the compilers happy on some platforms.
This commit was SVN r8142.
- remove dead code that isn't used anywhere (originally ompi_fifo_t
was going to be a generalized class, but now it's exclusively used
in the sm stuff, so there's no point in the generalized code that
definitely *won't* work with the sm btl, or is not being used now
[SVN always has history so we can go back])
- had to add an interprocess lock in the area where the writer may
create a new circular buffer to ensure that the reader's tail
doesn't accidentally end up back in the same old buffer while the
head continues on to a new circular buffer (this was what was
happening to cause some intel tests to hang -- e.g., MPI_Scan_c,
MPI_Send_fairness_c and MPI_Isend_fairness_c). Unbelievably, this
may actually *increase* performance because it may order things
better. Will do performance testing tomorrow. We're fairly certain
that this lock can probably be removed and the code fixed in a
different way, but we're under a deadline and correctness comes
first, so it's been added to the to-do list to come back and
re-examine this case later.
This commit was SVN r8136.
- Add big comment about a general overview of what the sm btl is doing
- random small code cleanups
- fix instances of mca_btl_sm[0] to mca_btl_sm[1] where relevant
- remove a lot of unused, confusing, and incorrect interface functions
from ompi_fifo.h and ompi_circular_buffer.h. These functions, if
they were used, would not work properly with the scheme that the sm
btl uses with the fifos (i.e., receiver makes right -- if necessary)
- add some missing offset computations in the fifo and circular buffers
- change the types of offsets to be ssize_t, not size_t
- remove an offset parameter from a function that didn't need it
This commit was SVN r8135.
- Ensure to convert base_shared_mem_flags to be a relative offset in
the global storage, and then to convert that back to an absolute
virtual address before we try to use it
- Don't double increment n_local_procs when calculating the peer rank
during bootstrapping of the different base address case
Something else is still wrong; if mmap() returns a different base
address, things don't work (i.e., segv or hang forever when you try to
send a message). More specifically, the bootstrapping now seems to
correctly handle the case when mmap() base addresses are different,
but the message passing does *not* -- it always assumes that the
mmap() base addresses are the same.
Still working on the fix for that -- want to checkpoint what has been
done so far to facilitate working on different machines...
This commit was SVN r8134.
Lots of misc fixes: printfs->opal_output, handles fanin/out correctly for forced ops
unused vars, correct calculations on meaning of 'msgsize' for decision functions
(varies depending on algorithm), etc
This commit was SVN r8113.
When compiling C++ code that includes something that looks for the C++
header file "memory" (stupid C++ headers not having .h extensions), it
goes through the header file search path, which includes $(topsrcdir)/opal,
so it finds the directory $(topsrcdir)/opal/memory/ and tries to load
that as the memory header file and all goes downhill.
This commit was SVN r8111.
functions (e.g., MPI_REDUCE). We don't generate the back-end
subroutines for them (because it makes an expontential number of
subroutines, and compilers literally will segv), so we shouldn't
generate the f90 interfaces for them, either. This allows user's MPI
F90 apps to automaitcally fall through to the F77 bindings for these
functions.
This commit was SVN r8094.
name was changed to shorten it too early (and then not restored), so
the "interface" name was not output correctly into
mpi-f90-interfaces.h. Change to make it like the other long functions
-- temporarily change it to a shorter name while outputing the
subroutines, and then revert it when outputting the end interface.
This commit was SVN r8086.
- first we setup the connections in the begining with all the peers
- MX does not handle well the case where several peers make connections to the same
destination simultaneously.
So I change the order in which we connect. First we compute our rank in the array,
then in a round-robin fashion we setup connection starting with our left neighboard.
This commit was SVN r8075.
This flag is inherited by all datatypes create with unavailable datatypes. Basically, we let the user create the wrong datatype but we dont let him using it for any pt2pt communications or any pack/unpack.
This commit was SVN r8069.
comm_fill_rest there is no need for calling ompi_set_group_rank, since
we know already the rank of the process in the new comm. In case the
process was not part of the new communicator (rank = MPI_UNDEFINED)
calling this function caused a segfault on some platforms.
This commit was SVN r8060.
Limit the amount of data to be packed to the remaining on the convertor. This make the things a lot simpler in the pack/unpack functions.
This commit was SVN r8028.
Do not ignore the type and extent of the last optimized basic type in some special cases.
Update the last fake END_LOOpP with the correct value for the first_elem_disp field.
This commit was SVN r8023.
never stored on the stack. It is partially stored on the stack depending on the loops
but every time we pack/unpack a basic datatype we take in account again it's displacement.
This approach make the whole logic a lot simpler. In same time I split the big functions
in several basic block.
This commit was SVN r8021.