* make buffers really big so that we pass allocmem until we figure out
why we're not flow controlling as I expected
* set event queue to invalid intially and use that as the enabled test
rather than a seperate bool - shrinks the module a bit
* add dropped count checks, with a panic if one occurs. Still need to
implement some type of retransmit logic.
This commit was SVN r5704.
- don't free the send buffer unless the converter tells us we need to
- properly do the math to determine when the receive buffer has been
fully used and unlinked itself
This commit was SVN r5703.
* Minor formatting fixes in XGrid RAS component
* Code cleanup in XGrid PLS component:
- If we can't get daemon contact information, kill the job at the XGrid
level
- Add MCA parameter pls_xgrid_delete_job that will delete the job from
XGrid when complete (this seems like standard behavior, so it's the
default)
- Remove compiler warning about getting the name of a XGGrid object
- Properly populate the daemon information for the killing code
This commit was SVN r5697.
more than we have asked for (on my G5). Anyway now I hope I have enought memory to printout
the full description of the datatype.
This commit was SVN r5690.
Many changes to headers for OMPI_DECLSPEC, and
proper placement of c_plusplus defines in those files.
mca/gpr/replica and tools are the two sets of directories
that still need work for the Windows build for this pass.
This commit was SVN r5688.
- app->num_procs changed to a size_t, which hosed the initialization
of its value to -1 (not sure why the compiler didn't complain
#$%@#$%), which was there to catch the case when the user forgot to
specify -np (or some other equivalent). Fixed.
This commit was SVN r5672.
- Change all uses of *printf'ing a size_t to use an explicit cast to
(unsigned long) and the %lu escape
- change ORTE_GPR_REPLICA_MAX_SIZE to INT_MAX until bug 1345 is fixed
(i.e., until we allow size_t in MCA params)
- ns_base_local_fns.c:orte_ns_base_get_proc_name_string(): changed
from %0X -> %lu
- ORTE_NAME_ARGS added explicit (unsigned long) casts, and changed all
usages of ORTE_NAME_ARGS to use %lu's
This commit was SVN r5644.
As an FYI: the pack/unpack routines should be happy with a NULL string (and appear to be so). Issue here was that the constructor was not called, which means that the string pointer was not initialized to NULL as it ordinarily would have been.
This commit was SVN r5639.
1. Instead of removing various src/ component directories, simply
"flatten" the Makefile.am structure by having only a single
top-level Makefile.am for the component, and having it include
src/Makefile.extra (which is where the source files are listed).
This effectively makes the build faster because "make" does not
traverse down into src/, and we don't build a Makefile for that
directory.
2. Did end up moving topo/unity/src/* into topo/unity, which is where
I figured out that option #1 would be a bit easier (and safer,
considering that other developers are actively working in various
src/ directories -- moving things around while they're working
would be Bad!)
3. Did not consolidate most of the io/romio component because of the
nightmare of sym links (especially w.r.t. VPATH builds) in the
included ROMIO distribution. I wasted too much time trying to get
that stuff right and finally gave up -- this is a "low hanging
fruit" optimization, after all.
This commit was SVN r5629.
- sends/recvs short messages (less than first frag size)
- does not properly ACK messages, so Ssend() is borked
- leaks memory like there's no tomorrow
- don't use it just yet
This commit was SVN r5625.
1. Added pid_t to the dps
2. Processes now "register" their local pid and update their location (i.e., nodename) on the registry during mpi_init
3. Added a new error code for values that exceed maximum for their data type (useful when transitioning a value from one variable to another of different size)
4. Fixed a few places where size_t was being incorrectly handled
5. Updated dps_test to cover pid_t types
This should now provide support for TotalView connection - which David is pursuing.
This commit was SVN r5623.
biggie), so we gain nothing there. On 10.4, it's implemented directly,
but doesn't support devices (which messes up pty support and IO
forwarding).
This commit was SVN r5621.
on all 64 bits architectures. The problem was the for unpack the source pointer was cast to a
specific type (uint32_t for 32 bits data) and then hton* was applied. The result was ... unexpected.
This patch always memcpy the data in a temporary variable with the correct size before calling
ntoh* functions, so we can insure that the data is always correctly aligned.
Moreover I add a debuging layer. OMPI_OUTPUT is used to print out the data being packed and
unpacked. It generate a lot of output but hopefully allow us to spot few bugs. This layer is not
completed the output stream descriptor is set to -1 (no output).
This commit was SVN r5617.
Anyway now I'm able to run on several 64 bits architectures (Athlon and G5) so
I suppose that we are back online on 64 bits.
This commit was SVN r5616.
everything in one directory. Still have only one Makefile, so it shouldn't
change build time at all
* Now that I finally understand the header system for data, refactor a little
bit of the code to match what really should be happening
* start of a hacked up send() - puts the data for a 0 byte message on the
other side, and all the pointers are where i think they should be. So
my plan of attack will work. But I think I'm going to have to use
iovecs instead of memcpy() real soon now.
This commit was SVN r5610.
one is selected it will be used for all purposes: small messages and long messages (even if the
long message is still split in several fragments). For the case where 2 PTLs per peer exists,
the first one is for latency (small messages and rendez-vous requests) when the second one
will be used for bandwitdh.
This commit was SVN r5600.
Jeff send me the way to do that automatically, and I'm pretty sure I'm not the only one who miss some
of the functionalities of our build system. The idea is really cool, let only the developper of a
component have it active until it reach a stable state. For all others peoples the .ompi_ignore
file prevent them for compiling the component.
cd src/mca/pml/uniq
echo $USER > .ompi_unignore
svn add .ompi_unignore
svn ci .ompi_unignore
This commit was SVN r5595.
The idea behind this PML is to minimiza the overhead of managing multiple PTL. For each node, UNIQ keep two PTL's
one for latency and one for bandwidth. One the next version I want to add a configure parameter to allow the user
to select how many PTL's he want: one or two.
This commit was SVN r5593.
based around PTL_MD_MAX_SIZE, which apparently isn't implemented in
Cray's Portals implementation. Time to rethink that design :/
This commit was SVN r5576.
HEADS UP: string versions of names are now presented in DECIMAL format - not HEX as they previously were. If you used the name services functions (as you were supposed to do) to access these names, you will not have any problems. If you did it yourself, then you need to fix it - my suggestion would be that you fix your code by using the name service functions to avoid future problems.
This commit was SVN r5571.
1. *correctly* fix the printing of size_t variables. Need to do this through a #define, not just typecast things. Thanks to Jeff/Brian for suggesting a cleaner way to do it (as opposed to just doing the #define at the print location). Note that not ALL of the prints have been "fixed" yet - will continue to identify them.
2. Add int64 and size_t to the pack/unpack unit tests.
3. Fix a bug in the int64 pack/unpack system.
This commit was SVN r5570.
the trick: I decide to print it always as an unsigned long and explicitly cast everything to this type.
Thus, I change all printf formats from %d to %lu and cast all arguemnts to the correct type (unsigned long).
This commit was SVN r5568.
when dps_internal.h get touched. Anyway the name say it should be internal to the dps system, so
there is no reason to have it included everywhere.
This commit was SVN r5555.
Merged in from:
svn merge -r5506:5553 https://svn.open-mpi.org/svn/ompi/tmp/hetero .
This commit was SVN r5552.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r5506
r5553
Merged from:
svn merge -r5496:5506 https://svn.open-mpi.org/svn/ompi/tmp/hetero .
This commit was SVN r5551.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r5496
r5506
Merged in from:
svn merge -r5448:5496 https://svn.open-mpi.org/svn/ompi/tmp/hetero .
This commit was SVN r5550.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r5448
r5496
from:
svn merge -r5440:5448 https://svn.open-mpi.org/svn/ompi/tmp/hetero .
This commit was SVN r5549.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r5440
r5448
soon.
Make ORTE_EXIT_CODE be the same as INT32, not INT8. This allows the
full propogation of the value returned by waitpid() rather than just
the lowest 8 bites. Also change the naming of it in orterun to be
exit_status, not exit_code (per POSIX standard naming convention).
orterun now returns the first nonzero exit status that it receives.
This commit was SVN r5530.
functions we destroy the frame pointer for the next call so very very weird things
happened. Like seg fault on i = 50 !!!
Both 32 and 64 bits versions have been modified but only the 32 version have been tested (by lack of ressources).
This commit was SVN r5525.
The following SVN revision numbers were found above:
r2 --> open-mpi/ompi@58fdc18855
status that we get (which naturally returns 0 if all return 0). This
should pick up nonzero returns from main() after MPI_FINALIZE, but the
gpr is still reporting 0 while testing. So orterun looks cororect for
this behavior -- investigating what's happening on the ORTE side is a
different commit...
This commit was SVN r5521.
MPI_COMPLEX*x, and some optional C datatypes in MPI reduction
operations. These types are not technically supported by the letter
of the MPI standard, but are implied by the spirit of it (and there
are definitely users that use them in real applications)
- Add checks in configure for back-end C types for MPI_INTEGER*x and
MPI_REAL*x
- Create C data structs for MPI_COMPLEX*x
- Fixed typo for MPI_INTEGER8 in mpi.h
- Updated configure macros to create MPI_FORTRAN_INTEGER* defines, as
opposed to MPI_FORTRAN_INT, which was causing [me] lots of confusion
(between C "*_INT" names and Fortran "*_INT" names). This caused
some trivial updates in ddt, ompi_info, and the MPI layer to match.
- Update ompi_info to show whether we have each MPI_INTEGER*x,
MPI_REAL*x, and MPI_COMPLEX*x
- Extended reduction operations for optional datatypes:
- "C integer" now includes long long int, long long, and unsigned
long long
- "Fortran integer" now includes MPI_INTEGER*x
- "Floating point" now includes MPI_REAL*x
- "Complex" now includes MPI_COMPLEX*x
This commit was SVN r5511.
the whole datatype code, make it a little bit more readable and add some
additional checks for correctness. In same time I move some internal structures
from the external .h include to the internal one.
The ddt_test.c get one more datatype to test. This one look like those used
in the BLACS test code.
This commit was SVN r5498.
ompi_progress, unless someone is actually using it (MPI-2 dynamic,
TCP PTL). This is only for end of MPI_Init to start of MPI_Finalize.
All other times, the event library will be progressed every call
into ompi_progress().
This commit was SVN r5479.
was getting confused with the recent addition of "-lutil" to get the
system-level libutil (i.e., Libtool was confusing the two and doing
Bad Things).
This commit was SVN r5467.
Ralf W (core libtool developer). There are still a few more to plug
in ompi_info (mainly concerned with shutting down OMPI/ORTE
subsystems), but they can wait...
This commit was SVN r5466.
we are part of the source tree and not defined otherwise, we are going
with an always defined if ompi_config.h is included policy. If
ompi_config.h is included before mpi.h or before OMPI_BUILDING is set,
it will set OMPI_BUILDING to 1 and enable all the internal code that
is in ompi_config_bottom.h. Otherwise, it will only include the
system configuration data (enough for defining the C and C++ interfaces
to MPI, but not perturbing the user environment).
This should fix the problems with bool and the like that the Eclipse
folks were seeing. It also cleans up some build system hacks that
we had along the way.
Also, don't use int64_t as the default size of MPI_Offset, because it
requires us including stdint.h in mpi.h, which is something we really
shouldn't be doing.
And finally, fix a ROMIO Makefile that didn't set -DOMPI_BUILDING=1,
as ROMIO includes mpi.h, but not ompi_config.h
This commit was SVN r5430.
Comment out pipes stuff for windows. need to come back and fix this properly in the future.
-his line, and those below, will be ignored--
M iof_base_setup.c
This commit was SVN r5424.
* Update cmpset test to call memory barrier when needed before checking the
results
* remove unneeded sync from cmpset_32 on Power PC
This commit was SVN r5420.
- argv[0] should be the name of the executable for the spawned processes.
- if we free a dynamic communicator (instead of disconnecting),
the counter for dynamic communicators has to be decreased as well,
else we core in finalize.
This commit was SVN r5419.
- Print error messages with the basename(argv[0]) rather than
hard-coded argv[0] so that you can see an error message beginning
with "mpirun" when you run mpirun, etc.
- For all processes that died due to a signal:
- If the signal was not SIGKILL, display the first N of them (where
N defaults to 1)
- If more than N processes died due to a non-SIGKILL signal, print
"And X more processes aborted..." kind of message
- Add --aborted command line parameter to change the default value
of N
- Also print out the total number of processes that died due to
SIGKILL, with a disclaimer that it's impossible to know if we
killed them or someone else killed them
This commit was SVN r5406.