Rename some internals to have a better conformance with the rest of the project.
Dont use a fragment for the ack on the match, use just a already registered buffer.
Delte a useless file (ptl_gm_addr.h). The structure is already present in the ptl_gm_peer.h file
This commit was SVN r3933.
- rendez-vous protocol for long messages (internally used by the GM driver)
- first step toward overlapping registration/transfert
- use the correct type (size_t and uint64_t) for some internal types to conform to the PML layer
Some minor changes:
- remove some useless macros
- clean-up the GM defines
- renames some GM MCA parameters
- correctly use the limit between eager and rendez-vous protocol
- speedup the code a little (dont allocate useless fragments).
- when allocation fragment set to ZERO all usefull fields in the struct
This commit was SVN r3927.
Use the GM FAST event to avoid a call to gm_unknown.
Dont allocate a fragment for the match message (either on the send or the receive side).
This commit was SVN r3905.
first invocation of MPI_File_open or MPI_File_delete (whichever is
first). The io framework is then only closed down if it was
successfully opened.
This is the first [atomic] step to having a progress thread in the
ROMIO component; it wasn't strictly *necessary*, but it's logically
the same direction and provided a good test case.
This commit was SVN r3895.
independing on the size of the data. The strange fact is that I get nearly the same performances as the NetPipe GM (that use registered memory) who are really close to the maximum performances with the Myrinet cards available on the cluster. However the load on the CPU is higher.
I still have to investigate how exactly this fact fit with the send/recv of non-contiguous datatypes.
This commit was SVN r3877.
Lookup for the peer information only when we need it for later usage.
Small optimizations (moving some function in .h and transform them in static inline).
Cleanups, cleanups and finally cleanups ...
This commit was SVN r3870.
have now a limited stack attached. If we handle contiguous data then we will use this stack, avoiding the free/malloc for the stack management. In
all others cases the convertor work as before a stack containing the required number of elements will be allocated. This small modification
decrease the latency for GM by nearly 0.7 micro-sec as reported by NetPipe.
This commit was SVN r3866.
buffer for the SIOCGIFCONF ioctl to complete successfully. Also, use
the sa_len member of if ifreq's ifr_addr member, if available, for
computing offsets of the ifreq structures.
Since this has the potential to break people, setting the env
variable OMPI_orig_if will result in the old code being used. This
will be removed once the new code survives a couple days in the
wild.
This commit was SVN r3845.